0% found this document useful (0 votes)
909 views

Database Management Systems Versus File Management Systems

Database management systems (DBMS) allow access to multiple files or tables at a time, while file management systems (FMS) only allow access to single files or tables. FMS are simpler to use and less expensive than DBMS, but have limitations like lack of support for transactions, recovery capabilities, and concurrent multi-user access. DBMS offer advantages like flexibility, support for larger databases, data integrity, and security features. While FMS may be sufficient for some small databases, DBMS provide more power and functionality for medium to large organizations through features like queries, transactions, and scalability.

Uploaded by

SaimaSadiq
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
909 views

Database Management Systems Versus File Management Systems

Database management systems (DBMS) allow access to multiple files or tables at a time, while file management systems (FMS) only allow access to single files or tables. FMS are simpler to use and less expensive than DBMS, but have limitations like lack of support for transactions, recovery capabilities, and concurrent multi-user access. DBMS offer advantages like flexibility, support for larger databases, data integrity, and security features. While FMS may be sufficient for some small databases, DBMS provide more power and functionality for medium to large organizations through features like queries, transactions, and scalability.

Uploaded by

SaimaSadiq
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 24

Database Management Systems Versus File Management Systems

A Database Management System (DMS) is a combination of computer software,


hardware, and information designed to electronically manipulate data via
computer processing. Two types of database management systems are DBMS’s
and FMS’s. In simple terms, a File Management System (FMS) is a Database
Management System that allows access to single files or tables at a time. FMS’s
accommodate flat files that have no relation to other files. The FMS was the
predecessor for the Database Management System (DBMS), which allows access
to multiple files or tables at a time (see Figure 1 below).

File Management Systems


Advantages Disadvantages
Typically does not support multi-user
Simpler to use
access
Less expensive· Limited to smaller databases
Fits the needs of many small businesses Limited functionality (i.e. no support for
and home users complicated transactions, recovery, etc.)
Popular FMS’s are packaged along with
the operating systems of personal
Decentralization of data
computers (i.e. Microsoft Cardfile and
Microsoft Works)
Good for database solutions for hand
Redundancy and Integrity issues
held devices such as Palm Pilot

Typically, File Management Systems provide the following advantages and


disadvantages:
The goals of a File Management System can be summarized as follows (Calleri,
2001):

• Data Management. An FMS should provide data management services to


the application.
• Generality with respect to storage devices. The FMS data abstractions and
access methods should remain unchanged irrespective of the devices
involved in data storage.
• Validity. An FMS should guarantee that at any given moment the stored
data reflect the operations performed on them.
• Protection. Illegal or potentially dangerous operations on the data should
be controlled by the FMS.
• Concurrency. In multiprogramming systems, concurrent access to the
data should be allowed with minimal differences.
• Performance. Compromise data access speed and data transfer rate with
functionality.
From the point of view of an end user (or application) an FMS typically provides
the following functionalities (Calleri, 2001):

• File creation, modification and deletion.


• Ownership of files and access control on the basis of ownership
permissions.
• Facilities to structure data within files (predefined record formats, etc).
• Facilities for maintaining data redundancies against technical failure
(back-ups, disk mirroring, etc.).
• Logical identification and structuring of the data, via file names and
hierarchical directory structures.

Database Management Systems


Database Management Systems provide the following advantages and
disadvantages:

Advantages Disadvantages
Greater flexibility Difficult to learn
Packaged separately from the operating
system (i.e. Oracle, Microsoft Access,
Good for larger databases
Lotus/IBM Approach, Borland Paradox,
Claris FileMaker Pro)
Greater processing power Slower processing speeds
Fits the needs of many medium to large-
Requires skilled administrators
sized organizations
Storage for all relevant data Expensive
Provides user views relevant to tasks
performed
Ensures data integrity by managing
transactions (ACID test = atomicity,
consistency, isolation, durability)
Supports simultaneous access
Enforces design criteria in relation to
data format and structure
Provides backup and recovery controls
Advanced security

The goals of a Database Management System can be summarized as follows


(Connelly, Begg, and Strachan, 1999, pps. 54 – 60):

• Data storage, retrieval, and update (while hiding the internal physical
implementation details)
• A user-accessible catalog
• Transaction support
• Concurrency control services (multi-user update functionality)
• Recovery services (damaged database must be returned to a consistent
state)
• Authorization services (security)
• Support for data communication Integrity services (i.e. constraints)
• Services to promote data independence
• Utility services (i.e. importing, monitoring, performance, record deletion,
etc.)

The components to facilitate the goals of a DBMS may include the following:

• Query processor
• Data Manipulation Language preprocessor
• Database manager (software components to include authorization control,
command processor, integrity checker, query optimizer, transaction
manager, scheduler, recovery manager, and buffer manager)
• Data Definition Language compiler
• File manager
• Catalog manager

ConclusionFrom the File Management System, the Database Management


System evolved. Part of the DBMS evolution was the need for a more complex
database that the FMS could not support (i.e. interrelationships). Even so, there
will always be a need for the File Management System as a practical tool and in
support of small, flat file databases. Choosing a DBMS in support of developing
databases for interrelations can be a complicated and costly task. DBMS’s are
themselves evolving into another generation of object-oriented systems. The
Object-Oriented Database Management System is expected to grow at a rate of
50% per year (Connelly, Begg, and Strachan, 1999, pg. 755). Object-Relational
Database Management System vendors such as Oracle, Informix, and IBM have
been predicted to gain a 50% larger share of the market than the RDBMS
vendors. Whatever the direction, the Database Management System has gained
its’ permanence as a fundamental root source of the information system.
Advantages of
Database advantages include the following:
database systems
• shared data;
• centralized control;
• disadvantages of redundancy control;
• improved data integrity;
• improved data security, and database systems; and,

• flexible conceptual design. [Top]


Disadvantages of Disadvantages are as follows:
database system
• a complex conceptual design process;
• the need for multiple external databases;
• the need to hire database-related employees;
• high DBMS acquisition costs;
• a more complex programmer environment;
• potentially catastrophic program failures;
• a longer running time for individual applications; and,

• highly dependent DBMS operations. [Top]


Nowadays every open source CMS or other script, down to the most basic of
ideas such as simple password-protection scripts, demand the use of a (usually)
MySQL database to function. Simple scripts using Berkeley DB files or even
plain text files are seen as marginal, too limited or too awkward for use. I have
several sites which run a simple CMS based on plain text files, and I can't see a
reason to switch.
Is MySQL overkill for many uses, or does it offer any significant advantages
over flat files other than the fact that you can't reliably use flat files when load-
balancing between different servers? flat files are usually faster, aren't they?
What do MySQL/PostgreSQL/MS-SQL offer that makes them so popular for web
development?

arran 5:48 pm on Sep. 5, 2005 (utc 0)

#:1579015 Very strange - i was thinking about this yesterday!


SQL (the query language) is the main reason i abandoned my
plan to switch from database to flat-files. Although not
perfect, it's extremely easy to quickly throw together all the
queries you need for a site. The thought of writing a hacky php
function every time you think of a new way to display/sort
information doesn't appeal.
Other reasons that come to mind are:
- Transactions
- Scalability (Indexing in particular)
- Security
In general, i do agree that many sites could live without
databases and in terms of performance would definitely
benefit from switching to flat-files.
arran.

txbakers 6:44 pm on Sep. 5, 2005 (utc 0)


flat files are usually faster
#:1579016
flat files are NOT faster. they can only be read from top to
bottom, and usually they have to be read all the way through.
A good database engine will take a query (Select field from
table where field = '#*$!x') and first determine the best way to
go into the database to solve the query, then leave the
processing when the query is solved.
Another main reason to use databases is the ability to
minimize the repetitions of data. Think about having multiple
spreadsheets for a project. If you have an ID number assigned
to a particular item, you only need to reference the ID number
between different sets of data, rather than replicating all of
the data for each sheet.
Database tables give you the ability to do this easily with
"joins".
Still, if all you need to do is validate passwords, a flat file
might be easier to use. But beyond that, you're much better off
using a database.

Lord Majestic 6:53 pm on Sep. 5, 2005 (utc 0)

#:1579017 Good databases have key advantage over flat files --


concurrency. When you just read stuff from file its easy, but
try to syncronise multiple updates or writes into flat file from
scripts that run in different process spaces.

AffiliateDreamer 2:38 am on Sep. 7, 2005 (utc 0)

say you go the flat file route....(maybe i'll start a new thread)
#:1579018
I'm curious as to how many files in a single directory have you
guys stored in the past?
I mean, can you store 1 million files in a folder on a windows
server? unix server?
Will file access times differ between unix and windows2003?
Lord Majestic 2:46 am on Sep. 7, 2005 (utc 0)
I mean, can you store 1 million files in a folder on a windows
#:1579019 server? unix server?
Possible, but this is a very bad idea to have so many files -- I
know, I have got 130,000+ files in one directory :o
physics 3:04 am on Sep. 7, 2005 (utc 0)

Storing a lot of flat files in one directory is a very bad idea.


#:1579020 There is a serious slowdown for the directory listing (though
some unix file systems are designed so that this isn't as much
of a problem, but not the common ext3). If you must use tons
of flat files you should split them up into different directories
so that each directory doesn't have too many files in it... But
really your time will be better spent getting MySQL running.
mrMister 1:32 pm on Sep. 8, 2005 (utc 0)

I use flat files to cache database output for data that doesn't
#:1579021 change very often.
I pull the data out of the database and store it in a flat file.
It is quicker, especially if you have a lot of processing to do (if
you're using joins for example) whenever you retreive the data.
Hester 9:03 am on Sep. 15, 2005 (utc 0)

#:1579022
Hmm, this forum is actually built on flat files!

aspdaddy 11:56 am on Sep. 15, 2005 (utc 0)

It has nothing to do with speed or performance. Simple flat


#:1579023 files will always out-do a database because manipulating
them is closer to the machine language.
The main reason for databases is independance from
hardware, user intefaces and data integrity.
trillianjedi 11:59 am on Sep. 15, 2005 (utc 0)

I use flat files to cache database output for data that


#:1579024
doesn't change very often.
I also use that type of hybrid design and have found it gives
the best performance/data integrity balance.
Most CMS's are designed to do everything. That functionality
always comes with a compromise.
Absolutely any CMS can be customised to work better for you
than it does straight out of the box. Reducing unnecessary
queries by removing them, or caching them, is one of the
simplest things you can do that makes a significant
performance difference.
I have several sites which run a simple CMS based on plain
text files, and I can't see a reason to switch.
Great - a good sign of slick software and no unnecessary
bloat.
I think it's a horses for courses thing. If you have a largely
static site in terms of the data type you're displaying (i.e. what
would be the DB structure in a DB based site doesn't ever
need to change) then flatfile is great. and fast.
If you need to budget in the future for either complex load-
balancing, or a structure of data that will need to grow and
change, then a DB is probably the favourite choice - for
reasons of management rather than speed.
Is MySQL overkill for many uses.....?
Yes. But given the low cost these days of an incredibly fast
server, fast hard disk and plenty of RAM, does it really matter
all that much?
Why is SQL so popular?
Convenience, simplicity of design, portability and easy
management of content data.
TJ
physics 6:19 pm on Sep. 15, 2005 (utc 0)

#:1579025 It has nothing to do with speed or performance. Simple flat


files will always out-do a database because manipulating
them is closer to the machine language.

This is first of all assuming that the code that you write is
'more efficient' than the code that the developers of the
database software have written. Plus you have to take the
time to write all of those functions and debug them. Also,
there are other things to consider such as the fact that if you
have tons of data stored on a hard disk in all different files
then every time you want to access a file a file handle has to
be opened at some random point on the disk, etc. With a
database it's stored in a 'central' location and the handle is
open... There are also all sorts of creature comforts like the
MySQL slow query log and things like mytop so you can keep
track of what queries might be slowing your system down. I
recently wrote an application that initially used flat files, then
switched to MySQL, the speedup was immediately evident.
Finally, yes many sites are based on flat files but they would
probably be better off with an 'advanced' database system.
Hester 3:39 pm on Sep. 16, 2005 (utc 0)

#:1579026
Only one file to backup too.

webprofessor 2:10 am on Sep. 19, 2005 (utc 0)

>> Simple flat files will always out-do a database because


#:1579027 manipulating them is closer to the machine language.
What a silly thing to say. Do you know how php handles flat
files? Do you know how asp handles flat files? Do you know
how C++ handles flat files ( which realy the notion of which
doesn't even exist ) All do it diffferently and with different
advantages and disadvantages.
Do you mean all flat files?
Are you using locks and semaphores?
Are you doing row processing?
Are you searching the flat file?
If your searching how many lines are in the flatfile?
Are there even lines?
and here's the doozy.. do you even know how to write
assembly and what advantages it brings by being closer to
the metal?
ALWAYS is a term you shouldn't use. Further more for
anything data sets of signifigent size or doing anything
signifigent with usually you are wrong. It takes quite a skilled
programmer to use flat files to manage large amounts of data
effeciently.
** anyways back to the point of the thread. **
The use of SQL allows for modular design of your application.
If you think this is a one time app and you'll rarely if ever
need to change it or its data, then a flat file may be the choice
for you.
I always use SQL when there is the possibility of having the
scale the application larger or the complexity of the data may
increase. The reason being is that for me it is easier to change
a sql based application than rewrite all my functions for
parseing files.
If your worried about performance, getting a better server may
be a solution for you. Processing power is cheap these days.
For me it is cheaper than the amount of time it takes me to
write code.
FIN
FlatFileAdvantages
PmWiki stores pages in flat files instead of using a relational database such as
MySQL. This page explains why this design decision has been made.

Pm's Explanation
Pm: I chose flat files to store PmWiki pages because I haven't seen any real
advantages of using a database, and there are definitely some disadvantages.
For the standard operations (view, edit, page revisions), holding the
information in flat files is clearly faster than accessing them in a database, and
with page caching abilities (coming soon) it'll be even faster. The only
operations that really benefit are searches, but I've always believed that for
fast, flexible search capabilities it's much better to use existing search programs
such as ht://Dig or Google over reinventing another search engine. PmWiki's
Site.Search is functional/fast enough for most purposes, and if more
performance is needed it's just better to switch to a real search engine.
Indeed, as of January 2004 the Wikipedia uses a MySQL database to store its
190K+ entries, but even with the database Wikipedia has disabled its online
search because of performance issues and just forwards search queries directly to
Google.

And there are big disadvantages to using a database -- with a database we'd have
to write a bunch of "administrative" tools/scripts to handle things such as mass
page deletions in the database, backups/restores of the pages, recovering pages
that have been wrongly deleted, etc. Much of that administrative programming
overhead is eliminated by using a flat file system, as admins can use existing tools
(FTP clients, web-based file/directory managers, shell commands). They are
already comfortable with the administrative tools. It's also much easier to build
sophisticated and customized page management tools and scripts for specialized
applications.

Finally, PmWiki is already structured such that the flat file structure can be easily
replaced by a database if it ever proves necessary. However, even PmWiki sites
with more than 40 000 pages function well in a flat file system without any
noticeable performance problems.

PmWiki supports the ability to subdivide the wiki.d/ directory into separate
subdirectories for each group, avoiding the "too large" directory problem. Check
out the Cookbook:PerGroupSubDirectories for more information.

Comments:

• Flat files are indeed much more easy to manage and my experience shows
that there is no problem at all for PmWiki. Still I had problems convincing
my boss using PmWiki since it is not using a "real" database. Ever thought
of using subdirectories for each group like in Uploads? There are known
issues on Solaris for directories containing more than 20.000 files. Uli

PmWiki already supports the ability to subdivide the wiki.d/ directory


into separate subdirectories for each group, avoiding the "too large"
directory problem. Contact me via email or pmwiki-users. --Pm

This is now specified in Cookbook:PerGroupSubDirectories. Thanks, Ben


and Patrick! --Sproaticus

• On a Linux based operating system, with a filesystem like ReiserFS which


can handle directory with tons of files entries, performance should not be a
problem and should even be better than using a database. -- Pouik
• There is a lot of prejudice out there in favor of using database engines
instead of flat files. Choosing which to use in a project ought to be similar
to choosing what programming language to use. Some of the questions to
ask are:
o Which choice fits the problem domain best (databases fit random
queries against a very large set of records best, flat files fit Wikis
best)
o What are the programmers familiar with; what do they like?
o What is available; what does the corporate culture allow; how much
do they cost? -- David Spector
• Personally, I like to store un-structured data in flat file. However, I do
believe database has its advantage on structured data. I feel this way when
I was using other wiki (Tiki, Wikipedia, phpWiki..) I always think to
extend them to include flat file. So, how about a common API? -- Duncan
Hsu
o PmWiki already has a common API, implemented via the PageStore
class in pmwiki.php. Cookbook authors can create a class with the
same interface as PageStore that saves pages in alternate locations
such as a database. --Pm
• I've got a question: wouldn't there be a problem with same-time multi-user
access to a file? (I mean writing - losing other's changes possibly)
o That is one problem I guess. Another is the administration side of
it. Of course I can dive into FTP and work with the flat files there,
but I like an admin interface of restoring articles. Mainly because I
have editors who are not so familiar with FTP as I am. --sjoerd
o PmWiki handles any locking necessary to make sure that multiple
accesses to a file don't cause any changes to be lost. PmWiki also
supports automatic merging of simultaneous edits. --Pm
• I created a 8000 files wiki for fun and testing. Basic pagehandling is fine
no performance issues. Search is acceptable. However creating the
.linkindex file from scratch is a problem. The host I run the site on (and
my test-machine) has a time out of 30 seconds. I disabled the linkindex,
however no backlinks ( pagelist link={$FullName}) are too slow. --BrBrBr

Re-enable the link index and run a few backlink searches (even if they
time out). PmWiki will incrementally build the link index. Once the link
index is built, everything will be fast and there won't be a big cost in
keeping the link index up-to-date. --Pm

• Another BIG advantage of flat files is that they are easy to edit directly. --
Babak
o Exactly! I know many scenarios where data-loss, caused by
hardware or transfer failure (storage medium crashes, power
dropouts and the likes), was easy to fix by simply using an editor on
the (flatfile) server's commandline and changing back what was
causing errors. I've never been able to do this with similar ease for
MySQL (and in such cases hate my job). -- SomeSysAdmin
• Maybe the reason flat files work so well is that a file system IS a
hierarchical database -- William
• Is a database more secure? That extra password protection needed to
access MySQL databases must mean something... Right? -- Xen
o Then why have no sites running PMwiki with flatfiles (that I know
of) ever been compromised? ;-) -- Julius
o If you can get access to PmWiki's flat files, you could also get access
to the php script containing the database password. So it doesn't
really provide any extra security. -- Andrew
 Exactly. But one should never store the (non-flatfile)
database password containing php in a web-server accessible
location. Instead do an include and put the php somewhere
outside of the web/doc root. -- Julius
• I think the biggest disadvantage of using a flatfile system is that it take the
programmer too much time to design it and to maintain its stabilization,
especially when more and more new feathers are added into the project
and more and more requirements are put out. And this also add risk to
user's data, as bugs are more likely to be brought in by program update.
This also add difficulty to resolve compatibility problems. On the other
hand, flat file system does work more efficiently than database in most
situations. -- Adam
o I would have to disagree (with part of that). Programming
something to speak to (and read from) MySQL for example can be
just as painful, precisely because it is not your own code or design.
That can be a huge disadvantage: You never know when an updated
MySQL needs changed queries, when it will do what, if it will do
what you need and so on. -- Julius
• I think that this could be an endless debate because the line is often thin
between advantage and disadvantage, imho the safe bet will always be to
give the option and let people choose given their own needs, cheers. -- h3
o I don't think the line is that thin. With a separate database you will
always have a much bigger chance on crashes and downtimes. You
make yourself more dependent by needing yet another service to be
running and backupped separately etc. Just count the times you see
things don't work and give you MySQL errors online, I have rarely,
if at all, seen that occur with flatfile databases. -- Ben
 Many people already have a copy of MySQL running, so that
isn't a problem. The mysql problems are from sites that are
too many/too slow of queries for their hardware. something
as simple as retrieving a wiki page isn't going to have trouble
like that.
 More people don't have a copy of MySQL running. In
fact, I know more people who don't run it on their
servers, precisely because it is such a resource
monster for its purpose: Merely some text-file storage
system. -- Steven
• Flat file has a very important advantage -- you can diff and merge pages
with merge tools. With that you would be able to make more than one wiki
sites in all your computers and merge them periodically. I think lots of
people need this function. At least, I switch to dukowiki from mediawiki
just because of this.-- Edward
• Databases are always on top of a filesystem -- At last all of the "real"
databases store their data on a filesystem. They provide an abstraction
layer for purposes as e.g. authentification, transactions or only
convenience on different OS and have a common query syntax (SQL).
Therefore the performance issue relies mainly on following factors:
o Performance of the filesystem
o Efficient caching strategies (for data, queries, ...)
o Efficient internal file organization
o Efficient code (client and server)-- Heiko
• Most file systems map files to hard sectors on a disk. Databases offer a
level of virtualisation:

the sectors can be on any disk or server. Result is you can use one server/disk for
DB, another server for PHP and a third for web server. You can share out load
and get better overall performance even in very heavy usage. Of course that may
not be the goal of PmWiki, ;-). -- Peter
Well you can always use NFS if you want your files on another server. But in both
cases NFS or a DB, running them on another server is actually likely to increase
your latency and not necessarily increase your thoughput. The advantage of a
separate DB is more apparent when you need more than one client accessing it at
the same time, which, of course, you can do with NFS also, the DB might provide
better locking mechanisms but they are not likely to be important to pmwiki (not
writer heavy enough). How do you suggest running PHP on another server than
your web server? And, whatever your solution for this, wouldn't this also be
available without a DB also? Martin Fick

• Just to say. I prefer flatfiles in this case just becouse my home server is an
MMX, but isn't SQL servers loaded in memory? memory access time is
much slower than HD, not to mention the really old ones (my is
2GbATA100). Of course that not all the pages should be loaded on
memory all the time, but for the most accesed ones... Also, it is easier to
provide a single download file providing with all the wikidata for the user
who wants to have it offline. He will just need a way to read it... And my
third point is that it is better for a wiki becouse no JOIN is needed.
Logical vs. Physical File Organisation

The physical storage of a file is how it is stored on the storage medium.

The logical storage of a file is how it looks to the user when it is being processed.

File Access Methods

A serial access file has data stored in the order in which it was written. New
data goes to the end of the file. To read a record from the file it is necessary to
read through all of the records from the start of the file until reaching the
required record.

A sequential access file has data stored in the order of a key field.

An indexed sequential file stores data in the order of a key field, but also has
an index holding the key field values and the address at which they are stored.
This allows both sequential and direct access.

A direct access file is one where any record can be accessed without having to
access other records first. This is also known as Random Access.

NB Direct Access Files can only be stored on Direct Access Media.

Why are files stored on a tape cartridge always serial access?

Advantages of Direct Access over Sequential Access

Records can be accessed in any chosen order

Records do not have to be put into any order when the file is created

Selected records can be accessed far more quickly from direct access files.

Advantages of Sequential Access over Direct Access


Sequential Files can be stored on Tape as well as disk

It is easier to write programs to handle sequential files

Advantages of Sequential Access over Serial Access

Updating master files with transaction files is made more easy using sequential
files since they are both in the same order to begin with

What would be the best method of access to find one record in a large file -
sequential or direct?

Why choose different methods of file access?

Some file organisations are better than others for particular tasks. These are
some of the reasons why a particular file organisation may be chosen:

The number of records to be accessed - If not many records are to be accessed,


direct access is more efficient. If most records are to be accessed during
processing then sequential access is better. This file activity is measured by
hit rate. (See later)

The size of the file - In large files sequential searches take a long time, so
direct access is better. In small files time delay is less important, so
sequential access is acceptable.

The type of application - Sequential access is usually suitable for batch


applications, but on line applications usually require direct access to give a
fast response time.

The type of storage media - Magnetic tape is a serial access medium so direct
access cannot be used.
Relational Databases

Information is very valuable and must be organised so that it can be accessed as


easily and quickly as possible. A database not only stores the data, but also
organises the data and controls access to it with a program called the Database
Management System (DBMS).

Flat files

The earliest data storage computer systems used flat files- A fiat file is like
information stored in a grid or table.

Each row in the table contains a record —information about a particular person
or thing.

Each column in the file contains information on one field, for example Name,
Type, and so on.

Primary key

No two records in a file can be the same or it will lead to confusion. For example,
if two people have the same name, there must be some other means of identifying
which record refers to which person.

Therefore each record has to have a unique identifier - something that makes it
different from all the other records - called the primary key.

A persons surname cannot be used as a primary key because:

• It may not be unique — there are many people with the name Smith
• it may change (for example if the person marries) and you must never alter
the primary key
• it maybe lengthy and so there is more chance of typing it incorrectly
• the database software creates an index of key fields and the shorter the key
field, the smaller the index, so the sorting and searching operations will be
performed faster — a surname can be long
It is a good idea to create a special field to act as the primary key. Sometimes
there is an obvious candidate, for example in a garage keeping a file on cars it
repairs, the registration number would be an ideal primary key.

Why are flat files inefficient?

In flat files data tends to be stored in several places. For example, in a school
information system information about teachers may be stored on the file for
classes, as well as on an administration file holding employment information.

This is very inefficient because repeating data wastes disk space. It could also lead
to inconsistent data if the teacher's address was stored differently in the two files.

We could store the information more efficiently by using a database with two
files. The class table and the teacher table.

Entity Relationship Diagram

This diagram shows the relationship between the teacher and the class. The
diagram shows that a teacher can teach more than one class. It also shows that
each class can only be taught by one teacher.

We can find out the name and address of each teacher from the teacher table

How is a database different from a flat file?

There can be more than one table in a database. A flat file database has only one
table. Each table in a database contains information about an entity, for example
teacher, class, and so on

What is a relational database?

In a relational database the tables are related. This means the tables are linked
together in some way. For example in the school database, we can create a
relationship between the teacher number field in the teacher table and the class
code field in the class table.

In the class table, class code is the primary key as it uniquely identifies the class.
However it is not unique in the teacher table as one teacher can teach more than
one class. By looking in the class table we can find out the teacher code of the
teacher who takes that class. By looking in the teacher table we can find out more
information about the teacher.

Flat files or databases?

Flat files have been used by computers for many years. They are usually used for
one particular specific purpose. For example, a company might maintain an
employee file used to produce a pay slip. The personnel department might also
have a file of employees' records, which holds some different data.

When information is held on separate files in different parts of an organisation,


difficulties arise. Information on one file might be up-dated but not on the other,
leading to inconsistencies. Most businesses now use databases to organise
their data more efficiently and to give flexibility.

The file approach is program-oriented; the needs of the program determine how
it is stored. The database approach is data-oriented, the type of data
determines how it is stored. This gives it a number of advantages over a file-based
system.

Advantages of the database approach

• Data independence. Any changes to the structure of a database, for


example adding a field or a table, will not affect any of the programs that
access the data. In a file-based system, a minor change in a file structure
may require a lot of reprogramming of all the programs that access this
file.
• Data consistency. Each data item is stored only once, however many
applications it is used for. There is no danger of an item, such as an
employee’s change of address, being up-dated on one system and not on
another. If this happened the data would not be consistent.
• No data redundancy. Redundancy occurs when data is duplicated
unnecessarily. In a file-based system, the same information may be held
on several different files, wasting space but making up-dating more
difficult.
• More information available to users. In a database system, all
information is stored togethe’ centrally. Authorised users have access to all
this information. In a file-based system data is held in separate files in
different departments, sometimes on incompatible systems.
• Ease of use. The DBMS provides easy-to-use queries that enable users to
get instant answers. In a file-based system a query would have to be
specially written by a programmer.
• Greater security and integrity of data. The DBMS will ensure that
only authorised users are allowed access to the data. Different users can
have different access privileges, depending (In their needs. In a file-based
system using a number of files it is difficult to control access

Disadvantages of the database approach

• Bigger computers and greater risk. A DBMS is a large program that


may require a larger disk and a bigger, more powerful computer than a
file-based system. Storing all the information in one database can be
dangerous if the database system fails. All departments of the organisation
will be affected, not just a single department. Complex procedures are
required to ensure that lost data can be recovered.
• Greater complexity. As databases have more uses, they get more
complex and their design is more difficult. This requires considerable
expertise and if not done well, the new system may fail to satisfy the user’s
needs.
• Possible inefficieney or poor performance. A tailor-made file-based
system may be more efficient than a database.

NB

a table is another name for a file in a database

a relationship is a link between two tables

an entity is a subject about which data is stored

Types of relationship

A relational database consists of a number of entities or tables which are related


in some way. Entities can be related in any one of three ways:

• one-to-one
• one-to-many
• many-to-many
Examples

Each product in a supermarket has a bar code. The relationship between product
name and its bar code is one-to-one.

One supermarket company has many stores. This is an example of a one-to-


many relationship.

There are many different products, each on sale in many different stores. This is
an example of many-to-many relationship.

Database Normalisation

Normalisation means structuring the database in the most efficient manner. In a


relational database it is not possible to have many-to-many relationships as these
will lead to ambiguities.

Tables should be structured in such a way that

• there is no redundant data, that is there is no unnecessary duplication


• data is consistent throughout the database
• many-to-many relationships are avoided
• the structure of each table is flexible enough to allow you to enter as many
or as few items as you want
• the structure should enable a user to make all kinds of complex queries
relating data from different tables.

Database administration

1. The database administrator (DBA) is the person responsible for


maintaining the database. The administrator’s tasks include the following:
2. The design of the database and monitoring its performance. If problems
arise appropriate changes must be made to the database structure,
3. Keeping users informed of changes in the database structure that will
affect them.
4. Maintenance of the data dictionary for the database.
5. Implementing access privileges — specifying what a user can access
and/or change.
6. Protecting confidential information.
7. Allocating passwords to each user.
8. Providing training to users in how to access and use the database.
9. Ensuring adequate back-up procedures exist to protect data.

The Data Dictionary

The data dictionary is information about the database, such as:

• what tables are included and fields in these tables name and description of
each data item
• the characteristics of each data item, such as its length and data type
• any restrictions on the value of certain fields the relationships between
data items
• control information such as who is allowed access to files
• whether users can change data or only read it

The Database Management System (DBMS)

The DBMS is the program that provides an interface between the database and
the user in order to make access to the data as simple as possible, it has several
other functions:

1. Data storage, retrieval and up-date. Users to store, retrieve and up-
date information as easily as possible. These users are not computer
experts and do not need to be aware of the internal structure of the
database or how to set it up.
2. Creation and maintenance of the data dictionary
3. Managing the facilities for sharing the database. Many databases
need a multi-access facility. Two or more people must be able to access a
record simultaneously and to up-date it without a problem.
4. Back-up and recovery. information in the database must not be lost in
the event of system failure
5. Security. The DBMS must check passwords and allow appropriate
privileges.

Database security

As we have already seen data stored in a database is very valuable. Good security
to prevent loss, theft or corruption of data is vital. Relational databases such as
Microsoft Access and Paradox are multi-access databases. This means that on a
network, more than one user may access the same database at the same time,
How can the software cope with two or more users opening the same file, making
alterations and yet maintain the integrity of the database?
Relational databases provide different methods of database security:

1. The simplest method is to set a password for opening the database. Once
set, a password must be entered whenever the database is opened. Only
users who type the correct password will be allowed to open the database.
The password will be encrypted so that it can’t be accessed simply by
reading the database file. Once a database is open, all the features are
available to the user. For a database on a stand alone computer, setting a
password is normally sufficient protection.
2. A more flexible method of database security is called user-level security,
which is similar to the sort of security found on networks. Users must type
a username and password when they load the DBMS. The database
administrator will allocate users to a group. Different groups will be
allowed to see different parts of the database.

Databases or a manual system?

Why should a video shop use a database? Could a manual system be better than a
database system? Both systems record details of members and who has hired
which video. (This could easily be done in a paper-based system by using a list of
all videos and the name of the hirer written next to it.)

A manual system is cheaper, unlikely to break down and requires little training.
However the computerised system will probably be better for the following
reasons:

Management information is automatically gathered, for example details of each


hiring, financial details, how many times a customer has hired a video and
how many times a video has been hired. These figures can be used in
preparation of accounts or to analyse which videos are most popular.

Better service to customers. Using a bar code reader to enter the video code and
the member code is very quick. Queues at the counter will be shorter.

Details of members and videos can be found and printed quickly.

The names and addresses of members can be used for advertising purposes in a
direct-mail shot. The database can be queried to come up with a list of
people who haven’t hired a video for six months and a letter written offering
them a discount if they hire a film this week. The letter can be personalised
using the mail-merge from a word-processing package.
Similarly automatic reminders can be sent out to members who have not returned
a video by the due date.

The database can be extended to include the member’s date of birth. The
computer can be used to ensure that a member is old enough to hire, say, an
18 video.

You might also like