0% found this document useful (0 votes)
16 views43 pages

Introduction To Data Bases 1

The document outlines various types of file systems and database management systems (DBMS), emphasizing the importance of data organization and processing efficiency in businesses. It discusses flat files, relational databases, and different file organization methods such as serial, sequential, indexed sequential, and random access. Key concepts include data redundancy, normalization, and the advantages of relational databases in reducing duplication and maintaining data integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views43 pages

Introduction To Data Bases 1

The document outlines various types of file systems and database management systems (DBMS), emphasizing the importance of data organization and processing efficiency in businesses. It discusses flat files, relational databases, and different file organization methods such as serial, sequential, indexed sequential, and random access. Key concepts include data redundancy, normalization, and the advantages of relational databases in reducing duplication and maintaining data integrity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Unit 4.3.

0
 Candidates should have an
understanding of how organizations use
ICT, including
 sequential file systems (batch processing e.g.
payroll);
 Indexed sequential & random access files
(e.g. . payroll and personnel records.
 Relational database systems (e.g. customer
database linked to sales records)
 Youshould be able to describe these
systems, giving the hardware and
 Datais the lifeblood of most
businesses and organizations. Why
do they collect and store data?
 Because data is processed (sorted,
filtered, searched) to give us
A database is a collection of data
that is stored in an organized or
logical manner so that data can be
processed effectively or retrieved
quickly and efficiently.
 You should recall the following from
GCSE:
 Tables
 Fields
 Records
 Some databases exist
solely to process data
automatically – for
example databases held by
utility companies
 Some databases exist to
give us information when
we need it: for example,
the school database.
 The purpose of the
database obviously affects
the way that data is stored,
Different Which phone
Person? No is right? Inconsistent?

A Flat File is one data file containing a two dimensional table. It


generally contains data duplication as each record is self contained.
Q. How many records are in this flat file?
A record is one row of a table and contains all the data related to a
particular person or thing e.g. a loan record for Joe Smith
Q. How many fields are there in this flat file?
A field is one column of a table and contains one piece of data about a
person or thing e.g BorrowerName
Why does the above flat file store data inefficiently?
A flat file may contain data duplication where the same data item is stored
in two or more different locations.
Unnecessary duplication is known as data redundancy.
Redundancy often leads to inconsistency where the same data item is stored
differently in different places. Eg BorrPhone 454545,454555 due to typing
errors or being updated in one place but not another.
Flat files can be turned into more efficient related tables through a process
of normalisation and put into a database
•Take out any repeating groups of data and put them in a separate table.
•Make sure one field is present in both tables to form a link or relationship
between them.
•Generally this field is unique to one of the tables (its primary key).
•Separate tables only tend to contain data about one kind of thing.

Question: Identify the repeating groups in the flat file above.


•It contains 2 repeating groups, what are they?
•These details can be taken out into separate tables. What fields
should be left behind in the original table?
Original Loan Flat File

BORROWER TABLE LOAN TABLE (Original)

BOOK TABLE
Relational Diagram

NB. Need to add borrowerID to


give the borrower table a
unique key.
Problems with the traditional file approach:-

•Data redundancy - same data duplicated in many different files

•Data inconsistency - when the same items of data are held in


several different files, the data should be updated in each file
when it changes (if not -> data inconsistency)

•Program-data dependence - file format (i.e. which data fields


constitute a record) must be specified in each program.
Changes to the format of the data fields mean that every file
which uses that program has to be changed.

•Lack of flexibility - for non-routine data it could take weeks to


assemble data from various files and write new programs to
produce the required reports

•Non-sharable data - if two departments need the same data,


either a second copy of the data would be made (-> data
inconsistency) or the same file used (adding extra fields would
mean programs would need to be changed to reflect the new
file structure)
 A relational database consists of a number
of separate tables
 For example a payroll table and a staff table
 Tables are linked to each other…
 … using a key field
 For example the employee ID
 This field is part of other table(s)
 Data from one table combined with data
from other table(s) when producing reports.
 Can select different fields from each table
for output
 SQL is used for queries and producing
1. Tables are designed to reduce duplicate data to a minimum and therefore
remove any redundant data
2. No redundant data means data only has to be input once ensuring faster
data entry and consistency of data
3. Changes in the structure of the database do not affect programming that
accesses other parts of the database. This is called Data Independence
from the program. Eg Adding a new field called Gender to the Borrower
table doesn’t mean you have to reprogram the Loans Report. (You would
have to do this on a flat file)
4. Data Pool can be accessed by several different applications
5. Information held more than once (Key fields acting as links between
tables) are automatically updated by the system
6. Increased productivity as users can use report generators to customise
reports to meet particular needs.
7. Different access rights available for different parts of database
1. As all data for a range of applications is held
in one place there are greater security and
confidentiality issues.
• Eg Many users need to view and update
different combinations of tables or records or
fields in a database.
• Eg Very important to back up this data as all
data will be lost if a natural disaster occurs.
2. Backup and restoration processes are more
complex for databases than flat files
A relationship is a link or association between
entities

The links (relationships) may be...

one-to-one
Products and bar-codes in a supermarket.
one-to-many
One video club member may loan a number
 of videos.
many-to-many
Pupils and Teachers in a school.
Entity-relationship diagram - diagrammatic way of representing
the relationship between entities in a database.
An entity-relationship diagram shows the links between tables.

One-to-one
E.g. Products in a supermarket each have a
unique barcode number.

One-to-many E.g. A video club member may hire out a


number of videos.

Many-to-many Teachers and pupils in a school. Each teacher


teaches many pupils and each pupil has many
teachers.
DBMS
The DBMS (Database Management System) is a
program which allows the user access to data.

It must...
 allow users to create and edit the data
and provide facilities to search the data
using a query language.
 allow other applications to use the
data.
 create and maintain the data
dictionary
 maintain the integrity of the database. On a
multi-access system, this is done by locking
a record or table when a user is editing it.
This means that another user is unable to
edit it at the same time. When the data is
saved it is unlocked.
 check passwords of individual users and
only allow that user access to certain parts of
the database.
 ensure that recovery is possible if the
database is corrupted.
There are four types of file
organization that you need to know
about:
Serial
Sequential
Indexed Sequential
Direct /Random Access
A serial file is one in which the
records have been stored in the
order in which they have arisen.
They have not been sorted into
any particular order.

Ashopping list is an example


of a non-
computerised serial file.
▪ A collection of records
 Anexample of a serial file is an
unsorted transaction file (more on
this in a minute ).

 Cannot be used as master


 Used as temporary transaction file
 Records stored in the order received
A sequential file is one in which the
records are stored in sorted order of
one or more key fields.
 Sequential access means that data is
accessed in a predetermined, ordered
sequence.
 Sequential access is sometimes the only
way of accessing the data, for example if
it is on a tape.
 It may also be the access
method we need to use if the
application requires processing
a sequence of data elements in
order.
 Records are usually stored on
tape and processed one after
the other – for example when
utility companies issue bills, or
when businesses produce pay
slips for their workers at the end
of each month.
 A collection of records
 Stored in key sequence
 Adding/deleting record requires
making new file (so that the
sequence is maintained)
 Used as master files
 Serial files are often used as
transaction files.
 Sequential files are used as master
files.
 A company’s master file might hold all
the data about every employee
 A transaction file might hold a list of all the
employees who have gotten married this
month and changed their names.
 The master file would be read one
record at a time
 The transaction file would be used to
update the master file

Federer
r
Potte

Windsor
Cole
Hermione Grang

Britney Spears
Cheryl Tweedy

Kate Middleton
 Simple file design
 Very efficient when most of the
records must be processed e.g. Payroll
 Very efficient if the data has a natural
order
 Can be stored on inexpensive devices
like magnetic tape.
 Entire file must be processed even if a
single record is to be searched.
 Transactions have to be sorted before
processing
 Overall processing is slow, because you
have to go through each record until you
get to the one you want!
 Each record of a file has a key field
which uniquely identifies that record.
 An index consists of keys and
addresses, just like an index in a
book:
 The pages in a book are stored
sequentially, so you can read through it
page by page
OR
 You can look up the page you want
in the index and flick straight to it
 An indexed sequential file is a
sequential file (i.e. sorted into order
of a key field) which has an index.
 A full index to a file is one in which
there is an entry for every record.
 Because each record has an index, we
can access individual records directly,
without having to scroll through all the
other records first.
 Indexed sequential files are
important for applications where
data needs to be accessed.....
 sequentially , one record after another

OR
 randomly using the index.
A company may store details about its
employees as an indexed sequential file.
Sometimes the file is accessed....
sequentially. For example when the
whole of the file is processed to produce
pay slips at the end of the month.
Sometimes the file is accessed....
randomly. Maybe an employee
changes address, or a female employee
gets married and changes her surname.
 An indexed sequential file can only be stored
on a random access device e.g. magnetic disc
or CD.
 This is because we need a device that will allow
us direct access to random files, rather than
the sequential access that magnetic tape
allows.
 Provides flexibility for users who
need both type of access with the
same file
 Faster than sequential
 Extrastorage space for the index is
required, just like in a book: your text
book would be 372 pages without
the index (go on, check!) but is 380
pages with the index.
 Records are read directly from or
written on to the file.
 The records are stored at known
address.
 The address is calculated by applying
a mathematical function to the key
field.
A random file would have to be stored
on a direct access backing storage
medium e.g. magnetic disc, CD, DVD
 Example : Any information retrieval
system. Eg Train timetable system.
 Any record can be directly accessed.
 Speed of record processing is very
fast.
 Up-to-date file because of online
updating.
 Concurrent processing is possible.
 More complex than sequential
 Does not fully use memory
locations
 More security and backup problems

You might also like