File Processing Approaches: (1) Traditional Approach To Information Processing
File Processing Approaches: (1) Traditional Approach To Information Processing
Purchasing
Program
Purchase
Purchasing Master file
Transaction file
Purchasing
output
Genral
Ledger
General Ledger
General ledger Master file
Transaction file
G.L.
Output
Files are custom designed for each application and generally there is little or no
sharing of data among various applications.
Example :
Payroll System
i. Data Redundancy : Often identical data are stored in two or more files. For
Example, a customer may have a saving A/C, a current A/C and some fixed deposits.
In such a case, the name and address will be found in three different files.
ii. Data Integrity : The problem of data integrity arises due to redundancy. The same
data may be found in different forms in different files.
For example, if a customer changed his address and notified it to the bank, the
address may have been changed in the savings A/C file, but not in the fixed
deposit file. Thus two different addresses may be found for the same person.
iii. Lack of Data Integration : Data on different master files may be related, as in the
case of payroll and Personnel master file. If someone wants a report displaying Emp-
Code, Name, Basic and department, traditional approach does not allow this.
iv. Data Availability : Data is scattered in many files. If a customer’s credit worthiness
in the bank has to be checked before he is given a loan, it would be necessary to look at
many files. Due to non-uniformity in file-design, the same customer may have different
identification number in different files and hence obtaining the necessary data will be
difficult.
v. Management Control : As data is scattered in different files and relating them is not
easy, it is difficult to implement uniform policies in an organization.
vi. Program/Data Dependence : Under the file-oriented approach, programs are tied to
master files and vice-versa. Changes in the physical format of the master file, such as
the addition of a data field, require changes in all programs that access the master file.
Consequently, for each of the application programs that a programmer writes or
maintains, he must be concerned with data management.
vii. Lack of Flexibility : The information retrieval capabilities of most traditional systems
are limited to predetermined requests for data. Therefore, the system produces
information in the form of scheduled reports and queries which it has been previously
programmed to handle. If managements needs unanticipated data, the information can
perhaps be provided if it is in the files, but extensive programming is often involved.
Thus by the time the programming is completed, the information may no longer be
required or useful. Ideally, information processing should be able to mix related data
elements from several different files and produce information with a fast turnaround to
service unanticipated request for information.
(2) DATABASE APPROACH TO INFORAMTION PROCESSING :
A View is an imaginary table created from existing tables; a view may contain data from
more than one table .
Table 1 Table 2
----
----
---- ∼∼
----
----
∼∼
----
----- ∼∼
----- ∼∼
----- ∼∼
----- ∼∼
----- ∼∼
View
DBMS
A database management system is a set of procedures that manage the database and
provide access to the database in the form required by any application program (users). It
effectively ensures that necessary data in the desired form is available for diverse
applications of different departments in an organization.
Application Programs
Purchasing
Accounts
Payable
DBMS
Inventory
Payroll Database
Personnel
Thus, DBMS is a set of programs that serve as an interface between application programs
and a set of co-ordinated and integrated physical files called a database.
A DBMS provides the capabilities for creating, maintaining and changing a database.
In case of DBMS, the data among the physical files are related with various pointers and
keys, which not only reduce data redundancy but also enable unanticipated retrieval of
related information.
Application Application
Program #1 Program #2
Logical View
DBMS
DBMS provides the facility
to view data in logical form
(Data Hiding),
Physical view Insulting the user from
Concerns about how the data
are Physically stored
Database
Physical View
Logical View :
TitleAuthor
Au-id Title-id
Title-id Titles Primary Key
Title-id
Title-name Foreign Key.
Au-id Pub-id
Price
Edition Sales Foreign Key
Authors Store-id
Au-id Ord-num
au-name Date
au-address Foreign Key
Qty
au-city Title-id
Publisher
Pub-id
Pub-name
Pub-address
Pub-city
Stores
Store-id
Store-name
Store-addr
(1) Data Sharing : The main motivation for designing a system to manage a database is
share-ability of data. It means not only the current applications can share the data, but
this should also serve needs of future applications.
(3) Control (Reduce) Redundancy : The system should identify existence of common
data and avoid duplicate recording. Relationships or pointers should be used to locate
data which are used many time.
(4) Relating Data Items : Relationships between data items should be specified.
(5) Data Integrity : (or Inconsistency can be avoided) : Consistency of data values and
relationship must be preserved. In order to achieve this, the system must ensure
validity of data by using good data editing, synchronizing, updating of data and
propagating changes to other related data elements.
(6) Data Security : This is concerned with protecting access to data. Protection is
needed at many levels for access, modification, deletion or display. Access
restriction may be for individual data items or group of items. For example, access to
salary of employees may be allowed to the financial controller but access by him may
be barred to the medical records.
(7) Database Performance : The system should be able to provide timely information as
required.
DATABASE
The objective of the DBMS is to provide a convenient and effective method of defining,
storing and retrieving the contents in the database. To represent this information, some
means of modeling is used.
DBMS are large, complex Software packages that are written in languages such as BAL
(Basic Assembler Language), FORTRAN, COBOL, PL/1 or C. They have filled a large
portion of the gap between the ordinary user (e.g., order entry clerk, marketing vice-
president), the application programmer, and the computer. Non-programmer user can
now at least “Shake hands” on their own with the computer.
Thus DBMS is a collection of software which constructs, expands and maintains database
and it also provides the interface between the user and the data in the base.
Now, it is clear that the Software that controls the input, storage and retrieval of
information from the database is called DBMS. DBMS is a tool for managing
information stored on a computer.
In other words DBMS is a collection of the h/w and s/w that provides the means for
organizing and structuring data in files, records and fields and has the ability to:
(i) Maintain the data in database by adding new records, deleting `dead’ records
and amending records.
(ii) Expand the base by defining and adding new data files.
(iii) Easily retrieve data interactively in a variety of ways.
(iv) Generate reports based on data in the database.
(v) Maintain the integrity and consistency of data.
(vi) Provide security for data in the base eg., protection against unauthorized
access, safeguard data against corruption.
(vii) Provide recovery and restart facilities after a h/w or s/w failure.
Also the DBMS provides facilities for different types of file processing. It can
FUNCTIONS OF DBMS :
(1) The DBMS allocates storage to Data
(2) The DBMS maintains the data in database by
(i) Adding new records
(ii) Deleting dead records
(iii) Amending records
(3) The DBMS provides an interface with user-programs.
(4) It provides facilities for different file processing, e.g.
-process the complete file
- process the required record
-retrieve the individual record.
(5) Provides security and protection for data against
- accidental or intentional intrusion
- against corruption
- against catastrophes (natural disasters)
(6) Provides recovery and restart facility after a H/w or S/w failure.
(7) DBMS ensures data integrity (accuracy of data entered)
(8) It ensures data privacy
(9) DBMS keeps statistics of the use of data in the database and allows data which is
frequently used in readily accessible form
(10) DBMS makes use of description (catalog) of all data types.
(iii) Program/Data independence : This means that the programs and the data
are mutually independent i.e. data can be reconstructed without the need to
make alterations to the programs. Similarly a program change does not call for
rearrangement of the data layout. The independence of the database and
programs using it, means that one can be changed without changing the other.
The ability to separate the logical database definition from its physical storage
organization increases the capabilities to redefine and restructure the database.
That is, the storage structures and access strategies may be altered in response
to changing requirements, without having alteration in existing applications.
There are, in fact, two distinct levels of data-independences :
a) Physical data Independence insulates applications from the
underlying physical storage organization of the data.
b) Logical data independence, which insulates applications from
changes made to the logical organization of data e.g., the addition
of new record types or relations.
(iv) Data interrelationship :The data items in the base are linked or chained to
each other so that any required relationship between different items of data
can be established. As the base is expanded or as user’s requirement change,
these relationship can be changed and new relationships can be established.
- Include all the necessary structural interrelations of data.
- necessary owing to the fact that the various applications use data in
different ways.
(v) The database is common to all users of the database system. That is database
system is Integrated and shared and have a common approach to the
retrieval, insertion and amendments of data.
(vi) Centralized Storage : All the information are stored in a central place i.e.
database is stored in a centralized computer system.
(viii) A database system must provide mechanisms for the protection of data from
unauthorized intrusion, whether accidental or malicious. Strict authorization
checks are provided for privacy and confidential purposes.
(ix) Security checks are provided to safeguard database from security-hazards eg.
Thieves, fire and natural-disaster.
(x) Response time is very low. A DBMS must offer a high standard of
performance and efficiency, especially for on-line query processing.
(xi) Usable by all programs – A database needs to be usable by not only all the
existing applications but also by all foreseeable applications
(xii) Reduced programming effort: A user need not write programs for activities
such as querying database, report generation, addition/deletion of data etc.
A database must be open ended so as to accept new sets of data items and
changes to existing data item sets.
DISADVANTAGES OF DBMS
COMPONENTS OF DBMS :
A database management system consists of five major components.
1. Data
2. Hardware
3. Software
4. Users Application Programmer
→ End Users
DBA
DDL
5. DBMS Facilities
DML
1. Data : Data Stored in the system is partitioned into one or more databases. The
data stored in the database, in general, are both shared and integrated.
By shared data we mean that the individual piece of data in database is shared
among several users. By integrated we mean that a database may be considered
as a unification of different strings of data files. Sharing of data also implies
concurrent sharing such that the ability of different users to access the database at
the same time. The data are shared among different users and applications, but a
common and controlled approach is used for inserting, deleting, modifying and
retrieving data.
3. Software : Software is in between the database (physical) and the user of the
system. All requests of data by user is entertained by DBMS, a complex Software
system.
4. Users : May be of three different types :
i) Application programmer
ii) End user
iii) D.B.A.
(i) Application Programmer : is responsible for writing application
programs in higher level languages to be utilized by End users.
(ii) End Users : End users are those who access the database from a terminal
and may employ query language provided as an integral part of the
system. The user may invoke a user written program, that accepts a
command from the terminal and in this issue requests to the DBMS on end
user’s behalf.
(a) Naïve user : Users who need not be aware of the presence of the
database system or any other system supporting their usage, are
considered naïve users. These users work through a menu oriented
application program.
(b) On line user : These users may communicate with the database directly
via an online terminal or directly via a user interface and application
program. They use DML to manipulate database directly.
End Users
Application
Programmers
AP1
AP2
AP3
Likewise, DBMS have greatly extended the application programmers’ ability to handle
complex data associated structures and to supply timely reports for a variety of users,
with less difficulty and a smaller investment in programming time than ever before.
To the user of the DBMS, more of the internal operations and data structure are
transparent. Although, the degree of transparency varies among different packages, it has
the net effect of isolating user’s from technical considerations.
A database system is partitioned into modules that deal with the responsibilities of
the overall system. In almost all the cases the underlying operating system provides only
the most basic services and the database system has to build itself on top of it.
Application
Programs Database
Object code Manager
DBMS
File Manager
Data Files
Data
Dictionary Disk
Database Manager is an interface between low level data stored in the database and the
application programs and queries submitted to the system.
File manager is supposed to manage the allocation of space on Disk, storage and
maintain the data structures used to represent information stored on disk.
OPERATION & CONTROL FEATURES OF DBMS
1. Access : Users direct the DBMS by employing either a query language, which
is composed of special English like statements, or a host computer language
such as COBOL, FORTRAN, BASIC. Mode of access may be either batch
(Payroll) or interactive (travel agency making airline reservations).
Sequential
Direct processing
Identification by attributes.
The followings are some of the most popular, commercially available DBMS:
9. INGRESS
10. UNIFY
11. SYBASE
13. MS-Access
14. FoxPro
15. dBase
ARCHITECTURE OF DBMS
An outline of the generalized architecture of a DBMS is given below. There are three
levels and that’s why this DBMS is called Three Level Architecture. A large number of
commercial and research database models fit into this framework.
The architecture is divided into three levels: The External, Conceptual and Internal. The
view at each of these levels is described by a schema. A Schema is an outline or a plan
that describes the records and relationship existing in the view.
iii) Internal View : This describes how the actual database is stored on the
storage device and also describes the data structures and access method (or
retrieval method) to be used by database.
EMP_NO A6 01 EMPC
DEPT_NO A4 02 EMPNO PICX(6)
SALARY F5.2 02 DEPTNO PICX(4)
CONCEPTUAL
EMPLOYEE
EMPL-NUM Character (6)
DEPT-NUM Character (4)
SALARY Numeric (5)
INTERNAL
Length of records in Bytes, Index, Pointers etc.
Emp-name Emp-name
Emp-address Emp-Soc-Sec-No
Emp-address
Emp-salary
Emp-name : String
Emp-Soc-Sec-No : Pr. Key
Emp-address : String
Emp-Skill : String
Emp-salary : integer
Employee record
Length-120 Name : String length 25 offset 0
Soc-Sec-No. : 9 decimal offset 25 unique
Deptt. : String length 6 offset 34
Address : String length 51 offset 40
Skill : String length 20 offset 91
Salary : 9.2 decimal offset 111
DATABASE MODELS
The data in the database are usually logically organized according to some data model.
Database systems are generally based on one of the three data models, namely:
This model is based on the mathematical notion of a relation. In this model both the data
objects and their inter relationships are represented by two dimensional tables.
Relational database model use two dimensional table to store data. They order data in a
table comprising of rows and columns and differ remarkably from their hierarchical and
network counterparts. There are no parent and child data set. Consider the following
tables containing information of employees and department.
Employee Table
Dept Table
Dept_Code Department
413 Personnel
414 Accounts
415 MIS
(Values under certain column is known as domain e.g. domain of Dept_Code is 413,414,415 )
The rows (records) of this table is known as tuple and column as domain. A domain is a
pool of values from which the actual values appearing in a given column are drawn. A
relational database is composed of relations and tabular representation is easy to
comprehend and implement. All others database model can be converted to relational
structure very easily and searching is very fast. The examples of relational database are
SQL/DS, INGRESS, UNIFY, ORACLE, SYBASE, FOXPRO, dBASE III Plus.
T-CODE T-NAMS SEPECIALIZATION CORSE
N:N
BASICS OF RDBMS
1. Entity : Things, objects or events about which we collect information. An entity is an object or
event which can be distinctly identified. Entities are distinguishable objects of concern and are
modeled using their characteristics or attributes.
For example: a student with Enrolment_No U3199 is an entity. Similarly, an employee with
emp_id CS001 is an entity.
2. Attributes : Attributes are the set of information which are collected about an entity, i.e., a set of
attributes defines the characteristics or properties of an entity.
3. Relation : A relation is similar to a table which consists of rows and columns. The programmer
views a relation as a file in the database. Each row in the relation represents a record whereas the
various columns represent the fields within the record.
4. Domain : The Domain is a set of possible values that an attribute can have.
For example, the attribute salary may have a domain such that salary of an employee in any value
between 1000 and 5000.
5. Tuple : A row in a relation is also called a tuple. A tuple having set of n. numbers of attributes is
termed as n-tuple.
At a given instance of time, the number of tuples in a relation is known as cardinality. And the
number. of attributes in a tuple is known as degree of the tuple or relation.
6. Key : Each relation has at least one column (or attribute) for which, each row must have a unique
value. Such an attribute is called a key . In other words, A key uniquely identifies a record in the
file. For example, Emp_No is a key in the relation.
The Hierarchical and Network data models are inherently unstable because of the hidden
pointers in the records. In the event of system errors, the chain of address between the
records could be damaged resulting in reduced data integrity.
The relational model was proposed by Dr. E.F. Codd in 1970. This model simplified the
database structure.
Hierarchical structure makes it difficult to express the relationship where a child has
more than one parent. But the database is easy to comprehend, modify and search.
One-to-many relationship
STUDENT
Limitation:
3. Network Model : It is a parent child data structure where in child records can have
more than one parent. It is more flexible than the hierarchical model and is suitable to
represent many-to-many relationships.
In this model the database is represented by a directed graph, the nodes of which
represent the data objects (record type) and the arcs of which define the relationships
among the data objects.
Examples are HP’s IMAGE, UNIVAC’s DMS 1100 and DEC’s DBMS 10-20. Networks
system allows very general interdependencies to be expressed conveniently. But the
resulting structure can be difficult to comprehend, modify and reconstruct in case of
failure.
Many-to-Many Relationship
TEACHER
Limitations:
DATABASE MODELLING
E-R Diagram: E-R Diagram is a tool which helps in expressing the relationships
amongst various entities and helps in modeling (deciding/designing) the database.
1
Teacher
1 Teaches
Teac
hes
m m