0% found this document useful (0 votes)
49 views17 pages

Dbms Notes - Lal

detailed noted

Uploaded by

Yash Bhatnagar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views17 pages

Dbms Notes - Lal

detailed noted

Uploaded by

Yash Bhatnagar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 17

File Processing System:

File processing system store data in separate computer files. File


processing system is a system used to store and manage data that
involves each department or area within an organization having its
own set of files, often creating data redundancy and data isolation.

The file handling which we learn under C/C ++ is the example of


file processing system. The Application programs written in C/C ++
like programming languages go through the file system to access
these flat. files as shown.

Limitations of the File Processing System I File-Based


Approach
There are following problems associated with the File Based Approach:

1. Separated and Isolated Data: To make a decision, a user might need


data from two separate files. First, the files were evaluated by analysts and
programmers to determine the specific data required from each file and the
relationships between the data and then applications could be written in a
programming language to process and extract the needed data. Imagine the
work involved if data from several files was needed.
2. Duplication of data: Often the same information is stored in more than
one file. Uncontrolled duplication of data is not required for several
reasons, such as:

• Duplication is wasteful. It costs time and money to enter the data more
than once

• It takes up additional storage space, again with associated costs.

3. Duplication can lead to loss of data integrity; in other words the


data is no longer consistent. For example, consider the duplication of data
between the Payroll and Personnel departments. If a member of staff moves
to new house and the change of address is communicated only to Personnel
and not to Payroll, the person's pay slip will be sent to the wrong address. A
more serious problem occurs if an employee is promoted with an associated
increase in salary. Again, the change is notified to Personnel but the change
does not filter through to Payroll. Now, the employee is receiving the wrong
salary.

4. Data Dependence:

In other words, in file based approach application programs are data


dependent. It means that, with the change in the physical representation
(how the data is physically represented in disk) or access technique (how it
is physically accessed) of data, application programs are also affected and
needs modification. In other words application programs are dependent on
the how the data is physically stored and accessed.

If for example, if the physical format of the master/transaction file is


changed, by making the modification in the delimiter of the field or record,
it necessitates that the application programs which depend on it must be
modified.

Let us consider a student file, where information of students is stored in


text file and each field is separated by blank space as shown below:

I Rahat 35 Thapar
Now, if the delimiter of the field changes from blank space to semicolon as
shown below:

1; Rahat; 35; Thapar

Then, the application programs using this file must be modified, because
now it will token the field on semicolon; but earlier it was blank space.

5. Difficulty in setting relationship on data: To create useful


applications for the user, often data from various files must be combined. In
file processing it was difficult to determine relationships between isolated
data in order to meet user requirements.

6. Incompatible file formats: As the structure of files is embedded in the


application programs, the structures are dependent on the application
programming language. For example, the structure of a file generated by a
COBOL program may be different from the structure of a file generated by a
'C' program. The direct incompatibility of such files makes them difficult to
process jointly.

7. Data Security. The security of data is low in file based system because,
the data is maintained in the flat file(s) is easily accessible. For Example:
Consider the Banking System. The Customer Transaction file has details
about the total available balance of all customers. A Customer wants
information about his account balance. In a file system it is difficult to give
the Customer access to only his data in the· file. Thus enforcing security
constraints for the entire file or for certain data items are difficult.

8. Transactional Problems. The File based system approach does not


satisfy transaction properties like Atomicity, Consistency, Isolation and
Durability properties commonly known as ACID properties.

For example: Suppose, in a banking system, a transaction that transfers Rs.


1000 from account A to account B with initial values' of A and B being Rs.
5000 and Rs. 10000 respectively. If a system crash occurred after the
withdrawal of Rs. 1000 from account A, but before depositing of amount in
account B, it will result an inconsistent state of the system. It means that
the transactions should not execute partially but wholly. This concept is
known as Atomicity of a transaction (either 0% or 100% of transaction). It is
difficult to achieve this property in a file based system.

9. Concurrency problems. When multiple users access the same piece of


data at same interval of time then it is called as concurrency of the system.
When two or more users read the data simultaneously there is ll( problem,
but when they like to update a file simultaneously, it may result in a
problem.

For example:

Let us consider a scenario where in transaction T 1 a user transfers an


amout1t 1000 from

Account A to B (initial value of A is 5000 and B is 8000). In mean while,


another transaction T2, tries to display the sum of account A and B is also
executed. If both the transaction runs in parallel it may results
inconsistency as shown below:

The above schedule results inconsistency of database and it shows


Rs.12,000 as sum of accounts A and B instead of Rs .13,000. The problem
occurs because second concurrently running transaction T2, reads A and B
at intermediate point and computes its sum, which results inconsistent
value.

10. Poor data modeling of real world. The file based system is not able
to represent the complex data and interfile relationships, which results poor
data modeling properties.
Database is collection of data which is related by some aspect. Data is
collection of facts and figures which can be processed to produce
information.

Name of a student, age, class and her subjects can be counted as data for
recording purposes.

For example, if we have data about marks obtained by all students, we can
then conclude about toppers and average marks etc.

A database management system stores data, in such a way which is easier


to retrieve, manipulate and helps to produce information.

Characteristics
 Relation-based tables: DBMS allows entities and relations among them to form as tables.
This eases the concept of data saving. A user can understand the architecture of database just by
looking at table names etc.
 Less redundancy: DBMS follows rules of normalization, which splits a relation when any of
its attributes is having redundancy in values. Following normalization, which itself is a
mathematically rich and scientific process, make the entire database to contain as less redundancy
as possible.
 Query Language: DBMS is equipped with query language, which makes it more efficient to
retrieve and manipulate data. A user can apply as many and different filtering options, as he or
she wants. Traditionally it was not possible where file-processing system was used.
 ACID Properties: DBMS follows the concepts for ACID properties, which stands for
Atomicity, Consistency, Isolation and Durability. These concepts are applied on transactions, which
manipulate data in database. ACID properties maintains database in healthy state in multi-
transactional environment and in case of failure.

Atomicity
Atomicity refers to the ability of the database to guarantee that either all of the tasks of
a transaction are performed or none of them are. Database modifications must follow an
all or nothing rule. Each transaction is said to be atomic if when one part of the
transaction fails, the entire transaction fails.
Consistency
The consistency property ensures that the database remains in a consistent state
before the start of the transaction and after the transaction is over (whether successful
or not).

Isolation
Isolation refers to the requirement that other operations cannot access or see the data
in an intermediate state during a transaction. This constraint is required to maintain the
performance as well as the consistency between transactions in a database. Thus, each
transaction is unaware of another transactions executing concurrently in the system.

Durability
Durability refers to the guarantee that once the user has been notified of success, the transaction will
persist, and not be undone. This means it will survive system failure, and that the database system has
checked the integrity constraints and won't need to abort the transaction.

 Multiuser and Concurrent Access: DBMS support multi-user environment and allows them
to access and manipulate data in parallel. Though, there are restrictions on transactions when they
attempt to handle same data item, but users are always unaware of them.
 Multiple views: DBMS offers multiples views for different users. A user who is in sales
department will have a different view of database than a person working in production department.
This enables user to have a concentrate view of database according to their requirements.
 Security: Features like multiple views offers security at some extent where users are unable
to access data of other users and departments. DBMS offers methods to impose constraints while
entering data into database and retrieving data at later stage. DBMS offers many different levels of
security features, which enables multiple users to have different view with different features. For
example, a user in sales department cannot see data of purchase department is one thing,
additionally how much data of sales department he can see, can also be managed. Because DBMS
is not saved on disk as traditional file system it is very hard for a thief to break the code.

Users
DBMS is used by various users for various purposes. Some may involve in retrieving data and some
may involve in backing it up. Some of them are described as follows:

[Image: DBMS Users]


 Administrators: A bunch of users maintain the DBMS and are responsible for administrating
the database. They are responsible to look after its usage and by whom it should be used. They
create users access and apply limitation to maintain isolation and force security. Administrators also
look after DBMS resources like system license, software application and tools required and other
hardware related maintenance.
 Designer: This is the group of people who actually works on designing part of database. The
actual database is started with requirement analysis followed by a good designing process. They
people keep a close watch on what data should be kept and in what format. They identify and design
the whole set of entities, relations, constraints and views.
 End Users: This group contains the persons who actually take advantage of database
system. End users can be just viewers who pay attention to the logs or market rates or end users
can be as sophisticated as business analysts who takes the most of it.
DBMS Architecture
The design of a Database Management System highly depends on its architecture. DBMS
architecture can be seen as single tier or multi tier. N-tier architecture divides the whole system into
related but independent n modules, which can be independently modified, altered, changed or
replaced.

In 1-tier architecture, DBMS is the only entity where user directly sits on DBMS and uses it. Any
changes done here will directly be done on DBMS itself. It does not provide handy tools for end
users and preferably database designer use single tier architecture.

If the architecture of DBMS is 2-tier then must have some application, which uses the DBMS.
Programmers use 2-tier architecture where they access DBMS by means of application.

3-tier architecture
Most widely used architecture is 3-tier architecture. 3-tier architecture separates it tier from each
other on basis of users. It is described as follows:

3-tier DBMS architecture

 Database (Data) Tier: At this tier, only database resides. Database along with its query
processing languages sits in layer-3 of 3-tier architecture. It also contains all relations and their
constraints.
 Application (Middle) Tier: At this tier the application server and program, which access
database, resides. For a user this application tier works as abstracted view of database. Users are
unaware of any existence of database beyond application. For database-tier, application tier is the
user of it. Database tier is not aware of any other user beyond application tier. This tier works as
mediator between the two.
 User (Presentation) Tier: An end user sits on this tier. From a users aspect this tier is
everything. He/she doesn't know about any existence or form of database beyond this layer. At this
layer multiple views of database can be provided by the application. All views are generated by
applications, which reside in application tier.

DATABASE LANGUAGE:

Database languages are used for defining and accessing the database.

DDL

For describing data and data structures a suitable description tool, a data definition language (DDL),
is needed. With this help a data scheme can be defined and also changed later.
Typical DDL operations (with their respective keywords in the structured query language SQL):

Creation of tables and definition of attributes (CREATE TABLE ...)

Change of tables by adding or deleting attributes (ALTER TABLE …)

Deletion of whole table including content (!) (DROP TABLE …)

DML

Additionally a language for the descriptions of the operations with data like store, search, read,
change, etc. the so-called data manipulation, is needed. Such operations can be done with a data
manipulation language (DML). Within such languages keywords like insert, modify, update, delete,
select, etc. are common.
Typical DML operations (with their respective keywords in the structured query language SQL):

Add data (INSERT)


Change data (UPDATE)

Delete data (DELETE)

Query data (SELECT)

Often these two languages for the definition and manipulation of databases are combined in one
comprehensive language sql.

Data Control Language (DCL)

Data Control Language (DCL) statements are Used to control access to data stored in a database.
Some examples:
 GRANT - gives user's access privileges to database

 REVOKE - withdraw access privileges given with the GRANT command

Transaction Control (TCL)

Transaction Control (TCL) statements are used to manage the changes made by DML statements.
It allows statements to be grouped together into logical transactions.

Some examples:

 COMMIT - save work done

 SAVEPOINT - identify a point in a transaction to which you can later roll back

 ROLLBACK - restore database to original since the last COMMIT.


Database Interfaces

Working Principle of a Database Interface

Working Principle of a Database Interface

The application poses with the help of SQL, a query language, a query to the database system. There,
the corresponding answer (result set) is prepared and also with the help of SQL given back to the
application. This communication can take place interactively or be embedded into another language.

Type and Use of the Database Interface

Following, two important uses of a database interface like SQL are listed:

Interactive SQL can be used interactively from a terminal.

Embedded SQL can be embedded into another language (host language) which might be used to create a database
application.

The ability to modify schema definition in one level without affecting schema definition in
the next higher level is called data independence. There are two levels of data
independence, they are Physical data independence and Logical data independence.

1. Physical data independence is the ability to modify the physical schema without
causing application programs to be rewritten. Modifications at the physical level
are occasionally necessary to improve performance. It means we change the
physical storage/level without affecting the conceptual or external view of the
data. The new changes are absorbed by mapping techniques.
2. Logical data independence is the ability to modify the logical schema without
causing application program to be rewritten. Modifications at the logical level are
necessary whenever the logical structure of the database is altered (for example,
when money-market accounts are added to banking system). Logical Data
independence means if we add some new columns or remove some columns
from table then the user view and programs should not changes. It is called the
logical independence. For example: consider two users A & B. Both are
selecting the empno and ename. If user B add a new column salary in his
view/table then it will not effect the external view user; user A, but internal view
of database has been changed for both users A & B. Now user A can also print
the salary. It means if we change in view then program which use this view need
not to be changed.

For the system to be usable, it must retrieve data efficiently. The need for
efficiency has led designers to use complex data structures to represent data
in the database. Since many database-systems users are not computer
trained, developers hide the complexity from users through several levels of
abstraction, to simplify users’ interactions with the system:
 Physical Level : The lowest level of abstraction describes how the
data are actually stored. The physical level describes complex low-
level data structures in detail.
 Logical Level : The next-higher level of abstraction describes what
data are stored in the database, and what relationships exist among
those data. The logical level thus describes the entire database in
terms of a small number of relatively simple structures. Although
implementation of the simple structures at the logical level may
involve complex physical-level structures, the user of the logical level
does not need to be aware of this complexity. Database
administrators, who must decide what information to keep in the
database, use the logical level of abstraction.

The three levels of data abstraction


 View Level : The highest level of abstraction describes only part of
the entire database. Even though the logical level uses simpler
structures, complexity remains because of the variety of information
stored in a large database. Many users of the database system do not
need all this information; instead, they need to access only a part of
the database. The view level of abstraction exists to simplify their
interaction with the system. The system may provide many views for
the same database.

Data Models
1. Data models are a collection of conceptual tools for describing data, data
relationships, data semantics and data constraints. There are three different
groups:
1. Object-based Logical Models.
2. Record-based Logical Models.
3. Physical Data Models.
We'll look at them in more detail now.

 Object-based Logical Models


o The E-R Model

o The Object-Oriented Model


 Record-based Logical Models
o The Relational Model
o The Network Model
o The Hierarchical Model
 Physical Data Models
The Entity-Relationship (ER) model, a high-level data model that is useful in developing a conceptual
design for a database.

Entities and Attributes

Entity: an object that is involved in the enterprise and that be distinguished from other
objects. Can be person, place, event, object, concept in the real world

 Can be physical object or abstraction


 Ex: "John", "CSE305"

Entity Type: set of similar objects or a category of entities; they are well defined

 A rectangle represents an entity set


 Ex: students, courses
 We often just say "entity" and mean "entity type"
Attributes:

– Properties of Entities that describe their characteristics.

– Types:

• Simple: Attribute that is not divisible, e.g. age.

• Composite: Attribute composed of several simple attributes,

e.g. address (house number, street, district)

• Multiple : Attribute with a set of possible values for the same

entity, e.g. Phone (home, mobile etc.) or email

• Key: Uniquely Ids the Entity e.g. PPSN, Chassis No.

– Value Set (or domain): Each simple attribute associated with a VS

that may be assigned to that attribute for each individual entity,

e.g. age = integer, range *18,…65+

derived attribute can be obtained from other attributes or related entities.

Keys

Superkey: an attribute or set of attributes that uniquely identifies an entity--there can


be many of these

Composite key: a key requiring more than one attribute

Candidate key: a superkey such that no proper subset of its attributes is also a
superkey (minimal superkey – has no unnecessary attributes)

Primary key: the candidate key chosen to be used for identifying entities and
accessing records. Unless otherwise noted "key" means "primary key"
Alternate key: a candidate key not used for primary key

Relationship
A relationship type is a set of associations among entity types. For example,
the student entity type is related to the team entity type because each student is a
member of a team.

A relationship type is a set of associations among entity types.


A relationship orrelationship instance is an ordered pair consisting of particular
related entities.

The degree of a relationship type is the number of entity types that participate. If
two entity types participate, the relationship type is binary. A role name indicates
the purpose of an entity in a relationship.

a relationship type can also have attributes. The relationship type order connects
entities chemical andsupplier. The relationship is many-to-many because each
chemical can be from several suppliers and each supplier has a number of chemicals.
An order has a purchase date, amount, and total cost as well as the chemical and
supplier information. Thus, order has attributes PurchaseDate, amount,
and TotalCost that we cannot appropriately associate with chemical orsupplier.

A recursive relationship is one in which the same entity participates more than once
in the relationship.

You might also like