0% found this document useful (0 votes)
5 views

dbms notes

Uploaded by

Kavita Rani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

dbms notes

Uploaded by

Kavita Rani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 46

DBMS NOTES (Unit-1,2,3)

Database is a collection of related data and data is a collection of facts and figures that can be
processed to produce information.

Mostly data represents recordable facts. Data aids in producing information, which is based on
facts. For example, if we have data about marks obtained by all students, we can then conclude
about toppers and average marks.

A database management system stores data in such a way that it becomes easier to retrieve,
manipulate, and produce information.

Q1 .Characteristics
Traditionally, data was organized in file formats. DBMS was a new concept then, and all the
research was done to make it overcome the deficiencies in traditional style of data management.
A modern DBMS has the following characteristics −

 Real-world entity − A modern DBMS is more realistic and uses real-world entities to
design its architecture. It uses the behavior and attributes too. For example, a school
database may use students as an entity and their age as an attribute.

 Relation-based tables − DBMS allows entities and relations among them to form tables.
A user can understand the architecture of a database just by looking at the table names.

 Isolation of data and application − A database system is entirely different than its data.
A database is an active entity, whereas data is said to be passive, on which the database
works and organizes. DBMS also stores metadata, which is data about data, to ease its
own process.

 Less redundancy − DBMS follows the rules of normalization, which splits a relation
when any of its attributes is having redundancy in values. Normalization is a
mathematically rich and scientific process that reduces data redundancy.

 Consistency − Consistency is a state where every relation in a database remains


consistent. There exist methods and techniques, which can detect attempt of leaving
database in inconsistent state. A DBMS can provide greater consistency as compared to
earlier forms of data storing applications like file-processing systems.
 Query Language − DBMS is equipped with query language, which makes it more
efficient to retrieve and manipulate data. A user can apply as many and as different
filtering options as required to retrieve a set of data. Traditionally it was not possible
where file-processing system was used.

 ACID Properties − DBMS follows the concepts of Atomicity, Consistency, Isolation,


and Durability (normally shortened as ACID). These concepts are applied on
transactions, which manipulate data in a database. ACID properties help the database
stay healthy in multi-transactional environments and in case of failure.

 Multiuser and Concurrent Access − DBMS supports multi-user environment and


allows them to access and manipulate data in parallel. Though there are restrictions on
transactions when users attempt to handle the same data item, but users are always
unaware of them.

 Multiple views − DBMS offers multiple views for different users. A user who is in the
Sales department will have a different view of database than a person working in the
Production department. This feature enables the users to have a concentrate view of the
database according to their requirements.

 Security − Features like multiple views offer security to some extent where users are
unable to access data of other users and departments. DBMS offers methods to impose
constraints while entering data into the database and retrieving the same at a later stage.
DBMS offers many different levels of security features, which enables multiple users to
have different views with different features. For example, a user in the Sales department
cannot see the data that belongs to the Purchase department. Additionally, it can also be
managed how much data of the Sales department should be displayed to the user. Since
a DBMS is not saved on the disk as traditional file systems, it is very hard for miscreants
to break the code.

Q2. DIFFERENCE BETWEEN MANAGEMENT SYSTEM & DATABASE SYSTEM

File management System


Database system
(C++) (ORACLE)

1.REDUNDANT DATA: - 1.REDUNDANCY CONTROL: -


Redundant data is the duplication of Bring the central control over data
same data at more than one storage DBMS remove duplication of data. Thus
place. Since all the files are designed and DBMS controls redundancy by ensuring
independently, same filed are stored in that the data y times.
more than one file.

2.DATA CAN BE SHARED: -


2. ISOLATE DATA: -
Various users can use the same data in
Data is separated and scattered at the database. Data can be shared not
various location. It may be in different only by existing applications but new
formats. To make a decision, the applications can also be developed to use
programmer may need different files. the same stored data. Thus, this means
Thus, to extract data from different files that data requirements of various new
and their coordinating is a different applications may also be satisfied
process. without creating new files.

3.LARGE SYSTEM: -

Database management systems are large


systems.

3.SMALL SYSTEM: -
4.RELATIVELY EXPENSIVE:
File management systems are small
systems. Database systems are expensive. The
cost can be of two types: -

(1) Software cost, which includes


4.RELATIVELY CHEAP: - purchase of DBMS and development of
File management systems are relatively DBMS.
cheap as compared to database systems. (2) Hardware cost includes upgradation
needed to allow the extra overheads put
up by DBMS.

5.RIGOROUS SECURITY: -

As data is stored in a common place,


thus DBMS assures a sound security
system.
6.CONSISTENT OUTPUT: -

In database system consistent output is


produced.
5.NO SECURITY: -

Security restrictions can not applied on


the data.

7. CENTRAL CONTROL: -

6.CHANCES OF INCONSISTENT It has central control.


OUTPUT: -In file processing system
inconsistent output is often produced.
8.COMPLEX STRUCTURE: -

7.DISTRIBUTED CONTROL: - It has complex structure.

It has distributed control.


9.COMPLEX BACKUP &
RECOVERY: -
8.SIMPLE STRUCTURE: -
Data recovery in case of hardware or
It has simple structure. software failure is difficult due to the
complexity of the database system.

9.SIMPLE BACKUP& RECOVERY: -


10. SEARCH CAPABILITY: -
There are simple procedures for backup
and recovery process. There is search capability in database
system to give response to user queries.

11.MULTIPLE USERS: -
10.NO SEARCH CAPABILITY: -

There is no search capability in file


system to give response to user queries. It has multiple users.

11. OFTEN SINGLE USER: -

It has often-single user.

Q3. COMPONENTS /ELEMENTS OF DBMS (Do this question from notes given)

- DBMS is complex software, which provides interface for each category of the user.
- DBMS interprets user commands so that computer system can operate on that
command to perform the desired task.
- The important components of DBMS are: -
1. Data definition language (DDL) Compiler
2. Data manager
3. File manager
4. Disk manager
5. Query processor
6. Telecommunication system
7. Data files
8. Data dictionary

1. Data definition language (DDL) Compiler


- DDL is used by the database designers or application programmers to
specific the contents and structure of the database.
- DDL compiler converts the data definition statements that are in the
source from to object form.

2.Data manager

- Data manager is the central component of the DBMS.


- Data manger performs the following: -
a) Maintains the backup and recovery operations.
b) Converts the user quires from the user logical view to a
physical file system.
c) Maintains the data consistency, integrity and security.
d) Organize to provide the synchronization in the simultaneous
operations performed by concurrent users.
e) Interface with the file manager.
3.File manger: -

- File manager:
 Manages the allocation of file space on disk storage.
 Manages data structure to represent information stored on disk.
 Receives the request from the manager for a specific physical records,
locates the block containing the required record and then demands
for block of data from the disk manager. After receiving the required
block transmits the required record to the data manager.

4.Disk manager: -

- Perform all the physical input and output operations.


- Transfers the block requested by the file manager that the file manager
need not be aware of the physical characteristics of the storage media.

5. Query processor: -

- Query processor is the component of the DBMS responsible for


generating the best plan or strategy process the query.
- Query processor interprets the online user query converts it into a series
of efficient operations that carried out by the data manager for
execution.
- The query processor has access to the following information stored in
the data dictionary; -

 Number of tuples in the relation.


 Size of the record in bytes.
 Number of blocks used to store relation.
 Number of distinct values that appear in the relation.

-
These statistics are used to estimate the cost of different access plans
and thus help in selecting the best strategy process the query.
6.Telecommunication System: -

- Users remote or local, communication with the computer system by


sending and receiving message over communication line. These
messages are routed by Telecommunication system.
7. Data files: -

- Data files store all the data portion of the database.


8. Data Dictionary: -

- Data dictionary stores the information about the data in database.


- It contains the information about the entities, attributes, checks and
validation.
- Data dictionary is an integral component of the DBMS and is used to
control the database operations, data integrity and accuracy.

Objectives of DBMS:

The main objectives of DBMS are: -

1. Minimal Redundancy: -

- Data redundancy is the duplication of same data at more than one


storage place.
- This duplication of data leads to wastage of storage space & time and
affects cost also.
- This redundancy has to be eliminated by integrating the data at one
place.
2. Consistency: -

- Data duplication creates multiple level of updation.


- At some occasions, updation of duplication data entries may supply
incorrect or conflicting information. At such times, the database is said
to be inconsistent.
- Consistency of data has to be achieved through redundancy control.
3.Sharing of data: -

-This means various users can use the same data in the database.
-Moreover new applications can be developed according to the needs to
operate against the same stored data. Hence the objective of DBMS is to
satisfy the data requirement of various new applications without the
need of having separate data for each application.
4. To provide multiple user interfaces: -

- In order to allow different users to access the database DBMS provides: -


o Query languages for casual users such as SQL or QBE to access the
database.
o Programming language interfaces for application programmers and
o Menu- driven interfaces for stand-alone users.
5.Simplicity: -

- Another objective of DBMS is to make application development procedure


simple and easier.
- To achieve this DBMS is accompanied with powerful query manipulation
and reports generation tools.
- In order to hide the various data storage complexities many 4GLs are
available with DBMS.
6.Flexibility: -

- DBMS allows changes to the structure of the database without affecting the
stored data the existing application.
- Thus it should the application developments cheaper, faster and flexible.
7.Data migration: -

- This objective is important to make the database economical.


- All data are not referenced very frequently. Thus the rarely accessed data
can be stored on slower access or on cheap devices. Whereas more
frequently accessed data can be stored on fast access or direct access media
devices.
- Thus data migration implies the adjustment of data on costly or cheap
media devices.

8.To restrict unauthorized access:-

- Data in database must be secured. Hence an important objective of


database system is to restrict unauthorized access.
- To assure this DBMS must provide:
 Identification of users of the database before they can use the database.
 Monitoring user’s action so that if they do something wrong they are
likely to be found
 All contents should be proper and not easy to crack.

9.privacy and security: -

- Privacy means when, how and to what extent data should be given to users.
- Databases are costly products and hence their security is very important.
Security of data is needed from accidental as well as intentional disclosures.
- Thus to achieve privacy and security is also an objective of DBMS.

10.To enforce integrity: -

- Integrity is data accuracy. It also implies that incorrect information cannot be


stored in the database.
- In order to achieve the objective of integrity , some integrity constraints are
enforced on the database
- A DBMS should have the capabilities for defining and imposing consistency
constraints.
11. Maintain standards:-

- All applicable standards should be followed in the representation of data such as


format, conventions on data names, documentation etc.
- The standardized data is very helpful during migration or interchanging of data.
- This will result in uniformity of the entire database as well as its usage.

Q4. Advantages of DBMS


The database management system has promising potential advantages, which are explained
below:

1. Controlling Redundancy: In file system, each application has its own private files, which
cannot be shared between multiple applications. 1:his can often lead to considerable redundancy
in the stored data, which results in wastage of storage space. By having centralized database
most of this can be avoided. It is not possible that all redundancy should be eliminated.
Sometimes there are sound business and technical reasons for· maintaining multiple copies of the
same data. In a database system, however this redundancy can be controlled.

For example: In case of college database, there may be the number of applications like General
Office, Library, Account Office, Hostel etc. Each of these applications may maintain the
following information into own private file applications:

It is clear from the above file systems, that there is some common data of the student which has
to be mentioned in each application, like Rollno, Name, Class, Phone_No~ Address etc. This
will cause the problem of redundancy which results in wastage of storage space and difficult to
maintain, but in case of centralized database, data can be shared by number of applications and
the whole college can maintain its computerized data with the following database:

It is clear in the above database that Rollno, Name, Class, Father_Name, Address,

Phone_No, Date_of_birth which are stored repeatedly in file system in each application, need not
be stored repeatedly in case of database, because every other application can access this
information by joining of relations on the basis of common column i.e. Rollno. Suppose any user
of Library system need the Name, Address of any particular student and by joining of Library
and General Office relations on the basis of column Rollno he/she can easily retrieve this
information.

Thus, we can say that centralized system of DBMS reduces the redundancy of data to great
extent but cannot eliminate the redundancy because RollNo is still repeated in all the relations.

2. Integrity can be enforced: Integrity of data means that data in database is always accurate,
such that incorrect information cannot be stored in database. In order to maintain the integrity of
data, some integrity constraints are enforced on the database. A DBMS should provide
capabilities for defining and enforcing the constraints.

For Example: Let us consider the case of college database and suppose that college having only
BTech, MTech, MSc, BCA, BBA and BCOM classes. But if a \.,ser enters the class MCA, then
this incorrect information must not be stored in database and must be prompted that this is an
invalid data entry. In order to enforce this, the integrity constraint must be applied to the class
attribute of the student entity. But, in case of file system tins constraint must be enforced on all
the application separately (because all applications have a class field).

In case of DBMS, this integrity constraint is applied only once on the class field of the

General Office (because class field appears only once in the whole database), and all other
applications will get the class information about the student from the General Office table so the
integrity constraint is applied to the whole database. So, we can conclude that integrity constraint
can be easily enforced in centralized DBMS system as compared to file system.

3. Inconsistency can be avoided : When the same data is duplicated and changes are made at
one site, which is not propagated to the other site, it gives rise to inconsistency and the two
entries regarding the same data will not agree. At such times the data is said to be inconsistent.
So, if the redundancy is removed chances of having inconsistent data is also removed.

Let us again, consider the college system and suppose that in case of General_Office file

it is indicated that Roll_Number 5 lives in Amritsar but in library file it is indicated that

Roll_Number 5 lives in Jalandhar. Then, this is a state at which tIle two entries of the same
object do not agree with each other (that is one is updated and other is not). At such time the
database is said to be inconsistent.

An inconsistent database is capable of supplying incorrect or conflicting information. So there


should be no inconsistency in database. It can be clearly shown that inconsistency can be avoided
in centralized system very well as compared to file system ..

Let us consider again, the example of college system and suppose that RollNo 5 is .shifted from
Amritsar to Jalandhar, then address information of Roll Number 5 must be updated, whenever
Roll number and address occurs in the system. In case of file system, the information must be
updated separately in each application, but if we make updation only at three places and forget to
make updation at fourth application, then the whole system show the inconsistent results about
Roll Number 5.

In case of DBMS, Roll number and address occurs together only single time in General_Office
table. So, it needs single updation and then an other application retrieve the address information
from General_Office which is updated so, all application will get the current and latest
information by providing single update operation and this single update operation is propagated
to the whole database or all other application automatically, this property is called as Propagation
of Update.

We can say the redundancy of data greatly affect the consistency of data. If redundancy is less, it
is easy to implement consistency of data. Thus, DBMS system can avoid inconsistency to great
extent.

4. Data can be shared: As explained earlier, the data about Name, Class, Father __name etc. of
General_Office is shared by multiple applications in centralized DBMS as compared to file
system so now applications can be developed to operate against the same stored data. The
applications may be developed without having to create any new stored files.
5. Standards can be enforced : Since DBMS is a central system, so standard can be enforced
easily may be at Company level, Department level, National level or International level. The
standardized data is very helpful during migration or interchanging of data. The file system is an
independent system so standard cannot be easily enforced on multiple independent applications.

6. Restricting unauthorized access: When multiple users share a database, it is likely that some
users will not be authorized to access all information in the database. For example, account
office data is often considered confidential, and hence only authorized persons are allowed to
access such data. In addition, some users may be permitted only to retrieve data, whereas other
are allowed both to retrieve and to update. Hence, the type of access operation retrieval or update
must also be controlled. Typically, users or user groups are given account numbers protected by
passwords, which they can use to gain access to the database. A DBMS should provide a security
and authorization subsystem, which the DBA uses to create accounts and to specify account
restrictions. The DBMS should then enforce these restrictions automatically.

7. Solving Enterprise Requirement than Individual Requirement: Since many types of users
with varying level of technical knowledge use a database, a DBMS should provide a variety of
user interface. The overall requirements of the enterprise are more important than the individual
user requirements. So, the DBA can structure the database system to provide an overall service
that is "best for the enterprise".

For example: A representation can be chosen for the data in storage that gives fast access for the
most important application at the cost of poor performance in some other application. But, the
file system favors the individual requirements than the enterprise requirements

8. Providing Backup and Recovery: A DBMS must provide facilities for recovering from
hardware or software failures. The backup and recovery subsystem of the DBMS is responsible
for recovery. For example, if the computer system fails in the middle of a complex update
program, the recovery subsystem is responsible for making sure that the .database is restored to
the state it was in before the program started executing.

9. Cost of developing and maintaining system is lower: It is much easier to respond to


unanticipated requests when data is centralized in a database than when it is stored in a
conventional file system. Although the initial cost of setting up of a database can be large, but
the cost of developing and maintaining application programs to be far lower than for similar
service using conventional systems. The productivity of programmers can be higher in using
non-procedural languages that have been developed with DBMS than using procedural
languages.
10. Data Model can be developed : The centralized system is able to represent the complex data
and interfile relationships, which results better data modeling properties. The data madding
properties of relational model is based on Entity and their Relationship, which is discussed in
detail in chapter 4 of the book.

11. Concurrency Control : DBMS systems provide mechanisms to provide concurrent access
of data to multiple users.

Disadvantages of DBMS:

The disadvantages of the database approach are summarized as follows:

1. Complexity : The provision of the functionality that is expected of a good DBMS makes the
DBMS an extremely complex piece of software. Database designers, developers, database
administrators and end-users must understand this functionality to take full advantage of it.
Failure to understand the system can lead to bad design decisions, which can have serious
consequences for an organization.

2. Size : The complexity and breadth of functionality makes the DBMS an extremely large piece
of software, occupying many megabytes of disk space and requiring substantial amounts
of memory to run efficiently.

3. Performance: Typically, a File Based system is written for a specific application, such as
invoicing. As result, performance is generally very good. However, the DBMS is written to be
more general, to cater for many applications rather than just one. The effect is that some
applications may not run as fast as they used to.

4. Higher impact of a failure: The centralization of resources increases the vulnerability of the
system. Since all users and applications rely on the ~vailabi1ity of the DBMS, the failure of any
component can bring operations to a halt.

5. Cost of DBMS: The cost of DBMS varies significantly, depending on the environment and
functionality provided. There is also the recurrent annual maintenance cost.

6. Additional Hardware costs: The disk storage requirements for the DBMS and the database
may necessitate the purchase of additional storage space. Furthermore, to achieve the required
performance it may be necessary to purchase a larger machine, perhaps even a machine
dedicated to running the DBMS. The procurement of additional hardware results in further
expenditure.
7. Cost of Conversion: In some situations, the cost oftlle DBMS and extra hardware may be
insignificant compared with the cost of converting existing applications to run on the new DBMS
and hardware. This cost also includes the cost of training staff to use these new systems and
possibly the employment of specialist staff to help with conversion and running of the system.
This cost is one of the main reasons why some organizations feel tied to their current systems
and cannot switch to modern database technology.

When not to Use a DBMS

In spite of the advantages of using a DBMS, there are a few situations in which such a system
may involve unnecessary overhead costs, as that would not be incurred in traditional file
processing.

The overhead costs of using a DBMS are due to the following:

+ High initial investment in hardware, software, and training.

+ Generality that a DBMS provides for defining and processing data.

+ Overhead for providing security, concurrency control, recovery, and integrity functions.

Additional problems may arise, if the database designers and DBA do not properly design the
database or if the database systems applications are not implemented properly.

Hence, it may be more desirable to use regular files under the following circumstances:

+ The database and applications are simple, well defined and not expected to change.

+ There are tight real-time requirements for some programs that may not be met because of
DBMS overhead.
+ Multiple user access to data is not required.

+ An application may need to manipulate the data in a way not supported by the query language.

Requirement for a DBMS

The software responsible for the management data in computers i.e. DBMS (like Oracle,
Foxpro, SQL Server etc.) should meet the following requirements:
Q7. DBMS Users:
There are a number of users who can access or retrieve data on demand using the applications
and interfaces provided by the DBMS. Each type of user needs different software capabilities.
The users of a database system can be classified in the following groups, depending on their
degrees of expertise or the mode of their interactions with the DBMS. The users can be:

• Naive Users

• Online Users

• Application Programmers

• Sophisticated Users

• Data Base Administrator (DBA)

Naive Users: Naive Users are those users who need not be aware of the presence of the database
system or any other system supporting their usage. Naive users are end users of the database who
work through a menu driven application program, where the type and range of response is
always indicated to the user.

A user of an Automatic Teller Machine (ATM) falls in this category. The user is instructed
through each step of a transaction. He or she then responds by pressing a coded key or entering a
numeric value. The operations that can be performed by valve users are very limited and affect
only a precise portion of the database. For example, in the case of the user of the Automatic
Teller Machine, user's action affects only one or more of his/her own accounts.

Online Users : Online users are those who may communicate with the database directly via an
online terminal or indirectly via a user interface and application program. These users are aware
of the presence of the database system and may have acquired a certain amount of expertise with
in the limited interaction permitted with a database.

Sophisticated Users : Such users interact with the system without ,writing programs.

Instead, they form their requests in database query language. Each such query is submitted to a
very processor whose function is to breakdown DML statement into instructions that the storage
manager understands.

Specialized Users : Such users are those ,who write specialized database application that do not
fit into the fractional data-processing framework. For example: Computer-aided design systems,
knowledge base and expert system, systems that store data with complex data types (for
example, graphics data and audio data).

Application Programmers : Professional programmers are those who are responsible for
developing application programs or user interface. The application programs could be written
using general purpose programming language or the commands available to manipulate a
database.
Database Administrator: The database administrator (DBA) is the person or group in charge
for implementing the database system ,within an organization. The "DBA has all the system
privileges allowed by the DBMS and can assign (grant) and remove (revoke) levels of access
(privileges) to and from other users. DBA is also responsible for the evaluation, selection and
implementation of DBMS package.

Q8. Functions of Data Base Administrator

Data Base Administrator (DBA) is a person or group in charge for implementing DBMS in an
organization. Database Administrator's job requires a high degree of technical expertise and the
ability to understand and interpret management requirements ata senior level. In practice the
DBA may consist of team of people rather than just one person

The main responsibilities of DBA are:

• Makes decisions concerning the content of the database: It is the DBA's job to decide
exactly what information is to be held in the database-in other words, to identify the' entities of
interest to the enterprise and to identify information to be recorded about those entitie .

• Plans storage structures and access strategies: The DBA must also decide how the data is to
be represented in the database, and must specify the representation by writing the storage
structure definition (using the internal data defination language).

In addition, the associated mapping between the storage structure definition and the conceptual
schema must also be specified.
• Provides support to users: It is the responsibility of the DBA to provide support to the users,
to ensure that the data they require is available, and to write the\ necessary external schemas
(using the appropriate external data definition language).

In addition, the mapping between any given eA1ernal schema and the conceptual' schema must
also be specified.

• Defines security and integrity checks: DBA is responsible for providing the authorization and
authentication checks such that no malicious users can accessdatabase and it must remain
protected. DBA must also ensure the integrity of the database.

• Interprets backup and recovery strategies: In the event of damage to any portion\ of the
database-caused by human error, say, or a failure in the hardware or supporting operating
system-it is essential to be able to repair the data concerned witl1 a minimum of delay and with
as little effect as possible on the rest of the system.

The DBA must define and implement an appropriate recovery strategy to recover he database
from all types of failures.

• Monitoring performance and responding to changes in requirements: The

DBA is responsible for so organizing the system as to get the performance that is "best for the
enterprise," and for making the appropriate adjustments as requirements change.

Q9.DBMS Langauages:

A DBMS must provide appropriate languages and interfaces for each category of users to
express database queries and updates. Database Languages are used to create and
maintain database on computer. There are large numbers of database languages like
Oracle, MySQL, MS Access, dBase, FoxPro etc. SQL statements commonly used in Oracle
and MS Access can be categorized as data definition language (DDL), data control
language (DCL) and data manipulation language (DML).

Data Definition Language (DDL)


It is a language that allows the users to define data and their relationship to other types of data. It
is mainly used to create files, databases, data dictionary and tables within databases.

It is also used to specify the structure of each table, set of associated values with each attribute,
integrity constraints, security and authorization information for each table and physical storage
structure of each table on disk.

The following table gives an overview about usage of DDL statements in SQL

The following table gives an overview about usage of DDL statements in SQL

Data Manipulation Language (DML)

It is a language that provides a set of operations to support the basic data manipulation operations
on the data held in the databases. It allows users to insert, update, delete and retrieve data from
the database. The part of DML that involves data retrieval is called a query language.

The following table gives an overview about the usage of DML statements in SQL:

Data Control Language (DCL)

DCL statements control access to data and the database using statements such as GRANT and
REVOKE. A privilege can either be granted to a User with the help of GRANT statement. The
privileges assigned can be SELECT, ALTER, DELETE, EXECUTE, INSERT, INDEX etc. In
addition to granting of privileges, you can also revoke (taken back) it by using REVOKE
command.

The following table gives an overview about the usage of DCL statements in SQL:
In practice, the data definition and data manipulation languages are not two separate languages.
Instead they simply form parts of a single database language such as Structured Query Language
(SQL). SQL represents combination of DDL and DML, as well as statements for constraints
specification and schema evaluation.

Data Definition Language (DDL)statements are used to define the database structure or
schema. Some examples:

* CREATE -to create objects in the database


* ALTER - alters the structure of the database
* DROP - delete objects from the database
* TRUNCATE - remove all records from a table, including all spaces allocated for the records
are removed
* COMMENT - add comments to the data dictionary
* RENAME - rename an object

Data Manipulation Language (DML) statements are used for managing data within schema
objects. Some examples:

* SELECT - retrieve data from the a database


* INSERT - insert data into a table
* UPDATE - updates existing data within a table
* DELETE - deletes all records from a table, the space for the records remain
* MERGE - UPSERT operation (insert or update)
* CALL - call a PL/SQL or Java subprogram
* EXPLAIN PLAN - explain access path to data
* LOCK TABLE - control concurrency

Data Control Language (DCL) statements. Some examples:


* GRANT - gives user's access privileges to database
* REVOKE - withdraw access privileges given with the GRANT command

Transaction Control (TCL) statements are used to manage the changes made by DML
statements. It allows statements to be grouped together into logical transactions.

*COMMIT - save work done


* SAVEPOINT - identify a point in a transaction to which you can later roll back
* ROLLBACK - restore database to original since the last COMMIT
* SET TRANSACTION - Change transaction options like isolation level and what rollback
segment to use

Q10.3-tier Architecture(Three Schema Architecture)


(do from Navathe book with diagram)

Internal level: It is the lowest level of data abstraction that deals with the physical
representation of the database on the computer and thus, is also known as physical level. It
describes how the data is physically stored and organized on the storage medium. At this level,
various aspects are considered to achieve optimal runtime performance and storage space
utilization. These aspects include storage space allocation techniques for data and indexes,
access paths such as indexes, data compression and encryption techniques, and record
placement.
Conceptual level: This level of abstraction deals with the logical structure of the entire database
and thus, is also known as logical level. It describes what data is stored in the database, the
relationships among the data and complete view of the user’s requirements without any concern
for the physical implementation. That is, it hides the complexity of physical storage structures.
The conceptual view is the overall view of the database and it includes all the information that
is going to be represented in the database.

External level: It is the highest level of abstraction that deals with the user’s view of the
database and thus, is also known as view level. In general, most of the users and application
programs do not require the entire data stored in the database. The external level describes a
part of the database for a particular group of users. It permits users to access data in a way that
is customized according to their needs, so that the same data can be seen by different users in
different ways, at the same time. In this way, it provides a powerful and flexible security
mechanism by hiding the parts of the database from certain users, as the user is not aware of
existence of any attributes that are missing from the view.

Q11. Database Schema


A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are associated.
It formulates all the constraints that are to be applied on the data.

A database schema defines its entities and the relationship among them. It contains a descriptive
detail of the database, which can be depicted by means of schema diagrams. It’s the database
designers who design the schema to help programmers understand the database and make it
useful.
A database schema can be divided broadly into two categories −

 Physical Database Schema − This schema pertains to the actual storage of data and its
form of storage like files, indices, etc. It defines how the data will be stored in a
secondary storage.

 Logical Database Schema − This schema defines all the logical constraints that need to
be applied on the data stored. It defines tables, views, and integrity constraints.

Database Instance
It is important that we distinguish these two terms individually. Database schema is the skeleton
of database. It is designed when the database doesn't exist at all. Once the database is
operational, it is very difficult to make any changes to it. A database schema does not contain
any data or information.

A database instance is a state of operational database with data at any given time. It contains a
snapshot of the database. Database instances tend to change with time. A DBMS ensures that its
every instance (state) is in a valid state, by diligently following all the validations, constraints,
and conditions that the database designers have imposed.

If a database system is not multi-layered, then it becomes difficult to make any changes in the
database system. Database systems are designed in multi-layers as we learnt earlier.

Q12. Data Independence


Note: (Diag is same as three schema architecture)
A database system normally contains a lot of data in addition to users’ data. For example, it
stores data about data, known as metadata, to locate and retrieve data easily. It is rather difficult
to modify or update a set of metadata once it is stored in the database. But as a DBMS expands,
it needs to change over time to satisfy the requirements of the users. If the entire data is
dependent, it would become a tedious and highly complex job.

Metadata itself follows a layered architecture, so that when we change data at one layer, it does
not affect the data at another level. This data is independent but mapped to each other.

Logical data independence: It is the ability to change the conceptual schema without affecting
the external schemas or application programs. The conceptual schema may be changed due to
change in constraints or addition of new data item or removal of existing data item, etc., from
the database. The separation of the external level from the conceptual level enables the users to
make changes at the conceptual level without affecting the external level or the application
programs.

Physical data independence: It is the ability to change the internal schema without affecting
the conceptual or external schema. An internal schema may be changed due to several reasons
such as for creating additional access structure, changing the storage structure, etc. The
separation of internal schema from the conceptual schema facilitates physical data
independence.

Logical data independence is more difficult to achieve than the physical data independence
because the application programs are always dependent on the logical structure of the
database. Therefore, the change in the logical structure of the database may require change in
the application programs.

Q13.ER Diagram
The ER model defines the conceptual view of a database. It works around real-world entities
and the associations among them. At view level, the ER model is considered a good option for
designing databases.

Components of E-R Diagram


The E-R diagram has three main components.
1) Entity

An Entity can be any object, place, person or class. In E-R Diagram, an entity is represented
using rectangles. Consider an example of an Organisation. Employee, Manager, Department,
Product and many more can be taken as entities from an Organisation.

Entities are represented by means of rectangles. Rectangles are named with the entity set they
represent.
Weak Entity

Weak entity is an entity that depends on another entity. Weak entity doen't have key attribute of
their own. Double rectangle represents weak entity.
Symbols and Notations
Attributes(do from navathe book)
Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.

There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.

Types of Attributes
 Simple attribute − Simple attributes are atomic values, which cannot be divided further.
For example, a student's phone number is an atomic value of 10 digits.

 Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first_name and last_name.

 Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average_salary in a department should not be saved directly in the database,
instead it can be derived. For another example, age can be derived from data_of_birth.

 Single-value attribute − Single-value attributes contain single value. For example −


Social_Security_Number.

 Multi-value attribute − Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email_address, etc.

These attribute types can come together in a way like –(do in detail)

 simple single-valued attributes


 simple multi-valued attributes

 composite single-valued attributes

 composite multi-valued attributes


2) Attribute

An Attribute describes a property or characterstic of an entity. For example, Name, Age,


Address etc can be attributes of a Student. An attribute is represented using eclipse.

Key Attribute

Key attribute represents the main characterstic of an Entity. It is used to represent Primary key.
Ellipse with underlying lines represent Key Attribute.
Composite Attribute

An attribute can also have their own attributes. These attributes are known
as Composite attribute.

Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is
then connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.

Multivalued attributes are depicted by double ellipse.


Derived attributes are depicted by dashed ellipse.

3) Relationship

A Relationship describes relations between entities. Relationship is represented using diamonds.


There are three types of relationship that exist between Entities.

 Binary Relationship
 Recursive Relationship

 Ternary Relationship

Binary Relationship

Binary Relationship means relation between two Entities. This is further divided into three types.

1. One to One : This type of relationship is rarely seen in real world.


The above example describes that one student can enroll only for one course and a course
will also have only one Student. This is not what you will usually see in relationship.

2. One to Many : It reflects business rule that one entity is associated with many number of
same entity. The example for this relation might sound a little weird, but this menas that one
student can enroll to many courses, but one course will have one Student.

The arrows in the diagram describes that one student can enroll for only one course.

3. Many to One : It reflects business rule that many entities can be associated with just one
entity. For example, Student enrolls for only one Course but a Course can have many
Students.
4. Many to Many :

The above diagram represents that many students can enroll for more than one courses.

Recursive Relationship

When an Entity is related with itself it is known as Recursive Relationship.


Ternary Relationship

Relationship of degree three is called Ternary relationship.

Participation Constraints
 Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.

 Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.

Relational data model is the primary data model, which is used widely around the world for data
storage and processing. This model is simple and it has all the properties and capabilities
required
to process data with storage efficiency.

Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores
the relation among entities. A table has rows and columns, where rows represents records and
columns represent the attributes.

Tuple − A single row of a table, which contains a single record for that relation is called a tuple.

Relation instance − A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.

Relation schema − A relation schema describes the relation name (table name), attributes, and
their names.

Relation key − Each row has one or more attributes, known as relation key, which can identify
the row in the relation (table) uniquely.

Attribute domain − Every attribute has some pre-defined value scope, known as attribute
domain.

Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions
are called Relational Integrity Constraints. There are three main integrity constraints −

 Key constraints
 Domain constraints

 Referential integrity constraints

Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple
uniquely. This minimal subset of attributes is called key for that relation. If there are more than
one such minimal subsets, these are called candidate keys.

Key constraints force that −


 in a relation with a key attribute, no two tuples can have identical values for key
attributes.

 a key attribute can not have NULL values.

Key constraints are also referred to as Entity Constraints.

Domain Constraints
Attributes have specific values in real-world scenario. For example, age can only be a positive
integer. The same constraints have been tried to employ on the attributes of a relation. Every
attribute is bound to have a specific range of values. For example, age cannot be less than zero
and telephone numbers cannot contain a digit outside 0-9.

Referential integrity Constraints


Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key
attribute of a relation that can be referred in other relation.

Referential integrity constraint states that if a relation refers to a key attribute of a different or
same relation, then that key element must exist.

Database Keys
Keys are very important part of Relational database. They are used to establish and identify
relation between tables. They also ensure that each record within a table can be uniquely
identified by combination of one or more fields within a table.

Super Key

Super Key is defined as a set of attributes within a table that uniquely identifies each record
within a table. Super Key is a superset of Candidate key.
Candidate Key

Candidate keys are defined as the set of fields from which primary key can be selected. It is an
attribute or set of attribute that can act as a primary key for a table to uniquely identify each
record in that table.

Primary Key

Primary key is a candidate key that is most appropriate to become main key of the table. It is a
key that uniquely identify each record in a table.

Composite Key

Key that consist of two or more attributes that uniquely identify an entity occurance is
called Composite key. But any attribute that makes up the Composite key is not a simple key in
its own.
Secondary or Alternative key

The candidate key which are not selected for primary key are known as secondary keys or
alternative keys

Non-key Attribute

Non-key attributes are attributes other than candidate key attributes in a table.

Non-prime Attribute

Non-prime Attributes are attributes other than Primary attribute.


E-R Diagram for Hospital Management System
This is my first E-R diagram for my first Mini Project on Hospital Management System.

Normalization of Database
Database Normalisation is a technique of organizing the data in the database. Normalization is a
systematic approach of decomposing tables to eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts
data into tabular form by removing duplicated data from the relation tables.

Normalization is used for mainly two purpose,

 Eliminating reduntant(useless) data.

 Ensuring data dependencies make sense i.e data is logically stored.


Problem Without Normalization

Without Normalization, it becomes difficult to handle and update the database, without facing data
loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized. To
understand these anomalies let us take an example of Student table.

S_id S_Name S_Address Subject_opted

401 Adam Noida Bio

402 Alex Panipat Maths

403 Stuart Jammu Maths

404 Adam Noida Physics

 Updation Anamoly : To update address of a student who occurs twice or more than twice in

a table, we will have to update S_Address column in all the rows, else data will become

inconsistent.

 Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id), name and

address of a student but if student has not opted for any subjects yet then we have to

insert NULL there, leading to Insertion Anamoly.

 Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it, when we

delete that row, entire student record will be deleted along with it.

Normalization Rule

Normalization rule are divided into following normal form.

1. First Normal Form


2. Second Normal Form

3. Third Normal Form

4. BCNF

Functional Dependency
Functional dependency (FD) is a set of constraints between two attributes in
a relation. Functional dependency says that if two tuples have same values
for attributes A1, A2,..., An, then those two tuples must have to have same
values for attributes B1, B2, ..., Bn.

Functional dependency is represented by an arrow sign (→) that is, X→Y,


where X functionally determines Y. The left-hand side attributes determine
the values of attributes on the right-hand side.

Armstrong's Axioms
If F is a set of functional dependencies then the closure of F, denoted as F +,
is the set of all functional dependencies logically implied by F. Armstrong's
Axioms are a set of rules, that when applied repeatedly, generates a closure
of functional dependencies.

 Reflexive rule − If alpha is a set of attributes and beta is_subset_of alpha, then
alpha holds beta.

 Augmentation rule − If a → b holds and y is attribute set, then ay → by also


holds. That is adding attributes in dependencies, does not change the basic
dependencies.

 Transitivity rule − Same as transitive rule in algebra, if a → b holds and b → c


holds, then a → c also holds. a → b is called as a functionally that determines b.

Trivial Functional Dependency


 Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of X,
then it is called a trivial FD. Trivial FDs always hold.
 Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called
a non-trivial FD.

 Completely non-trivial − If an FD X → Y holds, where x intersect Y = Φ, it is


said to be a completely non-trivial FD.

First Normal Form


First Normal Form is defined in the definition of relations (tables) itself. This
rule defines that all the attributes in a relation must have atomic domains.
The values in an atomic domain are indivisible units.

We re-arrange the relation (table) as below, to convert it to First Normal


Form.

Each attribute must contain only a single value from its pre-defined domain.

Second Normal Form


Before we learn about the second normal form, we need to understand the
following −

 Prime attribute − An attribute, which is a part of the prime-key, is known as a


prime attribute.
 Non-prime attribute − An attribute, which is not a part of the prime-key, is
said to be a non-prime attribute.

If we follow second normal form, then every non-prime attribute should be


fully functionally dependent on prime key attribute. That is, if X → A holds,
then there should not be any proper subset Y of X, for which Y → A also
holds true.

We see here in Student_Project relation that the prime key attributes are
Stu_ID and Proj_ID. According to the rule, non-key attributes, i.e. Stu_Name
and Proj_Name must be dependent upon both and not on any of the prime
key attribute individually. But we find that Stu_Name can be identified by
Stu_ID and Proj_Name can be identified by Proj_ID independently. This is
calledpartial dependency, which is not allowed in Second Normal Form.

We broke the relation in two as depicted in the above picture. So there


exists no partial dependency.

Third Normal Form


For a relation to be in Third Normal Form, it must be in Second Normal form
and the following must satisfy −
 No non-prime attribute is transitively dependent on prime key attribute.

 For any non-trivial functional dependency, X → A, then either −

o X is a superkey or,

o A is prime attribute.

We find that in the above Student_detail relation, Stu_ID is the key and only
prime key attribute. We find that City can be identified by Stu_ID as well as
Zip itself. Neither Zip is a superkey nor is City a prime attribute.
Additionally, Stu_ID → Zip → City, so there exists transitive dependency.

To bring this relation into third normal form, we break the relation into two
relations as follows −

Boyce-Codd Normal Form


Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on
strict terms. BCNF states that −

 For any non-trivial functional dependency, X → A, X must be a super-key.

In the above image, Stu_ID is the super-key in the relation Student_Detail


and Zip is the super-key in the relation ZipCodes. So,
Stu_ID → Stu_Name, Zip

and

Zip → City

Which confirms that both the relations are in BCNF.

You might also like