0% found this document useful (0 votes)
24 views154 pages

Relational Database Management Systems by N.P. Singh

The document provides an overview of database systems, defining databases as organized collections of data used for storing, managing, and retrieving information. It discusses the types of databases, including operational and analytical databases, and highlights the functions and advantages of Database Management Systems (DBMS), such as data independence, integrity, and security. Additionally, it addresses the roles of various database users, particularly the Database Administrator (DBA), who is responsible for managing database access and maintenance.

Uploaded by

icewater1912
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views154 pages

Relational Database Management Systems by N.P. Singh

The document provides an overview of database systems, defining databases as organized collections of data used for storing, managing, and retrieving information. It discusses the types of databases, including operational and analytical databases, and highlights the functions and advantages of Database Management Systems (DBMS), such as data independence, integrity, and security. Additionally, it addresses the roles of various database users, particularly the Database Administrator (DBA), who is responsible for managing database access and maintenance.

Uploaded by

icewater1912
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 154

Chapter 1

DATABASE SYSTEMS

1.1 INTRODUCTION TO DATABASE SYSTEMS


Databases offer a convenient and powerful way to organize information.
Databases are not specific to computers. Examples of non-computerised databases are
phone book, dictionaries, almanacs, etc. Databases are designed to offer an organized
mechanism for storing, managing and retrieving information. Databases are probably
one of the most common uses of computers, and are available on just about every type
of computer. There are three main components to any database application: A method
for entering or editing data — usually data entry screens or import functions
A data storage mechanism — a way of storing the data on the computer A report
generator to extract and interpret information from the stored data
1.1.2 Data
Data is collection of raw/ unorganized facts and figures. It simply exists and has
no significance beyond its existence. It can exist in any form, usable or not. It is always
need to be processed. Raw data may be collection of numbers, characters, images or it
can be acquired from many different sources. Data are typically the results of
measurements and can be the basis of symbols, graphs, images, or observations of a set
of variables. For example, age of students, the height of students, marks of students,
blood group students, is generally considered as “data”
1.1.3 Information
Valuable or useful data is called information. After processing, organizing and
presenting data in a given situation so as to make it useful, it becomes Information. For
example marks or height of a particular student may be considered as “information”,
Let us take another example, suppose that you want to know how you’re doing in a
particular course. You have taken two 10-question multiple-choice tests. On the first test,
you got questions 1, 3, and 4 wrong; on the second test, you did worse, missing items 5, 3,
6, and 9. The items that you got wrong are merely datadata Unprocessed facts.—
unprocessed facts. What’s important is your total score. You scored 70% on the first exam
and 60% on the second. These two numbers constitute information in formation Data that
have been processed or turned into some useful form.—data that have been processed, or
turned into some useful form.
1.2 CONCEPT OF A DATABASE
The simplest definition of a database is a collection of data items stored for later retrieval.
A database is a collection of information—preferably related information and organized.
In other words a database is an organized collection of data used for the purpose of
modeling some type of organization or organizational process. It really doesn’t matter
whether we are using paper or a computer software program to collect and store the data.
As long as we are gathering data in some organized manner for a specific purpose, you’ve
got a database. The structure is achieved by organizing the data according to a database
model. The model that is most commonly used today is the relational model. A database,
on the other hand, is the implementation or creation of a physical database on a computer.
A database model is used to create a database.
1.2.1 Types of Databases
There are two types of databases found in database management, operational databases
and analytical databases.
Operational databases are the backbone of many organizations, companies, universities
and institutions throughout the world today. This type of database is primarily used in on-
line transaction processing (OLTP) scenarios, where data is collected, modified and
maintained on a daily basis. An operational database is always of dynamic nature,
meaning that it changes constantly and always reflects up-to-theminute information.
Organizations, such as retail stores, share market, banks, railway, airways companies,
manufacturing companies, hospitals, and publishing houses, use operational databases
because their data is in a constant state of change. Traditionally, data elements stored in
databases have been simply. Transactions have been of short duration and often needed to
access only a few data items. College database for maintaining information concerning
students, courses, and marks in a college environment
For example, a university or college database might contain information about the
following:
Entities such as students, faculty, courses.
Relationships between entities, such as students’ enrollment in courses, faculty teaching
courses
Analytical databases are primarily used in on-line analytical processing (OLAP) scenarios,
where there is a need to store and track historical and time-dependent data. An analytical
database is a valuable asset when there is a need to track trends, view statistical data over
a long period of time, and make tactical or strategic business projections. This type of
database stores static data, meaning that the data is never modified. The information
gleaned from an analytical database reflects a point-in-time snapshot of the data. Chemical
labs, geological companies, and marketing-analysis firms are examples of organizations
that use analytical databases.
1.2.2 Characteristics of the database approach
• Data Abstraction — A data model is used to hide storage details and present the users
with a conceptual view of the database.
Programs refer to the data model constructs rather than data storage details
• Support of multiple views of the data — Each user may see a different view of the
database, which describes only the data of interest to that user.
• Sharing of data and multi-user transaction processing — Allowing a set of concurrent
users to retrieve from and to update the database.
Concurrency control within the DBMS guarantees that each transaction is correctly
executed or aborted
Recovery subsystem ensures each completed transaction has its effect permanently
recorded in the database
OLTP (Online Transaction Processing) is a major part of database applications. This
allows hundreds of concurrent transactions to execute per second.
1.3 DATABASE MANAGEMENT SYSTEM
Database management system (DBMS) is a software system that is used to define, create,
and maintain databases. It assists in maintaining and utilizing large collections of data and
also provides various levels of access control to users. It allows uses to communicate with
databases. In other words, we can say that it is mediator between users and databases.
DBMS may have utilities software such as database designer tools, report writers and
application development tools. The various common examples of DBMS are Oracle,
Access, SQL Server, Sybase, DB2, FoxPro, Dbase etc.
A DBMS presents a logical view of the data to the users. How this data is stored and
retrieved is hidden from the users. A DBMS ensures that the data is consistent across the
database and controls who can access what data.
DBMS software is installed on computer called data base server which hold databases.
Configuration (like processing capability and storage capacity) of server computer, size of
organization and volume of data will determine whether it is a single user or multi-user
system. In single user system only one user is allowed to access database at a time and
whole database resides on a single computer. User performs various tasks like designing
of databases, maintenance of database, writing application programs for systems and other
related activities.
In large organizations, number of users is more and volume of data is high. This large
amount of data is difficult to manage by a single user. In this case data is integrated and
shared. Data is kept on various database servers and with the help of computer networks
data can be transmitted from one server to another. For example examination department
and accounts department of any college/ university may need address of a student. So,
same data is being shared by two departments. It is responsibility of DBMS to provide
data to various users at a time and ensures that no two users can modify same piece of data
at a same time. It may be possible that accounts department’s user cannot view
examination related data. They share data but may be different level of access to
databases. Some users may have access to view databases; others may have access to
modify databases also. These permissions to access databases are called privileges.
The second type of data that is also very important and keep information about data is
called Metadata or Data Dictionary. Data Dictionary keeps all kind of information about
users, internal structure of databases, privileges and rights etc. Database system may
distributed or centralized. Earlier, main frames or minicomputers were used to keep
centralized databases. This was called single tier system. Terminals ware used to access
databases through DBMS in single tier system.
In two-tier system is an example of distributed computing where different software is
required for server machine and for the client machine. This system is called Client/Server
database system. A client-server environment was common in the preInternet days where a
transactional database serviced users within a single company. The number of users could
range from as little as one to thousands, depending on the size of the company. The critical
factor was actually a mixture of both individual record change activity and modestly sized
reports. Client-server database models typically catered for low concurrency and low
throughput at the same time because the number of users was always manageable.
In three-tier system special software called middleware is used to connect client of one
DBMS to access another DBMS. In a distributed computing system there may numbers of
DBMS installed on different server machine. A distributed database system allows
applications to access data from local and remote databases. It is two types homogenous
and heterogeneous distributed database system. A homogenous distributed database
system is a network of two or more Databases that reside on one or more machines. In a
heterogeneous distributed database system, at least one of the databases is of different
type. Data is stored geographically and resides to nearer where it will be

used. For example in banking system with branches in several cities may store customer
data at each branch. In this decentralized database system, it is job the DBMS system to
provide information of customer from any location by hiding all the details of how data is
searched. Another advantage if this system is in one the server gets down, user can still
find information from others
1.3.1 Functions of DBMS
• DBMS free the programmers from the need to worry about the organization and location
of the data i.e. it shields the users from complex hardware level details.
• DBMS can organize process and present data elements from the database. This
capability enables decision makers to search and query database contents in order to
extract answers that are not available in regular Reports.
• Programming is speeded up because programmer can concentrate on logic of the
application.
• It includes special user friendly query languages which are easy to understand by non
programming users of the system.
1.3.2 Advantages and disadvantages of database systems
Using a DBMS to manage data has many advantages:
1. Data independence — Application programs should be as independent from details of
data storage and representation. DBMS provides us abstract view of the data to insulate
application code from such details.
2. Data administration — Using DBMS software, it is very easy to manage database.
Centralizing the administration of data can improve performance of database system when
several users share the data.
3. Data integrity and security — DBMS can provide security to databases by assigning
privileges to different users. It provides authorization or access controls to different
classes of users to perform different operations on databases, such as creation,
modification, deletion and updation of data. DBMS can enforce integrity constraints on
the data. For example before inserting student marks into database file, the DBMS can
verify that whether student is on roll or not. In the conventional systems because the data
is duplicated in multiple files so updating or changes may sometimes lead to entry of
incorrect data in some files where it exists.
4. Flexibility — Since changes are often necessary to the contents of the data stored in
any system, these changes are made more easily in a centralized database than in a
conventional system. Because programs and data are independent, programs do not have
to be modified when types of unrelated data are added to or deleted from the database, or
when physical storage changes.
5. Fast response to information requests — Because data are integrated into a single
database, complex requests can be handled much more rapidly then if the data were
located in separate, non-integrated files. In many businesses, faster response means better
customer service.
6. Efficient data and multiple accesses — A DBMS uses a variety of techniques to store
and retrieve data efficiently. DBMS allows data to be accessed in a variety of ways (such
as through various key fields) and often, by using several programming languages (both
3GL and nonprocedural 4GL programs).
7. Lower user training costs — Users often find it easier to learn such systems and
training costs may be reduced. Also, the total time taken to process requests may be
shorter, which would increase user productivity.
8. Less storage — Theoretically, all occurrences of data items need be stored only once,
thereby eliminating the storage of redundant data. System developers and database
designers often use data normalization to minimize data redundancy.
9. Standards can be enforced — Since all access to the database must be through
DBMS, so standards are easier to enforce. Standards may relate to the naming of data,
format of data, structure of the data etc. Standardizing stored data formats is usually
desirable for the purpose of data interchange or migration between systems.
10. Controlling Data Redundancy — In the conventional file processing system, every
user group maintains its own files for handling its data files. This may lead to
• Duplication of same data in different files.
• Wastage of storage space, since duplicated data is stored.
• Errors may be generated due to updation of the same data in different files.
• Time in entering data again and again is wasted.
• Computer Resources are needlessly used.
• It is very difficult to combine information.
11. Elimination of Inconsistency — In the file processing system information is
duplicated throughout the system. So changes made in one file may be necessary be
carried over to another file. This may lead to inconsistent data. So we need to remove this
duplication of data in multiple file to eliminate inconsistency.
1.3.3 Disadvantages of database systems
Because of the larger number of users accessing the data when a database is used, the
enterprise may involve additional risks as compared to a conventional data processing
system in the following areas.
1. Complexity — The supply and operation of a database management system with
several users and databases is quite costly and demanding. A DBMS is a complex piece of
software, optimized for certain kinds of workloads (e.g., answering complex queries or
handling many concurrent requests), and its performance may not be adequate for certain
specialized applications.
2. Data Quality — Since the database is accessible to users remotely, adequate controls
are needed to control users updating data and to control data quality. With increased
number of users accessing data directly, there are enormous opportunities for users to
damage the data. Unless there are suitable controls, the data quality may be compromised.
3. Cost of Hardware & Software — A computer with high speed of data processing and
memory of large size is required to run the DBMS software. It means that you have to up
grade the hardware used for file-based system. Similarly, DBMS software is also very
costly.
4. Cost of Data Conversion — When a computer file-based system is replaced with a
database system, the data stored into data file must be converted to database file. It is very
difficult and costly method to convert data of data files into database. You have to hire
database and system designers along with application programmers. Alternatively, you
have to take the services of some software house. So a lot of money has to be paid for
developing software
5. Cost of Staff Trailing — Most DBMSs are often complex systems so the training for
users to use the DBMS is required. Training is required at all levels, including
programming, application development, and database administration. The organization has
to be paid a lot of amount for the training of staff to run the DBMS.
6. Database Damage — In most of the organizations, all data is integrated into a single
database. If database is damaged due to electric failure or database is corrupted on the
storage media, then valuable data may be lost forever.
1.4 CLASSIFICATION OF DBMS USERS
People may be direct or indirect DBMS user. They operate in various roles requiring
various levels of training. They can access and retrieve data as per requirement from
databases. Following is the list of few Databases users:
Database Administrator (DBA)
Database Designers
End Users
Application Programmers
1.4.1 Database Administrator (DBA)
The DBA is a person or a group of persons who is responsible for the supervision of the
database. In any organization DBA plays a very crucial role. The DBA is responsible for
giving permission to the users to access the database by grant and revoke privileges. He is
also responsible for coordinating and monitoring its use, managing backups and repairing
damage due to hardware and/or software failures and for acquiring hardware and software
resources as needed. In case of small organization the role of DBA is performed by a
single person and in case of large organizations there is a group of DBA’s who share
responsibilities.
For example in a university / college, it is job of the DBA to make sure that DBMS makes
the correct address of students or staff available from one central storage. Some of the
roles of the DBA may include:
Installation/ Upgradation / Configuration of software and hardware: It is job of the DBA
to install DBMS software, application software, and other software related to DBMS
administration. DBA test new software before it is moved into a production environment.
DBA also configure hardware and software.
Security administration: In client/server or distributed computing where number of users
share databases, security is very important issue. DBA monitors and administers DBMS
security. The DBA is also responsible for assigning users to databases and determining the
proper security level for each user. Within each database, the DBA is responsible for
assigning permissions to the various database objects such as tables, views, and stored
procedures.
Database design: The DBA is often involved at the preliminary database-design stages.
The DBA is the person who is well familiar with DBMS and system, so helps the other
users like development team with special performance considerations and can point out
potential problems.
Data analysis: It is also very important to analyze store data to increase performance and
efficiency of system. The DBA is called on to perform this task data analysis.
Data modeling and optimization: By modeling the data, it is possible to optimize the
system layouts to take the most advantage of the I/O subsystem. System administrator
performs periodic maintenance. DBA also understand and implement of schemas (layout
of Database).
Performing Backup and Recovery : Backup and recovery are the DBA’s most critical
tasks; they include the following aspects:
Establishing standards and schedules for database backups
Developing recovery procedures for each database
Making sure that the backup schedules meet the recovery requirements
a. Establishing and Enforcing Standards — The DBA should establish naming
conventions and standards for the DBMS Server and databases and make sure that
everyone sticks to them.
1.4.2 Database Designers
They are responsible for identifying the data to be stored in the database and for choosing
appropriate structure to represent and store the data. It is the responsibility of database
designers to communicate with all prospective of the database users in order to understand
their requirements so that they can create a design that meets their requirements. The
database designer is responsible for defining the detailed database design, including tables,
indexes, views, constraints, triggers, stored procedures, and other database specific
constructs needed to store, retrieve, and delete objects. Database Designers are also
identify the data (entities and attributes), relationships between the data and constraints on
the data. He/she maps the logical database design into a set of tables and integrity
constraints. He designs any security measures required on the data.
1.4.3 End Users
End Users are the people who interact with the database through applications or utilities.
The various categories of end users are:
Casual End Users - These Users occasionally access the database but may need different
information each time. They use sophisticated database Query language to specify their
requests. For example: High level Managers who access the data weekly or biweekly.
Native End Users - These users frequently query and update the database using standard
types of Queries. The operations that can be performed by this class of users are very
limited and effect precise portion of the database.
For example, reservation clerks for airlines/hotels check availability for given request and
make reservations. Also, persons using Automated Teller Machines (ATM’s) fall under
this category as he has access to limited portion of the database.
Standalone end Users/On-line End Users - Those end Users who interact with the
database directly via on-line terminal or indirectly through Menu or graphics based
Interfaces. For example, user of a text package, library management software that store
variety of library data such as issue and return of books for fine purposes.
1.4.4 Application Programmers
As we know Computer programs as being composed of three parts:
a. input
b. processing
c. output
A programmer is a person who writes, tests, debugs, and maintains the detailed
instructions, called computer programs. There are mainly two types of computer
programmers, system and application programmers. Systems programmers, write
programs to manage and maintain computer systems software, such as operating systems
and utility software. Application software is written according to the requirements of the
users/ organizations. Programmer converts that software design into a logical series of
instructions that the computer can follow. Application programmers write programs to
handle a specific job, such as a program for railways reservation, hospital management,
banking, inventory control etc. Different programming languages are used depending on
the use of the program. Database programmers write programs to access data and perform
calculations. These programs could be written in General Purpose Programming languages
such as Visual Basic, Developer, C, FORTRAN, COBOL etc. to manipulate the database.
These application programs operate on the data to perform various operations such as
retaining information, creating new informa​tion, deleting or changing existing
information.
For example, these days Visual Basic is a very popular language for front end
programming. Menus, forms, Macros and reports are created by visual basic language, can
be connected with back-end DBMS, with the help of software tools and utilities. Another
job of database programmers is to manage information of an organization without
reducing the flexibility of data storage, manipulation and retrieval process. In the process
testing and debugging, programmer test program, makes the appropriate modification,
then rechecks the program until an acceptably low level. Programmers prepare program
documentation and writes operating procedure.
1.4.5 System Analysts
System analysis and design involves the building of systems. The systems analyst plays a
vital role in the systems development process. System analysts are the architect of the
systems. The Systems analysts bring the users and owners ideas together to create a
solution for their business needs. System analysis and design are two different activities
preformed by system analysts. A systems analyst must attain skills like technical,
managerial, analytical, and interpersonal. There are major objectives of system analyst:
1. Defines the overall objectives of the system project.
2. Creates a road map of the existing organization/system, identifying the creators of data
and the primary user of data.
3. Describes hardware and software that serves the organizations.
4. Interacts with the customers to know their requirements
5. Prepares test data and tests programs as necessary to eliminate errors in logic, coding or
performance problems.
6. Identifies the areas of required organizational change or implements the new system
7. Prepares quality documentation and writes operating procedures.
8. Trains and prepares training material for operators and program users, or arranges for
training.
1.6 WORKERS BEHIND THE SCENE
1.6.1 DBMS system designers and implementers
DBMS Systems design is the process of defining the architecture, components, modules,
interfaces, and data for a system to satisfy specified requirements. He decides how the
logical database design is to be physically realized. DBMS Systems design ensures that
the direction of database development ultimately supports corporate objectives. There aim
is to design the DBMS software packages.
1.6.2 Tool developers
Tools are special purpose software used to assist designers/ developers/users to do a
specific task on computer. Tools are optional packages that are often purchased separately.
Tool developers jobs are to design and implement tool that facilitate the use of the DBMS
software. Tools include design tools, performance tools, specials interfaces, graphical
interfaces, prototyping, simulation, and test data generation etc.
1.6.3 Operator and maintenance personnel
They work on running and maintaining the hardware and software environment for the
database system.
REVIEW QUESTION

Short Answer Type


1. DBMS stands for……………..
2. DBA stands for………………
3. OLTP stands for………………
4. DBMS is software package.
5. DBA is software package.
6. Software development is the main job of a DBA.
7. Data and information are similar
8. Server is a general purpose computer
9. Following is not a DBMS
a. FoxPro
b. MS Access
c. Oracle
d. Java
Long Answer type
(True/False) (True/False) (True/False) (True/False) (True/False)
1. What is the different between data and information?
2. What is a database?
3. What is the difference between system and application programming?
4. What is DBMS? Write various functions of DBMS.
5. Write various advantages and disadvantages of DBMS.
6. What is DBA? What are the duties of DBA?
7. Write about the functioning of Database designer and application programmers.
8. Classify various users of DBMS.
9. What is the different between single tier, two tier and Multi tier database system?
10. Mention various workers behind the scene.

Chapter 2
DATABASE SYSTEM CONCEPTS AND ARCHITECTURE

2.1 The Evolution of Database Modeling


Use of the data models is to explain the logical layout of the It also shows the relationship
of various parts to each other. Data must be stored in some fashion in a file for it to be
useful. In database circles over the past 20 years or so, there have been three basic camps
of “logical” database models — hierarchical, network, and relational — three ways of
logically perceiving the arrangement of data in the file structure.
The relational database model is currently the best solution for both storage and retrieval
of data. Examining the relational database model from its roots can help you understand
critical problems the relational database model is used to solve; therefore, it is essential
that you understand how the different data models evolved into the relational database
model as it is today. The evolution of database modeling occurred when each database
model improved upon the previous one. The initial solution was no virtually database
model at all: the file system (also known as flat files). The file system is the operating
system.

2.1.1 Flat Files Systems


Using a file system database model implies that no modeling techniques are applied and
that the database is stored in flat files in a file system, utilizing the structure of the
operating system alone. The term “flat file” is a way of describing a simple text file,
containing no structure whatsoever—data is simply dumped in a file.
Any searching through flat files for data has to be explicitly programmed. The advantage
of the various database models is that they provide some of this programming for you. For
a file system database, data can be stored in individual files or multiple files. Similar to
searching through flat files, any relationships and validation between different flat files
would have to be programmed and likely be of limited capability.
Flat files are not databases at all. However, it is important to understand them for two
reasons. First, flat files are often used to store database information. In this case, the
operating system is still unaware of the contents and structure of the files, but the DBMS
has metadata that allows it to translate between the flat files in the physical layer and the
database structures in the logical layer. Metadata, which literally means “data about data,”
is the term used for the information that the database stores in its catalog to describe the
data stored in the database and the relationships among the data. The metadata for a
customer, for example, might include a list of all the data items collected about the
customer, along with the length, minimum and maximum data values, and a brief
description of each data item. Second, flat files existed before databases, and the earliest
database systems evolved from flat file systems that preceded them.
Overall, the worst problem with the flat file approach is that the definition of the contents
of each file and the logic required to correlate the data from multiple flat files have to be
included in every application program that requires those files, thus adding to the expense
and complexity of the application programs. It was this very problem that provided
computer scientists of the day with the incentive to find a better way to organize data.
2.1.2 The Hierarchical Model
The hierarchical database model is an inverted tree-like structure. A tree-structure
diagram is the schema for a hierarchical database. Such a diagram consists of two basic
components:
1. Boxes, which correspond to record types
2. Lines, which correspond to links
A hierarchical database consists of a collection of records that are connected to each other
through links. Each record is a collection of fields (attributes), each of which contains only
one data value. A link is an association between precisely two records. The tables of this
model take on a child-parent relationship. Each child table has a single parent table, and
each parent table can have multiple child tables. Child tables are completely dependent on
parent tables; therefore, a child table can exist only if its parent table does. It follows that
any entries in child tables can only exist where corresponding parent entries exist in parent
tables. All relationships between records in a hierarchical model have a cardinality of one-
to-many or one-to-one, but never many-to-one or manyto-many. The most popular
hierarchical database was Information Management System (IMS) from IBM.
Root File (Parent)
Parent Files
(Child to Root)
Children Files (Child to Parent)
There are, of course, several other ways of making the parent–child link. Each method has
advantages and disadvantages, but imagine the difficulty with the linked list system if you
wanted to have multiple parents for each child record. Also note that some system must be
chosen to be implemented in the underlying database software. Once the linking system is
chosen, it is fixed by the software implementation; the way the link is done has to be used
to link all child records to parents, regardless of how inefficient it might be for one
situation.
Example: 1
Following figure shows an example hierarchical database model. So, for example, there is
a one-to-many relationship between Customer and Order because there may multiple
order. We cannot search for an order detail without first finding the customer, and then
order.
Order
Order Detail
Customer
Following figure shows the contents of selected records within the hierarchical model
design. The record for customer Surjeet has a pointer to its first order (ID 101), and that
order has a pointer to the next order (ID 110). We know that Order 110 is the last order for
the customer because it does not have any pointers to additional orders. Looking at the
next layer in the hierarchy, Order 30 has a pointer to its first Order Detail record (for
Product 38), and that record has a pointer to the next detail record, and forth.
Characteristics of a hierarchical model
The characteristics of a hierarchical model are listed below:
All child node occurrences can use the data of their parent node occurrences All child
nodes are deleted with the deletion of parent node.
Root node may have multiple child nodes.
A child node has only one parent node.
There are three major drawbacks to the hierarchical model:
1. Not all situations fall into the one-to-many, parent–child format.
2. The choice of the way in which the files are linked impacts performance, both
positively and negatively.
3. The linking of parent and child records is done physically. If the dependent file were
reorganized, then all pointers would have to be reset.
2.1.3 Network Database Model
A network database model organizes files in a manner that associates each file with n
number of other files. This approach uses pointers to create a relationship between records
in one file and records in another file. The network approach provides more flexibility
than the hierarchical approach and allows a database designer to optimize the database
using detailed control and data organization. The network model allows child tables to
have more than one parent, thus creating a networked-like table structure.
A node represents a collection of records, and a set structure establishes and represents a
relationship in a network database. It is a transparent construction that relates a pair of
nodes together by using one node as an owner and the other node as a member. A user can
access data from within the network database, starting from any node and working
backward or forward through related sets. Multiple parent tables for each child allows for
many-to-many relationships, in addition to one-to-many relationships.
A record in the member node cannot exist without being related to an existing record in
the owner node. One or more sets (connections) can be defined between a specific pair of
nodes, and a single node can also be involved in other sets with other nodes in the
database.
The most popular database based on the network model was the Integrated Database
Management System (IDMS), originally developed by Cullinane (later renamed Cullinet).
The product was enhanced with relational extensions, named IDMS/R and eventually sold
to Computer Associates.
Example 2:
In the network model contents example shown in figure, each parent-child relationship is
depicted with a different type of line, illustrating that each has a different name. This
difference is important because it points out the largest downside of the network model,
which is complexity.
Instead of a single path that may be used for processing the records, there are now many
paths. To find all the other orders for this customer, there must be a way to work forward
from where we are to the end of the chain and then wrap around to the beginning and
forward from there until we return to the order from which we started. It is to satisfy this
processing need that all pointer chains in network model databases are circular. The
process of navigating through a network database was called “walking the set” because it
involved choosing paths through the database structure much like choosing walking paths
through a forest when there can be multiple ways to get to the same destination.

So some of the characteristics of both the hierarchical and network database approaches
are:
They use pointers.
Their architecture uses redundancy to create relationships and optimization. They
evolved over time, almost on a trial and error basis.
Models were developed after the fact to explain and improve them, but not as
part of the original design and theory.
Advantages
The network database gives fast data access
It also allows users to create queries that are more complex than those they created using
a hierarchical database.
Disadvantages
A user has to be familiar with the structure of the database in order to work through the
set structures.
Change in the database structure is difficult without affecting the application programs
that interact with it.
2.1.4 Relational Database Model
It is a data model in which the data is stored in tables (also referred to as entities).
Each table has rows (also referred to as records or tuples) and columns (also referred to
as attributes). In relational database model any tables can be linked together. The relational
model is based on a collection of mathematical principles drawn primarily from set
theory and predicate logic. These principles were first applied to the field of data modeling
in the late 1960s by Dr. E. F. Codd. Each row in a table has a unique identifier (UID). In
this database model relationships between tables are defined using the UID of one table
and joining them with the UID of another table. Pointers are not used. Data redundancy
is reduced via a process called normalization.
The relational model presents data in familiar two-dimensional tables, much like
a spreadsheet does. Unlike a spreadsheet, the data is not necessarily stored in tabular
form and the model also permits combining (joining in relational terminology) tables to
form views, which are also presented as two-dimensional tables.
The entire structure is, as we’ve said, a relation. The relation is divided into two
sections, the heading and the body. The tuples make up the body, while the heading is
composed of, well, the heading. The body of the relation consists of an unordered set of
zero or more tuples. A relation is a relation provided that it’s arranged in row and
column format and its values are scalar. Its existence is completely independent of any
physical representation. As such, a user isn’t required to know the physical location of
a record in order to retrieve its data. This is unlike the hierarchical and network database
models, in which knowing the layout of the structures is crucial to retrieving data. As
long as a user is familiar with the relationships among the tables in the database, he can
access data in an almost unlimited number of ways. He can access data from tables that
are directly related and from tables that are indirectly related.
Example 3:
The relational model categorizes relationships as one-to-one, one-to-many, and many-to-
many. A relationship between a pair of tables is established implicitly through matching
values of a shared field. In the above figure, the Student and Course tables are related via
Course ID field; a specific course is associated with student through a matching Course
ID.

Relational Model Structure


In general terms, relational database systems have the following characteristics:
All data is conceptually represented as an orderly arrangement of data into rows and
columns, called a relation.
All values are scalar. That is, at any given row/column position in the relation there is one
and only one value.
All operations are performed on an entire relation and result in an entire relation, a
concept known as closure.
Advantages of a Relational Database
The relational database model provides a number of advantages over network
and hierarchical models, such as the following:
It ensures the accuracy of the data due to data integrity that is built into the model at the
field level. It avoids duplication of records and to detect missing primary key values. At
the relationship, it ensures that the relationship between a pair of tables is valid. At the
business level it ensures that the data is accurate in terms of the business itself.
Any change in logical design or physical implementation of database will affect the
applications developed upon it.
In this model, data remains consistent and accurate due to the various levels of integrity
we can impose within the database.
Data can be retrieved either from a particular table or from any number of related tables
within the database. User can view information in unlimited number of ways.
Disadvantage
The relational data model has some disadvantages:
Disadvantage of the relational database was that software programs based on it ran very
slowly. In this model, a relational database system depends on machine performance.
It is not suited to all organizations.
In this model relationship among tables are inherently defined. One has to know the inner
data structure, to familiar with the physical details of relations. In above example, the user
may not be familiar with the inter relationship between the two tables without viewing
their data definition.
2.2 Schemas
A schema is the physical layout of tables in a database. In other words, a schema is a
specification of the physical database’s information content and logical structure. The
structure could be the overall database or an object inside the database. A table’s structure
would be considered a schema because it defines the structure in terms of its columns or
attributes. Think of the logical database design as the architectural blueprints and the
physical database implementation as the completed home.
The logical database design describes the size, shape, and necessary systems for a
database; it addresses the informational and operational needs of your business. We then
build the physical implementation of the logical database design, using DBMS software
program. Once tables have been created, set up table relationships, and established the
appropriate levels of data integrity. Now database is complete and ready to create
applications that allow user to interact easily with the data stored in the database. After
this, these applications will provide timely and accurate information to the user.
Users have different requirements and therefore each different kind of user must be
provided with a schema that is specific to his or her requirements.
A database has three schema levels:
1) a system level,
2) a logical level, and
3) a user level.
The user level represents distinct views that are customized to the needs of the individual
users. For example, users in the accounting department would have a view of the database
that included the salaries of employees or fees collection of students while users in the
examination department would have views of academic records of students.
The logical level would be a collection of table schemas such as an employee/ student
table and an examination table, each with attributes as columns. This level would include
mappings to the user schemas.
The system level is a collection of software and file schemas. Schema levels make it
possible for users, developers, and database administrators to work within their level
without having to know anything about the other levels.
Let us take an example of college database where we have three tables Student, Course
and Marks Record.

2.3 Database Instance


The term instance is typically used to describe a complete database environment,
including the RDBMS software, table structure, stored procedures and other functionality.
It is most commonly used when administrators describe multiple instances of the same
database. It is the actual data stored in a database at a particular moment in time. It is also
called database state or occurrence.
2.4 Data base state
The database state changes every time the database is updated. A database is always in
one specific state. For example, these states include ONLINE, OFFLINE, or SUSPECT.
The following are few the database states:
ONLINE: Database is available for access. The primary file group is online
OFFLINE: Database is unavailable. A database becomes offline by explicit user action
and remains offline until additional user action is taken. For example, the database may be
taken offline in order to move a file to a new disk. The database is then brought back
online after the move has been completed.
RESTORING: One or more files of the primary file group are being restored, or one or
more secondary files are being restored offline. The database is unavailable.
RECOVERING: Database is being recovered. The recovering process is a transient state;
the database will automatically become online if the recovery succeeds. If the recovery
fails, the database will become suspect. The database is unavailable.
RECOVERY PENDING: The database is not damaged, but files may be missing or
system resource limitations may be preventing it from starting. The database is
unavailable.
SUSPECT: At least the primary file group is suspect and may be damaged. The database
cannot be recovered during startup of database server. The database is unavailable.
Additional action by the user is required to resolve the problem.
EMERGENCY: User has changed the database and set the status to EMERGENCY. The
database is in single-user mode and may be repaired or restored.
2.5 DBMS Architecture
The architecture shown in following figure was first developed by ANSI/SPARC
(American National Standards Institute Standards Planning and Requirements Committee)
in the 1970s and quickly became a foundation for much of the database research and
development efforts that followed. Most modern DBMSs follow this architecture, which is
composed of three primary layers: the physical layer, the logical layer, and the external
layer. The original architecture included a conceptual layer, which has been omitted here
because none of the modern database vendors implemented it.

Database layers of abstraction


Databases have the unique capability of presenting multiple users of the data with their
own distinct views of that data while storing the underlying data only once. These are
collectively called user views. Because views store no actual data, they automatically
reflect any data changes made to the underlying database objects. This is all possible
through layers of abstraction.
2.5.1 The Physical / Internal Layer
The physical layer contains the data files that hold all the data for the database. All
modern DBMSs allow the database to be stored in multiple data files, which are usually
spread out over multiple physical disk drives. Maximum performance can be achieved
with this arrangement. However, Microsoft Access noted DBMS, stores the entire
database in a single physical file. This arrangement limits the ability of the DBMS to scale
to accommodate many concurrent users of the database, making it inappropriate as a
solution for large enterprise systems, while simplifying database use on a singleuser
personal computer system.
The user(s) of the database does not need to know how the data is actually stored within
these files, or even which file contains the data item(s) of interest. Inmost organizations,
DBA handles the details of installing and configuring the database software and data files
and making the database available to the database users. The DBMS with the association
of computer’s operating system automatically manages the data files, including all file
opening, reading, closing, updating and writing operations. The database user should not
be required to refer to physical data files when using a database, which is in sharp contrast
with spreadsheets and word processing, where the user must consciously save the
document(s) and choose file names and storage locations.
2.5.2 The Logical/ conceptual Layer
The logical layer or conceptual layer is the first of two layers of abstraction in the
database. Physical layer has a concrete existence in the operating system files, whereas the
logical layer exists only as abstract data structures assembled from the physical layer as
needed. It describes what data is being store in the database and what kinds of
relationships exist among data. The DBMS transforms the data in the data files into a
common structure. This layer is also called the schema. Schema is a collection of all the
data items stored in a particular database. Depending on the particular DBMS, this can be
a set of two-dimensional tables, a hierarchical structure similar to a company’s
organization chart, or some other structure.
2.5.3 The External Layer
The external layer is the second layer of abstraction in the database. This layer is
composed of the user views, which are collectively called the subschema. In this layer,
users and application programs can access the database by connecting and issuing queries
(command) against the database. Ideally, only the DBA deals with the physical and logical
layers. The DBMS handles the transformation of selected items from one or more data
structures in the logical layer to form each user view. The user views can be predefined
and stored in the database for reuse, or they can be temporary items that are built by the
DBMS to hold the results of a single ad hoc database query until no longer needed by the
database user.
2.6 Data Independence
Data independence is a form of database management that keeps data separated from all
programs that make use of the data. Data independence ensures that the data cannot be
redefined or reorganized by any of the programs that make use of the data. In this manner,
the data remains accessible, but is also stable and cannot be corrupted by the applications
using it. Data Independence has two types that Physical Independence and Logical
Independence.
2.6.1 Physical Data Independence
The ability to modify the physical file structure of a database without disrupting existing
users and application program is known as physical data independence. As shown earlier
in figure, it is the separation of the physical layer from the logical layer that provides
physical data independence in a DBMS. The measure, sometimes called the degree of
physical data independence, is how much change can be made in the file system without
impacting the logical layer. In systems without data independence, even the slightest
change to the way data was stored required the programmers to make changes to every
computer program that used the data.
All modern computer systems have some degree of physical data independence. However,
on most personal systems, the user must still remember where they placed the file so they
can locate it when they need it again. DBMSs expand greatly on the physical data
independence provided by the computer system in that they allow database users to access
database objects ( tables in a relational DBMS) without having to reference the physical
data files in any way. The DBMS catalog keeps track of where the objects are physically
stored. Here are some examples of physical changes that may be made in a data-
independent manner:
Moving a database data file from one device to another or one directory to another
Splitting or combining database data files
Renaming database files
Moving a database object from one data file to another
Adding new database objects or data files
2.6.2 Logical Data Independence
The ability to change the logical layer with out disrupting existing users and application
program is called logical data independence. It is the transformation between the logical
layer and the external layer that provides logical data independence. It is important to
know that most logical changes also involve a physical change. For example, a new
database object (such as a table in a relational DBMS) cannot be added without physically
storing the data somewhere; hence, there is a corresponding change in the physical layer.
Moreover, deletion of objects in the logical layer will cause anything that uses those
objects to fail but should not affect anything else.
Here are some examples of changes in the logical layer that can be safely made thanks to
logical data independence:
Adding a new database object
Adding data items to an existing object
Any change where a view can be placed in the external model that replaces (and
processes the same as) the original object in the logical layer, such as combining or
splitting existing objects
2.7 Database Languages
Database languages and Interfaces deal with how a data models gets into a database and
how the information gets to the user. These deal with interaction of an application with a
database management system and user query to a database and view of result . The DBMS
interface will comprise a database sublanguage. A database sublanguage is a programming
language designed specifically for initiating DBMS functions.
2.7.1 Data Definition Languages (DDL)
Data Definition Languages are used by the DBA and database designers to specify the
conceptual schema of a database. In other words , it is used to describe data and data
structures. These are also used to define internal and external schemas (view).
Examples of DDL commands:
CREATE: To make a new database, table, index, or stored query. DROP: To destroy an
existing database, table, index, or view.
ALTER : To modify an existing database object.
2.7.2 Data Manipulation Language (DML)
Data Manipulation Language is used to specify database retrievals and updates.
Various operations like store, read, search, change, etc. are performed by DML. DML
commands can be embedded in a general purpose programming language such as
PASCAL, COBOL or PL/1. Stand alone DML commands can be applied directly (query
language).
Examples of DML Commands:
Select : To query data from tables in a database
Insert : To insert new row(s) into a database.
Update: To update row(s) in a table
Delete: To delete row(s) from a table.
Types of Data Manipulation Language:
Procedural DML
It is also called record-at-a-time or low-level DML. It must be embedded in a
programming language. It searches for retrieves individual database records. Looping
and other constructs are used by host programming language to retrieve multiple records
from databases.
Non-procedural DML
It is also called set-at-a-time or high-level DML. It can be used as a stand-alone
query language or can be embedded in a programming language. It Searches for and
retrieves information from multiple related database records in a single command.
2.7.3 Data Control Language
Data control language (DCL) is used to control user access to database and specific data
within it.
Examples of DCL:
Grant: Gives privilege or role to a user.
Revoke: Take privilege or role back from a user.
2.8 Interfaces
These are different ways to connect with database system.
2.8.1 Stand alone query languages interfaces
The application poses with the help of SQL, a query language, a query to the database
system. There, the corresponding answer (result set) is prepared and also with the help of
SQL given back to the application. This communication can take place interactively or be
embedded into another language.

Type and Use of the Stand-alone query languages interfaces


Following, two important uses of a database interface like SQL are listed: Interactive:
SQL can be used interactively from a terminal.
Embedded: SQL can be embedded into another language, which might be used
to create a database application.
2.8.2 User Interfaces
A user interface is the view of a database interface that is seen by the user. User interfaces
are often graphical or at least partly graphical (GUI - graphical user interface) constructed
and offer tools which make the interaction with the database easier.
a) Menu-Based interfaces for Web Clients or Browsing — These interfaces present
user with lists of options, called menus that lead the user through the formulation a
request. Menus do away with the need to memorize the specific commands and syntax of
a query language. Pull-down menus are a very popular technique in Web based user
interfaces. They are also often used in browsing interfaces, which allows a user to look
through the contents of a database.
b) Form-based Interfaces — These interfaces consist of forms which are adapted to the
user. He/She can fill in all of the fields and make new entries to the database or only some
of the fields to query the other ones. But some operations might be restricted by the
application.
Form-based user interfaces are wide spread and are a very important means of interacting
with a DBMS. They are easy to use and have the advantage that the user does not need
special knowledge about database languages like SQL.
c) Text-based Interfaces — To be able to administrate the database or for other
professional users there are possibilities to communicate with the DBMS directly in the
query language via a input/output window. Text-based interfaces are very powerful tools
and allow a comprehensive interaction with a DBMS. However, the use of these is based
on active knowledge of the respective database language.
d) GIS Interface — A GIS user interface often integrates features of a database interface.
The database interaction takes place through the combination of different interfaces:
Graphical interaction via a selection on the map
Combination of form-based and text-based interaction (e.g. special Query
Wizards for the easier creation of database queries)
e) Interfaces for the DBA — Most database systems contain privileged commands that
can be used only by the DBA’s staff. These include commands for creating account,
setting system parameters, granting account authorization, changing a schema, and
reorganizing the storage structures of a database.
2.9 Classification of DBMS
DBMS can be classified by following:
1. Based on the Data Model
- Traditional : Hierarchical, Network, Relational (All ready discussed)
- Emerging : Object Oriented, Semantic, Entity Relationship
2. Other Classification
- Single user verses multi-user DBMS
- Centralised verses distributed DBMS
2.9.1 Object Oriented Data Model
An object is a logical grouping of related data and program logic that represents a real
world thing, such as a customer, employee, order, or product. Individual data items, such
as customer code and customer name, are called variables in the objectoriented model and
are stored within each object. a method is a piece of application program logic that
operates on a particular object and provides a finite function, such as checking a
customer’s credit limit or updating a customer’s address. Among the many differences
between the object-oriented model and the models already presented, the most significant
is that variables may only be accessed through methods. This property is called
encapsulation.
The object-oriented model incorporates all of the characteristics of an objectoriented
programming language and essentially relegates the relational database to the status of a
data store. The fundamental idea here is that the database developer handles every aspect
of the database, including the sets of operations that manipulate the data in the database
from within the object-oriented database programming software.
2.9.2 Semantic DBMS
Semantic databases represent information as a collection of objects and relationships
between these objects. Data items related to objects can be of arbitrary size, multi-valued,
or missing entirely. This approach has been applied to various types of data, including
scientific and multi-media data.
This system should be useful for most typical database applications, as well as for
specialized domains such as Earth Sciences.
Many database applications, e.g. those for Earth Sciences, have three essential needs:
(1) Strong semantics embedded in the database — to handle the complexity of
information;
(2) Storage of multi-dimensional spatial, image, scientific, and other non-conventional
data; and
(3) Very high performance — to allow massive data flow.
2.9.3 Centralised DBMS
A centralized database has all its data on one place. As it is totally different from
distributed database which has data on different places. In centralized database as all the
data reside on one place so problem of bottle-neck can occur, and data availability is not
efficient as in distributed database. Let me define some advantages of distributed database,
it will clear the difference between centralized and distributed database.
In this setting, any user could use an application on the corporate mainframe through
“dumb” terminal. Dumb terminal has no processing capability, only the ability to send
information to and display information given to it by the central computer (mainframe).
Mainframe is responsible for database management and interaction with user in multi-user
environment. Processing of applications was divided into two parts:
1. A front end, responsible for interfacing with the user
2. a back end management of data
Division of work allows more system flexibility because multiple front ends can access
the database govern by a single DBMS.
Advantages of centralized DBMS
Only one DBA can manage the single computer.
Back up of data for protection was easy.
User can share hardware peripherals.
Disadvantages of centralized DBMS
More horsepower of mainframe needs to serve the number of users. Large amount of
money is being charged by control company for operating system, application, hardware.
2.9.4 Distributed Database System
Distributed means the partitioning (dividing up) of the application and/or database into
parts and the placement of different parts on different computing devices, all connected by
a network.
A distributed database is a collection of multiple, logically interrelated databases
distributed over a computer network; stores data on multiple computers over the network
and permits access from any node to the joint data .
A distributed database management system (DDBMS) is a software system that permits
the management of the distributed databases and makes the distribution transparent to the
users.
Database users can access the distributed database through local applications applications
which do not require data from other sites and global applications applications which do
require data from other sites. A distributed database does not share main memory or disks.

Several factors have led to the development of DDBS:


Distributed nature of some database applications
Increased reliability and availability
Allowing data sharing while maintaining some measure of local control To improved
performance
Promises of Distributed DBMS
It provides transparent management of distributed, fragmented, and replicated data.
Transparency refers to separation of the higher-level semantics of a system from lower-
level implementation details.
It provides improved reliability and availability through distributed transactions. The
users can still access part of the distributed database with “proper care” even though some
of the data is unreachable. Distributed transactions facilitate maintenance of consistent
database state even when failures occur.
Performance can be improved. Since each site handles only a portion of a database, the
contention for CPU and I/O resources is not that severe. Data localization reduces
communication overheads.
It has ability to add new sites, data, and users over time without major restructuring.
Disadvantages of DDBSs
No operating true distributed database systems in existence
DDBS problems are inherently more complex than centralized DBMS ones More
hardware, software and people costs
Problems of synchronization and coordination to maintain data consistency
remote database fragments must be secured, and they are not centralized so the remote
sites must be secured as well. The infrastructure must also be secured.
Database design more complex – besides of the normal difficulties, the design of a
distributed database has to consider fragmentation of data, allocation of fragments to
specific sites and data replication.
2.9.5 Client/Server Model
The client/server model includes one or more shared computers, called servers, that are
connected by a network to the individual users’ workstations, called clients. Client/ server
computing started in the 1980s. The original model used is now called the two-tier
client/sever model, and later evolved into three-tier client/server model, and finally into
the N-tier client/server model, which is also known as the Internet computing model.
Two-Tier Client/Server Model
The two-tier client/server model, shown in following figure, is almost the reverse of the
centralized model in that all the business and presentation logic is placed on the client
workstation, which typically is a high-powered personal computer system. The only thing
remaining on a centralized server is the database.

The benefits of the two-tier client/server model include the following: User interface is
improved as compared with systems using dumb terminals. workstation processor did
most of the work and did not have to be shared with
anyone else, so it offer the potential for improved performance Here are the drawbacks:
Expensive client workstations are required because all the application logic ran on the
client.
Applications are installed on every client workstation, and all have to be updated with a
new software release at the same time.
REVEIW QUSETIONS
Short Anwser Type
1. Metadata, which literally means “data about data”
2. In network model, root node may have multiple child nodes.
3. In network model, a child node has only one parent node.
4. A schema is the physical layout of tables in a database
5. A back end , responsible for interfacing with the user
6. DDL stands for …………..
7. DCL stands for …………..
8. DML stands for …………..
9. SQL stands for …………..
10. Drop command is used to ………….. a table.
11. Create command is used to ………….. a table.
12. Insert command is used to ………….. data
13. Select command is used to ………….. data1.
Long Answer Type
1. What is data modeling?
2. Expalin various types of Datamodels.
3. What do you understande by architecture of DBMS?
4. Explain database schema levels.
5. What is the role of Schems?
6. What is difference between distributed and centerlized DBMS?
7. Classify DBMS.
8. Explain Database schema levels.
9. Write various advantages or disadvantages of Relational Model.
10. What is database state and instance?
11. What do you mean by data independence?
12. Explain terms DDL, DML,and DCL.
13. What do you mean by interface?
14. What is the role of view level in data abstaction?
(True/ False) (True/ False) (True/ False) (True/ False) (True/ False)
15. What is the difference between Procedural DML and Non-procedural DML?

Chapter 3
DATA MODELING USING E.R. MODEL

3.1 DATA MODELS


In E-R approach approach, a given universe of discourse is represented using an entity
model: a model built up of entities, relationships and attributes. The most abstract level of
a database design is the data model, the conceptual description of a problem space. Data
models are expressed in terms of entities, attributes, domains, and relationships.
Relational database systems have the following characteristics: All data is conceptually
represented as an orderly arrangement of data into rows
and columns, called a relation.
All values are scalar (A single, non-repeating value). That is, at any given row/
column position in the relation there is one and only one value. All operations are
performed on an entire relation and result in an entire relation,
a concept known as closure
3.2 ENTITIES
Definition :- An entity type is a class of entity occurrences characterized by the same
attributes.
An entity is anything about which the system needs to store information. They can be
physical (e.g., individuals, products, or buildings) or logical (e.g., departments, accounts,
or ideas). We have to capture data about entities and then store data about them in a
database. An entity is represented as a rectangle on the diagram. An entity is a physical or
abstract object that exists and can be distinguished from other objects.
For example, the Student entity in following figure represents the collection of all students
of a college/university. The individual student is called instance of the entity. Each entity
in the entity class is represented by a row of data often referred to as a record.
Example 1
Ram Pal with a student registration number 006226124023 is an entity since the distinct
features described by (registration-number, name, address, city) uniquely identify a
particular person existing in the universe.
3.2.1 Entity Types
An entity type is a class of entity occurrences characterized by the same attributes.
Some entities can be broken down into more specific categories or types. The more
detailed entities are called subtypes. The more general entity to which they belong is
called a super type. The super type is called a super class and the subtypes are called
subclasses of the super class. It is essential to understand that subtypes break down
entities by type rather than by state, meaning their mode or condition. An easy way to
distinguish the two is that existing entities can change state, but they seldom, if ever,
change type.
For example, Student or Teacher, every member (superentity) is described by
attributes common to both types (ID, name, address) while differentiating attributes are
applicable to one type. In this case, Teacher and Student are two sub-entities with different
attributes. Where as College_Member is a Super_Entity having common fields.
3.2.2 Sets of entities
Row or records associated with an entity in an entity class is called “sets of entities”.
An entity set is set of entities of the same type. It share the same properties or attributes
for example the set of all students those are studying in a college, are called entity set of
student.
Some examples of entities include:
A person entity can be employee or student.
A concept entity can be account or department.
A place entity can be state or country.
An object entity can be building or product.
3.3 ATTRIBUTES
An entity is characterised by a number of properties or attributes. Values assigned
to attributes are used to distinguish one entity from another. Following figure identifies
the attributes of an entity in the entity class Individual. The attribute Rollno is a primary
key, as indicated by underline of that attribute. a primary key is the unique attribute that
distinguishes one record in the entity type from another. We discuss primary keys in
details when discussing constraints. For every individual record or entity in the entity
class, the college/university may want to remember their Rollno, Name, Date of Birth and
Address. Attributes can be stored as single or multiple columns in a table.
3.3.1 Simple attribute
Simple or atomic attributes cannot be further divided or subdivided, hence the notion
“atomic.” For example rollno is a simple attribute.
3.3.2 Composite Attribute
A composite attribute, also called a group attribute, is an attribute formed by combining or
aggregating related attributes. The naming of composite attributes should be descriptive
and general. Most data processing applications divide the name into component parts. In
that case Name, is called a composite attribute or an aggregate. Name is usually composed
of a first name, a middle name and a last name. These are called sub-attributes. The sub-
attributes, such as first name, middle name, and last name, are called simple, atomic,
attributes.
3.3.3 Multi-Valued Attribute
Another type of non-simple attribute that has to be managed is called a multivalued
attribute. The multi-valued attribute, as the name implies, may take on more than one
value for a given occurrence of an entity. For example, the attribute school could easily be
multi-valued if a person attends more than one school.
3.3.4 The Derived Attribute
Derived attributes are attributes that the user may envision but may not be recorded per se.
These derived attributes can be calculated from other data in the database. An example of
a derived attribute would be an age that could be calculated once a student’s birthdate is
entered.
3.3.5 Keys
The main function of a database is to store data and retrieve data on demand.
Key is an attribute that may be used to find a particular entity occurrence. If an attribute
can be thought of as a unique identifier for an entity, it is called a candidate key. When
a candidate key is selected as unique identifier, it is called the primary key for the entity.
3.3.6 Strong and weak entities
Entities with, at least one identified key is called strong entities or regular entities.
Those entities that rely on other entities for their existence, called as weak entities. Weak
entities may not have candidate keys, although the actual meaning of a weak entity is
“one that depends on another for existence.

3.4 DOMAIN OF ATTRIBUTES


A domain is the set of all possible values that an attribute may validly contain.
Domain is also known as the range of values from which an attribute can be drawn.
Domains are often confused with data types; they are not the same. Data type is a
physical concept while domain is a logical one. “number” is a data type; “age” is a
domain.
To give another example, “SubjectName” and “Surname” might both are text fields, but
they are obviously different kinds of text fields; they belong to different domains. For
example, the domain AwardedDegree, which represents the degrees awarded by a
university. This attribute might be defined as text of tree character in the database
schema. It may member of the set {BA, BE, MA, ME, MD, MS, PhD,}.
3.5 RELATIONSHIPS
Relationships are the associations among the various entities. The relationships become
the glue that holds the database together. Relationships are shown on the conceptual
design diagram (refer to figure 1) as lines connecting one or more entities. Each end of a
relationship line shows the maximum cardinality of the relationship, which is the
maximum number of instances of one entity that can be associated with the entity on the
opposite end of the line.
The entities that are related are called participants, and the number of participants in a
relationship is its degree. The most of relationships are binary, with two participants, but
unary relationships (a relation that is related to itself) are also common
Relationships may thus be classified into three types: one-to-one, one-to-many, and many-
to-many.
A relationship set can be thought of as a set of n-tuples:
{(e1; ….. ; en)| e1 E 1; ……. ; en E n}
Each n-tuple denotes a relationship involving n entities e1 through en, where entity ei is in
entity set Ei.

Formal Terminology
Conceptual Physical
relation table
attribute field
tuple record
3.5.1 Constraint and Cardinality
Constraints are some rules to store or retrieve data from a database. They restrict users
what data can be stored and retrieved. Constraints should be added after a logical data
model is stable. A primary key consists of one or more attributes of an entity that
distinguishes each record from the others. Cardinality (or degree) concerns the number of
instances involved in a relationship. A relationship can be said to be either a 1:1 (oneto-
one) relationship, a 1:M (one-to-many) relationship, or a M:M (many-to-many)
relationship.
3.6 ENTITY RELATIONSHIP MODELING
Entity relationship modeling is the process of visually representing entities, attributes, and
relationships, producing a diagram called an entity relationship diagram (ERD). Non-
technical people can easily understand it. It provides great value to technical persons.
ERDs are platform independent and can even be used for non relational databases if
desired. It was developed by Peter Chen in 1976. Many variations in ERD have been
developed by various vendors, computer scientists, and academician. The elements
common to all ERD formats are shown below:
Entities are represented as rectangles or boxes.
Relationships are represented as lines.
Line ends indicate the maximum cardinality of the relationship . Symbols near the line
ends indicate the minimum cardinality of the relationship. Attributes may be optionally
included.
3.6.1 One-to-One ( 1: 1) Relationships
A one-to-one relationship is an association where an instance of one entity can be
associated with at most one instance of the other entity, and vice versa. The one-toone
relationship is simplest type of relationship. A pair of tables have a one-to-one relationship
when a one record in the first table is related to only one record in the second table, and a
single record in the second table is related to only one record in the first table. In one-to-
one relationship, one table serves as a “parent” table and the other
serves as a “child” table. One-to-one relationships are not very common among entities.
In following figures, the relationship between the Employee and Pay Receivable entities is
one-to-one. This means that an Employee can have at most one associated Pay, and Pay
can have at most one associated Employee. The relationship is also mandatory
in both directions, meaning that an employee must have at least one pay receivable
associated with it, and a Pay receivable must have at least one Employee associated with
it. Putting this all together, we can read the relationship between the Employee and Pay
entities as “one employee has one and only one associated pay receivable, and one pay
receivable has one and only one associated employee .”
Following figure shows an example of a typical one-to-one relationship. In this figure,
Employee is the parent table and Pay Receivable is the child table. A single record in the
Employee table can be related to only one record in the Pay Receivable table, and a single
record in the Pay Receivable table can be related to only one record in the Employee table.
Both the tables are linked together by Employee ID is indeed the primary key in both
tables.
3.6.2 One-to-Many (1: M) Relationships
A one-to-many relationship is an association between two entities where any instance of
the first entity may be associated with one or more instances of the second, and any
instance of the second entity may be associated with at most one instance of the first.
The relationship between Department and Faculty as shown in the figure, which is
mandatory in only one direction, is read as follows: “At any point in time, each
Department can have zero to many Faculties, and each Faculty must have one and only
one Department.” One-to-many relationships are quite common. In fact, they are the
fundamental building block of the relational database model in that all relationships in a
relational database are implemented as if they are one-to-many.
A one-to-many relationship exists between a pair of tables when a single record in the first
table can be related to many records in the second table, but a single record in the second
table can be related to only one record in the first table.
The example in following figure illustrates a typical one-to-many relationship. A single
record in the Department table can be linked to one or more records in the Faculty table,
but a single record in the Faculty table is related to only one record in the Department
table. Department ID is a foreign key in the Faculty table ( more detail in chapter 4).
3.6.3 Many-to-Many ( M:M ) Relationships
A many-to-many relationship is an association between two entities where any instance of
the first entity may be associated with zero, one, or more instances of the second, and vice
versa. following figure, shows many-to-many relationship between Teacher and Student.
Relationship is read as: “At any given event, each Teacher teaches zero to many students,
and each student is being taught by zero to many teachers.”
This particular relationship has data associated with it as shown in the diamond on the
diagram. Data that belongs to a many-to-many relationship is called intersection data.
Many-to-many relationships are quite common, and most of them will have intersection
data. The relational model does not directly support many-to-many relationships. In
Relational database, some changes are required to map the conceptual model to the
corresponding logical model. The intersection data is mapped to a separate table (an
intersection table). The many-to-many relationship are mapped to two one-tomany
relationships. The intersection table is kept in the middle and on the “many” side of both
relationships.
A pair of tables have many-to-many relationship when a single record in the X table can
be related to many records in the Y table and a single record in the Y table can be related
to many records in the X table. The relationship are established with a linking table. A
linking table makes it easy for database designer to link records from one table with those
of the other. It helps user to add, delete, or modify related data without any problem. A
linking table can be defined by taking copies of the primary key of each table in the
relationship. Structure of the new table is form with the use of these primary keys.
As shown in following figure, there is no relation between Teacher and Student entities.
To implement many-to-many relationship, third table teaching is being created with
primary keys from both the tables
3.6.4 Unary or Recursive Relationships
Unary relationships have only one participant the relation is associated with itself. These
relationships can exist between entity instances of the same type. These are also called
recursive relationships. An example of a unary relationship is Employee to Manager.
One’s manager is, in most cases, also an employee with a manager of his or her own.
Unary relationships can be of any cardinality. One-to-many unary relationships are used to
implement hierarchies, such as the organizational hierarchy implicit in the Employee-
Manager relationship.
Example 1:
In following figure the line drawn between ‘Computer Engg.’ and ‘230’ indicates that
lecturer 230 is employed by the Computer Engg department. Note that the cardinality of
the relationship is 1:M. The department ‘Computer Engg’, for instance, has two lecturers
associated with it. The entity Lecturer has mandatory participation while Department has
optional participation.
Electrical Engg., for instance, is not associated with any lecturers.
Example 2:
In following figure, Lecturer to Student is a many-to-many relationship. Lecturer 234, for
example, teaches students 34698 and 37798. The participation of Lecturer is optional;
Student is mandatory.
3.7 MAPPING THE ENTITY DIAGRAM TO A RELATIONAL DATABASE
Converting an ER diagram into a database is called mapping. A relational database is also
represented by two-dimensional tables called “relations.” The tables are composed of rows
and columns. The rows are called tuples and the columns, attributes. All attributes of a
relation must be atomic and keys must not be null. Mapping rules are required to convert
ER diagram into a relational database.
3.7.1 Rule 1: Mapping of strong entities
For strong entities, create a new table/relation for each strong entity. The indicated key of
the strong entity is made the primary key of the table. If there is more than one candidate
key in ER diagram, choose one as a primary key.
3.7.2 Rule 2: Mapping atomic attributes
For entities with atomic attributes, entities are mapped to a table by forming columns for
the atomic attributes.
Example 3:
Conversion of following ER diagram with atomic attributes into relation database
A relational database of the above Entity diagram with some data would be like following
table:
The entity name, Student, would be the name of the relation/table. The attributes in the
diagram become the column headings.

3.7.3 Rule 3: Mapping of composite attributes


For entities with composite attributes, entities are mapped to a table by forming columns
from the atomic parts of the composite attributes.
Example 4:
Mapping of composite attributes of following ER diagram
A relational database, which corresponds to the entity diagram in above figure, data would
be like following table

3.7.4 Rule 4: Mapping of multi-valued attributes


For multi-valued attributes, make a separate table for the multi-valued attribute. Keep a
row for each value of the multivalued attribute, with the key from the original table. The
key of the new table will be the combination of the multivalued attribute and the key of
the owner entity. Eliminate the multi-valued attribute from the original table.
Example 5:
In the following figure school is multi value attribute where that a student may
have more than one school.

Now suppose that the above example had name as a key. It would be mapped
into two relations: a relation with the multi-valued attribute, and a resulting relation
with the multi-valued attribute excised.
3.8 MAPPING RELATIONSHIPS TO A RELATIONAL DATABASE
In this section we are going learn how to mapping of relationships. Before mapping go
through these steps
Identify the entities
Add attributes to entities, identifying primary keys:
What relationship is there
3.8.1 Rule: mapping M:M relationships
For each M:M relationship, form a new table ( relation) with the primary keys of both of
the two entities (owner entities) that are being related in the M:M relationship. The key of
this newly created table will be the concatenated keys of the owner entities. Include any
attributes that the M:M relationship may have in this new table.
Example 6:
For example, refer to following ER diagram. The student and course tables have
the following data:

x = students, y = courses, relationship = admission


Students, which are recorded in the database, must be admitted in many courses.
Here, RegNo and CourseID are the primary keys of student and course relation ,
respectively, then to map the M:M relationship, we form a relation called admission, as
follows:

Both CourseID and RegNo together are the primary key of the relation, admission. 3.8.2
Rule: Mapping 1:M relationships
For binary 1:M relationships, we have to find what kind of participation constraints the M
side of the relationship has. For binary 1:M relationships, if the M-side has full
participation, include the key of the entity from the 1 side, in the relation on the M side as
a foreign key.
Example 7:
For example, in following ERD, we assume full participation on the student side,
we will have:
Rooms may have zero or more students and Students must live in one and only one room.
The relational implementation would be:

Here, the full participation is on the student entity side that is the M side. Key is taken
from room (RoomNo) relation and included it in the student relation. If the relationship
had an attribute, it would be included in the M side and that is student.
3.8.3 Rule: Mapping 1:M relationships with partial participation
For binary 1:M relationships, if the M-side has partial participation, the 1:M relationship is
handled just like a binary M:M relationship. A separate table is created for the
relationship. The key of the newly created table consists of a concatenation of the keys of
the related entities. Include any attributes that were on the relationship, on this new table.
3.8.3 Rule: Mapping 1:1 relationships
For binary 1:1 relationships, include the primary key of Entity X into Entity Y as the
foreign key.
Example:
In the following ERD there is One to one relationship between employee and
house

To map this relation ship key is added to any one the table like EmployeeID is added to
house relation

REVIEW QUESTION

Short Answer type questions


1. ERD stands for ………..
2. Entity that rely on other entities for their existence, called ………..
3. M:M is of ………..relationship.
4. Unary relationships have only …. participant the relation is associated with itself.
5. The rows are called ……….. and the columns, ………..
6. Relation is also known as a table
7. An entity is represented as a rectangle on the diagram.
8. An entity set is set of entities of the same type. (True/False) (True/False) (True/False)
9. A composite attribute is formed by combining related attributes. (True/False)
10. Key uniquely identify rows (True/False)
Long Answer type questions
1. What you mean by an entity and entity set?
2. What is an attribute?
3. Write various types of attributes?
4. What do you mean by a relation?
5. What is ER diagram?
6. What is the goal of ER Modeling?
7. What do you mean by domain of Attributes?
8. What do you mean by Relationships?
9. Explain following terms
a. 1:1 relationships
b. 1:M relationships
c. M:1 relationships
d. M:M relationships
e. Unary Relationships
10. What is mapping?
11. Explain various mapping rules for Entity Diagram with suitable examples.
12. How do you map relationships?

Chapter 4
RELATIONAL MODEL

4.1 INTRODUCTION
The relational data model is unusual in being largely due to the efforts of one man, E.F.
Codd. In 1970 E.F. Codd published a seminal paper which laid the foundation for
probably the most popular of the contemporary data models. A database is effectively a set
of data structures for organising and storing data. In any data model, and consequently in
any DBMS, we must have a set of principles for exploiting such data structures for
information systems applications with in organisations. Data definition is the process of
exploiting the inherent data structures of a data model for a particular organisational
application.
4.2 DOMAINS
Each column of a table there is a set of possible values called its domain. The primary unit
of data in the relational data model is the data-item. Such data-items are said to be non-
decomposable or atomic. A set of such data-items of the same type is said to be a domain.
Domains are therefore pools of values from which actual values appearing in the columns
of a table are drawn.
In the above figure we have five columns in student table. Let d1 represents the set of all
Registration No, d2 is set of name, d3 is set of semester, d4 is set of Branch and d5 is set of
cities
So the table Student can be represented as :
d1 x d2 x d3 x d4 x d5
Example 1:
The possible domain definitions for the attributes person-name, date-of-birth,
and city might be:
PERSON-NAMES = {Ajit Kumar, Aman, Ram Gopal , Akash, Anjali, Balraj} CITY = {
Hoshiarpur, Jalandhar, Ludhiana, Patiala, Bathinda}
DATE-OF-BIRTH = {a string in dd-mm-yyyy format, such that: 03–02–1940 dd, mm,
yyyy represent day, month year}
To make the distinction between the attributes and their corresponding domains, we are
using small letters for the names of the attributes and capital letters for the domains. An
empty set {} is a member of any of these domains.
4.3 TABLE
According to the relational model, data in a relational database is stored in relations,
which are perceived by the user as tables. Each relation is composed of tuples (records)
and attributes (fields). A table that stores data used to supply information is called a data
table, and it is the most common type of table in a relational database. Data in this type of
table is dynamic because you can manipulate it (modify, delete, and so forth) and process
it into information in some form or fashion.
4.4 FIELD
A field (also called an attribute in theory of relational database) is the smallest structure in
the database. It shows a characteristic of the subject of the table to which it belongs. Fields
are the structures that actually hold data. The data in these fields can then be retrieved and
presented as useful information. The importance of fields cannot be ignored. Every field
contains one and only one value, and its name will identify the type of value it store. If
you see fields with names such as RollNo, Name, City, State, And Zipcode, you can easily
imagine exactly what type of values move into each field.
Types of fields in an improperly or poorly designed database.
A multipart field (composite field), which holds two or more distinct items within
its value.
A multivalued field, which contains multiple instances of the same type of value. A
calculated field, which holds a concatenated text value or the result of a
mathematical expression.
4.5 RECORD
A record (a tuple) represents a unique instance of the subject of a table. It is composed of
the entire set of fields in a table, regardless of whether or not the fields contain values.
Each record should be identified throughout the database by a unique value in the primary
key field of that record. Records are a key factor in understanding table relationships.
Table relationships deals how a record in one table relates to other records in another table.
4.6 RELATIONS
A relation may be thought of as a set of rows. A relation may alternately be though of as a
set of columns. There is only one data structure in the relational data model – the relation.
The relational data model consists of the following three component parts:
structure - a uniform single data structure type called a relation manipulations a set of
operators that transform relations into other relations behaviour - general integrity rules
that guard the consistency of any database
Example 2:
Let us take a product of the domains defined in Example 1, that is P = PERSONNAMES ×
CITIES × -DATES-OF-BIRTH. P will then contain all tuples that were obtained by
combining all the values from these three domains:
P={<Ajit Kumar, 03–02–1940, Hoshiarpur>, <Ajit Kumar, 03–02–1940, Jalandhar>,
……………, <Ajit Kumar, 13–02–1947, Ludhiana>,…………, < Aman, 1– Jalandhar>,
……………, <Ajit Kumar, 13–02–1947, Ludhiana>,…………, < Aman, 1– 09-1960,
Hoshairpur>,..…<Balraj, 31–12–1999, Bathinda>}
‘A person with person-name exists, is of 50 years of age or more, was born on date-of-
birth at city’, the corresponding relation might only hold the following triples:
PERSON50={<Ajit Kumar, 12–10–1950, Hoshiarpur>,
<Aman, 12–11–1950, Ludhiana>,
<Akash, 13–11–1951, Jalandhar>}
PERSON50 is a subset of P and contains only those tuples of P that represent information
about the 50-year old people in that society.
It is very easy to show relations in a tabular form as shown in following figure. Each row
of the table acts as distinct tuple, so that its degree is a number of columns and its
cardinality is a number of rows in the table. The attribute names may not be the same as
the names of their underlying domains. In a relation, no two tuples can be identical and the
order in which the tuples appear is not significant. The values in the relations will thus be
referred to through a combination of a relation name and the attribute name. Tuples are
solely identified through the attribute values in the relational model. An attribute (or a set
of attributes) uniquely identify any particular tuple within a relation.
Because the idea of a relation is modeled on a mathematical construct, a relation is a table
which obeys a certain restricted set of rules:
Every relation in a database must have a unique name
Every column in a relation must have a unique name within the relation All entries in a
column must be of the same kind. They are said to be defined on
the same domain
Each row in a relation must be unique. Duplication of rows are not allowed in a
relation
The ordering of rows and columns in a relation is not significant Each cell or
column/row intersection in a relation should contain only a atomic
value. multiple-values are not allowed in the cells of a relation
4.5.1 Definition
A relation R is a subset of an expanded cartesian product of n, not necessarily distinct
domains D1×D2×…,×Dn, such that for every element dk=<dk1, dk2,…, dkn> R a
predefined proposition p(<dk1,dk2,…,dkn>) is true;
dk Di for every i=l, 2,…,n where k is the number of ntuples in the relation Ri
(cardinality of R) and n is the number of attributes in the relation R (degree of R). Let us
take an example of first row of the above table, where see ; t(Registration No.) =
6226204201, t(Name) = “Amrik Singh”, t(Semester) = 1, t( Branch) = “CSE” , t( City)
=”Jalandhar”
Relation scheme R, can be represented by set of attributes a1, a2, a3,….…., an. None empty
set Di( 1<i <n ) is known as domain of attribute ai. It is denoted as Dom(ai). A relation r of
relation scheme R is a set of tuples { t1, t2, t3,………………. ,tm) from R to D.
4.6 VIEW
It is a “virtual” table made of fields from one or more tables in the database. The tables
used to compose a view, are called base tables. The relational model refers to a view as
“virtual” because it takes data from base tables rather than storing data on its own. The
only information about a view that is kept in the database is its structure. Many major
RDBMS programs support views Views enable us to see the information in our database
from many different aspects, providing us with a great amount of flexibility. Views are
created in a variety of ways. They are very useful when the base on multiple related tables.
For example, in a college scheduling database, we can create a view that consolidates data
from the students, classes, courses and class schedules tables.
4.7 CONSTRAINTS
In is very important for a user to know that what data can be stored or retrieved from a
database. Constraints are rules restricting for storing and retrieving data. They should be
added after a logical data model is stable. Primary and, to some extent, foreign key
constraints fit this definition. They are critical components of a relational database.
4.7.1 Key Constraints
Keys are important for a table structure for the following reasons:
They ensure that each record in a table is uniquely identified. The complete set of records
within the table constitutes the collection, and each record represents a unique instance of
the table’s subject within that collection. A key is mean of accurately identifying each
instance
Various types of integrity, are established and enforced by keys. Keys play a major role of
table-level integrity and relationship-level integrity. Keys ensure that a table has unique
records and that the fields those are used to establish a relationship between a pair of
tables always contain matching values.
Table relationships are established by keys.
4.8 CANDIDATE KEYS
The first type of key that is establish for a table is the candidate key, which is a field or
group of fields that uniquely identifies a single instance of the table’s subject. Each table
must have at least one candidate key. It is examined from the table’s pool of available
candidate keys and designate one of them as the official primary key for the table.
Before, a field is designated as a candidate key, we must make certain it complies with all
of the elements of a Candidate Key. Set of guidelines determines whether the field is fit to
serve as a candidate key. A field cannot be designated as a candidate key if it fails to
conform to any of these elements.
4.8.1 Elements of a Candidate Key
It cannot be a multipart field. Usage of one field as an identifier is a bad idea. It should
not contain null values because a null value represents the absence of a
value
Optional value is not allowed in completely or in part. An optional value may be
null at some point.
It comprises a minimum number of fields necessary to define uniqueness.
Combination of fields can be used as a candidate key, so long as each field
contributes to defining a unique value.
Its values must be uniquely identify each record in the table. It protects us against
duplicate records.
Its value must exclusively identify the value of each field within a given record.
With this, the table’s candidate keys give the only means of identifying each field
value within the record.
Only in rare cases its value can be modified.
Example 3:
In following figure we probably identified Registration No., NAME, Address, Branch and
City as potential candidate keys. But, we have to examine these fields more closely to
determine which ones are truly eligible to become candidate keys.

Upon close examination, we can draw the following conclusions:


Registration No. is eligible. Because, this field conforms to every element of a candidate
key.
Name is ineligible because it can contain duplicate values. The values of a candidate key
must be unique. But, in this case there can be more than one occurrence of a particular
name.
Address is ineligible because it can contain duplicate values. Many people live in the
same house.
Name and Address are eligible. The combined values of both fields will give a unique
identifier for a given record.
Branch is ineligible because it can contain duplicate as shown in figure.
City is ineligible because it can contain duplicate values as shown in figure.
Student table has two candidate keys: Registration No. and the combination of Name
and Address field.
4.8.2 Primary Keys (PK)
A primary key is used to uniquely identify a record in a table. Unique identification for
each record is required because there is no other way to find a record without the
possibility of finding more than one record, if the unique identifier is not used.
A primary key value uniquely identifies a given record within a table and represents that
record throughout the entire database. It also helps to protect against duplicate records. A
primary key must conform to the exact same elements as a candidate key. A primary key is
selected from a table’s pool of available candidate keys. Identify each qualified candidate
key in the table, and select one of them to become the official primary key of the table. In
addition to being unique, a primary key cannot be null. There must be a value for the
attribute every time. Another characteristic is that primary keys are indexed. This allows
data to be sorted and retrieved faster.
Here are a couple of guidelines to select an appropriate primary key:
If there are a single-field candidate key and a composite candidate key, choose the single
field candidate key. Candidate key that contains the least number of fields is best suitable
primary key.
Choose a candidate key that incorporates part of the table name within its own name. For
example, a candidate key with a name such as EmployeeID is a good choice for the
Employee table.
Example 4:
In the above figure, Registration No. uniquely identifies a student. Therefore, it is
better to use Registration No as a primary key for student table.
Example 5:
Examine the candidate keys and choose one to serve as the primary key for the table. The
choice is largely arbitrary—you can choose the one that you believe most accurately
identifies the table’s subject or the one that is the most meaningful to everyone in the
organization. For example, consider the Student table again in following figure.
In this table we have two keys; one is Registration No that is candidate key (CK) and
another is combination of Name and Address fields called composite candidate key
(CCK). Either of the candidate keys (Registration No or Name + Address) within the table
could serve as the primary key. Most of the users prefer to choose Registration No if
everyone in the college is accustomed to using this number as a means of identifying
students. Selected candidate key becomes the primary key of the table and is managed by
the Elements of a Primary Key. These elements are exactly the same as those for the
candidate key.
4.8.3 Rules for Establishing a Primary Key (PK)
Each table must have one and only one primary key. Only one primary key is necessary
for a particular table.
Each primary key within the database must be unique. No two tables should have the
same primary key unless one of them is a subset table. A table throughout the database
structure is identified by primary key; therefore, each table must have its own unique
primary key. It avoids any possible confusion concerning the table’s identity.
4.8.4 Alternate Keys
When a candidate key is selected as a primary key of a particular table, the remaining
candidate keys are called as alternate keys. An alternative means of uniquely identifying a
particular record within the table can be provided to users. It is marked with “AK” or
“CAK” (composite alternate key) in the table structure;
4.8.5 Foreign keys (FK)
Foreign keys are the copies of primary keys created into child tables to form the opposite
side of the link in an inter-table relationship. A foreign key defines the reference for each
record in the child table, referencing back to the primary key in the parent table.
Elements of a Foreign Key
It should have the same name as the primary key from which it was copied. It uses a
replica of the field specifications for the primary key from which it was
copied
It gets its values from the primary key to which it refers.
Example 6:
In the following figure, Customer ID is primary key(PK) in Customer table( Parent table)
and acts as foreign key(FK) on order table (Child table). Similarly, Employee Number is
primary key in Employee table (Parent table) and foreign key in order table(Child table).

4.8.6 Non-keys
A non-key is a field that does not act as a candidate, primary, alternate, or foreign key. Its
sole purpose is to represent a characteristic of the table’s subject. Its value is determined
by the primary key. There is no particular designation for a non-key, so it is not marked in
the table structure.
4.9 CONSTRAINTS ON NULL
A null represents a missing or unknown value. A null does not represent a zero or a text
string of one or more blank spaces. The reasons are quite simple.
A zero can have a very wide variety of meanings. It can represent the state of an account
balance, the number of available product in a stock.
Although a text string of one or more blank spaces is guaranteed to be meaningless to
most of us.
A zero-length string two consecutive single quotes with no space in between (‘’) is also an
acceptable value to languages such as SQL.
4.9.1 The Value of Nulls
Missing values in a database are commonly the result of human error. Unknown values
appear in a table for a variety of reasons. One reason may be that a specific value needed
for a field is yet not defined. Each null in the Sate Code field shows a missing or unknown
Sate Code for the record in which it appears.
4.9.2 The Problem with Nulls
Nulls adversely effect mathematical operations. An operation involving a null evaluates to
null. If a number is unknown then the result of the operation is obviously unknown.
Following examples demon sate outcome of the operation with nulls :
(20 x 3) + 7 = 67
(Null x 7) + 8 = Null
(20 x Null) + 14 = Null
(23x 4) + Null = Null
Example 7 :
The Product table in following figure illustrates the effects of null values on mathematical
expressions. The value for the Total Value field is derived from the mathematical
expression “[Price] x [Qty on Hand]”. Value for the Total Value field is missing where the
Qty on Hand value is null, resulting in a null value for the Total Value field as well. This
leads to a serious undetected error that occurs when all the values in the Total Value field
are added together: an inaccurate total. The only way to avoid this problem is to ensure
that the values for the Qty on Hand field cannot be null.
4.10 DATA INTEGRITY
Data integrity refers to the validity, consistency, and accuracy of the data in a database.
There are four types of data integrity. The following is a brief description of each:
1. Table-level integrity (known as entity integrity) ensures that there are no duplicate
records within the table and that the field that identifies each record within the table is
unique and never null.
2. Field-level integrity ( known as domain integrity) ensures that the values in each field
are valid, consistent, and accurate; and that fields of the same type (such as NAME fields)
are consistently defined throughout the database.
3. Relationship-level integrity (known as referential integrity) ensures that the records in
the tables are synchronized whenever data is entered into, updated in, or deleted from
either table.
4. Business rules impose restrictions or limitations on certain aspects of a database based
on the ways an organization perceives and uses its data. 4.11 DOMAIN INTEGRITY
An integrity constraint that determines the range of possible values for a domain.field
specification represents all the elements of a field. This type of data integrity warrants the
following: the identity and purpose of a field is clear and all of the tables in which it
appears are properly identified; field definitions are consistent throughout the database;
the values of a field are consistent and valid; and the types of modifications, comparisons,
and operations that can be applied to the values in the field are clearly identified. Each
field specification incorporates three types of elements: general, physical, and logical.
General elements constitute the most fundamental information about the field and include
items such as Field Name, Description, and Parent Table.
Physical elements determine how a field is built and how it is represented to the person
using it. This category includes items such as Data Type, Length, and Display Format.
Logical elements describe the values stored in a field and include items such as Required
Value, Range of Values, and Default Value.
4.12 ENTITY INTEGRITY
Definition
No prime attribute of a relation may hold a null value
Entity constraints ensure the integrity of the entities being modeled by the system. Entity
integrity is an integrity rule which states that every table must have a primary key and that
the column or columns chosen to be the primary key should be unique and not null. If
each value of a primary key must be distinct, no duplicate rows can logically appear in a
table.
Example 8:
Let Account (AccountNo, Name, Type of Account, Amount) with a primary key
AccountNo be a relation in a bank database. Following figure shows its possible instance.

According to definition, NULL value of primary key is not allowed therefore marked row
cannot be recorded into a database.
4.13 Referential Integrity
Referential Integrity ensures the integrity of relationships between primary and foreign
key values in related tables. In a relation between two tables, one table has a primary key
and the other a foreign key. The primary key uniquely identifies each record in the first
table. There can be only one record in the first table with the same primary key value. The
foreign key is placed into the second table in the relationship such that the foreign key
contains a copy of the primary key value from the record in the related table.
Primary and foreign keys are both constraints. A constraint is a piece of metadata (data
catalog) defined for a table defining restrictions on values. A primary key constraint forces
the primary key field to be unique.
The integrity constraints that ensure that relationships between entities remain valid. No
record in the foreign table can contain a foreign key that doesn’t match a record in the
primary table.
Referencing (or referential) foreign key constraints can be in any table, including the same
table as the primary key constrained field referenced by the foreign key (a self join). A
foreign key constraint uses its reference to refer back to a referenced table, containing the
primary key constraint, to ensure that the two values in the primary key field and foreign
key field match.
It allows us cascading update where records in a foreign table are updated automatically
when the corresponding record in the primary table is changed. There are some specific
circumstances to consider in terms of how Referential Integrity is generally enforced:
A primary key table is assumed to be a parent table and a foreign key table a child table.
When adding a new record to a child table, if a foreign key value is entered, it must exist
in the related primary key field of the parent table.
Foreign key fields can contain NULL values. Primary key field values can never contain
NULL values as they are required to be unique.
When changing a record in a parent table if the primary key is changed, the change must
be cascaded to all foreign key valued records in any related child tables.
When changing a record in a child table, a change to a foreign key requires that a related
primary key must be checked for existence, or changed first. If a foreign key is changed to
NULL, no primary key is required. If the foreign key is changed to a non-NULL value, the
foreign key value must exist as a primary key value in the related parent table.
When deleting a parent table record then related foreign key records in child tables must
either be cascade deleted or deleted from child tables first. 4.14 RELATIONAL
ALGEBRA
The relational algebra is a set of eight operators. Each operator takes one or more relations
as input and produces one relation as output. The three main operators of the algebra are
restrict, project and join. Using these three operators most of the manipulation required of
relational systems can be accomplished.
The additional operators – product, union, intersection, difference and division are
modeled on the traditional operators of set theory. There is no standard syntax for the
operators of the relational algebra.
Relational Algebra operations :
select(σ),
project(π),
cross product(×),
union( ),
intersection( ),
difference(—),
join ( )
4.14.1 Restrict
Restrict( select) is an operator which takes one relation as input and produces a single
relation as output. Restrict can be considered a ‘horizontal slicer’ in that it extracts rows
from the input relation matching a given condition and passes them to the output relation.
It is also known as selection operator.It is denoted by a symbol sigma (σ). The notation T
where c orσc(T) is used, where T is a table expression, c is a condition. The selection
operation can use any of the comparison operations (=, !=, <=, <, >=, >).
Syntax for the restrict operator is as follows:
RESTRICT <table name> [WHERE <condition>] ’!<result table>
Example 9:
In the following figure, only those rows are extracted from table product where
Qty on Hand is greater than 10.

Example 10:
To select all rows from product relation/ table where Qty on hand getaer than
10, we can write same as:
σ Qty on hand > 10 ( Product )
4.14.2 Project
The project operator takes a single relation as input and produces a single relation
as output. Project is a ‘vertical slicer’ in that it produces in the output relation a subset of
the columns in the input relation. It is denotedby symbol pi (π). We can represent the
projection operation by T[c1, c2, …], Where T is a table expression and [c1, c2, …] is a
column list, also called a projection list.
Syntax for the project operator is as follows:
PROJECT <table name> [<column list>]<result table>
Example 11:
In the following figure, only three columns (Product ID, Product Name, Price)
are extracted from table product.
Example 12:
To see three fields ( Product ID, Product Name, price) , we can write as:
π product ID,Product Name,Price (Product)
Another example Projection operator
π Product Name(σ Price >1000(student))

4.14.3 Product
Joins are based on the relational operator product, a direct analogue of an operator in set
theory known as the Cartesian product. It is denoted by a symbol cross (x). Product takes
two relations as input and produces as output one relation composed of all the possible
combinations of input tuples/rows. Product is a little-used operator in practice because of
its potential for generating an ‘information explosion’.
Syntax of the product operator is given below:
PRODUCT <table 1> WITH <table 2><result table>
Example13:
PRODUCT Teacher WITH Subject R
In the following figure we have two relation Teacher and Subject.

The output of this product have all combination of tuples/rows of both relations/ tables.
We get a relation of six tuples with six attributes.
4.14.4 Equi-Join
The join operator takes two relations as input and produces one relation as output.
In Equi-Join two tables are combined together but only for records where the values
match in the join columns of two tables. The equi-join operator is a product with an
associated restrict. We shall assume that the primary key of one table and the foreign
key of the other table form the default join columns.
Syntax is:
EQUIJOIN <table1> WITH <table2><result table>
Example 14:
EQUIJOIN Teacher WITH Module R

Here we have joined Teacher and Module, but only produced a row in R where a Teacher
table TeacherID value matches a module table TeacherID value.

4.14.5 Natural Join


An equi-join does not remove the duplicate join column in the resulting table.
You can see in above figure – TeacherID appears twice in resulting table. A natural join
removes one of these join columns. The natural join operator is a product with an
associated restrict followed by a project of one of the join columns
Syntax of Join:
JOIN <table 1> WITH <table 2><result table>
Example 15:
A natural join of Teacher with Module will produce the table R :
4.14.6 Union
Union is an operator which takes two compatible relations as input and produces one
relation as output. By compatible is meant that the tables have the same structure the same
columns defined on the same domains.
Syntax:
<table 1> UNION <table 2> <result table>
Example16:
Teacher UNION Administrator R

Note that although Manoj Kumar appears in both the Teacher and Administrator table, he
only appears once in R. This is because, since R is a relation, it cannot have duplicate
rows.

4.14.7 Intersection
Intersection is fundamentally the opposite of union. Whereas union produces the
combination of two sets or tables, intersection produces a result table which contains rows
common to both input tables.
Syntax of Intersection is:
<table 1> INTERSECTION <table 2><result table>
Example 17:
Lecturers INTERSECTION Administrators R
It give us rows, those are common in lecturer and administrator tables.

4.14.8 Difference
In most operators of the relational algebra, the order of specifying input relations is
insignificant. A union of table 1 with table 2, for instance, is exactly the same as a union of
table 2 with table 1. Using difference, in contrast, the order of specifying the input tables
does matter.
Example 18:
Teacher DIFFERENCE Administrator R
will produce all Teachers who are not administrators
Teacher ID Teacher Name
102 Aman Kumar
103 Nigam
Designation Sr. Lecturer Lecturer
Administrator DIFFERENCE Teacher R will show rows from administrator table who
are not Teachers

4.14.9 Division
Division or divide takes two tables as input and produces one table as output.
One of the input tables must be a binary table (i.e. a table with two columns). The other
input table must be a unary table (i.e. a table with one column). The unary table must
also be defined on the same domain as one of the columns in the binary table. The
fundamental idea of division is that we take the values of the unary table and check
them off against the associated column in the binary table. Whenever all the values in
the unary table match with the same value in the binary table then we output a value to
the output table.
Example 19:
Suppose that we maintain a table( ModuleDay) with a list of days on which
particular modules are taught:

We might want to find a common date on which both Introduction to DBMS and
Programming in C are taught. Hence, our unary table (PairedModule) would be:

If we divide the relation


ModuleDay by the relation PairedModule we would get the following output relation:
4.14.10 Formal notation for the relational algebra
A more formal notation for expressing operations in this formal language is presented in
the table below. Operator Restrict
Project
Union
Difference
Intersection
Natural join
Syntax
The restrict / select operator transforms a single relation R into resulting tuples matching
the specified condition
The project operator transforms a single relation R into a subset consisting of specified
attributes a1, …, an
The union operator produces output from two relations R and S containing all the tuples of
R or S, or both R and S. Duplicate tuples are removed from the output
The difference operator produces output from two relations R and S in which output tuples
exist in R but not S
The intersection operator produces output from two relations R and S in which output
tuples exist both in R and S
A natural join is an equi-join of two relations R and S. One instance of each of the
common attributes in the output relation is projected out
4.15 RELATION SCHEME Definition
σ predicate (R)
π a1, …, an (R)
RS
R–S
RS
R |><| S
It consists of relation name, and a set of attributes or field names or column names. Each
attribute has an associated domain.
Example 20 :
Example 21:
Relational Scheme
student (regID, name, degree, year, deptID, guide )
Here, degree is the program ( BA, MA, LLB, LLM, Ph D etc) for which the student has
joined. Year is the year of admission and guide is the empID of a faculty member.
department (deptID, name, hod, contactNo)
teacher (empID, name, gender, doj, deptID, phone)
Here, dj is the year of joining of the faculty member in the department deptId. course
(courseID, cname, MaxMarks, deptID )
Here, deptID indicates the department that offers the course.
admission(regID, courseID, semester, year, marksObt )
Here, semester can be either “odd”or “even” indicating the two semesters of an academic
year. The value of marksObt will be NULL for the current semester and NONNULL for
past semesters.
teaching (empID, courseId, semester, year, classRoom)
4.15.1 Queries on Relation Scheme
Retrieve the list of male PhD students
σdegree = ‘PhD’^ sex = ‘F’(student)
Obtain the name and regID of all male BA students
πregID, name(σ degree = ‘BA’^ gender = ‘M’ (student))

Obtain the regID of students who got more than 800 marks π regID(student) – π regID(σ
marksObt > 800 (admission))
Obtain the department Ids for departments with no lady teacher π deptID (department) – π
deptID(σ gender = ‘F’(teacher))
Obtain the regID of male students who have obtained at least 800 marks πregID(σ gender =
‘M’(student)) )” π regID(σ marksObt >= 800(admission))

REVIEW QUESTION

Short Answer type Questions


1. Every relation in a database must have a unique name (True / False)
2. Every column in a relation must have a unique name within the relation (True / False)
3. A primary key is used to uniquely identify a record in a table.
4. Table-level integrity also known as entity integrity.
5. Referential Integrity support cascade operations
6. The tables used to compose a view, are called……………..
7. RDBMS stands for……………..
8. CCK stands for……………..
9. CAK stands for……………..
10. …………….. operator can be considered a ‘horizontal slicer’.
11. ……………..operator is a ‘vertical slicer’
12. Examples of an entity are
a. A customer
c. A customer order
d. An employee’s paycheck
e. A customer’s name
13. Examples of an attribute are
a. An employee
b. An employee’s name
d. An alphabetical listing of employees
e. An employee’s birth date
Long Answer type Questions
1. What does a null represent?
2. What is a null’s major disadvantage?
3. What is a view?
(True / False) (True / False) (True / False)
4. What are the three types of relationships that can exist between a pair of tables?
5. What are the three ways in which you can characterize a relationship?
6. What is a field specification?
7. What is data integrity?
8. Name the four types of data integrity.
9. Explain data, entity, domain and referential integrity.
10. What are the differences between candidate and primary key?
11. Write short note on NULL constraints.
12. What is the differences between table and view?
13. Explain various relational algebra operators with suitable example.
14. Explain Key constraints.
15. What is Foreign keys (FK)?

Chapter 5
NORMALIZATION

5.1 NON-LOSS DECOMPOSITION


Non-loss decomposition is process in which relations/tables are divided in such a manner
that they can be recombined without loss of information. The relational model permits us
to join relations/tables in various ways by linking attributes. The process of getting a fully
normalized data model includes removing redundancy. Relations are divided in such a
way that the resultant relations can be recombined without losing any of the information.
This is the principle of lossless decomposition.
5.2 FUNCTIONAL DEPENDENCY
Functional dependency is a relationship between two attributes such that the value of one
attribute finds the value in the other. Functional dependency is extremely useful tool for
thinking about data structures. Given any tuple T, with two sets of attributes {X1……….…
Xn} and {Y1…………Yn} , then set Y is functionally dependent on set X if, for any legal
value of X, there is only one legal value for Y.
The functional dependency between sets of attributes can be indicated as shown in figure.
Functional dependencies can be expressed as X (ProductID) Y(Description), which
reads “X functionally determines Y.”
Example 1:
ProductID Description
Example 2: In a college personnel database, Staff_No and Staff_Name are in a functional
determinant relationship. Staff_No is the determinant and Staff_Name is the dependent
data-item. For every Staff_No there is only one associated value of Staff_Name.
For example, 135 may be associated with the value G. P. Singh. This does not mean that
there are not more than one member of staff named G. P. Singh in a college. It simply
means that each G. P. Singh will have a different staffNo. There is a functional
determinancy from Staff_No to Staff_Name but the same is not true in the opposite
direction. Staff_No will probably functionally determine Department_Name. For every
member of staff there is only one associated department that applies.
5.2.1 Transitive dependence
Z is transitively dependent on X when X determines Y and Y determines Z.
Transitive dependence thus describes that Z is indirectly dependent on X through its
relationship with Y.
Here is the transitivity rule restated:
Given X Y
Given Y Z
Then X Z
5.2.2 The Reflexive Rule
If X is a composite, composed of A and B, then X A and X B.
Example:
X = Name, City. Then we are saying that X Name and X City.

The rule,
which seems quite obvious, says if I give you the combination <Sachin, Amritsar>, what
is this person’s Name? What is this person’s City?
5.2.3 The Augmentation Rule
If X Y, then XZ Y. You might call this rule, “more information is not really needed,
but it doesn’t hurt.”
5.2.4 The Decomposition Rule
The decomposition rule says that if it is given that X YZ (that is, X defines both
Y and Z), then X Y and X Z.
5.2.5 The Union Rule
The union rule is the reverse of the decomposition rule in that if X Y and X
Z, then X YZ.
5.2.6 Full functional dependence
Some times X determines Y, but X combined with Z does not determine Y. Y
depends on X and X alone. If Y depends on X with anything else, there is not full
functional
dependence. Essentially X, the determinant, cannot be a composite key. A composite
key contains more than one field (the equivalent of X with Z).
5.2.7 Multiple valued dependency
Not all dependencies can be modeled in terms of functions. Y is said to be nonfunctionally
dependent on data-item X if for every value of data-item X there is a
delimited set of values for data-item Y. This is also known as a non-functional
dependency.
A commonly used example of a multi-valued dependency is a field containing a
commadelimited list or collection of some kind. A collection could be an array of values
of the
same type. Those multiple values are dependent as a whole on the primary key, as a
whole meaning the entire collection in the comma delimited list. A trivial multi-valued
dependency occurs between two fields when they are the only two fields in the table.
5.3 THE CONCEPT OF ANOMALIES
The motive of relational database theory is to remove anomalies from occurring in a
database. Anomalies can occur during changes to a database. Due to anomaly, data can
become logically corrupted. An anomaly with respect to relational database design is
essentially an erroneous change to data, more specifically to a single record. There are
mainly three types of anomalies that can corrupt data
5.3.1 Insert anomaly
Insert anomaly occurs when a record is inserted to a detail table, with no related record
existing in a master table. Addition of a new subject in detail table (see following figure)
requires that the teacher should be added first in master table, assuming, that the teacher
does not already exist.

5.3.2 Delete anomaly


The delete anomaly is just the opposite of the insert anomaly. Delete anomaly occurs when
a record is deleted from a master table, without first deleting all sibling records, in a detail
table. In cascade deletion, deletion of a master record automatically deletes all child
records in all related detail tables, before deleting the parent record in the master table. For
example, referring following figure, deleting a Teacher requires initial deletion of any
Subject that a teacher might have studied. If a Teacher was deleted from master table and
subject were left in the database( in detail table) without corresponding parent teacher, the
detail table records would become known as orphaned records. If you want to delete a
teacher, Pawan Kumar from master table than you have to delete all the records associated
to Pawan Kumar in detail table before the deletion from master table.
5.3.3 Update anomaly
Some times an update of a single data value requires multiple rows of data to be updated.
This situation is called update anomaly. Let us take an example of invoice, if a user
wanted to change the customer’s address, he would have to change it on every single
invoice for the customer. This is because the customer address would be redundantly
stored in every invoice for the customer. Redundant data needs updation of many copies of
the data, but miss a few of them, which results in inconsistent data. So a skilled database
designer capture an attribute at once, store it once, and use that one copy everywhere.
5.4 NORMALIZATION
Normalization is the process of decomposing and arranging the attributes in a schema,
which results in a set of tables with very simple structures. The main task of normalization
is to make tables as simple as possible. Normalization removes duplication of records and
minimizes redundant chunks of data. Database could be made wellorganized. Physical
space of a computer system can be used very effectively. A database that has undergone in
the process of normalization is one that has been divided into smaller tables to reduce
redundancies and aid in the management of the stored data. The definitions of each level
of normal forms and the process required to arrive at each one are covered in the sections
that follow.
5.4.1 Benefits of normalization
Save disk storage space — The primary reason of the use of normalization is to reduce
redundancy in a database. Duplicate records occupy more space on a computer. Disk space
is expensive, and minimizing redundancy reduces disk storage requirements, which save
an expensive hardware resource. Because normalization removes duplicate records, disk
storage space can be saved.
Easy to maintain — Updating redundant data can become a cumbersome task. For
example if a person changes his mobile number, updation of the number becomes an easy
task, if it resides in only one place. When data exists in multiple locations, it can become
inconsistent if all fields are not updated simultaneously.
Lessen input/output (I/O) activity — Redundant data may require the manipulation of
large data blocks. Writing on disk and reading from disk (input/output) operations are the
slowest processes within a database. Reducing R/W operations can improve database
performance.
Querying and reporting — Querying or reporting operations in un-normalized database
is quite difficult. When first and last names are combined into a single field, then parsing
of the name field by last name becomes cumbersome.
Security control — Normalization makes security more easier because the DBA can
restrict table access to limited number of users. For example, within an staff directory, all
staff members may be permitted to view the name, home address, city, date of birth and
phone number of other staff members , but not their pay, perks, leaves and other
information.
5.5 DEFINITIONS OF NORMAL FORMS
There are six levels of normalization associated with six normal forms. A database’s level
of normalization is, therefore, referred to as its normal form. The normal form is a method
of measuring the level or depth to which a database has been normalized. The rules of
normalization are cumulative in nature. Each subsequent level depends upon
normalization steps taken in the previous normal form. A database must first be in the first
normal (1NF) form before it can be normalized to the second normal form(2NF), as we
will see in the following definitions and examples. The rules for normalizing tables in a
database are explained as follows.
5.5.1 First Normal Form (1NF): Eliminating Repeating Data
Definition — A relation is in first normal form if and only if every non-key attribute is
functionally dependent upon the primary key
First Normal Form (1NF) removes repeating groups such that all records in all tables can
be identified uniquely by a primary key in each table. All fields other than the primary key
must depend on the primary key. 1NF does the following:
1. Create a primary for each row of table.
2. Create a new table to move the repeating groups from the original table.
3. All fields other than the primary key must depend on the primary key, either directly or
indirectly.
4. All fields must contain a single value.
5. All values in each field must be of the same datatype.
The data-set shown in tabular form in following figure is said to be an unnormalised data
set. This can be seen, if the data-item SubjectName is chosen as the key of this data-set
and underline it to indicate this.

A given cell of the table for the attributes StudentID, StudentName and MarksObt have
multiple values. In above table StudentID, StudentName and MarksObt all repeat with
respect to SubjectName. The attributes StudentID, StudentName and MarksObt are clearly
not functionally dependent on the primary key SubjectName. The attributes TeacherID and
TeacherName clearly are. This means that we have to form two tables: one for the
functionally dependent attributes, and one for the non-dependent attributes. SubjectName
and StudentID are declared as primary key of this second table Results.

5.5.2 Second Normal Form (2NF): Eliminating Partial Dependencies Definition — A


relation is in second normal form if and only if it is in first normal form and every non-key
attribute is fully functionally dependent on the primary key
In Second Normal Form (2NF) all non-key values must be fully functionally dependent on
the primary key. To move from first normal form to second normal form, remove part-key
dependencies. No partial dependencies are allowed in 2NF. A partial dependency exists
when a field is fully dependent on a part of a composite primary key. 2NF does the
following:
1. Because normalization rules are cumulative so the table must be in 1NF.
2. Each column in a table must depend on the whole key for that table. Non-key fields
those are not completely and individually dependent on the primary key, are not allowed.
3. Partial dependencies must be removed. It is of special type of functional dependency
that exists when a field is fully dependant on a part of a composite primary key.
4. Form a new table to separate the partially dependent part of the primary key and its
dependent fields.
2NF performs a similar task to that of 1NF, but forms a table where repeating values rather
than repeating fields are removed to a new table. The result is a many-toone relationship is
created between the original and the newly created tables. The newly created tables gets a
primary key consisting of a single field.
Let us take an example of the table named Results. We have a two-part compound key
SubjectName and StudentID. All the items are required of the key to tell us what is the
MarksObt from Results table. SubjectName has no influence on the StudentName.
StudentID alone finds StudentName. We break out the determinant and dependent data-
items into their own table. This moves to a decomposition of the tables as follows:
5.5.3 Third Normal Form : Eliminating Transitive Dependencies
Definition — A relation is in third normal form if and only if it is in second normal form
and every non-key attribute is non-transitively dependent on the primary key
Third Normal Form (3NF) eliminates transitive dependencies, meaning that a field is
indirectly determined by the primary key. An attribute that depends on another attribute
that is not the primary key of the relation is known as transitively dependent. This is
because the field is functionally dependent on another field, whereas the other field is
dependent on the primary key.
3NF does the following:
1. The table must already be in the 2NF.
2. Eliminate transitive dependencies (that is, all the non-key attributes depend only
on the primary key).
3. Any kind derived data is allowed, such as total columns. Derived columns are
defined in terms of other columns, rather than in terms of specific attributes. 4. Form a
new table to contain any separated fields.
To transform a second normal form relation into third normal form, move
transitively dependent attributes to relations where they depend only on the primary key.
To move from second normal form to third normal form, inter-data dependencies have to
be removed. Every table is being examined. Is the value of data-item A dependent on the
value of data-item B, or vice versa? If so the relevant data-items are spitted off into a
separate table. In previous example SubjectName is divided into Subject and
TeachingStaff tables. Here, TeacherID determines TeacherName. TeacherName is
transitively dependent on SubjectName. TeacherID is therefore asking to be a primary key.
A separate table is formed, called as TeachingStaff with TeacherID as the primary key.
This is demonstrated below:
5.5.4
Boyce/Codd Normal Form
In Boyce/Codd Normal Form (BCNF) every determinant in a table is a candidate key. A
field whose value may depend on other fields is called determinant. BCNF is a powerful
normal form than third normal form. It is designed to cover those anomalies that may arise
when there is more than one candidate key in some set of data requirements. Boyce/Codd
normal form states, essentially, that there must be no functional dependencies between
candidate keys.
BCNF is often considered an extension or variation of the 3NF because it addresses
situations where multiple, overlapping candidate keys exist. The following conditions
must hold true:
1. A table must be in 3NF.
2. All the candidate keys are composite keys made up of more than one column.
3. The relation must have two or more candidate keys.
4. The candidate keys each have duplicate columns, that is, at least one column in
common with another candidate key.
Let us take an example, the relation shown in following figure. The relation is in third
normal form (assuming supplier names are unique), but it still have some redundant data.

The two candidate keys in this example are (SupplierID, ProductID) and ( SupplierName,
ProductID) , and the functional dependency is illustrated in following figure.
So there is a functional dependency {SupplierID} { SupplierName}, which is in
violation of BCNF . A write model of BCNF is shown in following figures.

5.5.5 Fourth Normal Form


Fourth Normal Form (4NF) removes multiple sets of multivalued dependencies.
The Fourth Normal Form (4NF) applies to one-to-many and many-to-many relationships
and states that independent entities cannot be stored in the same table within those
relationships. Any multiple sets of data can be divided that are not directly related and
separate them into independent tables. Independent repeating groups should not be
combined in a single relation. A table is therefore in 4NF when it:
1. A table must be in 3NF or BCNF with 3NF.
2. Does not contain more than one multi-valued dependency
3. Multi-valued dependencies must be transformed into functional dependencies. This
implies that one value and not multiple values are dependent on a primary key.
Let us discus an example where an organization has completely unnormalized data as
shown in following table.

In the above figure, the multi-valued dependency is {ProductName} {Size}| {Supplier},


it means ProductName multi-determines Size and Supplier.” According to 4NF, multi-
valued dependencies must be divided into separate relations, as shown in following Figure
figures. A relation is in fourth normal form if it is in BCNF, and also, all the multi-valued
dependencies are also functional dependencies out of the candidate keys. A multiple
valued set is a field containing a comma-delimited list or collections of some kind. A
collection could be an array of values of the same type.
5.5.6 Fifth Normal Form
5th Normal Form (5NF) removes cyclic dependencies. 5NF is also known as
Projection Normal Form (PJNF). 5NF is an case of join dependencies. A join dependency
shows the cyclical constraint “if Entity A is connected to Entity B, and Entity B is
connected
to Entity C, and Entity C is aconnected back to Entity A, then all three entities must
necessarily coexist in the same tuple.” In other words , a fourth normal form table is in
fifth normal form if it cannot be non-loss decomposed into a series of smaller tables.
The last stage of normalization is Fifth Normal Form (5NF). 5NF normalization is
achieved when:
1. A table must be in 4NF.
2. A table is in 5NF if it cannot be made into any smaller tables with different keys
and the original table must be able to be reconstructed from the tables into which it has
been broken down without any loss of data.
Following tables shows data that is not fit to 5NF. Decomposing is required to break the
relation into three distinct relations.
Following figures show, decomposition of a single three-field composite primary key
(ProjectDescription, Employee, Manager) table to three semi-related tables, each
containing two-field composite primary keys.
Following figure shows the actual data structures that reflect 5NF structure.
REVIEW QUESTION

Short Answer type questions


1. BCNF stands for……………
2. 5NF stands for…………………
3. A 3NF relation should be in……………
4. A relation is in first normal form if the domains on which its attributes are defined are
scalar. (T/F)
5. The ability to divide relations in such a manner that they can be recombined without
loss of information. (T/F)
6. Functional dependency is a relationship between two attributes such that the value of
one attribute determines the value in the other. (T/F)
7. Integrity constraint is a data integrity rule (T/F)
8. join dependency is not a cyclical relationship between three relations (T/F)
Long answer type questions
1. What normalization is, its benefits and potential hazards
2. What do you mean by Non-loss Decomposition? Explain with an example.
3. What is functional dependency?
4. What is the difference Transitive dependence and Multiple valued dependency?
5. Explain term anomalies and it types.
6. Explain different types of normalization.
7. Explain process of normalization.

Chapter 6
DATABASE ACCESS AND SECURITY

6.1 Database security


Security has become an essential consideration in modern systems. The foundation of any
security system is user authorization and authentication. This is the process by which a
user is validated to ensure that he is allowed to perform the operation on database. Some
DBMSs integrate with operating system security for this, others maintain their own user
and password lists, and still others integrate with external directory services servers.
Enforcing security restrictions and implementing a security scheme are the responsibility
of the DBMS software. Each time the DBMS retrieves, inserts, deletes, or updates data, it
does so on behalf of some user. The DBMS permits or prohibits the action depending on
which user is making the request. Security is usually applied to tables and views, but other
objects such as forms, application programs, and entire databases can also be protected.
Most users will have permission to use certain database objects but will be prohibited
from using others. Privileges are the actions that a user is permitted to carry out for a
given database object
6.1.1 Why Is Security Necessary?
Here are some reasons why security must be designed into your computer systems:
Databases access through the Internet, or through any network, are vulnerable to hackers
and other criminals those can damage or steal the data.
Individuals or competitors may be interested in whatever they can obtain that has
economic value.
The emotionally unbalanced, and just plain evil people.
Employees of an organization commit frauds some times. In this case internal security to
database is must.
Honest mistakes by authorized users can cause security exposures, loss of data, and
processing errors.
Hackers are interested in a sense of notoriety from penetrating organization systems.
6.2 Access control
Access control is the ability to permit or deny the use of a particular resource by a
particular entity. Access control mechanisms can be used in managing physical resources.
Every organization prescribes the security policies and procedures that must be followed.
We explore these layers in the sections that follow.
Access control Auditing
Authentication Encryption
Physical Security Network Security
System-Level Security
Auditing
Evaluation of a person, organization, process, project, system, or product is known as an
audit. Audits are performed to ascertain the validity and reliability of information. It also
provide us an assessment of a system’s internal control. The main motive of an audit is to
give an opinion on the person / organization/system , under evaluation based on work
done on a test basis.
Authentication
Authentication involve confirming the identity of a user, the origins of an artifact,
or assuring that a computer program is a trusted one.
Encryption
encryption is a process in which information, using an algorithm is presented
in such a form that it is useful of a user and useless for others. To make it readable for a
user special arraganment is made with keys. The result of the process is encrypted
information (in cryptography, referred to as ciphertext). the word encryption also
implicitly refers to the reverse process, decryption to make the encrypted information
readable.
Physical Security
In the case of physically security database servers should be kept in a locked
room where only authorized personnel have access. Depending on the sensitivity and
value of the data in the database, the following additional measures might be needed: In
an organization, there should be proper video surveillance system. DBA can take help
from “Token” security devices. For example cards or keys that must be inserted into the
server in order to gain access where a pin code must be required to obtain a password.
Administrators can take help biometric devices, like a user must pass a fingerprint or
retinal scan to obtain database access.
As a policy matter, user’s entry should be restricted into server room. As a policy
matter the removal of any hardware, such as tapes and disks should be strictly prohibited.
Network Security
Physical security is not enough when the database server is accessible via a
network. Intruders can manage to obtain a network connection to the database server.
Holistic approach should be required to provide network security beside it must be
ensured that every computer system attached to that network is equally secure.
System-Level Security
After providing network security to a system, the next area of focus is the system
that will run the DBMS. A poorly secured database server can create problems for an
organization. Following are some measures to secure database:
Operating system software should be minimal: To get work done quickly, install only the
minimal software components. While installing operating system use the “custom”
installation option to choose only the required components. Operating system services
should be minimal: Software/ services those are not required at start up make them
disable.
Installing of DBMS software should be minimal: Install DBMS software with fewer
features with custom installation option. It leads to less problems such as buffer overflow
vulnerabilities.
Security patches: To keep system updated and secure security alerts are reviewed as they
are announced. Security patches must be applied in a timely manner. For example
changing all default passwords.
FUNCTIONAL DEPENDENCY
Functional dependency is a relationship between two attributes such that the value of one
attribute finds the value in the other. Functional dependency is extremely useful tool for
thinking about data structures. Given any tuple T, with two sets of attributes {X1……….…
Xn} and {Y1…………Yn} , then set Y is functionally dependent on set X if, for any legal
value of X, there is only one legal value for Y.
The functional dependency between sets of attributes can be indicated as shown in figure.
Functional dependencies can be expressed as X (ProductID) Y(Description), which
reads “X functionally determines Y.”
6.3 Transaction
A transaction is a bundle of one or more SQL statements that together form a logical unit
of work. The SQL statements that form the transaction are typically closely related and
perform interdependent actions. Each statement in the transaction performs some part of a
task, but all of them are required to complete the task. Grouping the statements as a single
transaction tells the DBMS that the entire statement sequence should be executed
atomically—all of the statements must be completed for the database to be in a consistent
state.
We use the term transaction to indicate a meaningful atomic operation, that may or may
not be composite. Successful termination is called commitment and a successful
transaction is assumed to terminate with a commit operation. After a successful commit
operation, the changes that the transaction has made to the system state are guaranteed to
persist. This is the durability property of transactions. A transaction management system
must be crash-resilient in order to enforce the property of atomicity of transactions: either
all or none of the operations of a transaction are carried out. If a transaction has not been
committed it cannot be assumed that all its operations are complete. When the system
restarts after a crash it must be able to roll back (undo) the effects of any transactions that
were uncommitted at the time of the crash. This is called aborting a transaction. A
transaction is defined to end with a commit or an abort operation. transaction processing
We now use an example to study serializability. Each data object has an associated set of
operations, in this case:
create a new account;
delete an existing account;
read-balance takes an account name as argument and returns the balance of the
account;
check-balance takes an account name and a value as arguments and returns
true if the balance of the account is greater than or equal to the argument value,
else it returns false;
Credit takes an account name and a value as arguments and adds the argument value to
the balance. Note that the value of the balance is not output to the client;
Debit takes an account name and a value as arguments and subtracts the argument value
from the balance. Note that the value of the balance is not output to the client.
It is assumed here that the client is responsible for checking the balance before doing a
debit. For example, the transfer transaction would contain:
if check-balance (account-A, Rs1000) then debit (account-A, Rs1000) … set-interest-
rate (r%) is used to set the daily interest rate to a given percentage; add-interest-to-
balance is run daily by the system administration. This operation
computes the interest accrued to the account, based on its value at midnight, and adds the
interest to the balance.
In order for a set of actions to qualify as a transaction, it must pass the ACID test. ACID is
an acronym commonly used when referring to the four characteristics of a transaction:
Atomic: It refers to the all-or-nothing nature of a transaction. Either all operations in a
transaction are performed or none are performed. If some statements are executed and the
transaction fails at any point before it is completed, the results of these executions are
rolled back. Only when all statements are executed properly then the results of that
transaction applied to the database.
Consistent: The database must be consistent at the beginning and at the end of the
transaction. A transaction as a set of actions that moves the database from one consistent
state to another. All rules that define and constrain the data must be applied to that data as
a result of any changes that occur during the transaction. All structures of database must
be correct at the end of the transaction.
Isolated: Data might temporarily be in an inconsistent during a transaction. It should not
be provided to other transactions until the data is once again consistent. No user should be
able to access inconsistent data during a transaction. For a transaction to be isolated, no
other transactions can affect that transaction.
Durable: committed changes must be reserved, and the data should be in a consistent state
and reliable, even if hardware or application errors occur.
6.4 Locking Mechanisms
A lock may deny access to other database session. A lock is a control placed in the
database to protect data so that only one database session may change it. When data is
locked, no other database session can update the data until the lock is released, which is
usually done with a COMMIT or ROLLBACK SQL statement.
Typical lock levels are as follows:
Database: The entire database can be locked so that only one database session may apply
updates. It very useful for maintenance of database, such as upgrading to a new version of
the database software. In Oracle, database is opened in exclusive mode, which restricts the
database to only one user session.
File: An entire database file can be locked. A file can have part of a table, an entire table,
or parts of many tables. This level is less favored in modern databases.
Table: An entire table can be locked. This is useful when a table-wide change are
performed such as reloading the data in the table, updating every row, or altering the table
to add or remove columns.
Block: A block within a database file can be locked. A block is the smallest unit of data
that the operating system can read from or write to a file. The block size is also called the
sector size. Some operating systems use pages instead of blocks. A page is a virtual block
of fixed size, typically 2K or 4K.
Row: A row in a table can be locked. This is the most common locking level. All modern
RDBMS support row level locking.
Column: Some columns within a row in the table can be locked. It is not very practical
because of the resources required to place and release locks at column level of granularity.
6.5 The two-phase Command protocol
Each object is assumed to have lock and unlock operations. We should consider how the
two phases, of acquiring and releasing locks, can be implemented. In a centralized system
the transaction manager knows when locks on all the objects of a transaction have been
acquired and the operations done. The unlock operation can then be invoked on all the
objects.
Phase 1: the commit manager requests and assembles the ‘votes’ for commit or abort of
the transaction from each participating node;
Phase 2: the commit manager decides to commit or abort, on the basis of the votes, and
propagates the decision to the participating nodes. 6.6 Grant and revoke
Some of the RDBMS provides security features to safe data stored in databases from
unauthorized viewing and damage. Appropriate rights are assigned to the users. These
rights are also called Privileges. Various objects like tables, views and sequences are
created by users, designer or DBA are controlled by these persons. If any user wants to
access other objects associated to another user, he/she has to take permission for such
access. Giving permission to another users is called Granting of Privileges. Privileges fall
into two broad categories:
System privileges permit the grantee to perform a general database function, like creating
new user accounts or connecting to the database.
Object privileges permit the grantee to perform specific actions on specific objects, such
as selecting, deleting or updating data from the STUDENT table. Some times, we assign
same privileges to number of users. Group of privilege definitions as a single named
object called a role.
In oracle data, control statements include the GRANT and REVOKE commands. Syntax
of GRANT command:
GRANT < Object privileges >
ON <objectname>
TO < username>
[ WITH GRANT OPTION ];
We have numbers of privileges and we can grant all or only specific privileges. These
privileges are listed below:
ALTER: It is used give an access to a user to change the table definition or structure with
ALTER TABLE command
INSERT: It is used give an access to a user to add new records to the table.
SELECT: It is used give an access to a user to see the data from the table(s).
UPDATE: It allows a user to modify the records in the tables.
DELETE: It allows a user to remove the records in the tables.
GRANT SELECT, INSERT ON STUDENT TO AJIT; Or
GRANT ALL ON STUDENT TO AJIT;
The above statement grants the select and insert privileges on the STUDENT table to user
AJIT. In second statement all data manipulation permission on the table STUDENT are
given to user AJIT.
REVOKE Statement
REVOKE statement is used to withdraw granted privileges to a user on an object.
Syntax of REVOKE command
REVOKE < Object privileges >
ON <objectname>
FROM < username>;
REVOKE SELECT, INSERT, UPDATE ON STUDENT FROM AJIT; SELECT,
INSERT and UPDATE privileges are taken back from user AJIT on table STUDENT.
Short answer type Question
1. Authentication involve confirming the identity of a user (True/False)
2. A transaction is a bundle of one or more SQL statements that together form a logical
unit of work.
3. Atomic refers to the all-or-nothing nature of a transaction
4. An entire table can not be locked
5. GRANT and REVOKE commands are same
(True/False) (True/False) (True/False) (True/False)
6. Enforcing security restrictions and implementing a security scheme are the
responsibility of the ……….. software.
7. Privileges are given to users in SQL using the ……….. statement.
8. privileges can be withdrawn using the ……….. statement.
Long answer type Question
1. What do you mean by database security?
2. What is Access control?
3. Write a short note on Transaction Processing.
4. Explain types of Locking.
5. What the function of Grant and revoke statement.

Chapter 7
SQL USING ORACLE
7.1 Introduction to SQL
SQL, pronounced “sequel,” is an acronym for Structured Query Language. A standards
body called ANSI, the American National Standards Institute, maintains this language.
SQL is a powerful query language that was created as a means to communicate with
databases. Databases store data. SQL can be used to view, manipulate, and create this data.
It can even define the structures that will hold the data. Because SQL is a standards-
controlled language, it is reusable from database to database.
What are the advantages of SQL?
1. SQL is not a proprietary language used by specific database vendors. Almost
every major DBMS supports SQL, so learning this one language will enable you
to interact with just about every database you’ll run into.
2. SQL is easy to learn. The statements are all made up of descriptive English words,
and there aren’t that many of them.
3. Despite its apparent simplicity, SQL is actually a very powerful language, and
by cleverly using its language elements you can perform very complex and
sophisticated database operations.
7.2 SQL Tools
Oracle has provided a user-friendly interactive tool for running SQL since its first release.
The SQL*Plus tool today has four variations from which to choose:
SQL*Plus Command Line. Use this when you don’t have a Windows interface, such as
when using telnet to reach a remote UNIX database server.
SQL*Plus Windows. Use this in a Windows-capable environment (can be invoked using a
network name from a client or directly on the database server, regardless of the operating
system).
SQL*Plus Worksheet. This comes as part of Oracle Enterprise Manager, a Windows-like
user interface created to support the database administrator and simplify many tasks.
iSQL*Plus. This gives you the same interface as SQL*Plus Windows, except it runs in a
Web browser. Use this to run SQL commands and automatically generate a report in
HTML format.
7.3 Data Definition Languages
7.3.1 Creating and Modifying Table Structure
There are generally two ways to create database tables:
Most DBMSs come with an administration tool that can be used to create and manage
database tables interactively.
Tables may also be manipulated directly with SQL statements.
A named schema object defined by a table definition in a CREATE TABLE statement.
Persistent base tables hold the SQL data that is stored in your database. Syntax of Create
Table:
CREATE TABLE tablename (colunname1 colunname2 datatype(size), datatype(size),
…………………………..
colunnamen datatype(size));
Column Data Types
Whenever you define a column in a CREATE TABLE statement, you must, at the very
least, provide a name for the column and an associated data type or domain. The data type
or domain (discussed in Chapter 4) restricts the values that can be entered into that
column. For example, some data types limit a column’s values to numbers, while other
data types allow any character to be entered.
Data Type Description
CHARACTER Specifies the exact number of characters that will be stored for each value.
For example, if we define the number of characters as 15, but the value contains only 10
characters, the remaining five characters will be spaces. The data type can be abbreviated
as CHAR.Example: STUDENT_NAME CHAR(40)
VARYING
CHARACTER
INTEGER
NUMBER
DATE
BOOLEAN
LONG Specifies the maximum number of characters that can be included in a value. The
number of characters stored is exactly the same number as the value entered, so no spaces
are added to the value. For example, if we define the number of characters as 25, but the
value contains only 15 characters, the remaining ten characters space can be saved. The
data type can be abbreviated VARCHAR2.Example: STUDENT_NAME
VARCHAR2(40)
Only integers are accepted. Any parameters with this data type are not specified. It is
abbreviated as INT.Example: PRODUCT_ID INT
It is used to store numbers( fixed or floating). It specifies the precision and the scale of a
numeric value. Only the precision can be specified. Numbers of any magnitude may be
stored up to 38 digits of precision. If precision is omitted values are stored upto maximum
of 38 digits. Example: COST NUMBER(7,2)
This data type is used to represent date and time. It specifies the year, month, and day
value of a date. Date format to represented as DDMON-YY as in 27-MAY-99. Some time
the year is four digits and supports the values 0001 through 9999; the month is two digits
and supports the values 01 through 12; and the day is two digits and supports the values 01
through 31. Time specifies the hour, minute, and second values of a time.Example:
DATE_OF_BIRTH DATE
The Boolean data type is very simple and easy to apply. The data type holds only three
values: true, false, or unknown. A null value valuates to unknown. In Boolean
comparisons, true is greater than false and a comparison involving an unknown (null)
value will return an unknown result.Example: ALLOWED BOOLEAN
This data type is used to store variable length character strings up to 2GB.
Example : Create a table student with following structure:
Column Name
RollNo
Name
DOB
Address
City
Phone
Sol:
CREATE TABLE Student
Data Type Size
Number 4 Character 20 Date
Character 50 Character 15 Number 13
( Rollno number(4) ,
Name varchar2(20),
DOB date,
Address varchar2(50), City char(15), Phone number(13)
);
Example: Create a table Staff with following structure: Column Name
StaffID
Name
DOB
Designation
Department Address
City
Phone
Data Type Size Number 4 Character 20 Date
Character 15 Character 15 Character 50 Character 15 Number 13 Sol:
CREATE TABLE Staff (
StaffID number(4) , Name varcha2r(20), DOB
Designation Department Address
City
Phone
);
date,
varchar2(15), varchar2(15), varchar2(50), char(15),
number(13)
Example : Create a table student_result with following structure:
Column Name Data Type Size RollNo Number 4
Semester Number 1
Marksobt Number 4
Maxmarks Number 4
Remarks Character 15
Sol:
CREATE TABLE Student_result
(
Rollno number(4) ,
Semester number(1) , Marksobt number(4) , Maxmarks number(4) , Remarks char(15));
Creating table with Primary Key or NULL values
A constraint is an option that further defines a table or a column. It will either add more
information to or put certain restrictions on the table or column. The first constraint is
NULL or NOT NULL. This allows you to specify whether or not a column accepts NULL
values. If a column is defined as NOT NULL, some value has to be assigned to that
column or else the database will produce an error. There are also constraints available that
will define primary keys and foreign key relationships. Using these while creating tables
will enable your database to enforce referential integrity. Using the PRIMARY KEY
constraint after a column will designate that column, and that column alone, as the
primary key.
Example : Create a table Staff with Primary Key and NULL values:
Sol:
CREATE TABLE Staff
(
StaffID
Name
DOB
Designation Department Address
City
Phone
);
number(4)
varchar2(20) date
varchar2(15) varchar2(15) varchar2(50), char(15),
char(13)
PRIMARY KEY, NOT NULL,
NOT NULL, NOT NULL,
NOT NULL,
Example : Create a table product_info with Primary Key and NOT NULL Constraints as
per following structure:
Column Name Product_ID
Product_Name Date_of_purchase Qty_on_hand Cost_price
Sell_price
Reorder_limit Specification
Data Type Number
Character Date
Number
Number
Number
Number
Character
Size Constraint
5 PRIMARY KEY
20 NOT NULL
6
7,2
7,2
6
100 Sol:
CREATE TABLE product_info (
Product_ID
Product_Name Date_of_purchase Qty_on_hand Cost_price
Sell_price
Reorder_limit Specification
);
number(5) varchar(20) date,
number(6), number(7,2), number(7,2), number(6), varchar(100) PRIMARY KEY, NOT
NULL,
Exercise: Insert Following data into product_info table

Exercise: Create Customer Table with following structure.


Column
Customer_id
Customer_name Customer_address Customer_city
Customer_state Customer _phone
Datatype Number Character Character Character Character Character Exercise: Create
Order Table with following structure.
Column Datatype
Order_no Number
Order_date date
Customer_id Number
Exercise: Create OrderItem Table with following structure. Column Datatype
Order_ no Number
Order_item Number
Product_ID Number
Quantity Number
Sell_price Item price
Modifying Tables Structure
Modifying the structure of a table is done with the ALTER TABLE command. There are
three basic things that can be modified: columns, column constraints, and table
constraints. ALTER TABLE, statement may varies from one DBMS to another. Making
alteration to existing tables is neither simple nor consistent. A user can add new column or
modify existing column. To change a table using ALTER TABLE, the following
information must be must specified:
The table must exist and the name of the table to be altered after the keywords
ALTER TABLE.
The list of changes to be made.
Adding and Dropping New Column
Syntax:
ALTER TABLE tablename
ADD columnname data type( size);
Or
ALTER TABLE tablename
DROP COLUMN columnname;
Example: Add another column branch to table student
Sol: ALTER TABLE Student
ADD Branch varchar( 15);
Example: Delete a column Phone from table Staff
Sol: ALTER TABLE Staff
DROP COLUMN Phone;
Modifying Existing Column
Syntax:
ALTER TABLE tablename
MODIFY ( columnname newdatatype( new size));
Example: Modify table student on column branch
Sol: ALTER TABLE Student
MODIFY( Branch varchar( 20);
Deleting Tables
Deleting or dropping a table will delete almost everything that is connected to it. All of the
data in that table will be gone. The table structure, indexes, constraints, and privileges will
be lost
Syntax:
DROP TABLE tablename;
Example: Delete table Student and all the data in it.
Sol:
DROP TABLE Student; Renaming Tables
In Oracle RDBMS, RENAME statement is used to change the name of a table. The basic
syntax for all rename operations requires that specify the old name and a new name.
Syntax:
RENAME oldtablename To newtablenam;
Example: Change the name of Student table to Stuent_info. Sol:
RENAME Student To Student_info;
Creating a Table from an Existing Table
The most common way to create a table is with the CREATE TABLE command.
However, some time it is required to create a table similar to the existing table and fill it
with similar data for temporary modification. Oracle RDBMS provides an alternative
method of creating tables, using the same format and data of an existing table.
Syntax:
CREATE TABLE NEW_TABLE(field1, field2, field3) AS
(SELECT field1, field2, field3 FROM OLD_TABLE <WHERE…>
This syntax allows us to create a new table with the same data types as those of the fields
that are selected from the old table. The fields in the new table can be renamed by giving
them new names.
Example:
CREATE TABLE Staff_CSE(StaffID, Name, Designation) AS
(SELECT StaffID, Name, Designation FROM Staff WHERE Department =’CSE’); This
command select rows with StaffID, Name, Designation fields from table
Staff of CSE department and stores selected data into newly created table Staff_CSE.
Inserting Data into Tables
To add record(s) to a table INSERT statement is used. It can be used in several
ways:
insert a single complete row
insert a single partial row
insert the results of a query
INSERT statement needs a table and the values to be inserted into the new row.
If the column names are not specified in the INSERT INTO clause, then there must be
one value for each column in the table and the values must be in the same order as they
are defined in the table.
If the column names are specified in the INSERT INTO clause, then there must be exactly
one value per specified column and those values must be in the same order in which they
are defined in the INSERT INTO clause. However, the column names and values do not
have to be in the same order as the columns in the table definition.
Each value with a character string data type must be enclosed in single quotes. You may
use the keyword NULL (or null) as the data value in the VALUES clause to assign a null
value to any column that allows nulls.
Syntax:
INSERT INTO tablename (columnname1, columnname2,……….. columnnameN,
VALUES(expression1, expression1……………. expressionN);
Example: Insert Following data into Staff Table using SQL:
StaffID Name
101 Manoj Kumar
102 Aman Kumar
103 Nigam
104 Ajay Kumar
105 Rohit
201 Sham
234 Ram
246 Arun
267 Sanjeev
Sol:
23-Jun-66 HOD
DOB Designation
12-May-72 Sr. Lecturer 11-Jan-75 Lecturer 10-Feb-77 Lecturer
9-Jul-82 Lecturer 5-May-76 Lecturer
8-Jun-87 Technician
7-Jan-86 Technician
7-Jan-88 Lab Astt.
CSE H. No 23 JP Nag. ECE H. No 56 SP Nag. ECE H.No 64 SP Nag.
Department Address
CSE H.No 69 SP Nag. CSE H.No 543 JS Nag
EE H.No 558 JS Nag ME H.No 513 JS Nag CSE H. No 29 JP Nag. ECE H. No 65 JP Nag.
City Phone Jalandhar
Ludhiana
Ludhiana
Ludhiana
Amritsar
Amritsar
Amritsar
Jalandhar
Jalandhar
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES ( 101, ‘Manoj Kumar’,‘23-Jun-66’,‘HOD’,‘CSE’,’H. No 23 JP
Nag.’,‘Jalandhar’,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES (102, ‘Aman Kumar’,‘12-May-72’,‘Sr. Lecturer’,‘ECE’,’H. No 56 SP Nag.’,
‘Ludhiana’, ’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(103,’Nigam’, ‘11-Jan-75’,’Lecturer’,’ECE’, ‘H.No 64 SP Nag. Ludhiana’,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(104,’Ajay Kumar’,’10-Feb-77’,‘Lecturer’,‘CSE’,’H.No 69 SP Nag.’ ,
’Ludhiana’ ,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(105,’Rohit’,’9-Jul-82’,’Lecturer’,’CSE’,’ H.No 543 JS Nag’,’Amritsar’,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(201,’Sham’,’5-May-76’,’Lecturer’,’EE’,’ H.No 558 JS Nag’,’Amritsar’,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(234,’Ram’,’8-Jun-87’,’ Technician’,’ME’,’H.No 513 JS Nag’,’Amritsar’,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(246,’Arun’,’7-Jan-86’,’Technician’,’CSE’,’H. No 29 JP Nag.’,’ Jalandhar’,’’);
INSERT INTO Staff (StaffID, Name, DOB,Designation, Department,Address,City,Phone)
VALUES(267,’Sanjeev’,’7-Jan-88’,’Lab Astt.’,’ECE’,’H. No 65 JP Nag.’,’Jalandhar’,’’);
7.3.2 Retrieving Data From Tables
SELECT Statement
The SELECT statement is a specialized way to see the data in a database. Thus, a
SELECT statement is also called a query because it quite literally “queries” or asks
questions of a database. There are several uses for the SELECT statement:
Simple query: A SELECT statement can be used to retrieve data from a table or a group of
related tables. All columns or specify columns can be retrieved. Similarly all rows or
specify rows can be retrieved.
Complex query: A SELECT statement can be embedded within another SELECT
statement. You can write a query within a query.
Create a view or table: A SELECT statement can be used to create a view or a new table.
A view is a stored query that is executed whenever another SELECT statement retrieves
data from the view by using the view in a query. Views are very useful to enforce security
by limiting the columns or rows that particular users are allowed to see.
Insert, update, or delete data: A SELECT statement can be used within the INSERT,
UPDATE, or DELETE statements to make these more flexibility commands.
Retrieving Individual Columns
The required column name is specified right after the SELECT keyword, and the
FROM keyword tells the name of the table from which to get the data.
Syntax:
SELECT columnname FROM tablename;
Example: Show data StaffID column from staff table
SELECT StaffID FROM Staff;
Retrieving Multiple Columns
The same SELECT statement is used to get multiple columns from a table. But in this
case, multiple column names must be written after the SELECT keyword, and each
column must be separated by a comma.
Syntax:
SELECT columnname1, columnname2,………. columnnameN FROM tablename;
Retrieving All Columns
SELECT statements can also be used to see data of all columns. This is done
using the asterisk (*) wildcard character, which means all column in the table.
Syntax:
SELECT * FROM tablename;
Example: Show all records of Staff Tbale.
SELECT * FROM Staff;
Filtering Table Data
It is rare that all the records or data from the table will be required. Database tables usually
contain large amounts of data. Retrieving the data as per user requirement involves
specifying search criteria, known as a filter condition. To methods of filtering data will be
like:
Select rows and all columns of the table(s)
Select few columns and all rows
Select few rows and few columns
Use of WHERE Clause
Data can be filtered by mentioning search criteria in the WHERE clause. The WHERE
clause is written right after the table name. RDBMS shows only those records that meet
the specific condition.
Syntax:
SELECT column1,………… columnN FROM tablename
WHERE search condition;
Example: Retrieve Product_ID, Product_Name from table product_info where the
Cost_price is greater than 500.
Sol:
SELECT Product_ID, Product_Name FROM product_info WHERE Cost_price > 500;
WHERE Clause Operators
Following conditional operators can be used with WHERE Clause
Operator
=
!=
<
<=
>
>=
<>
BETWEEN
IS NULL
Description
Equal
Not Equal
Less Than
Less than or Equal to
Greater than
Greater than or Equal to
Not Equal
Between two specified values Is a NULL value
Example: demonstrates the use of the BETWEEN operator. Sol:
SELECT Product_ID, Product_Name
FROM product_info
WHERE Cost_price BETWEEN 500 AND 2000;
Using the AND/OR Operator with WHERE Clause
Every predicate evaluates to true, false, or unknown. If more than one predicate is
included in the WHERE clause, they are joined together by the OR keyword or the AND
keyword. The OR operator instructs the database management system software to retrieve
rows that match either condition. If AND is used, then predicates on either side must
evaluate to true for the row to pass the filter.
Example: Retrieve data about lecturer of CSE department from staff table.
Sol:
SELECT * FROM staff
WHERE Department = ‘CSE’ AND Designation = ‘Lecturer’ ;
Example: Retrieve data from staff table where designation is lecturer or city is Ludhiana
Sol:
SELECT * FROM staff
WHERE City = ‘Ludhiana’ OR Designation = ‘Lecturer’ ;
IN Operator
It tests whether a data value matches one of a list of target values. IN takes a
comma-delimited list of valid values, all enclosed within parentheses.
Example: Demonstration of IN operator .
Sol:
SELECT Product_ID, Product_Name
FROM product_info
WHERE Cost_price IN ( 500 , 2000) ;
Pattern Matching Test (LIKE)
A simple comparison test can be used to retrieve rows where the contents of a text column
match some particular text. The percentage sign (%) and the underscore/ underbar (_)
characters are pattern-matching wild card characters. A wild card character can match any
character. More specifically, the percentage sign (%) is used as a patternmatching
character representing zero or more characters in a subset of a string.
Example: find staff members with a letter “a” in the fist position of their names. Sol:
SELECT * FROM staff WHERE NAME LIKE ‘a%’; GROUP BY Clause
The GROUP BY clause has a function very different from the WHERE clause. As the
name implies, the GROUP BY clause is used to group together types of information in
order to summarize related data. The GROUP BY clause can be included in a SELECT
statement whether or not the WHERE clause is used. The GROUP BY clause instructs the
DBMS to group the data and then perform the aggregate on each group rather than on the
entire result set.
Example: Show data department wise of staff table .
Sel:
SELECT * FROM Staff
GROUP BY Department ;
HAVING Clause
The HAVING clause is similar to the WHERE clause in that it defines a search
condition. HAVING Supports All of WHERE’s Operators
ORDER BY Clause
It is used to add sorting parameters, allowing rows to be rearranged in a specified order.
Sorting query results helps make your report more readable and useful. For example, it is
very useful to list people in alphabetical order by last name. The ORDER BY clause of the
SELECT statement provides sorting capability. The ORDER BY clause appears after the
WHERE clause. The ORDER BY clause is an optional clause, as is the WHERE clause.
Example: Show data according to names
SELECT * FROM Staff
OREDER BY Name ;
SELECT Clause Ordering
SELECT
Clause
SELECT
WHERE
GROUP BY HAVING ORDER BY
Clause
Description
Columns or expressions to be returned Row-level filtering
Group specification
Group-level filtering
Output sort order
7.3.3 Manipulation on Tables
Operators can be divided into several groups. Arithmetic operators allow things like 1 + 1
or 5 * 3, where + and * are the arithmetic operators. Logical operators allow merging of
multiple expressions.
Arithmetic Operators
Various Arithmetic Operators are used while seeing records from a table.
Following are Data manipulation operators:
+ Addition
- Subtraction ** Exponential * Multiplication
/ Davison
( ) Enclosed Operation Example: Show the contents of the column 110% of the values
contained in the Cost_price of product_info table.
Sol:
SELECT Product_ID, Product_Name, Date_of_purchase,1.1*Cost_price FROM
product_info;
Using Functions
SQL has built-in functions available for your use, just like in application
development tools like Visual Basic, C etc. These functions allow you to extend your
productivity on the database server and not rely as much or at all on an application tool
to perform these tasks. This means you can get the results you want straight from the
server without having to write code in another language to manipulate that data to see
those results.
Aggregate Functions
Aggregate functions give you a single answer based on a set of data passed into
the function. These functions provide a particular statistic about the data set.
COUNT
For instance, a very highly used aggregate function is COUNT. COUNT takes a set of data
and counts the number of items in the set. It returns a single value, the count.
For example.
SELECT COUNT(Product_Name) AS num
FROM product_info;
It is also used to count all the rows of the result set without specifying a column name by
using *. This is used quite often to determine how many rows are in a particular table.
SELECT COUNT(*) FROM product_info;
SUM SUM function is used to return the sum (total) of the values in a specific column.
However, * will not work with SUM as it would not know what exactly you would like
to add together.
SELECT SUM(Cost_price)
FROM product_info;
MAX and MIN
MIN and MAX are short for minimum and maximum. Given a list of values,
they return the smallest value or the biggest value, respectively.
SELECT MAX(Cost_price) AS max_price
FROM product_info;
OR
SELECT MIN(Cost_price) AS min_price
FROM product_info;
AVG
Another aggregate function is the AVG function. This function averages the
values passed in to it. AVG() can be used to return the average value of all columns or of
specific columns or rows. So, using the product_info again, let’s find the average cost of
the Cost_price in the table.
SELECT AVG(Cost_price) AS “Average Cost”
FROM product_info;
Numeric Functions
ABS
ABS is a function used to return the absolute value of the passed-in column value or
expression. You’ll see that positive values come out positive, zero values come out zero,
and negative values come out positive .
SELECT ABS(-35) FROM dual ;
OUTPUT: 35
POWER
It is used to returens ‘m’ raised to ‘nth’ power. ‘n’ must be an integer. SELECT
POWER(4,2) FROM dual ;
OUTPUT: 16
CEILING (or CEIL) and FLOOR
The CEILING and FLOOR functions can be used to find the nearest integer above or
below the supplied value. This works like rounding except you specify which direction the
value will round.
SELECT CEILING(35.31) AS RoundUp, FLOOR(35.31) As RoundDown FROM dual;
OUTPUT: RoundUp Round Down
36 3 5
ROUND
Speaking of rounding, let’s see what the actual ROUND function will do. ROUND is used
to round a value to the precision specified. If you specify a negative number for the
precision, it counts off that value left from the decimal point. If you specify a positive
number, it counts off right from the decimal point. A value of zero places the precision at
the decimal point. If a negative number is specified that is greater than the number for
digits left of the decimal, 0 is returned. The following query demonstrates several results
from specifying different precisions.
SELECT ROUND(13,18,1) FROM dual;
Output: 13.2
SIGN
If you just want to find out if a value is positive or negative, you can use the SIGN
function. If the value is zero, it returns 0. If the value is positive, it returns 1. If the value is
negative, it returns -1.
SELECT SIGN(5.31), SIGN(0), SIGN(-5.31) FROM dual;
Output: 1 0 - 1
SQUARE and SQRT
You can square a value using SQUARE or find the square root using SQRT.
These are straightforward, so we’ll just show you an example.
SELECT SQUARE(3) FROM dual;
Output: 9
Or
SELECT SQRT (14) FROM dual;
Output: 4
String Functions
UPPER and LOWER
Two of the things we want to do the most with string data is change the characters to either
all uppercase characters or all lowe rcase characters. UPPER is used for the former and
LOWER for the latter. As with all expressions, these can be used in any part of the
statement that supports expressions.
SELECT UPPER(Staff_name)
FROM Staff;
LTRIM and RTRIM
Another very useful function pair is LTRIM and RTRIM. These guys trim spaces off the
left or right side of the column or expression passed in. LTRIM trims the left side, whereas
RTRIM does the right side. This becomes important when working with columns of the
data type c h a r. So you can visualize this, let’s plug in values instead of columns.
SELECT RTRIM (‘Hello ‘) + ‘ ‘ + ‘Every’ + ‘ ‘ + ‘Body’;
OUTPUT
--------------------
Hello Every Body
SUBSTRING (or SUBSTR)
SUBSTRING retrieves the requested number of characters but takes a starting position as
a parameter. This allows users to retrieve a portion of a string from the middle. To
demonstrate this, we’re going to use the PhoneNumber column from the Customer table.
The following query will break the phone number into its respective parts and add
formatting.
SELECT SUBSTR (‘Hello Every Body’,2,4) FROM dual;
Output: ello
LENGTH
LENGTH is a function that can be used to find the length of a column of
expression.
SELECT LENGTH ( ‘Hello’ ) FROM dual;
REPLACE
EPLACE is another handy string function. It can be used to replace a portion of a string
with another value. You use this function by specifying, first, the string that contains the
value you’d like to replace. Then specify the portion that needs replacing.
SELECT Phone, REPLACE(Phone, ‘317’, ‘111’) FROM staff;
DAY, MONTH, and YEAR
DAY, MONTH, or YEAR functions provide the same results as using the DATEPART
function for the day, month, or year, respectively. The following example shows the
syntax.
SELECT DAY(‘6/21/2003’) AS Day,
MONTH(‘6/21/2002’) AS Month,
YEAR(‘6/21/2002’) AS Year;
OUTPUT
Day Month Y e a r
21 6 2 0 0 3
Conversation Function
TO_NUMBER
This function is used to convert ‘char’ value containing number to number
datatype.
TO_CHAR
This function is used to convert NUMBER datatype to a value of Character
datatype.For example:
SELECT TO_CHAR( 42326,’099,999’) From dual;
Output: 042,326
Joining Tables
A join is retrieval of data from more than one table. This section shows you how to merge
rows from multiple tables into a single query. Merging of rows is known as a join. This
section experiments with example SELECT statements containing many different types of
joins. A join is created by the RDBMS as needed, and it persists for the duration of the
query execution. all the tables to be included and how they are related to each other.
Types of Joins
1. Cross-join or Cartesian product: Merges all data selected from both tables into a single
result set.
2. Inner join: Combines rows from both tables using matching column names and column
values. The result set includes only rows that match.
3. Outer join: Selects rows from both tables as with an inner join but including rows from
one or both tables that do not have matching rows in the other table. Missing values are
replaced with null values.
4. Self-join: This joins a table to itself.
5. Equi-joins, anti-joins, and range joins: An equi-join combines table data based on
equality (=), an anti-join matches data based on inequality (!=, <> or NOT), and a range
join compares data using a range of values (<, > or BETWEEN).
Example: Select Student_ID, Name, Semester , MarksObt and Maxmarks from Student,
Student_result table .
Sol: SELECT Student_ID, Name, Semester , MarksObt , Maxmarks FROM Student,
Student_result WHERE Student. Student_ID = Student_result. Student_ID; Example:
Select Student_ID, Name, Semester , MarksObt and Maxmarks from Student,
Student_result table where MarksObt are more than 600.
Sol: SELECT Student_ID, Name, Semester , MarksObt , Maxmarks FROM Student,
Student_result
WHERE Student. Student_ID = Student_result. Student_ID
AND Student_result. MarksObt > 600;
Using Table Aliases
An alias is an alternative name for a field,table or value. Aliases are assigned with the AS
keyword.
Example:
SELECT Customer_name, Customer_address
FROM Customer C, Order O
WHERE C. Customer_id = O. Customer_id
Cross-Join
A cross-join merges all data from all tables into a single result set regardless of
matching column names or their values.
Example:
SELECT COUNT(*) FROM Student;
SELECT COUNT(*) FROM Staff;
SELECT COUNT(*) FROM
(SELECT St.Name, S.fname FROM Student St, Staff Sf); If Student table has 20 rows and
Staff table has 10rows.
The cross-join has 200 (20 * 10) rows.
Self Join
Referring the same table more than once in a single SELECT statement is called
self join
Example:
SELECT Cr1. Customer_id, Cr1. Customer_name, Cr1.cust_contact FROM Customer
Cr1, Customer Cr2
WHERE Cr1. Customer_name = Cr2. Customer_name AND Cr2. Customer _contact = ‘
Vikas’;
Combined Queries
SQL queries can be combined using the UNION operator. Results multiple SELECT
statements can be combined into a single result set.
Example:
SELECT Customer_name, Customer _contact
FROM Customer
WHERE Customer _state IN (‘UP’,’PB’,HP’)
UNION
SELECT Customer_name, Customer _contact
FROM Customer
WHERE Customer_name = ‘Ajit’;
7.3.4 Subqueries
Queries within a query are referred to as subqueries. A subquery is a SELECT statement
nested inside another query. It returns a value to the containing query for evaluation. The
query containing the subquery is referred to as the outer query. A subquery is often
referred to as an inner query, inner select, s u b select , or nested query. In a SELECT
statement, subqueries can appear in the SELECT, W H E R E , and HAVING clauses. The
statement having a subquery is called parent statement. The parent statement uses the
result returned by the subquery.
It’s also possible to write a subquery within a subquery. This is called nesting. The
outermost query is the first level, and each level of query below that is a nested level. You
can go several levels deep with nested queries. The breaking point is determined by how
complex the query is and how powerful a system you are using for the RDBMS.
Subqueries within the WHERE Clause
This nested query used to execute another SELECT statement, whose result will
then be evaluated with the remainder of the Boolean expression on that line.
Example: retrieve a list of first semester student who score more than 600 marks. Sol:
SELECT Name FROM Student
WHERE RollNo =
(SELECT RollNo FROM Student_result WHERE Semester =1 AND MarkObt > 600);
Example:
SELECT Customer_name, Customer_city FROM Customer
WHERE Customer_id IN (SELECT Customer_id
FROM Order
WHERE Order_no IN (SELECT Order_no FROM OrderItem
WHERE Product_ID = 1003));
Updating Data
The UPDATE statement is used to modify data of a table. UPDATE can be used
in two ways:
update few rows in a table
update all rows in a table
Example: StaffID 105 now has changed his City to Patiala, and so his record needs
updating.
Sol:
UPDATE Staff
SET City= ‘Patiala’ WHERE StaffID = 105;
Example: Updating multiple columns. Sol:
UPDATE Staff
SET City = ‘Patiala’, Address = ’H.No. 56 Jaswant Nag’ WHERE StaffID = 104;
Example: Update table student_result put value 0 in field Marksobt where Remarks are
‘Result Late’ Marksobt
Sol:
UPDATE Student_result
SET Marksobt = 0
WHERE Remarks = ‘Result Late ‘;
Deleting Data
The DELETE statement is used to remove row(s) from a table. DELETE can be
used in two ways:
delete few rows from a table
delete all rows from a table
Delete command deletes rows only not columns
Example: Delete row from Student where RollNo is 701
Sol:
DELETE FROM Student
WHERE RollNo = 701;
Example: Delete Many Rows from Student_result table where Marksobt less than equal to
300
Sol:
DELETE FROM Student_result
WHERE MarksObt <= 300; Example: Delete All Rows of Student_result table. Sol:
DELETE FROM Student_result Example: Delete table Student_result
Sol:
DROP TABLE Student_result;
7.3.4 Maintaining Database object
Creating and Using Views
A view is a saved query that can function in many of the ways a table does. In
other words views are virtual tables. Views can be used to control security. We can even
create a view that comprises more than one table. views must have unique name. CREATE
VIEW statement is used to form a view. To remove a view, the DROP statement is used.
Here are some of the more common reasons for creating a view:
Security: Users are permitted to use the view, but not the base tables.
Simplicity: a view can be created by combining tables that have complex relationships so
users writing queries do not need to understand the relationships.
Complex Joins: Sometimes queries cannot be done without great difficulty unless you
create a view in something like a temporary table first.
The general syntax is pretty simple.
CREATE VIEW view_name
AS
select_statement
Example: Form a view CustomerList of all customers who have ordered any product. Sol:
CREATE VIEW CustomerList AS
SELECT Customer_name, Customer_address, Product_ID
FROM Customer, Order, OrderItem
WHERE Customer. Customer_id = Orders. Customer_id
AND OrderItem. Order_no = Order. Order_no;
Understanding Indexes
Indexes provide fast access to data. Indexes improve the performance of retrieval
operations of indexes. Another use of Indexes are data filtering and data sorting. Multiple
columns can be defined in an index A SQL index is created on a column or a group of
columns within a table. The syntax to create an index.
CREATE [UNIQUE] INDEX indexname
ON tablename (columnname [ASC | DESC] [,…n]);
Dropping an Index
Index can be deleted at any time. No data will be lost from the table, and the
index can be recreated at any time.
The syntax to create an index.
DROP INDEX tablename.indexname;
Example: create an index on the Customer_name column from the Customer table. Sol:
CREATE INDEX idx_LastName
ON Customer (Customer_name);
Example: Remove an index on the Customer_name column of the Customer table.
Sol:
DROP INDEX Customer .Customer_name;
Triggers
A trigger is very similar to a stored procedure. a trigger is executed automatically in
response to a specific event. A trigger can be defined to automatically “fire” whenever an
INSERT, an UPDATE, or a DELETE command is issued on a particular table. We can do
following things with triggers
Write to an audit log when rows are changed. Synchronize changes to a backup
database. Cascade changes and maintain referential integrity. Enforce complex data
validation and business rules.
Example: create a trigger that converts the Customer_state column in the Customer table
to uppercase on all INSERT and UPDATE operations.
Sol:
CREATE TRIGGER customer_st
AFTER INSERT OR UPDATE
FOR EACH ROW
BEGIN
UPDATE Customer
SET Customer_state = Upper(Customer_state)
WHERE Customer. Customer_id = :OLD. Customer_id END;
7.3.5 Commit and Rollback
Rollback
A ROLLBACK statement is explicitly defined in the transaction. When the
statement is executed, actions are undone, the database is returned to the state it was in
when the transaction was initiated, and the transaction is terminated. If the ROLLBACK
statement references a save point, only the actions taken after the save point are undone,
and the transaction is not terminated. The program that initiated the transaction is
interrupted, causing the program to abort. In the event of an abnormal interruption, which
can be the result of hardware or software problems, all changes are rolled back, the
database is returned to its original state, and the transaction is terminated. A transaction
terminated in this way is similar to terminating a transaction by using a ROLLBACK
statement.
Commit
A COMMIT statement is explicitly defined in the transaction. When the statement is
executed, all transaction-related changes are saved to the database, and the transaction is
terminated. The program successfully completes its execution. All transaction-related
changes are saved to the database, and the transaction is terminated. Once these changes
are committed, they cannot be rolled back. A transaction terminated in this way is similar
to terminating a transaction by using a COMMIT statement.
Syntax of COMMIT:
COMMIT ;
Syntax of ROLLBACK:
ROLLBACK ;

Example: Use
of Commit Statement
Sol:
DELETE OrderItem WHERE Order_no = 132; DELETE Order WHERE order_no = 132;
COMMIT;
Example: undone performed DELETE operation. Sol:
DELETE FROM Staff; ROLLBACK;
Short Answer type Question
1. SQL stands for………………
2. ……………….. statement is used to form a new table.
3. ……………..statement is used to change the name of a table
4. To add record(s) to a table ……………. statement is used.
5. DDL stands for…………..
6. DML stands for……………..
7. NULL operator tests for value for…………..
8. A table may have multiple primary keys
9. DROP COLUMN statement is used to create a new column
10. ALTER TABLE and CREATE TABLE are same statements
11. The SELECT statement is a way to view the data
12. SQL is a procedural language.
13. Self-join joins a table to itself.
14. A subquery within a subquery is called nested query
15. View and table are same.
Long Answer type Question
1. Write various SQL Tools.
2. Explain various Data Types
(True/ False) (True/ False) (True/ False) (True/ False) (True/ False) (True/ False) (True/
False) (True/ False)
3. How can you create tables with Primary Key or NOT NULL values.
4. Explain Modification of tables Structure with examples.
5. How can you retrieve Data from Tables
6. Write various conditional operator.
7. What is difference between IN and BETWEEN operator.
8. Explain Order By, Group By, Having Clause.
9. Explain Aggregate Functions, Numeric Functions and String Functions
10. How can join tables using SQL?
11. What is sub query?
12. How can you update data using SQL?
13. Write short note on following:
i. Trigger ii. View
iii. Index iv. Commit
v. Rollback vi. Transaction

Chapter 8
INTRODUCTION TO PL/SQL

8.1 What Is PL/SQL?


PL/SQL stands for “Procedural Language extensions to the Structured Query Language.”
PL/SQL truly is an easy language, compared to other programming languages. A computer
language is a particular way of giving sequence wise instructions to a computer.
Procedural refers to a series of ordered steps that the computer should follow to produce a
result. Procedural language includes data structures that hold information that can be used
several times. Program written in such a language use its sequential, conditional, and
repetitive constructs to express algorithms. PL/SQL is in the same family of languages as
BASIC, COBOL Pascal, and C.
It relies on a highly structured “block” design with different sections, all identified with
explicit, self-documenting keywords. PL/SQL is a special version of SQL and closely

related to SQL. PL/SQL is a programming language in its own right; it has its own syntax,
its own rules, and its own compiler.
Oracle’s PL/SQL language has several defining characteristics: It is a highly structured,
readable, and accessible language.
It is a standard and portable language for Oracle development. It is an embedded
language.PL/SQL was not designed to be used as a
standalone language, but instead to be invoked from within a host environment.
It is a high-performance, highly integrated database language.
8.1.1 Advantages of PL/SQL
PL/SQL is tightly integrated with SQL. With PL/SQL, you can use all SQL data
manipulation, cursor control, and transaction control statements, and all SQL functions,
operators, and pseudo columns.PL/SQL fully supports SQL data types. You need not
convert between PL/SQL and SQL data types. For example, if your PL/SQL program
retrieves a value from a database column of the SQL type VARCHAR2, it can store that
value in a PL/SQL variable of the type VARCHAR2.
Branching, conditional checking and looping is possible in PL/SQL.
Entire block of PL/SQL program is sent to RDBMS engine in single step. This can
drastically reduce network traffic between the database and an application. A user can use
PL/SQL blocks and subprograms (procedures and functions) to group SQL statements
before sending them to the database for execution.
PL/SQL also deals with error generated during execution of code. With PL/SQL user
friendly messages can be displayed.
We can declare variables in PL/SQL blocks of code. These variables are used to store
results of a query.
PL/SQL is also used to perform different kind of calculation without using RDBMS
engine.
PL/SQL stored subprograms move application code from the client to the server, where
user can protect it from tampering, hide the internal details, and restrict who has access.
Applications written in PL/SQL can run on any operating system and platform where the
database runs.

8.1.2 SQL*Plus
SQL*Plus is an interactive program that allows you to type in and execute SQL
statements. It also enables you to type in PL/SQL code and send it to the server to be
executed. SQL*Plus is one of the most common front ends used to develop and create
stored PL/SQL procedures and functions.
What happens when you run SQL*Plus and type in a SQL statement? Where does the
processing take place? What exactly does SQL*Plus do, and what does the database do? If
you are in a Windows environment and you have a database server somewhere on the
network, the following things happen:
1. SQL*Plus transmits your SQL query over the network to the database server.
2. SQL*Plus waits for a reply from the database server.
3. The database server executes the query and transmits the results back to SQL*Plus.
4. SQL*Plus displays the query results on your computer screen.
The important thing is that SQL*Plus does not execute your SQL queries. SQL*Plus also
does not execute your PL/SQL code. SQL*Plus simply serves as your window into the
Oracle database, which is where the real action takes place.
8.2 PL/SQL Block Structure
The smallest meaningful grouping of code is known as a block. A block is a unit of code
that provides execution and scoping boundaries for variable declarations and exception
handling. PL/SQL allows us to create without name blocks and named blocks, which are
either procedures or functions. A PL/SQL block has up to four different sections.
Header: Used only for named blocks. The header determines the way the named block or
program must be called. It is optional.
Declaration section: Identifies, variables, cursors, and other objects are declared in this
section those are referenced in the execution and exception sections. It is optional
Example:
DECLARE fname lname
cost
VARCHAR2(30); VARCHAR2(30); NUMBER := 0;
Execution section : This section contains executable statements (SQL and PL/ SQL) that
allow user to manipulate the variables that have been declared in the declaration section. It
is compulsory section.
Example 1:
BEGIN
SELECT firstname, lastname
INTO fname, lname
FROM student
WHERE studentID = 163;
DBMS_OUTPUT.PUT_LINE (‘Student name: ‘||fname|| ‘ ‘||lname); END;
Exception section : This section contains statements that are executed when a runtime
error occurs within the block. Errors can occur due to syntax, logic or validation rule
violation. It is optional section.
Example 2:
BEGIN
SELECT tname, lastname
INTO fname, lname
FROM Student
WHERE studentID = 163;
DBMS_OUTPUT.PUT_LINE (‘Student name: ‘||fname|| ‘ ‘||lname); EXCEPTION
WHEN NO_DATA_FOUND THEN
DBMS_OUTPUT.PUT_LINE (‘ no student found with ‘|| ‘Student ID163’); END;
8.3 The PL/SQL Character Set
A PL/SQL program made of a several statements. The precise characters available to
uswill depend on what database character set we are using. For example, following table
illustrates the available characters:
Type Characters
Letters A-Z, a-z
Digits 0-9
Symbols ~ ! @ # $ % * ( ) _ - + = | : ; “ ‘ < > , . ? / ^
Whitespace Tab, space, newline, carriage return
Characters are grouped together into lexical units, also called atomics of the language
because they are the smallest individual components. A lexical unit in PL/ SQL is any of
the following:
Identifier
Literal
Delimiter
Comment
8.4 PL/SQL Terminology
PL/SQL handles with the database and with the procedural world. Here are a few concepts
and terms we want to know.
Keyword: This book uses the term keyword to mean a word that the language recognizes.
In PL/SQL, keywords include BEGIN, END, IF, and RETURN.
Identifier: An identifier is a name for a PL/SQL object, including any of the following:
Constant or variable
Exception
Cursor
Program name: procedure, function, package, object type, trigger, etc. Reserved word
Label
Default properties of PL/SQL identifiers are summarized below:
Up to 30 characters in length
Must start with a letter
Can include $ (dollar sign), _ (underscore), and # (pound sign) Cannot contain any
“whitespace” characters
Some examples of invented identifiers: Total_balance, Total_Cost.
Datatype : A name for a class of values. PL/SQL’s built-in datatypes include NUMBER,
DATE, and VARCHAR2 .
Variable: Variables are named temporary storage locations that support a particular data
type in your PL/SQL program. Some variables can hold only a single thing, like the
number of people who live in Portugal, and some can hold a list of things, like the birth
dates of my family members.
Variable Naming
Like a SQL or database data type, PL/SQL variables must follow the identifier naming
rules:
• A variable name must be less than 31 characters in length.
• A variable name must start with an uppercase or lowercase ASCII letter: A–Z or a–z.
PL/SQL is not case-sensitive.
• A variable name may be composed of 1 letter, followed by up to 29 letters, numbers, or
the underscore (_) character. You can also use the number (#) and dollar sign ($)
characters.
Declaring, declaration, declaration section: Declaring a variable means naming it and
defining its datatype. With few exceptions, variables must be declared prior to use. In
PL/SQL, these designations most often occur in a separate section of the program called
the declaration section. Declarations are not, strictly speaking, “statements” themselves.
String: Some amount of textual data—that is, characters, words, spaces, punctuation, and
sometimes numerals. A string can contain zero, one, or more individual characters. String
values can be stored in variables with the appropriate datatype, such as VARCHAR2.
String values are bounded by single quotes, as in ‘Hello Everybody’.
NULL :A special value that represents the absence of a real value. Let’s further define
NULL:
• NULL is not equal to anything, not even NULL.
• NULL is not less than or greater than anything else, not even NULL.
• NULL means nothing knows, not even NULL.
Boolean : A class of variables and commands for working with the “truth values” of true
and false. Oracle Booleans, which are available in PL/SQL but not SQL, actually have
three possible values, TRUE, FALSE, and NULL.
Literal : A literal is a value that is not represented by an identifier; it is simply a value..
Literals may be string, numeric, or Boolean values. Examples: 106700, TRUE, ‘ Ajit
Singh’
Expression : A formula that evaluates some value at runtime based on one or more other
values. Examples: b + c, NOT done.
Operator : A character or phrase that the language uses to represent some particular
arithmetic, logical, or other function. Examples: +, -, AND, BETWEEN, :=.
Statement: A programmatic instruction to the computer to do something. Every statement
is composed of up to five main elements: literal values, keywords, programmersupplied
identifiers, operators, and a mandatory terminator. Some statements such as IF-THEN-
ELSE incorporate other statements inside them.
Terminator: A special character that you must put after each complete statement and each
declaration. In PL/SQL, the terminator is the semi-colon (;). The terminator announces
“okay, I’m through with this part.” It’s important to realize that the terminator goes only at
the very end of the entire statement, and that the statement may span several lines in the
file.
Block: A sequence of code that includes executable statements and that is bounded by
certain keywords. Virtually all PL/SQL programs incorporate one or more blocks, and
every block encloses one or more statements. Blocks can even be nested inside one
another.
8.5 PL/SQL Input and Output
Most PL/SQL input and output (I/O) is through SQL statements that store data in database
tables or query those tables. All other PL/SQL I/O is done through APIs, such as the
PL/SQL package DBMS_OUTPUT.
DBMS_OUTPUT: Is a collection of procedure and functions, used to collect data in a
system buffer. This data can be retrieve later on. The DBMS_OUTPUT package enables
us to send messages from stored procedures, packages, and triggers.
PUT_LINE: PUT_LINE procedures in this package enable a user to place information in a
buffer that can be read by another trigger, procedure, or package.
Writing first PL/SQL Program
BEGIN
DBMS_OUTPUT.PUT_LINE(‘Hello Everybody’);
END;
This is called an anonymous block, a block with no name. only one statement is
executabled by calling a procedure PUT_LINE, supplied in Oracle’s built-in package
named DBMS_OUTPUT. PUT_LINE is used to print message or data.
Entering PL/SQL Statements into SQL*Plus
SQL> BEGIN
2 DBMS_OUTPUT.PUT_LINE(‘Hello Everybody’);
3 END;
4/
Hello Everybody
PL/SQL procedure successfully completed.
8.6 Common Operators
An operator is a symbol or keyword that the language provides to perform an arithmetic,
logical, or other function . Following table shows common PL/SQL operators Operator
Category
Assignment Arithmetic
Logical
Comparison (of non-nulls)
Comparison (of nulls)
String
Notation Meaning Example
:=
+
-
*
/
**
AND
OR
NOT
=
!=
<
>
<=
>=
IN
BETWEEN
IS NULL
IS NOT NULL
LIKE
Store the value
Addition
Subtraction
Multiplication
Division
Exponentiation
Conjunction
Disjunction
Negation
Equality
Inequality
Less than
Greater than
Less than or equal Greater than or equal Equality disjunction Range test
Nullity test
Non-nullity test
Matching wildcard
|| Concatenation C := b;
c := a + b;
c := a - b;
c := a * b;
c := a / b;
c := a ** b;
a AND b
a OR b
a NOT b
b=c
b!=c
b<c
b>c
b <= c
b >= c
b IN (a, c [, d, … ] )
b BETWEEN a AND c
a IS NULL
a IS NOT NULL
IF student_name LIKE ‘Su%’
name := ‘Ajit ‘ || ‘Singh’;
8.7 Conditional and Sequential Control
8.7.1 IF Statements
The IF statement allows us to design conditional logic into programs. For example take a
following case where we have to decide when one is to be executed.
If the salary is between ten and twenty thousand, then apply a bonus of Rs1,500. If the
salary is between twenty and forty thousand, apply a bonus of Rs1,000. If the salary is
over forty thousand, give the employee a bonus of Rs500.
Some constructs are required in our programming language so that they can respond to all
sorts of situations and requirements, including conditional behavior, such as: “if a is true,
then do b, otherwise do c.”
PL/SQL supports conditional logic with the IF statement:
IF condition THEN Statement 1;
……………
………….….
Statement N;
END IF;
When an IF-THEN statement is executed, a condition is evaluated to either TRUE or
FALSE. If the value of condition is TRUE, control is transferred to the first executable
statement of the IF-THEN construct. If the value of condition is to FALSE, control is
passed to the first executable statement after the END IF statement.
Example 3:
DECLARE
num1 NUMBER := 15;
num2 NUMBER := 10;
temp NUMBER;
BEGIN
IF num1 > num2 THEN
temp := num1;
num1 := num2;
num2 := temp;
END IF;
DBMS_OUTPUT.PUT_LINE (‘num1 = ‘||num1);
DBMS_OUTPUT.PUT_LINE (‘num2 = ‘||num2);
END;
8.7.2 IF-THEN-ELSE Statement
IF-THEN-ELSE format is used when we want to choose between two mutually exclusive
actions. An IF-THEN-ELSE statement allows us to specify two groups of statements.
When the value of condition is TRUE, first group of statements is executed. When the
condition evaluates to FALSE another group of statements is executed. This is shown as
follows:
IF condition THEN
statement 1;
ELSE
statement 2;
END IF;
statement 3;
When the value of condition is TRUE, control is passed to statement 1; when the value of
condition is FALSE, control is transferred to statement 2; After the IFTHEN-ELSE
construct has completed, statement 3; is executed
Example 4:
DECLARE
num1 NUMBER := 15;
num2 NUMBER := 10;
BEGIN
IF num1 > num2 THEN
DBMS_OUTPUT.PUT_LINE (‘ Greater = ‘||num1); ELSE
DBMS_OUTPUT.PUT_LINE (‘ Greater = ‘||num2);
END IF;
END;
8.7.3 ELSIF Statements
An ELSIF statement has the following structure:
IF condition1 THEN
Statement1;
ELSIF condition2 THEN
statements 2;
ELSIF condition THEN
Statements 3;
……………
ELSE
last_statement;
END IF;
Example 5:
DECLARE
num NUMBER := &inputnum;
BEGIN
IF num < 0 THEN
DBMS_OUTPUT.PUT_LINE (num||’ is a negative number’);
ELSIF num = 0 THEN
DBMS_OUTPUT.PUT_LINE (num||’ is equal to zero’);
ELSE
DBMS_OUTPUT.PUT_LINE (num||’ is a positive number’); END IF;
END;
8.7.4 CASE Statements
CASE statement is an understandable and efficient alternative to a long series of
IF tests on the same expression. There are two forms of the CASE statement: simple and
searched. Syntax of the so-called simple CASE statement is:
CASE selector
WHEN expression1 THEN statement1;
WHEN expression2 THEN statements2;

WHEN expression m THEN statement m; ELSE statement m+1;
END CASE;
Example 6:
DECLARE
num NUMBER := &input_num; num2 NUMBER;
BEGIN
Num2 := MOD(num,2);
CASE num2
WHEN 0 THEN
DBMS_OUTPUT.PUT_LINE (num||’ is an even number’); ELSE
DBMS_OUTPUT.PUT_LINE (num||’ is an odd number’); END CASE;
DBMS_OUTPUT.PUT_LINE (‘Good Bye’);
END;
8.8 Looping using PL/SQL
When a statement or group of statements is executed several times, is called looping. In
PL/SQL, there are four types of loops simple loops, WHILE loops, numeric FOR loops,
and cursor FOR loops. While there are differences among the three loop constructs, every
loop has two parts: the loop boundary and the loop body:
Loop boundary: This is composed of the reserved words that initiate the loop, the
condition that causes the loop to terminate, and the END LOOP statement that ends the
loop.
Loop body: This is the sequence of executable statements inside the loop boundary that
execute on each iteration of the loop.
8.8.1 Simple (Infinite) Loop
This loop has simplest loop structure. This is also called an infinite loop. It has the
following syntax:
LOOP
statement 1;
statement 2;
…………….
………….
statement n;
END LOOP;
The EXIT statement is used to terminate a loop when the EXIT condition evaluates to
TRUE. An IF statement is used to evaluated EXIT condition. No it will not be an infinite
loop. It has the following syntax:
LOOP
Statement1;
Statement2;
IF Condition THEN
EXIT;
END IF;
END LOOP;
8.8.2 WHILE Loops
The WHILE loop executes as long as the mentioned Boolean condition evaluates to
TRUE. It has the following syntax:
WHILE condition LOOP
statement 1;
statement 2;
………………
statement n;
END LOOP;
The reserved word WHILE is used to start a loop construct. Then condition is evaluated to
TRUE or FALSE. The result of evaluation decides whether the loop is executed or not.
Statements 1 through n, are executed repeatedly. The END LOOP is a reserved phrase
used to terminate loop construct.
Example 7: Print numbers from 1 to 10
DECLARE
counter NUMBER := 1;
BEGIN
WHILE counter <= 10 LOOP
DBMS_OUTPUT.PUT_LINE (‘ ‘||counter);
counter := counter + 1;
END LOOP;
END;
8.8.3 FOR Loop
A numeric FOR loop is called numeric because it needs an integer value to terminate loop.
The numeric FOR loop is the traditional and familiar “counted” loop. The number of
iterated of the FOR loop is known when the loop starts; it is specified in the range scheme
found between the FOR and LOOP keywords in the boundary. Do not declare the loop
index. PL/SQL automatically and implicitly declares it as a local variable with datatype
INTEGER. It has the following syntax:
FOR loop_counter IN [REVERSE] lowest_number ….highest_number LOOP
statement 1;
statement 2;
…………….
statement n;
END LOOP;
The word FOR indicates the beginning of a FOR loop. loop_counter, is a variable, defines
index variable. This variable is defined by the loop construct. lowest_number and
highest_number are two integer numbers that define the number of repetitions for the
loop. The values of the lowest_number and highest_number are evaluated once, for the
first execution of the loop. It is determined how many times the loop will be repeated.
Statements 1 through N are a sequence of statements that is executed repeatedly. END
LOOP is a reserved phrase that indicates the end of the loop.
Example 8: Print numbers from 1 to 10
BEGIN
FOR counter IN 1..10 LOOP
DBMS_OUTPUT.PUT_LINE (‘ ‘||counter);
END LOOP;
END;
8.9 SQL in PL/SQL
A transaction in Oracle is a series of SQL statements that have been bundled together into
a logical unit.. PL/SQL is tightly coulped with the Oracle database via the SQL language.
From within PL/SQL, a user can execute any Data Manipulation Language statements like
INSERTs, UPDATEs, DELETEs, and, of course, queries.
INSERT statement
We have the syntax of the two basic types of INSERT statements: Inserting a single row
with list of values
INSERT INTO table [(column 1, column 2, …, column n)]
VALUES (value 1, value 2, …, value n);
Inserting one or more rows into a table as received by a SELECT statement against one or
more other tables:
INSERT INTO table [(column 1, column 2, ………….., column n)] AS
SELECT ……..;
Example:
BEGIN
INSERT INTO Staff VALUES(289,’Rajiv’,’7-May-78’,’Lab Astt.’, ’ECE’, ’H. No 75 JP
Nag.’, ’Amritsar’, ’’);
UPDATE statement
One or more columns can be updated in one or more rows using UPDATE. It
has the following syntax:
UPDATE tablename
SET column = val1
[,column2 = val2, … columnN = valN]
[WHERE where clause];
DELETE statement
DELETE is used to remove one, some, or all the rows in a table. It has the
following basic syntax:
DELETE FROM table
[WHERE where-clause];
Example9:
DECLARE
Rollnum NUMBER := 110;
IS
BEGIN
DELETE FROM Student WHERE RollNo = Rollnum;
END;
Variables Initialization with SELECT INTO
SELECT INTO statement is used to initialize variables. It has the following syntax:
SELECT itemname INTO variablename
FROM tablename;
Example10:
DECLARE
studentname VARCHAR2(20);
BEGIN
SELECT Name INTO studentname
FROM student WHERE RollNo = 101;
DBMS_OUTPUT.PUT_LINE(‘The name of student is ‘|| studentname );
9.10 Stored Procedure
SQL acts as the interface to the database. A client program, whether it exists on the same
computer or on another, makes a connection to the database, sends a request in the form of
SQL to the server, and in return gets back structured data,
So a stored procedure is a program that resides inside an Oracle database that manipulates
data in the database before the data is used outside the database. A procedure is a module
performing one or more actions; it does not need to return any values.
Procedure is made up of:
1. A declarative part
2. An executable part
3. An optional exception handling part
Declarative part: It may include the declarations of variables, constants, cursors,
subprograms and exceptions. These are local to procedure and no existence out side a
procedure.
Executable part: this part includes SQL and PL/SQL statements. This block manipulates
data and result is to be returned back to the calling portion.
Exception handling part: PL/SQL provides a feature to handle the Exceptions which occur
in a PL/SQL Block known as exception Handling. Using Exception Handling we can test
the code and avoid it from exiting abruptly. When an exception occurs a messages which
explains its cause is received. When an exception is raised, Oracle searches for an
appropriate exception handler in the exception section.
Use of Stored Procedures
Here is a list of reasons to use stored procedures:
They eliminate the network.
They allow us to more accurately model the real world in database. They provide us with
access to functionality that is not available through the standard database interface: SQL.
Using PL/SQL, you can write stored procedures for the following:
Data processing
Data migration
Entity behavior, including so-called business rules
Interfaces
Reports
The general format of a PL/SQL procedure is as follows:
CREATE OR REPLACE PROCEDURE name
[(parameter[, parameter, …])]
AS
local declarations
BEGIN
executable statements
[EXCEPTION
exception handlers }
END [name];
8.11 Functions
Functions are another type of stored code. They are very similar to procedures.
Function is a PL/SQL block that returns a single value. Because a function returns a
value, it is said to have a datatype and the return value datatype must be declared in the
header of the function. The function does not necessarily have any parameters, but it
must have a RETURN value declared in the header.
Function is made up of:
1. A declarative part
2. An executable part
3. An optional exception handling part
Here are a few more differences between a procedure and a function: A function MUST
return a value
A procedure cannot return a value
Procedures and functions can both return data in OUT and IN OUT parameters
The return statement in a function returns control to the calling program and returns the
results of the function
The return statement of a procedure returns control to the calling program and cannot
return a value
Functions can be called from SQL, procedure cannot
Functions are considered expressions, procedure are not
The syntax for creating a function is:
CREATE [OR REPLACE] FUNCTION function_name (parameter list) RETURN
datatype
IS
BEGIN
<body of the function>
RETURN (return value);
END;
Example11:
CREATE OR REPLACE FUNCTION staff_func
RETURN VARCHAR(20);
IS
staff_name VARCHAR(20);
BEGIN
SELECT Name INTO staff_name
FROM Staff WHERE StaffID = ‘101’;
RETURN staff_name ;
END;
Execution of PL/SQL Function
A function can be executed in the following ways. 1) Since a function returns a value we
can assign it to a variable.
employee_name := staff_func ;
If ‘employee_name’ is of datatype varchar we can store the name of the employee by
assigning the return type of the function to it.
2) As a part of a SELECT statement
SELECT staff_func FROM dual;
3) In a PL/SQL Statements like,
dbms_output.put_line(staff_func);
This line displays the value returned by the function.
Short Answer type question
Fill in the blanks
1. PL/SQL stands for ………….
2. …………. is used to display messages to the user.
3. The EXIT statement causes a loop to …………. when the EXIT condition evaluates to
TRUE.
4. The reserved word WHILE marks the …………. of a loop.
5. DBMS_OUTPUT.PUT_LINE is used to ………….
State True or False
1. SQL statements combined into PL/SQL blocks cause an increase in the network traffic
2. PUT_LINE is one of the procedures from the DBMS_OUTPUT package.
3. DBMS_OUTPUT.PUT_LINE writes information to the buffer for storage before it is
displayed on the screen.
4. When a condition of the IF-THEN-ELSE construct is evaluated to NULL, control is
passed to the first executable statement after END IF.
5. CASE statements cannot be nested one inside the other.
6. What value must the test condition evaluate to in order for the loop to terminate?
Multiple type questions
1. Which of the following sections is mandatory for a PL/SQL block? a) Exception-
handling section
b) Executable section
c) Declaration section
d) All
2. How many actions can you specify in an IF-THEN-ELSE statement? a) One
b) Two
c) Four
d) As many as you require
3. A CASE construct is a control statement for which of the following? a) Sequence
structure
b) Iteration structure
c) Selection structure
d) All
Long Answer type question
1. Why does PL/SQL have so many different types of characters? What are they used for?
2. What is the difference between SQL and PL/SQL?
3. What are the components of a PL/SQL code block?
4. Write a PL/SQL block that will insert a new student in the student table.
5. Explain different types of loop in PL/SQL.
6. Explain IF , IF-THEN-ELSE , CASE Statements
7. How can you use SQL within PL/SQL?
8. What the difference between a procedure and a function?

You might also like