Dbms-3bcom - UG - New

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 125

Database Management System

Data and Information


Data are raw facts that constitute building block of information.
Data are the heart of the DBMS. It is to be noted that all the data will not
convey useful information. Useful information is obtained from processed
data. In other words, data has to be interpreted in order to obtain
information. Good, timely, relevant information is the key to decision
making. Data are a representation of facts, concepts, or instructions in a
formalized manner suitable for communication, interpretation, or
processing by humans or automatic means. The data in DBMS can be
broadly classified into two types, one is the collection of information
needed by the organization and the other is “metadata” which is the
information about the database. The term “metadata” will be discussed
in detail later in this chapter. A company needs to save information about
employees, departments, and salaries. These pieces of information are
called data. Permanent storage of data are referred to as persistent data.
Information is data that has been processed in such a way as to
be meaningful to the person who receives it. Information is data that
has been converted into a more useful or intelligible form. It is the set
of data that has been organized for direct utilization of mankind,
as information helps human beings in their decision making process.
Examples are: Time Table, Merit List, Report card, Headed tables,
printed documents, pay slips, receipts, reports etc. The information is
obtained by assembling items of data into a meaningful form. For
example, marks obtained by students and their roll numbers form data,
the report card/sheet is the .information. Other forms of information
are pay-slips, schedules, reports, worksheet, bar charts, invoices and
account returns etc.
Database

A database is a collection of information that is organized. So that it


can easily be accessed, managed, and updated. A database is a well-
organized collection of data that are related in a meaningful way, which can
be accessed in different logical orders. Database systems are systems in which
the interpretation and storage of information are of primary importance. The
database should contain all the data needed by the organization as a result, a
huge volume of data, the need for long-term storage of the data, and access
of the data by a large number of users generally characterize database
systems
Database Management System
A database management system (DBMS) consists of collection of
interrelated data and a set of programs to access that data. It is software that
is helpful in maintaining and utilizing a database. A DBMS consists of:
A collection of interrelated and persistent data. This part of DBMS is
referred to as database (DB).
A set of application programs used to access, update, and manage data.
This part constitutes data management system (MS).
A DBMS is general-purpose software i.e., not application specific. The
same DBMS (e.g., Oracle, Sybase, etc.) can be used in railway reservation
system, library management, university, etc.
A DBMS takes care of storing and accessing data, leaving only application
specific tasks to application programs. it is evident that DBMS allows user to
input data, share the data, edit the data, manipulate the data, and display the
data in the database. Because a DBMS allows more than one user to share the
data; the complexity extends to its design and implementation.
Objectives of DBMS
The main objectives of database management system are data availability, data integrity,
data security, and data independence.

Data Availability: Data availability refers to the fact that the data are made available to
wide variety of users in a meaningful format at reasonable cost so that the users can
easily access the data.
 Data Integrity: Data integrity refers to the correctness of the data in the database. In
other words, the data available in the database is a reliable data.
 Data Security : Data security refers to the fact that only authorized users can access
the data. Data security can be enforced by passwords. If two separate users are
accessing a particular data at the same time, the DBMS must not allow them to make
conflicting changes
 Data Independence : DBMS allows the user to store, update, and retrieve data in an
efficient manner. DBMS provides an “abstract view” of how the data is stored in the
database. In order to store the information efficiently, complex data structures are used
to represent the data. The system hides certain details of how the data are stored and
maintained.
Evolution of Database Management
Systems
File-based system was the predecessor to the database management
system. Apollo moon-landing process was started in the year 1960. At
that time, there was no system available to handle and manage large
amount of information. As a result, North American Aviation which is
now popularly known as Rockwell International developed software
known as Generalized Update Access Method (GUAM).
In the mid-1960s, IBM joined North American Aviation to develop
GUAM into Information Management System (IMS). IMS was based
on Hierarchical data model. In the mid-1960s, General Electric
released Integrated Data Store (IDS).
IDS were based on network data model. Charles Bachmann was
mainly responsible for the development of IDS. The network database
was developed to fulfill the need to represent more complex data
relationships than could be modeled with hierarchical structures.
Conference on Data System Languages formed Data Base Task Group (DBTG) in 1967. DBTG
specified three distinct languages for standardization. They are Data Definition Language
(DDL), which would enable Database Administrator to define the schema, a subschema
DDL, which would allow the application programs to define the parts of the database and
Data Manipulation Language (DML) to
manipulate the data. The network and hierarchical data models developed during that time
had the drawbacks of minimal data independence, minimal theoretical foundation, and
complex data access.
To overcome these drawbacks, in 1970, Codd of IBM published a paper titled “A Relational
Model of Data for Large Shared Data Banks” in Communications of the ACM, vol. 13, No. 6,
pp. 377–387, June 1970.
As an impact of Codd’s paper, System R project was developed during the late 1970 by IBM
San Jose Research Laboratory in California. The project was developed to prove that
relational data model was implementable. The outcome of System R project was the
development of Structured Query Language (SQL) which is the standard language for
relational database management system.
In 1980s IBM released two commercial relational database management systems known as
DB2 and SQL/DS and Oracle Corporation released Oracle.
 In 1979, Codd himself attempted to address some of the failings in his original work with
an extended version of the relational model called RM/T in 1979 and RM/V2 in 1990.
In recent years, two approaches to DBMS are more popular, which are Object-
Oriented DBMS (OODBMS) and Object Relational DBMS (ORDBMS). The
chronological order of the development of DBMS is as follows:
1. Flat files – 1960s–1980s
2. Hierarchical – 1970s–1990s
3. Network – 1970s–1990s
4. Relational – 1980s–present
5. Object-oriented – 1990s–present
6. Object-relational – 1990s–present
7. Data warehousing – 1980s–present
8. Web-enabled – 1990s–present
• Early 1960s. Charles Bachman at GE created the first general purpose DBMS
Integrated Data Store. It created the basis for the network model which was
standardized by CODASYL (Conference on Data System Language).
• Late 1960s. IBM developed the Information Management System (IMS). IMS
used an alternate model, called the Hierarchical Data Model.
• 1970. Edgar Codd, from IBM created the Relational Data Model.
• In 1981 Codd received the Turing Award for his contributions to database
theory. Codd Passed away in April 2003.
• 1976. Peter Chen presented Entity-Relationship model, which is widely used
in database design. 1980. SQL developed by IBM, became the standard query
language for databases. SQL was standardized by ISO.
• 1980s and 1990s. IBM, Oracle, Informix and others developed powerful
DBMS.
Classification of Database Management System
The database management system can be broadly classified into two categories,
they are
(1) Passive Database Management System
(2) Active Database Management System
1. Passive Database Management System.
Passive Database Management Systems are program-driven. In passive
database management system the users query the current state of database and
retrieve the information currently available in the database. Traditional DBMS
are passive in the sense that they are explicitly and synchronously invoked by
user or application program initiated operations. Applications send requests for
operations to be performed by the DBMS and wait for the DBMS to confirm and
return any possible answers. The operations can be definitions and updates of
the schema, as well as queries and updates of the data.
2. Active Database Management System.
Active Database Management Systems are data-driven or event-
driven systems. In active database management system, the users
specify to the DBMS the information they need. If the information of
interest is currently available, the DBMS actively monitors the arrival of
the desired information and provides it to the relevant users. The scope
of a query in a passive DBMS is limited to the past and present data,
whereas the scope of a query in an active DBMS additionally includes
future data. An active DBMS reverses the control flow between
applications and the DBMS instead of only applications calling the
DBMS, the DBMS may also call applications in an active DBMS.
UNIT-II
Historical Roots of File and File System
In olden days records are maintained in traditional file systems. File
system means organization of files.
File is a collection (or) group of records. A record is collection of fields.
Where the field contains the real data. For ex: Student file, Where we maintain
all students record consisting of Roll No, name, group, marks, average.
In the olden days the file system maintains all the files in flat manner (flat
files/text-files). The flat file permits to search any record in sequential access
only. It was cumbersome and slow. To overcome the slowness they have gone
for Index file system which was faster in accessing in random manner. But it
occupies extra memory to maintain Index table.
In general in the file system all the data has to be stored in the
corresponding folders (or) directories. For ex: In a college we can
maintain admission details of students. Suppose the director wants to
know today’s admission status group wise in that case the file manager
(or) clerk has to open each folder to answer the director’s question.
That’s why it is time consuming, memory consuming, may be error
prone.
To over come this, the data base system has evolved. Which uses
the 4GL language i.e. SQL. Which allows answering any query?
The DBMS maintains all the records in the form of tables by
means of rows and columns. But in DBMS before storing the data the
schema has to be created with the help of DDL.
File system

• Assume maintaining the first year student’s data using file systems.
BBA BBA B.Com B.Com

• Folder name
• Roll No, name, fees Roll No, name, fees
Every folder contains all the relevant fields which occupy extra memory,
calculation is slow. Constraints cannot be imposed in File systems
DBMS.
Roll No Name Branch Fees
60012 Gopal B.Sc 20000

60013 Sainath B.Com 19000

60014 Surekha B.Sc 20000

60015 Ramya BBA 30000

In DBMS all the students information will be available in a centralized


place i.e. table. So it is easy to retrieve any data using query.
Constraints can be imposed in DBMS. [Primary key]. Data redundancy
can be removed easily.
File processing system

File processing systems at Pine Valley Furniture Company


Pine Valley Furniture Company manufactures high-quality all-wood furniture
and distributes it to retail stores nationwide. Among the firm’s several product
lines are computer desks, entertainment centers, dinette sets, bookcases, and
wall units. Customers submit orders to Pine Valley furniture by any of several
means i.e., telephone, mail, fax (or) electronic forms via the Internet.
Early computer applications at Pine Valley furniture used the traditional
file processing approach.
Three of the computer applications based on the file processing
approach are shown in below figure. The systems illustrated are order filling,
Invoicing and payroll. The figure also shows the major data files associated
with each application. For example the order filling system has three files;
customer master, inventory master and backorder.
Disadvantages of file processing systems
Several disadvantages are associated with conventional file processing
systems. These disadvantages are
1. Program Data dependency: File descriptions are stored within each
application program that accesses a given file. For example, in the
Invoicing system in the above figure program A accesses both the
Inventory pricing file and the customer master file. Therefore, this
program contains a detailed file description for both of these files. As a
consequence any change to a file structure requires changes to the file
description for all programs that the file
2. Duplication of Data: Since applications are often developed independently
in file processing systems unplanned duplicate data files are rule rather
than the exception. For example in the above figure the order filling system
contains an Inventory pricing file, while the Invoicing system contains an
Inventory pricing file. These files undoubtedly both contain data describing
Pine Valley Furniture company’s products. Such as product description,
unit price and quantity on hand. This duplication wasteful since it requires
additional storage space and increased effort to keep all files up to date.
Unfortunately, duplicate data files often result in loss of data integrity.
3. Limited data sharing: With the traditional file processing approach, each
application has its own private files and users have little opportunity to
share data outside their own applications. Notice in the above figure. For
example, that user in the accounting department has access to the
Invoicing system and its files, but they probably do not have access to the
order filling system (or) the payroll system and their files. It is often
frustrating to managers to find that a requested report will require a major
programming effort to obtain data from several incompatible files in
separate systems.
4. Lengthy Development Times: With traditional file processing
systems, there is little opportunity to leverage previous development
efforts. Each new application requires that the developer essentially
start from scratch by designing new file formats and descriptions and
then writing the file access logic for each new program. The lengthy
development times required are often inconsistent with today’s fast
paced business environment.
5. Excessive Program Maintenance: The preceding factors all combine
to create a heavy program maintenance load in organizations that rely
on traditional file processing systems. In fact, as much as 80 percent of
the total information systems development budget may be devoted to
program maintenance in such organizations. This of course leaves little
opportunity for developing new applications
Advantages and disadvantages:

In a typical file-processing system, records are stored in various files. A


number of different application programs are written to extract records
from and add records to the appropriate files. File-processing system
has a number of major disadvantages, such as data redundancy, data
inconsistency, un sharable data, unstandardized data, insecure data,
incorrect data etc.
Database management system answers all these problems as it
provides a centralized control over data.
The advantages of DBMS are as follows:
1. Reduces the data redundancy to a large extent.
Data redundancy means duplication of data. Non-database
systems maintain separate copy of data for each application. The
database systems do not maintain separate copies of the same data.
Rather, all the data are kept at one place and all the applications that
require data refer to the centrally maintained database.
2. Databases can control data inconsistency to a large extent
When the redundancy is not controlled, there may be occasions
on which the two entries about the same data do not agree. At such
times, database is said to be inconsistent. Obviously, an inconsistent
database will provide incorrect or conflicting information.
3. Databases facilitate sharing of data
Sharing of data means that individual pieces of data in the database
may be shared among several different users, in the sense that each of
those users may have access to the same piece of data and each of them
may use it for different purposes.
4. Databases enforce standards
The database management system can ensure that all the data
follow the applicable standards. There may be certain standards laid by
the company or organization using the database. Similarly, there may be
national or international standards.
5. Databases can ensure data security
A database management system ensures data security and privacy
by ensuring that the only means of access to the database is through the
proper channel and also by carrying out authorization checks whenever
access to sensitive data is attempted.
6. Program-Data Independence
The separation of data description (metadata) from the
application programs that use the data is called data independence.
With the database approach, data descriptions are stored in a central
location called the repository.
7. Increased Productivity of Application Development
A major advantage of the database approach is that it greatly
reduces the cost and time for developing new business applications.
8. Improved Data Quality
The database approach provides a number of tools and
processes to improve data quality.
9. Improved Data Accessibility and Responsiveness
With a relational database, end users without programming
experience can often retrieve and display data, even when it crosses
traditional departmental boundaries.
Disadvantages of DBMS

• In spite of the advantages of using a DBMS, there are a few situations


in which such a system may involve unnecessary overhead costs that
would not be incurred in traditional file processing. The overhead
costs of using a DBMS are due to the following.
• High initial investment in hardware, software and training.
• The generality that a DBMS provides for defining and processing data.
• Overhead for providing security, concurrency control, recover, and
integrity functions.
• Additional problems may arise if the database designers and DBA do
not properly design the database or if the database systems
applications are not implemented properly.
Functions of the DBMS
A DBMS performs several functions that guarantee the integrity and
consistency of the data in the database. Most of those functions are
transparent to end users, and most can be achieved only through the
use of a DBMS.
1. Data dictionary management:- The DBMS stores definitions of the
data elements and their relationships (metadata) in a data dictionary.
In turn, all programs that access the data in the databases work
through the relationship, thus relieving you from having to code such
complex relationships in each program. In other words, the
DBMSprovides data abstraction and it removes structural and data
dependency from the system.
2. Data storage management: The DBMS creates and manages the
complex structures required for data storage, thus relieving you from the
difficult task of defining and programming the physical data
characteristics. Data storage management is also important for database
performance tuning. Performance tuning relates to the activities that
make the database perform more efficiently in terms of storage and
access speed.
3. Data transformation and presentation: The DBMS transforms entered
data to conform to required data structures. The DBMS relieves you of
the chore of making a distinction between the logical data format and the
physical data format. That is, the DBMS formats the physically retrieved
data to make it conform to the user’s logical expectations.
4. Security management: The DBMS creates a security system that
enforces user security and data privacy. Security rules determine which
users can access the database, which data items each user can access,
and which data operations the user can perform.
5. Multiuser access control: To provide data integrity and data
consistency, the DBMS uses sophisticated algorithms to ensure that
multiple users can access the database concurrently without
compromising the integrity of the database.
6. Backup and recovery management: The DBMS provides backup and
data recovery to ensure data safety and integrity. Current DBMS systems
provide special utilities that allow the DBA to perform routine and special
backup and restore procedures. Recovery management deals with the
recovery of the database after a failure, such as a bad sector in the disk
or a power failure.
7. Data integrity management: The DBMS promotes and enforces
integrity rules, thus minimizing data redundancy and maximizing data
consistency. The data relationships stored in the data dictionary are
used to enforce data integrity. Ensuring data integrity is especially
important in transaction-oriented database systems.
8. Database access languages and application programming interfaces:
The DBMS provides data access through a query language. A query
language is a nonprocedural language-one that lets the user specify
what must be done without having to specify how it is to be done.
9. Database communication interfaces: Current-generation DBMSs
accept end-user requests via multiple, different network environments.
For example, the DBMS might provide access to the database via the
Internet through the use of Web browsers such as Mozilla Firefox or
Microsoft Internet Explorer.
Data models
Data model: A Model is an abstraction of a more complex real-world
object or event. A data model is the relatively simple representation,
usually graphical of complex real world data structures. The data
model’s main function is to help us understand the complexities of the
real world environment.
• Within the database environment, a data model represents data
structures and their characteristics, relations, constraints, and
transformations. Good database design uses an appropriate data
model as its foundation.
• A data model provides a blueprint of the data that is required for a
functional system.
Database models
Data modeling or database modeling is a technique that records the inventory,
shape, size, contents and rules of data elements used in the scope of business
process. The business process scope may be as large as a multidiscipline global
corporation, or as small as the receiving of dock. Simply we can define data
model as modeling of data for an organization.
Types of Database models:
1. Flat file database model
2. Hierarchical database model
3. Network database model
4. Relational database model
5. The E-R model.
6. Object oriented model
7. Object relational Database model
Flat file database model
A flat file database consists of one or more readable files, normally
stored in a text format. Information in these files is stored as fields, the
fields having either a constant length or a variable length
Every flat file database system is different because companies store
different data and companies have different needs.
• Hierarchical database model:
• The architecture of a hierarchical model is based on the concept of
parent/child relationships. To overcome the problems flat file model it is
developed. It a hierarchical database, a root table, or parent table, resides
at the top of the structure, which points to child tables containing related
data. The structure of hierarchical database model appears are an inverted
tree

Publishers

Authors Book Store

Titles Inventory Orders


Network Database model:
Improvements were made to the hierarchical model in
order to derive the network model. One of the model main advantages
of the network model is the capability of a parent to share relationships
with child. This means that a child table can have multiple parent
cables. Additionally, a user can access data by starting with any cable in
the structure, navigating either up or down in the tree. The user is not
required to access a root table first to get to child tables.

Publishers

Authors Book Store

Titles Inventory Orders


The Relational Database Model

• The relational model was introduced in 1970 by E.F.Codd in his


landmark paper ”A Relational Model of Data for Large shared
Databanks”. The relational model represented a major breakthrough
for both users and designers. To use an analogy, the relational model
produced an “automatic transmission“ database to replace the
“standard transmission” databases that proceeded it . Its conceptual
simplicity set the stage for genuine database revolution.
The Entity Relationship (E-R) model

• The entity relationship data model is based on a perception of a real


world that consists of collection of a basic objects called entities and
relationships among these objects. Entities are described in a
database by a set of attributes. For example, the attributes acc_no,
and balance may describe one particular account in a bank, and they
from attributes of the account entity set. Similarly, attributes
cust_name,cust_street and cust_city may describe a customer entity
Object oriented data base model
• An OO programming language allows the programmer to work with
objects to a time an application that interacts with a relational
database. During the last few years, object oriented programming has
become popular with languages such as C++, VB & Java. For ex.
Elements within a program or database application are usually
represented as objects. These objects have properties, which can be
modified, and can also be inherited from other objects. Related types
of objects are assigned various properties that can be adjusted to
define the particular object and determine how the object will act.
Object Relational Database Model
• Although some major seems exist between the object oriented and relational
models, the object relational model was developed with the objective of the
concepts of the relational databases model with object-oriented programming
style. The OR model is supposed to represent the best of both workers
( relational & OO ), although OR model is still early in development.

Person
Eno fname varchar
E_inf person &name varchar
Address_inf address initial varchar
Phone number
Address
Street varchar
City varchar
State varchar
Components and Interfaces of Database Management
System

A database management system involves five major components:


• Hardware
• Software
• Data
• Users/People
• Procedures
The interfaces between the components are shown in the figure
• Hardware:
The hardware can range from a single personal computer to a
mainframe or to a network of computers. The particular hardware depends
on the requirements of the organization and the DBMS used. A DBMS
requires a minimum amount of main memory and disk space to run, but this
minimum configuration may not necessarily give acceptable performance.
• Software:
The software includes the DBMS software, application programs
together with the operating systems including the network software if the
DBMS is being used over a network. The application programs are written in
third generation programming languages like ‘C’, COBOL, etc. or using fourth
generation language such as SQL, embedded in a third generation language.
The target DBMS may have its own fourth generation tools which
allow development of applications through the provision of non procedural
query languages, report generators, graphics generators and application
generators.
• Data:
Database is an organized collection of logically related data, usually designed to meet
the information needs of multiple users in an organization. It is important to distinguish
between the database and the repository. The repository contains definitions of data,
whereas the database contains occurrences of data.
The data in the database is integrated, shared and persistent.
• Integrated Data: A data can be considered to be a unification of several distinct data files
and when any redundancy among those files is eliminated.
• Shared Data: A database contains data that can be shared by different users for different
applications simultaneously.
• Persistent Data: A data which cannot be removed from the database as a side effect of
some other process. Persistent data have a life span that is not limited to single execution
of the programs that use them.

Users/people interacting with database:

Procedure: Procedures are the rules that govern the design and the use of database.
Components of Database Environment:
The major components of a typical database environment
and their relationships are shown below
• Computer-aided software engineering (CASE) tools : CASE Tools are automated
tools used to design databases and application programs.
• Repository: Repository is Centralized knowledge base for all data definitions,
data relationships, screen and report formats, and other system components. A
repository contains an extended set of metadata important for managing
databases as will as other components of an information system.
• Database management system (DBMS): DBMS is a Commercial software (and
occasionally, hardware and firmware) system, which is used to define, create,
maintain, and provide controlled access to the database and also to the
repository. In other words DBMS is collection of logically related data and set of
programs to operate data.
• Database: Database is an organized collection of logically related data, usually
designed to meet the information needs of multiple users in an organization. It is
important to distinguish between the database and the repository. The
repository contains definitions of data, whereas the database contains
occurrences of data.
• Application Programs: Application programs are Computer programs that are
used to create and maintain the database and provide information to users.
• User interface Languages, menus, and other facilities by which users
interact with various system components, such as CASE tools, application
programs, the DBMS, and the repository.
• Data administrators Persons who are responsible for the overall
information resources of an organization. Data administrators use CASE
tools to improve the productivity of database planning and design.
• System developers Persons such as systems analysts and programmers
who design new application programs. System developers often use
CASE tools for system requirements analysis and program design.
• End users Persons throughout the organization who add, delete, and
modify data in the database and who request or receive information
from it. All user interactions with the database must be routed through
the DBMS.
Ranges of Database Applications
The range of database applications can be divided into five categories: Personal
databases, workgroup databases, department databases, enterprise databases,
and Internet, Intranet, and Extranet databases.
Personal Databases:
Personal databases are designed to support one user. Personal databases
have long resided on personal computers (PCs), including laptops. Recently the
introduction of personal digital assistants (PDAs) has incorporated personal
databases into handheld devices that not only function as computing devices but
also as cellular phones, fax senders, and Web browsers.
• Personal databases are widely used because they can often improve personal
productivity. However, they entail a risk: The data cannot easily be shared with
other users. For this reason, personal databases should be limited to those
rather special situation (such as in a very small organization) where the need to
share the data among users of the personal database is unlikely to arise.
Workgroup Database:
A workgroup is a relatively small team of people who collaborate
on the same project or application or on a group of similar projects or
applications. A workgroup typically comprises fewer than 25 persons.
A workgroup database is designed to support the collaborative efforts
of such a team.
• The method of sharing the data in this database is shown in below
figure. Each member of the workgroup has a desktop computer and
the computers are linked by means of a local area network (LAN). The
database is stored on a central device called the database server,
which is also connected to the network. Thus each member of the
workgroup has access to the shared data.
• Workgroup database with local area network
Project Manager
Developer 1 Developer n Librarian

Local area network

Database server

Workgroup database

Department Databases:
A department is a functional unit within an organization. Typical examples
of department are personnel, marketing, manufacturing, and accounting.
A department is generally larger than a workgroup (typically between 25
and 100 persons) and is responsible for a more diverse range of functions.
• Department databases are designed to support the various functions and
activities of a department
Enterprise Databases
An enterprise database is one whose scope is the entire
organization or enterprise (or, at least, many different departments).
Such databases are intended to support organization-wide operations
and decision making. An enterprise database does, however, support
information needs from many departments. Over the last decade, the
evolution of enterprise databases has resulted in two major
developments:
1. Enterprise resource planning (ERP) systems
2. Data warehousing implementations.
• An enterprise data warehouse

Branch
Office -1

Branch
Office-2

Branch Corporate Office


Office-3
Data
warehouse
Branch
Office- 4

Branch
Office-5
Internet, Intranet and Extranet Databases

Internet: The most recent changes that affects the database


environment is the ascendance of the Internet, a worldwide network
that connects users of multiple platforms easily through an interface
known as a Web browser.
Extranet: Use of Internet protocols to establish limited access to
company data and information by the company’s customers and
suppliers.
Intranet: Use of Internet protocols to establish access to company data
and information that is limited to the organization
Database Architecture
Database architecture essentially describes the location of all the pieces of
information that make up the database application. The database
architecture can be broadly classified into the following categories
• Two-Tier Architecture
• Three-tier Architecture
• Multitier Architecture
or
• 1-Tier Architecture
• 2-Tier Architecture
• 3-Tier Architecture
1-Tier Architecture
1 Tier Architecture in DBMS is the simplest architecture of Database in
which the client, server, and Database all reside on the same machine.
A simple one tier architecture example would be anytime you install a
Database in your system and access it to practice SQL queries.

Two-Tier Architecture
The two-tier architecture is a client–server architecture in which the
client contains the presentation code and the SQL statements for data
access. The database server processes the SQL statements and sends
query results back to the client. Two-tier client/server provides a basic
separation of tasks. The client, or first tier, is primarily responsible for
the presentation of data to the user and the “server,” or second tier, is
primarily responsible for supplying data services to the client.
Presentation Services
“Presentation services” refers to the portion of the application which presents data
to the user. In addition, it also provides for the mechanisms in which the user will interact
with the data. More simply put, presentation logic defines and interacts with the user
interface. The presentation of the data should generally not contain any validation rules.
Application Services
“Application services” provide other functions necessary for the application.
Business Services/objects
“Business services” are a category of application services. Business services encapsulate
an organizations business processes and requirements. These rules are derived from the
steps necessary to carry out day-today business in an organization. These rules can be
validation rules, used to be sure that the incoming information is of a valid type and
format, or they can be process rules, which ensure that the proper business process is
followed in order to complete an operation.
Data Services
“Data services” provide access to data independent of their location. The data can come
from legacy mainframe, SQL RDBMS, or proprietary data access systems.
Advantages of Two-tier Architecture
• The two-tier architecture is a good approach for systems with stable
requirements and a moderate number of clients.
• The two-tier architecture is the simplest to implement, due to the
number of good commercial development environments.
Drawbacks of Two-tier Architecture
• Software maintenance can be difficult because PC clients contain a
mixture of presentation, validation, and business logic code.
• To make a significant change in the business logic, code must be
modified on many PC clients.
• Moreover the performance of two-tier architecture can be poor when a
large number of clients submit requests because the database server
may be overwhelmed with managing messages.
• With a large number of simultaneous clients, three-tier architecture may
be necessary.
Three-tier Architecture

Three-tier architecture offers a technology neutral method of


building client/server applications with vendors who employ standard
interfaces which provide services for each logical “tier.” Through
standard tiered interfaces, services are made available to the
application. A single application can employ many different services
which may reside on dissimilar platforms or are developed and
maintained with different tools. This approach allows a developer to
leverage investments in existing systems while creating new application
which can utilize existing resources.
Multitier Architecture
• A multi-tier, three-tier, or N-tier implementation employs a three-tier
logical architecture superimposed on a distributed physical model.
Application Servers can access other application servers in order to
supply services to the client application as well as to other Application
Servers. The multiple-tier architecture is the most general client–
server architecture.
• It can be most difficult to implement because of its generality.
However, a good design and implementation of multiple-tier
architecture can provide the most benefits in terms of scalability,
interoperability, and flexibility. In the above example, the client
application looks to Application Server #1 to supply data from a
mainframe-based application..
• Application Server #1 has no direct access to the mainframe application, but it
does know, through the development of application services, that Application
Server #2 provides a service to access the data from the mainframe application
which satisfies the client request. Application Server #1 then invokes the
appropriate service on Application Server #2 and receives the requested data
which is then passed on to the client. Application Servers can take many forms.
An Application Server may be anything from custom application services,
Transaction Processing Monitors, Database Middleware, Message Queue to a
CORBA/COM based solution
DBMS Vendors and their Products

Some of the popular DBMS vendors and their corresponding products are as follows
vendor product
IBM –DB2/MVS
–DB2/UDB –DB2/400
–Informix Dynamic Server (IDS)
Microsoft –Access
–SQLServer
–DesktopEdition(MSDE)
Open Source –MySQL
–PostgreSQL
Oracle –Oracle DBMS
–RDB
Sybase –Adaptive Server Enterprise (ASE)
–Adaptive Server Anywhere (ASA)
–Watcom
Different views of Database/Abstraction

A collection of interrelated files and a set of programs that allow users


to access and modify these files are known as database management
system.
• A major purpose of a database system is to provide the users
only that much information that is required by them. This means that
the system does not disclose all the details of data, rather it hides
certain details of how the data is stored and maintained.
• Since the requirements of different users differ from one
another, the complexity of the database is hidden from them, if
needed, through several levels of abstraction in order to simplify their
interaction with the system.
Various Levels of Database Implementation
A database is implemented through three general levels: internal,
conceptual and external so as cater to the needs of its users.
1. Internal Level (Physical Level). The lowest level of abstraction, the internal
level, is the one closest to physical storage. This level is also sometimes
termed as physical level. It describes how the data are actually stored on the
storage medium. At this level, complex low-level data structures are
described in details.
2. Conceptual Level This level of abstraction describes what data are actually
stored in the database. It also describes the relationships existing among
data. At this level, the database is described logically in terms of simple data-
structures. The users of this level are not concerned with how these logical
data structures will be implemented at the physical level. Rather, they just
are concerned about what information is to be kept in the database.
3. External Level (View Level). This is the level closest to the users and is
concerned with the way in which the data are viewed by individual users.
Most of the users of the database are not concerned with all the information
contained in the database. Instead, they need only a part of the database
relevant to them. For example, even though the bank database stores a lot
much information, an account holder (a user) is interested only in his account
details and not with the rest of the information stored in the database. To
simply such users’ interaction with the system, this level of abstraction is
defined. The system thus provides many views for the same database. The
below figure illustrates the interrelationship among these three levels of
abstraction.
UNIT –
III
Building Blocks of an Entity–Relationship Diagram

ER diagram is a graphical modeling tool to standardize ER modeling. The


modeling can be carried out with the help of pictorial representation of
entities, attributes, and relationships. The basic building blocks of Entity-
Relationship diagram are Entity, Attribute and Relationship.
Entity
An entity is an object that exists and is distinguishable from other objects.
• Ex: person, place, department etc
Entity Type
An entity type or entity set is a collection of similar entities. Some
examples of entity types are:
• All students in EDC, say STUDENT.
• All courses in EDC, say COURSE.
• All departments in EDC, say DEPARTMENT.
• An entity may belong to more than one entity type. For example, a
staff working in a particular department can pursue higher education
as part-time. Hence the same person is a LECTURER at one instance
and STUDENT at another instance.
Relationship
A relationship is an association of entities where the association
includes one entity from each participating entity type whereas
relationship type is a meaningful association between entity types.
• Teaches is the relationship type between LECTURER and STUDENT.
• Buying is the relationship between VENDOR and CUSTOMER.
• Treatment is the relationship between DOCTOR and PATIENT.
Attributes
Attributes are properties of entity types. In other words, entities are
described in a database by a set of attributes.
• Brand, cost, and weight are the attributes of CELLPHONE.
• Roll number, name, and grade are the attributes of STUDENT.
ER Diagram
The ER diagram is used to represent database schema.
In ER diagram:
• A rectangle represents an entity set.
• An ellipse represents an attribute.
• A diamond represents a relationship.
• Lines represent linking of attributes to entity sets and of entity sets to
relationship sets.
• Entity Set 

• Attribute 

• Relationship 
Classification of Entity Sets

• Entity sets can be broadly classified into:


• Strong entity
• Weak entity.
• Associative entity.

Entity Set
Strong Entity
Strong entity is one whose existence does not depend on other entity.
Example
• Consider the example, student takes course. Here student is a strong
entity

Student Takes Course


Associative entity
An associative entity is a term used in relational and entity–relationship
theory. A relational database requires the implementation of a base
relation (or base table) to resolve many-to-many relationships. A base
relation representing this kind of entity is called, informally, an associative
table.
Attribute Classification
Attribute is used to describe the properties of the entity. This attribute can be
broadly classified based on value and structure. Based on value the attribute
can be classified into single value, multivalued, derived, and null value
attribute. Based on structure, the attribute can be classified as simple and
composite attribute
Symbols Used in ER Diagram

The elements in ER diagram are Entity, Attribute, and Relationship. The


different types of entities like strong, weak, and associative entity, different
types of attributes like multivalued and derived attributes and identifying
relationship and their corresponding symbols are
Single Value Attribute
Single value attribute means, there is only one value associated
with that attribute.
Multivalued Attribute
In the case of multivalue attribute, more than one value will be
associated with that attribute.
Derived Attribute
The value of the derived attribute can be derived from the values
of other related attributes or entities. In ER diagram, the derived
attribute is represented by dotted ellipse
Null Value Attribute
In some cases, a particular entity may not have any applicable
value for an attribute. For such situation, a special value called null
value is created.
Composite Attribute
Composite attribute is one which can be further subdivided into simple
attributes
Relationship Degree
Relationship degree refers to the number of associated entities. The
relationship degree can be broadly classified into unary, binary, and ternary
relationship
Unary Relationship
The unary relationship is otherwise known as recursive relationship. In the
unary relationship the number of associated entity is one. An entity related to
itself is known as recursive relationship
Binary Relationship
In a binary relationship, two entities are involved. Consider the example;
each staff will be assigned to a particular department. Here the two entities
are STAFF and DEPARTMENT

Ternary Relationship
In a ternary relationship, three entities are simultaneously involved. Ternary
relationships are required when binary relationships are not sufficient to
accurately describe the semantics of an association among three entities
Specialization and Generalization (or)
Characteristics of Super type/Subtypes

There are two processors


“Specialization” and “Generalization” that
serve as mental models in developing
super type / subtype relationships
Generalization: In data modeling
“Generalization is the process of defining
a more general entity types from a set of
more specialized entity types”. Thus
generalization is a bottom-up approach
process. The example of generalization is
as show in the following figure.
• The above figure represents three entity types car, truck motorcycle.
At this stage data modeler intends to represents this separately on E-R
diagram however, on close examination, we see that three entity
types have a no. of attributes is common and they are vehicle-
identifier, vehicle-name, price and engine displacement.
• This fact suggests that each of the three entity types is really a version a
more general entity type. This more general entity type vehicle, together
with the resulting super type/subtype relationship is shown below.
• The entity CAR has the specific attribute no. of passengers, while truck
has two specific attributes capacity and cab-type. Thus, generalization has
allowed us to group entity types, along with their common attributes and
at the same time preserve specific attributes that are unique to each
subtype.
• The entry type motorcycle is not included in the relationship because it
doesn’t satisfy the subtype conditions.
• Referring to the figure (d) you will notice that the only attributes of
motorcycle are those that are common to all vehicles, there are no
attributes specific to motorcycles. Motorcycle does not have a
relationship to another entity type. Thus, there is no need to create a
motorcycle subtype. The fact that there is no motorcycle subtype
suggests that it must be possible to have an instance of vehicle that is not
a member of any of its subtypes.
Normalization
Normalization is a design technique that is widely used as a guide
in designing relational databases. Normalization is essentially a
two step process that puts data into tabular form by removing
repeating groups and then removes duplicated data from the
relational tables. Normalization theory is based on the concepts
of normal forms. A relational table is said to be a particular
normal form if it satisfies certain set of constraints
Advantages of Normalization
• Avoids data modification (INSERT/DELETE/UPDATE) anomalies as each data item lives in One place
• Greater flexibility in getting the expected data in atomic granular
• Normalization is conceptually cleaner and easier to maintain and change as your needs change
• Fewer null values and less opportunity for inconsistency
• A better handle on database security
• Increased storage efficiency
• The normalization process helps maximize the use of clustered indexes, which is the most powerful
and useful type of index available. As more data is separated into multiple tables because of
normalization, the more clustered indexes become available to help speed up data access
Disadvantages of Normalization
• 1. Requires much more CPU, memory, and I/O to process thus normalized data gives reduced
database performance
2. Requires more joins to get the desired result. A poorly-written query can bring the database
down
3. Maintenance overhead. The higher the level of normalization, the greater the number of tables
in the database.
Basic and Higher Normal forms
Normal forms are classified into two main categories they are
1. Basic Normal Forms
2. Higher Normal Forms.
1) Basic Normal Forms:
The basic normal forms are
• First normal form
• Second normal form
• Third normal form
First Normal Form:
A relation is in a first normal form if it contains no multi-valued attribute.
• For example the relation EMPLOYEE contains the attributes like Emp-Id, Emp-
Name, Sal, and gender. Here is no multi valued attribute. So the relation EMPLOYEE
is in first normal form.
• EMPLOYEE
Emp-No Emp-Name Sal Gender

Second Normal Form (2NF):


A relation is in second normal form if it is in first normal form and every non-key
attribute is fully functionally dependent on the primary attribute. That means no partial
functional dependency.
• If a relation is in second normal form if any one of the following conditions apply.
• A relation contains only one primary key.
Third Normal Form (3NF):
A relation is in 3rd normal form if it is in second normal form and no
transitive dependency. A transitive dependency in a relation is a functional
dependency between two or more non-key attributes. For example consider
the following relation SALEs.
• SALES.
Cust-Id Cust-Name Salesperson-Id Region

• In the above relation cust-Id is the primary key. So all of the remaining
attributes are functionally dependent on this attribute. However there is a
transitive dependency. The attribute region is functionally dependent the
attributes salesperson-Id. So we decompose the above relation into new
relations that satisfy our 3rd normal form.
CUSTOMER
Cust-Id Cust-Name Salesperson-Id
SALES
Salesperson-Id Region

Advanced Normal Forms/Higher Normal Forms:


• Advanced Normal Forms are
• Boyce codd normal form (BCNF)
• Fourth normal form
• Fifth normal form
1) Boyce/ Codd Normal Form:
• A relation is in Boyce codd normal form if it is in third normal form and the determinants are
candidate keys. For example consider a relation STUDENT with attributes like St.No, subject, Adviser.
• In the above relation primary key is a composite key of St.No, subject. Here two functional
dependencies occur.
STUDENT
St-No Subject Adviser
St-No adviser
Subject Adviser
In the above second functional dependency determinant is adviser. It is not a
candidate key. So the above relation cannot follow Boyce codd normal form. So we
decompose the above relation into new relations.
STUDENT
St-No Adviser

ADVISER
Subject Adviser
STUDENT
2) Fourth Normal Form:
St-No St-name Course-Id Grade
A relation is in fourth normal form if it is
in Boyce codd normal form and no multi St-No, course-Id grade
valued dependency. Here multi-valued
dependency is a functional dependency that St-Name, course-Id grade
exists a non-key attribute is functionally The above relation contains multi-valued dependency. So it is not in fourth
dependent on two or more sets of primary key
attributes. normal form. To avoid the multi-valued dependency we decompose the above
• For example consider a relation relations into new relations.
student contains attributes like st-No, st-
Name, course-Id and grade. Here primary
key is a composite key of st-No, st-Name, St-No St-Name
course-Id. In the above example the non-key
attribute grade is functionally dependent on
st-no, course-Id, and st-name, course-Id. St-No Course-Id Grade
Fifth Normal Form (5NF):- (Domain Normal Form) (Projection-join Normal Form)
A relation is in fifth normal form if it is in fourth normal form and
that contains joined dependency. Here join-dependency means if a
relation contains minimum of 3 attributes and every attribute may
functionally dependent on the remaining attributes. For example consider
a relation CLASS with attributes like subject, teacher and text-book. Here
primary key is a composite key of subject, teacher, text-book. The above
relation CLASS is not in fifth normal form because it satisfies join-
dependency. So we decompose the above relation into new relations.
Subject Teacher Text-book

Subject, teacher --- text-book Subject Teacher

Text-book, subject --- teacher


Teacher Text-book

Text-book, teacher --- subject


Text-book Subject
Aggregation and Composition

• Relationships among relationships are not supported by the ER


model. Groups of entities and relationships can be abstracted into
higher level entities using aggregation. Aggregation represents a
“HAS-A” or “IS-PART-OF” relationship between entity types. One
entity type is the whole, the other is the part. Aggregation allows us
to indicate that a relationship set participates in another relationship
set. The car has various components like tires, doors, engine, seat,
etc., which varies from one car to another. Relationship drives is
insufficient to model the complexity of this system. Part of
relationships allow abstraction into higher level entities. In this
example engine, tires, doors, and seats are aggregated into car.
• Composition is a stronger form of aggregation where the part cannot exist
without its containing whole entity type and the part can only be part of one
entity type. Consider the example of DEPARTMENT has PROJECT. Each
project is associated with a particular DEPARTMENT. There cannot be a
PROJECT without DEPARTMENT. Hence DEPARTMENT has PROJECT is an
example of composition
Relationships within the relational data
base / Mapping cardinalities
Mapping cardinalities (or) cardinality ratios express the number of
entities to which another entity can be associated via a relationship set.
Mapping cardinalities are most useful in describing binary relationship
sets although occasionally they contribute to the description of
relationship sets that involve more than two entity sets.
• For a binary relationship set R between entity sets A and B, the
mapping cardinality must be one of the following.
(i) One – to – One: An entity in A is associated with at most one entity in B;
and an entity in B is associated with at must one entity in A

a1 b1
(i)
a2
One
a3 – to

a4

One – to - One

• EXAMPLE

Employee Parking Place


I-assigned
(ii)One – to – Many An entity in A is associated with any number of entities
in B. An entity in B, however can be associated with at most one entity in
A.
b1

a1 b2

b3
a2
b4

b5

Example:

One - to - One
(iii) Many – to – One An entity in A is associated with at most one entity
in B. An entity in B however, can be associated with any number of
entities in A.
a1
b1
a2
b2
a3
b3
a4

a5

Many – to - One

• Example:
PRODUCT Product_Line
Contains

Many - to - One
(iv) Many-to-Many An entity in A is associated with any number of entities in
B and an entity in B is associated with any number of entities in A.

a1 b1

a2 b2

a3 b3

a4 b4

Example:
STUDENT Registers- Course
for
Entity Relationship model constructs(ER Model)

The basic constructs of the entity relationship model are entities,


relationships and attributes. The richness of the E-R Model allows
designers to model real world situations accurately and expressively,
which helps account for the popularity of the model.
Entity:
• An entity is an object that may represents person, place, object, event
or concept in the user environment about which the organization
wishes to maintain data.
• Eg: Employee, Student, Store, warehouse, Machine, Registration,
Account, course.
Entity type versus Entity Instance:
Entity Type: An entity type “is a collection of entitles that share
common properties or characteristics”. Each type in an E-R model is given a
name. Since the name represents a collection of items, all it is always
singular. We use capital letters for names of entity types. In an E-R diagram
the entity name is placed inside the box representing the entity type.
Entity Instance: An entity instance “is a single occurrence of an entity
type”. An entity type is described just once (Using metadata) in database,
while many instances of that entity type may be represented by data stored
in the databases.
Strong Vs weak entity type
Strong entity: An entity that exists independently of other entity type, then it is
called strong entity type or An entity set that has a primary key is termed as a
strong entity set. Strong entity is denoted by a symbol single lined Box

• Weak entity: An entity set may not have sufficient attributes to form a primary
key such entity set is termed as weak entity set (or) An entity type whose
existence depends on some other entity type is called weak entity type. weak
entity type has no meaning in the E- R diagram without the entity on which it
depends. the entity type on which the weak entity type depends is called “the
identifying owner” (or) simply called “owner” . Weak entity is denoted by a
symbol double lined Box.
Identity Relationship : The relationship that associates the weak entity set
with an owner is the identifying relationship.
Attributes: property or characteristic of an entity type that is of interest to
the organization is called attribute. An attribute is denoted by a symbol ellipse

• Ex:
FACULTY F_Name
M_Name

• In above example
F_Id,Name,Dob,Age,Skill F_Id Name
L_Name

Qualification
Dob
are attributes of the Skill FACULTY

FACULTY entity. Qualification


Age
Types of attributes
1. Simple Vs Composite Attribute:
Simple attribute:
A ‘Simple” or atomic attribute is an attribute that cannot be broken down
into smaller components. In above Example F_Id, Dob, Skill, Age, Qualificaton
are Simple attributes.
Composite attribute: A “composite” attribute is an attribute that can broken
down into component parts. In above Example Name is Composite attribute.
Because it is broken in to 3 parts(Components) namely F_Name, M_Name,
and L_Name.
Single valued Vs Multivalued Attribute
2. Single valued Vs Multivalued Attribute:
Single valued attribute :A “Single valued’ attribute is an attribute that takes
only one value for a given entity instance.
• Eg : The F_Id attribute of an entity FACULTY takes only one value for each
entity instance.
Multivalued Attribute: A “multi valued” attribute is an attribute that may
take more than one value for a given entity instance. We indicate a multi
valued attribute with a double lined ellipse
• Ex: Consider a FACULTY entity set with the
attribute Skill. Any particular
faculty may have Skill in more than one topic.
3. Null attributes: A null value is used when an entity does not have a value for
an attribute.
4. Stored Vs Derived Attributes :
Stored attribute: Some attribute values are calculated or derived from other
related attribute values that stored in the database.
• Ex : In FACULTY entity type F_Id, Dob are stored attributes.
• Derived Attribute: A “derived” attribute is an attribute whose values can be
calculated from related attribute values. We indicate a derived attribute in
an E-R Diagram by using an ellipse with a dashed line.
• Ex : The FACULTY entity type has Age attribute. If the users need to know
Age, that value must be derived from the attribute Dob.
5. Identifier attribute or Primary Key: An identifier is an attribute that
uniquely identifies individual instances of an entity type. The identifier for the
Faculty entity type is F_Id. Each entity instance must have a single value for
the attribute and the attribute must be associated with entity. We underline
identifiers name on the E-R Diagram.
Primary Key: A Primary key is a set of one or more attributes that can uniquely
identify tuples within the relation.
Composite identifier or Composite key attribute: A composite identifier is an
identifier that consists of a composite attribute.
• The following figure shows the entity set flight with the composite
identifier-flight-id. The Flight-id composite attribute intern has component
attributes flight-no and date. This combination is required to uniquely
identify individual occurrences of flight.
Relationships:
A relationship is an association among the instances of one (or) more
entity types i.e., of interest to the organization.
• Eg : Consider the entity type employee and course where courses represents
training courses that may be taken by employees. To track courses that have
been completed by particular employees. We define a relationship called
“Completes” between two entity types as shown in the following figure.
• Relationship type (Completes)

Course-id Course-title

Employee-id Employee-
name

Topic

Birthdate

Employee Completes Course


Relationship instances:
Employee Course
Raju C++
Kiran Java
Suman COBOL
Hema

• This is a many-To- many relationship since each employee may complete any no.
of course, while a given course may be completed by any no. of employees.
• Associative Entities : An associative entity is an entity type that associates the
instances of one or more entity types and contains attributes that are peculiar to
the relationships between those entity instances,
• In the E-R model associative entities are represented with the diamond
relationship symbol enclosed within entity box. The purpose of this symbol is to
preserve the information that the entity was initially specified as a relationship on
E-R Model.
Different Keys
Keys: It is important to be able to specify how rows in a relation are
distinguished conceptually, rows are distinct from one another, but from a
database perspective the difference among them must be expressed in terms
of their attributes. Keys come here for a rescue.
Primary key: A primary key is a set of one (or) more attributes that can
uniquely identify tuple within the relation.
• Ex: In our sample database, sup # is the primary key for supplier’s relation
(table) as it contains unique value for each tuple in the relation
Composite Primary Key: In some tables, combination of more than one
attribute provides a unique value for each row. In such tables, the group of
these attributes is declared as primary key. In such cases, the primary key
consists of more than one attribute; it is called composite-primary-key.
• Ex: Supp # and item # is the primary key for the shipments relation (table).
Candidate Key: All attribute combinations inside a relation that can serve as primary
key are candidate keys as they are candidates for the primary key position.
• Ex: In our sample database, there are two candidate keys supp # and sup-name in
the suppliers relation. Both of these attributes contain unique values for each
tuple.
Super Key
A super key is a column (or) set of columns that uniquely identifies a row
within a table.
• Ex:
Given table : Employees { employee-id, first-name, sur-name, sal }
Possible super keys are :
• { Employee_Id}
• { Employee_Id, First_Name }
• { Employee_Id, First_Name, Surname }
• { Employee_Id, First_Name, Surname, Sal }
Secondary Key: It is defined as a key that is used strictly for data retrieval
purposes.
Ex: Suppose customer data are stored in a CUSTOMER table in which the
customer number is the primary key. Suppose, that some of the customers
forget their number? Data retrieval for a customer can be facilitated when the
customer’s last name and phone number are used. In that case, the primary
key is the customer number, the secondary key is the combination of the
customer’s last name and phone number.
Foreign Key: A non-key attribute, whose values are derived from the primary
key of some other table, is known as foreign-key in its current table.
Data dictionary and System catalog
Data Dictionary:
An integral part of RDBMS is the data dictionary which stores
Meta data, (or) information about the database, including attribute names
and definitions for each table in the database. The data dictionary is usually
a part of the system catalog that is generated for each database.
• The system catalog describes all database objects, including
table-related data such as table names, table creators (or) owners, column
names and data types, foreign keys and primary keys, index files,
authorized users, user access privileges and so forth. The system catalog is
created by the DBMS and the information is stored in system tables, which
may be queried in the same manner as any other data table, if the user has
sufficient access privileges.
• The system catalog automatically produces database documentation. As
new tables are added to the database, that documentation also allows
the RDBMS to check for and eliminate homonyms and synonyms.
• Homonyms are similar-sounding words with different
meanings, such as boar and bore (or) identically spelled words with
different meanings such as fair
• ( meaning “just”) and fair (meaning “festival”).
• In a database context, the word homonym indicates the
use of the same attribute name to label different attributes. To lesser
confusion, you should avoid database homonyms, the data dictionary is
very useful in this regard.
• A synonym is the opposite of a homonym and indicates the use of
different names to describe the same attribute. For ex, can and auto
refer to the same object. Synonyms must be avoided.
Integrity constraints
Integrity Rules:
Relational database integrity rules are very important to good
database design. Many RDBMSs enforce integrity rules automatically.
However, it is much safer to make sure that your application design
conforms to the entity and referential integrity rules.
Entity Integrity:
All primary key entries are unique, and no part of a primary key may be
null. Each row will have a unique identity, and foreign key values can
properly reference primary key values.
Referential Integrity:
A foreign key may have either a null entry, as long as it is not a part of its tables
primary key, or an entry that matches the primary key value in a table to which it is
related. (Every non-null foreign key value must reference an existing primary key
value.)
Integrity Constraints: The relational data model includes several types of integrity
constraints. The purpose of integrity constraints is to implement the business rules
in the database. In relational data model several types of integrity constraints are
1. Domain Integrity Constraints:
A domain is a set of values that may be assigned to an attribute. All of the
values that may appear in a column of a table must be taken from the same
domain. A domain consists of values like column name, data type, size and
allowable values. Domain integrity constraints are 1) Not Null and 2) Check. The
Not Null constraint is used to avoid null values. The check constraint is used to
specify a condition for an attribute. In relational data model NULL values is not
equal to zero or Null strings. Here one null value is not equal to another null value.
Entity Integrity Constraints:
Mainly Entity integrity constraints are two types. They are (i) Primary
key and (ii) Unique:
• Every primary key attribute is non-null and contains unique values.
In some cases a particular attribute cannot be assigned a data value. These
are two situations. Where this is likely to occur either there is no
applicable data value (or) the applicable data value is not known. In this
case we use the entity integrity constraint unique.
Referential Integrity Constraints:
In the relational data model association between the tables are
defined with the help of Referential integrity constraints (foreign key). For
example the association between the CUSTOMER and ORDER tables is
identified by including the customer-Id attribute as foreign key in ORDER
table.
CUSTOMER
Customer-Id Cust-Name Add
Primary key
ORDER
Order-Id Order description Customer-Id

Foreign key

• A referential integrity constraint is a rule that maintains consistency


among the rows of two relations. The says that if there is a foreign key in
one relation, the foreign key values must match the primary key values in
another relation.
Logical view of Data
Logical view of data: A collection of interrelated files and a set of programs that allow
users to access and modify these files is known as a database management system.
The database stores and manages both data and meta data. The DBMS manages
and controls access to the data and the database structure. Such an arrangement i.e;
placing the DBMS between the application and the database, eliminates most of the
file system’s internet limitations.
• The relational data model allows the designer to focus on the logical
representation of the data and its relationships, rather than on physical storage
details. The relational model enables you to view data logically rather than physically.
• The use of a table, how the advantages of structural and data independence. A
table does resemble a file from a conceptual point of view. Because you can think of
related records as being stored in independent tables, the relational data base model
is much easier to understand than the hierarchical and network models. Logical
simplicity tends to yield simple and effective database design methodologies. The
table plays such a prominent vole in the relational model.
Tables and their characteristics

• The logical view of the relational database is facilitated by the creation of


data relationships based on a logical construct known as a relation. Because
a relation is a mathematical construct, end-users find it much easier to think
of a relation as a table. A table is perceived as a two-dimensional structure
composed of rows and columns. A table is also called a relation because the
relational model’s creator, E.F. Codd, used the term relation as a synonym
for table
Characteristics of a Relational Table

1. A table is perceived as a two-dimensional structure composed of rows and


columns.
2. Each table row (tuple) represents a single entity occurrence within the
entity set.
3. Each table column represents an attribute, and each column has a distinct
name.
4. Each row/column intersection represents a single data value.
5. All values in a column must conform to the same data format.
6. Each column has a specific range of values known as the attribute domain.
7. The order of the rows and columns is immaterial to the DBMS.
8. Each table must have an attribute or a combination of attributes that
uniquely identifies each row
Relational Model Concepts in DBMS
• Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g.,
Student_Rollno, NAME,etc.
• Tables – In the Relational model the, relations are saved in the table format. It is stored along
with its entities. A table has two properties rows and columns. Rows represent records and
columns represent attributes.
• Tuple – It is nothing but a single row of a table, which contains a single record.
• Relation Schema: A relation schema represents the name of the relation with its attributes.
• Degree: The total number of attributes which in the relation is called the degree of the
relation.
• Cardinality: Total number of rows present in the Table.
• Column: The column represents the set of values for a specific attribute.
• Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation
instances never have duplicate tuples.
• Relation key – Every row has one, two or multiple attributes, which is called relation key.
• Attribute domain – Every attribute has some pre-defined value and scope which is known as
attribute domain
E.F. CODD’S RELATIONAL DATABASE
RULES
• In 1985, Dr.E.F.Codd published a list of 12 rules to define a relational database
system. The reason Dr.Codd published the list was his concern that many
vendors were marketing products as “relational” even though those products
did not meet the minimum relational standards. Dr.Codd’s list below serves as
a frame of reference for what a truly relational database should be.
1. Information Representation: All information in a relational database must
be logically represented as column values in rows within tables.
2. Guaranted Access: Every value in a table is guaranteed to be accessible
through a combination of table name, primary key value, and column name
3. Systematic Treatment of Nulls: Nulls must be represented and treated in a
systematic way, independent of data type.
4. Dynamic On-Line Catalog Based on the Relational Model: The metadata
must be stored and managed as ordinary data, that is, in tables within the
database. Such data must be available to authorized users using the relational
database relational language.
5. Comprehensive Data Sublanguage: The relational database may support
many languages. However, it must support one well defined, declarative
language. However, it must support one well defined, declarative language
with support for data definition, view definition, data manipulation (interactive
and by program), integrity constraints, authorization, and transaction
management (begin, commit, and rollback).
6. View Updating: Any view that is theoretically updatable must be updatable
through the system.
7. High-Level Insert, Update and Delete: The database must support set-level inserts,
updates, and deletes.
8. Physical Data Independence: Application programs and ad hoc facilities are
logically unaffected when physical access methods or storage structures are changed.
9. Logical Data Independence: Application programs and hoc facilities are logically
unaffected when changes are made to the table structures that preserve the original
table values (changing order of column or inserting columns.)
10. Integrity Independence: All relational integrity constraints must be definable in
the relational language and stored in the system catalog, not at the application level.
11. Distribution Independence: The end users and application programs are unaware
and unaffected by the data location (distributed vs. local databases).
12. Nonsubversion: If the system supports low-level access to the data, there must
not be a way to bypass the integrity rules of the database.
13. Rule Zero: All preceding rules are based on the notion that in order for a database
to be considered relational, it must use its relational facilities exclusively to manage
the database.

You might also like