Dbms-3bcom - UG - New
Dbms-3bcom - UG - New
Dbms-3bcom - UG - New
Data Availability: Data availability refers to the fact that the data are made available to
wide variety of users in a meaningful format at reasonable cost so that the users can
easily access the data.
Data Integrity: Data integrity refers to the correctness of the data in the database. In
other words, the data available in the database is a reliable data.
Data Security : Data security refers to the fact that only authorized users can access
the data. Data security can be enforced by passwords. If two separate users are
accessing a particular data at the same time, the DBMS must not allow them to make
conflicting changes
Data Independence : DBMS allows the user to store, update, and retrieve data in an
efficient manner. DBMS provides an “abstract view” of how the data is stored in the
database. In order to store the information efficiently, complex data structures are used
to represent the data. The system hides certain details of how the data are stored and
maintained.
Evolution of Database Management
Systems
File-based system was the predecessor to the database management
system. Apollo moon-landing process was started in the year 1960. At
that time, there was no system available to handle and manage large
amount of information. As a result, North American Aviation which is
now popularly known as Rockwell International developed software
known as Generalized Update Access Method (GUAM).
In the mid-1960s, IBM joined North American Aviation to develop
GUAM into Information Management System (IMS). IMS was based
on Hierarchical data model. In the mid-1960s, General Electric
released Integrated Data Store (IDS).
IDS were based on network data model. Charles Bachmann was
mainly responsible for the development of IDS. The network database
was developed to fulfill the need to represent more complex data
relationships than could be modeled with hierarchical structures.
Conference on Data System Languages formed Data Base Task Group (DBTG) in 1967. DBTG
specified three distinct languages for standardization. They are Data Definition Language
(DDL), which would enable Database Administrator to define the schema, a subschema
DDL, which would allow the application programs to define the parts of the database and
Data Manipulation Language (DML) to
manipulate the data. The network and hierarchical data models developed during that time
had the drawbacks of minimal data independence, minimal theoretical foundation, and
complex data access.
To overcome these drawbacks, in 1970, Codd of IBM published a paper titled “A Relational
Model of Data for Large Shared Data Banks” in Communications of the ACM, vol. 13, No. 6,
pp. 377–387, June 1970.
As an impact of Codd’s paper, System R project was developed during the late 1970 by IBM
San Jose Research Laboratory in California. The project was developed to prove that
relational data model was implementable. The outcome of System R project was the
development of Structured Query Language (SQL) which is the standard language for
relational database management system.
In 1980s IBM released two commercial relational database management systems known as
DB2 and SQL/DS and Oracle Corporation released Oracle.
In 1979, Codd himself attempted to address some of the failings in his original work with
an extended version of the relational model called RM/T in 1979 and RM/V2 in 1990.
In recent years, two approaches to DBMS are more popular, which are Object-
Oriented DBMS (OODBMS) and Object Relational DBMS (ORDBMS). The
chronological order of the development of DBMS is as follows:
1. Flat files – 1960s–1980s
2. Hierarchical – 1970s–1990s
3. Network – 1970s–1990s
4. Relational – 1980s–present
5. Object-oriented – 1990s–present
6. Object-relational – 1990s–present
7. Data warehousing – 1980s–present
8. Web-enabled – 1990s–present
• Early 1960s. Charles Bachman at GE created the first general purpose DBMS
Integrated Data Store. It created the basis for the network model which was
standardized by CODASYL (Conference on Data System Language).
• Late 1960s. IBM developed the Information Management System (IMS). IMS
used an alternate model, called the Hierarchical Data Model.
• 1970. Edgar Codd, from IBM created the Relational Data Model.
• In 1981 Codd received the Turing Award for his contributions to database
theory. Codd Passed away in April 2003.
• 1976. Peter Chen presented Entity-Relationship model, which is widely used
in database design. 1980. SQL developed by IBM, became the standard query
language for databases. SQL was standardized by ISO.
• 1980s and 1990s. IBM, Oracle, Informix and others developed powerful
DBMS.
Classification of Database Management System
The database management system can be broadly classified into two categories,
they are
(1) Passive Database Management System
(2) Active Database Management System
1. Passive Database Management System.
Passive Database Management Systems are program-driven. In passive
database management system the users query the current state of database and
retrieve the information currently available in the database. Traditional DBMS
are passive in the sense that they are explicitly and synchronously invoked by
user or application program initiated operations. Applications send requests for
operations to be performed by the DBMS and wait for the DBMS to confirm and
return any possible answers. The operations can be definitions and updates of
the schema, as well as queries and updates of the data.
2. Active Database Management System.
Active Database Management Systems are data-driven or event-
driven systems. In active database management system, the users
specify to the DBMS the information they need. If the information of
interest is currently available, the DBMS actively monitors the arrival of
the desired information and provides it to the relevant users. The scope
of a query in a passive DBMS is limited to the past and present data,
whereas the scope of a query in an active DBMS additionally includes
future data. An active DBMS reverses the control flow between
applications and the DBMS instead of only applications calling the
DBMS, the DBMS may also call applications in an active DBMS.
UNIT-II
Historical Roots of File and File System
In olden days records are maintained in traditional file systems. File
system means organization of files.
File is a collection (or) group of records. A record is collection of fields.
Where the field contains the real data. For ex: Student file, Where we maintain
all students record consisting of Roll No, name, group, marks, average.
In the olden days the file system maintains all the files in flat manner (flat
files/text-files). The flat file permits to search any record in sequential access
only. It was cumbersome and slow. To overcome the slowness they have gone
for Index file system which was faster in accessing in random manner. But it
occupies extra memory to maintain Index table.
In general in the file system all the data has to be stored in the
corresponding folders (or) directories. For ex: In a college we can
maintain admission details of students. Suppose the director wants to
know today’s admission status group wise in that case the file manager
(or) clerk has to open each folder to answer the director’s question.
That’s why it is time consuming, memory consuming, may be error
prone.
To over come this, the data base system has evolved. Which uses
the 4GL language i.e. SQL. Which allows answering any query?
The DBMS maintains all the records in the form of tables by
means of rows and columns. But in DBMS before storing the data the
schema has to be created with the help of DDL.
File system
• Assume maintaining the first year student’s data using file systems.
BBA BBA B.Com B.Com
• Folder name
• Roll No, name, fees Roll No, name, fees
Every folder contains all the relevant fields which occupy extra memory,
calculation is slow. Constraints cannot be imposed in File systems
DBMS.
Roll No Name Branch Fees
60012 Gopal B.Sc 20000
Publishers
Publishers
Person
Eno fname varchar
E_inf person &name varchar
Address_inf address initial varchar
Phone number
Address
Street varchar
City varchar
State varchar
Components and Interfaces of Database Management
System
Procedure: Procedures are the rules that govern the design and the use of database.
Components of Database Environment:
The major components of a typical database environment
and their relationships are shown below
• Computer-aided software engineering (CASE) tools : CASE Tools are automated
tools used to design databases and application programs.
• Repository: Repository is Centralized knowledge base for all data definitions,
data relationships, screen and report formats, and other system components. A
repository contains an extended set of metadata important for managing
databases as will as other components of an information system.
• Database management system (DBMS): DBMS is a Commercial software (and
occasionally, hardware and firmware) system, which is used to define, create,
maintain, and provide controlled access to the database and also to the
repository. In other words DBMS is collection of logically related data and set of
programs to operate data.
• Database: Database is an organized collection of logically related data, usually
designed to meet the information needs of multiple users in an organization. It is
important to distinguish between the database and the repository. The
repository contains definitions of data, whereas the database contains
occurrences of data.
• Application Programs: Application programs are Computer programs that are
used to create and maintain the database and provide information to users.
• User interface Languages, menus, and other facilities by which users
interact with various system components, such as CASE tools, application
programs, the DBMS, and the repository.
• Data administrators Persons who are responsible for the overall
information resources of an organization. Data administrators use CASE
tools to improve the productivity of database planning and design.
• System developers Persons such as systems analysts and programmers
who design new application programs. System developers often use
CASE tools for system requirements analysis and program design.
• End users Persons throughout the organization who add, delete, and
modify data in the database and who request or receive information
from it. All user interactions with the database must be routed through
the DBMS.
Ranges of Database Applications
The range of database applications can be divided into five categories: Personal
databases, workgroup databases, department databases, enterprise databases,
and Internet, Intranet, and Extranet databases.
Personal Databases:
Personal databases are designed to support one user. Personal databases
have long resided on personal computers (PCs), including laptops. Recently the
introduction of personal digital assistants (PDAs) has incorporated personal
databases into handheld devices that not only function as computing devices but
also as cellular phones, fax senders, and Web browsers.
• Personal databases are widely used because they can often improve personal
productivity. However, they entail a risk: The data cannot easily be shared with
other users. For this reason, personal databases should be limited to those
rather special situation (such as in a very small organization) where the need to
share the data among users of the personal database is unlikely to arise.
Workgroup Database:
A workgroup is a relatively small team of people who collaborate
on the same project or application or on a group of similar projects or
applications. A workgroup typically comprises fewer than 25 persons.
A workgroup database is designed to support the collaborative efforts
of such a team.
• The method of sharing the data in this database is shown in below
figure. Each member of the workgroup has a desktop computer and
the computers are linked by means of a local area network (LAN). The
database is stored on a central device called the database server,
which is also connected to the network. Thus each member of the
workgroup has access to the shared data.
• Workgroup database with local area network
Project Manager
Developer 1 Developer n Librarian
Database server
Workgroup database
Department Databases:
A department is a functional unit within an organization. Typical examples
of department are personnel, marketing, manufacturing, and accounting.
A department is generally larger than a workgroup (typically between 25
and 100 persons) and is responsible for a more diverse range of functions.
• Department databases are designed to support the various functions and
activities of a department
Enterprise Databases
An enterprise database is one whose scope is the entire
organization or enterprise (or, at least, many different departments).
Such databases are intended to support organization-wide operations
and decision making. An enterprise database does, however, support
information needs from many departments. Over the last decade, the
evolution of enterprise databases has resulted in two major
developments:
1. Enterprise resource planning (ERP) systems
2. Data warehousing implementations.
• An enterprise data warehouse
Branch
Office -1
Branch
Office-2
Branch
Office-5
Internet, Intranet and Extranet Databases
Two-Tier Architecture
The two-tier architecture is a client–server architecture in which the
client contains the presentation code and the SQL statements for data
access. The database server processes the SQL statements and sends
query results back to the client. Two-tier client/server provides a basic
separation of tasks. The client, or first tier, is primarily responsible for
the presentation of data to the user and the “server,” or second tier, is
primarily responsible for supplying data services to the client.
Presentation Services
“Presentation services” refers to the portion of the application which presents data
to the user. In addition, it also provides for the mechanisms in which the user will interact
with the data. More simply put, presentation logic defines and interacts with the user
interface. The presentation of the data should generally not contain any validation rules.
Application Services
“Application services” provide other functions necessary for the application.
Business Services/objects
“Business services” are a category of application services. Business services encapsulate
an organizations business processes and requirements. These rules are derived from the
steps necessary to carry out day-today business in an organization. These rules can be
validation rules, used to be sure that the incoming information is of a valid type and
format, or they can be process rules, which ensure that the proper business process is
followed in order to complete an operation.
Data Services
“Data services” provide access to data independent of their location. The data can come
from legacy mainframe, SQL RDBMS, or proprietary data access systems.
Advantages of Two-tier Architecture
• The two-tier architecture is a good approach for systems with stable
requirements and a moderate number of clients.
• The two-tier architecture is the simplest to implement, due to the
number of good commercial development environments.
Drawbacks of Two-tier Architecture
• Software maintenance can be difficult because PC clients contain a
mixture of presentation, validation, and business logic code.
• To make a significant change in the business logic, code must be
modified on many PC clients.
• Moreover the performance of two-tier architecture can be poor when a
large number of clients submit requests because the database server
may be overwhelmed with managing messages.
• With a large number of simultaneous clients, three-tier architecture may
be necessary.
Three-tier Architecture
Some of the popular DBMS vendors and their corresponding products are as follows
vendor product
IBM –DB2/MVS
–DB2/UDB –DB2/400
–Informix Dynamic Server (IDS)
Microsoft –Access
–SQLServer
–DesktopEdition(MSDE)
Open Source –MySQL
–PostgreSQL
Oracle –Oracle DBMS
–RDB
Sybase –Adaptive Server Enterprise (ASE)
–Adaptive Server Anywhere (ASA)
–Watcom
Different views of Database/Abstraction
• Attribute
• Relationship
Classification of Entity Sets
Entity Set
Strong Entity
Strong entity is one whose existence does not depend on other entity.
Example
• Consider the example, student takes course. Here student is a strong
entity
Ternary Relationship
In a ternary relationship, three entities are simultaneously involved. Ternary
relationships are required when binary relationships are not sufficient to
accurately describe the semantics of an association among three entities
Specialization and Generalization (or)
Characteristics of Super type/Subtypes
• In the above relation cust-Id is the primary key. So all of the remaining
attributes are functionally dependent on this attribute. However there is a
transitive dependency. The attribute region is functionally dependent the
attributes salesperson-Id. So we decompose the above relation into new
relations that satisfy our 3rd normal form.
CUSTOMER
Cust-Id Cust-Name Salesperson-Id
SALES
Salesperson-Id Region
ADVISER
Subject Adviser
STUDENT
2) Fourth Normal Form:
St-No St-name Course-Id Grade
A relation is in fourth normal form if it is
in Boyce codd normal form and no multi St-No, course-Id grade
valued dependency. Here multi-valued
dependency is a functional dependency that St-Name, course-Id grade
exists a non-key attribute is functionally The above relation contains multi-valued dependency. So it is not in fourth
dependent on two or more sets of primary key
attributes. normal form. To avoid the multi-valued dependency we decompose the above
• For example consider a relation relations into new relations.
student contains attributes like st-No, st-
Name, course-Id and grade. Here primary
key is a composite key of st-No, st-Name, St-No St-Name
course-Id. In the above example the non-key
attribute grade is functionally dependent on
st-no, course-Id, and st-name, course-Id. St-No Course-Id Grade
Fifth Normal Form (5NF):- (Domain Normal Form) (Projection-join Normal Form)
A relation is in fifth normal form if it is in fourth normal form and
that contains joined dependency. Here join-dependency means if a
relation contains minimum of 3 attributes and every attribute may
functionally dependent on the remaining attributes. For example consider
a relation CLASS with attributes like subject, teacher and text-book. Here
primary key is a composite key of subject, teacher, text-book. The above
relation CLASS is not in fifth normal form because it satisfies join-
dependency. So we decompose the above relation into new relations.
Subject Teacher Text-book
a1 b1
(i)
a2
One
a3 – to
a4
One – to - One
• EXAMPLE
a1 b2
b3
a2
b4
b5
Example:
One - to - One
(iii) Many – to – One An entity in A is associated with at most one entity
in B. An entity in B however, can be associated with any number of
entities in A.
a1
b1
a2
b2
a3
b3
a4
a5
Many – to - One
• Example:
PRODUCT Product_Line
Contains
Many - to - One
(iv) Many-to-Many An entity in A is associated with any number of entities in
B and an entity in B is associated with any number of entities in A.
a1 b1
a2 b2
a3 b3
a4 b4
Example:
STUDENT Registers- Course
for
Entity Relationship model constructs(ER Model)
• Weak entity: An entity set may not have sufficient attributes to form a primary
key such entity set is termed as weak entity set (or) An entity type whose
existence depends on some other entity type is called weak entity type. weak
entity type has no meaning in the E- R diagram without the entity on which it
depends. the entity type on which the weak entity type depends is called “the
identifying owner” (or) simply called “owner” . Weak entity is denoted by a
symbol double lined Box.
Identity Relationship : The relationship that associates the weak entity set
with an owner is the identifying relationship.
Attributes: property or characteristic of an entity type that is of interest to
the organization is called attribute. An attribute is denoted by a symbol ellipse
• Ex:
FACULTY F_Name
M_Name
• In above example
F_Id,Name,Dob,Age,Skill F_Id Name
L_Name
Qualification
Dob
are attributes of the Skill FACULTY
Course-id Course-title
Employee-id Employee-
name
Topic
Birthdate
• This is a many-To- many relationship since each employee may complete any no.
of course, while a given course may be completed by any no. of employees.
• Associative Entities : An associative entity is an entity type that associates the
instances of one or more entity types and contains attributes that are peculiar to
the relationships between those entity instances,
• In the E-R model associative entities are represented with the diamond
relationship symbol enclosed within entity box. The purpose of this symbol is to
preserve the information that the entity was initially specified as a relationship on
E-R Model.
Different Keys
Keys: It is important to be able to specify how rows in a relation are
distinguished conceptually, rows are distinct from one another, but from a
database perspective the difference among them must be expressed in terms
of their attributes. Keys come here for a rescue.
Primary key: A primary key is a set of one (or) more attributes that can
uniquely identify tuple within the relation.
• Ex: In our sample database, sup # is the primary key for supplier’s relation
(table) as it contains unique value for each tuple in the relation
Composite Primary Key: In some tables, combination of more than one
attribute provides a unique value for each row. In such tables, the group of
these attributes is declared as primary key. In such cases, the primary key
consists of more than one attribute; it is called composite-primary-key.
• Ex: Supp # and item # is the primary key for the shipments relation (table).
Candidate Key: All attribute combinations inside a relation that can serve as primary
key are candidate keys as they are candidates for the primary key position.
• Ex: In our sample database, there are two candidate keys supp # and sup-name in
the suppliers relation. Both of these attributes contain unique values for each
tuple.
Super Key
A super key is a column (or) set of columns that uniquely identifies a row
within a table.
• Ex:
Given table : Employees { employee-id, first-name, sur-name, sal }
Possible super keys are :
• { Employee_Id}
• { Employee_Id, First_Name }
• { Employee_Id, First_Name, Surname }
• { Employee_Id, First_Name, Surname, Sal }
Secondary Key: It is defined as a key that is used strictly for data retrieval
purposes.
Ex: Suppose customer data are stored in a CUSTOMER table in which the
customer number is the primary key. Suppose, that some of the customers
forget their number? Data retrieval for a customer can be facilitated when the
customer’s last name and phone number are used. In that case, the primary
key is the customer number, the secondary key is the combination of the
customer’s last name and phone number.
Foreign Key: A non-key attribute, whose values are derived from the primary
key of some other table, is known as foreign-key in its current table.
Data dictionary and System catalog
Data Dictionary:
An integral part of RDBMS is the data dictionary which stores
Meta data, (or) information about the database, including attribute names
and definitions for each table in the database. The data dictionary is usually
a part of the system catalog that is generated for each database.
• The system catalog describes all database objects, including
table-related data such as table names, table creators (or) owners, column
names and data types, foreign keys and primary keys, index files,
authorized users, user access privileges and so forth. The system catalog is
created by the DBMS and the information is stored in system tables, which
may be queried in the same manner as any other data table, if the user has
sufficient access privileges.
• The system catalog automatically produces database documentation. As
new tables are added to the database, that documentation also allows
the RDBMS to check for and eliminate homonyms and synonyms.
• Homonyms are similar-sounding words with different
meanings, such as boar and bore (or) identically spelled words with
different meanings such as fair
• ( meaning “just”) and fair (meaning “festival”).
• In a database context, the word homonym indicates the
use of the same attribute name to label different attributes. To lesser
confusion, you should avoid database homonyms, the data dictionary is
very useful in this regard.
• A synonym is the opposite of a homonym and indicates the use of
different names to describe the same attribute. For ex, can and auto
refer to the same object. Synonyms must be avoided.
Integrity constraints
Integrity Rules:
Relational database integrity rules are very important to good
database design. Many RDBMSs enforce integrity rules automatically.
However, it is much safer to make sure that your application design
conforms to the entity and referential integrity rules.
Entity Integrity:
All primary key entries are unique, and no part of a primary key may be
null. Each row will have a unique identity, and foreign key values can
properly reference primary key values.
Referential Integrity:
A foreign key may have either a null entry, as long as it is not a part of its tables
primary key, or an entry that matches the primary key value in a table to which it is
related. (Every non-null foreign key value must reference an existing primary key
value.)
Integrity Constraints: The relational data model includes several types of integrity
constraints. The purpose of integrity constraints is to implement the business rules
in the database. In relational data model several types of integrity constraints are
1. Domain Integrity Constraints:
A domain is a set of values that may be assigned to an attribute. All of the
values that may appear in a column of a table must be taken from the same
domain. A domain consists of values like column name, data type, size and
allowable values. Domain integrity constraints are 1) Not Null and 2) Check. The
Not Null constraint is used to avoid null values. The check constraint is used to
specify a condition for an attribute. In relational data model NULL values is not
equal to zero or Null strings. Here one null value is not equal to another null value.
Entity Integrity Constraints:
Mainly Entity integrity constraints are two types. They are (i) Primary
key and (ii) Unique:
• Every primary key attribute is non-null and contains unique values.
In some cases a particular attribute cannot be assigned a data value. These
are two situations. Where this is likely to occur either there is no
applicable data value (or) the applicable data value is not known. In this
case we use the entity integrity constraint unique.
Referential Integrity Constraints:
In the relational data model association between the tables are
defined with the help of Referential integrity constraints (foreign key). For
example the association between the CUSTOMER and ORDER tables is
identified by including the customer-Id attribute as foreign key in ORDER
table.
CUSTOMER
Customer-Id Cust-Name Add
Primary key
ORDER
Order-Id Order description Customer-Id
Foreign key