0% found this document useful (0 votes)
4 views55 pages

UNIT1DBMS

A database management system (DBMS) is designed to efficiently store, retrieve, and manage interrelated data relevant to an enterprise while ensuring data integrity and security. It consists of two main components: the database itself and the management system that governs data access and manipulation. DBMS applications span various sectors including banking, education, and manufacturing, addressing issues like data redundancy, access difficulties, and security concerns that arise in traditional file-processing systems.

Uploaded by

naveneetha27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views55 pages

UNIT1DBMS

A database management system (DBMS) is designed to efficiently store, retrieve, and manage interrelated data relevant to an enterprise while ensuring data integrity and security. It consists of two main components: the database itself and the management system that governs data access and manipulation. DBMS applications span various sectors including banking, education, and manufacturing, addressing issues like data redundancy, access difficulties, and security concerns that arise in traditional file-processing systems.

Uploaded by

naveneetha27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

UNIT - I

Introduction of database system:

A database-management system (DBMS) is a collection of


interrelated data and a set of programs to access those data.
The collection of data, usually referred to as the database,
contains information relevant to an enterprise. The primary goal
of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient.
Database systems are designed to manage large bodies of
information. Management structures for storage of information
and providing mechanisms for the manipulation of information.
In addition, the database system must ensure the safety of the
information stored, despite system crashes or attempts at
unauthorized access.

Introduction to Database Management System


As the name suggests, the database management system
consists of two parts. They are:
1. Database and
2. Management System
What is a Database?
To find out what database is, we have to start from data, which
is the basic building block of any DBMS.
Data: Facts, figures, statistics etc. having no particular meaning
(e.g. 1, ABC, 19 etc).
Record: Collection of related data items, e.g. in the above
example the three data items had no meaning. But
if we organize them in the following way, then they collectively
represent meaningful information.
Roll Name Age
1 ABC 19
Table or Relation: Collection of related records.
Roll Name Age
1 ABC 19
2 DEF 22
3 XYZ 28
The columns of this relation are called Fields, Attributes or
Domains. The rows are called Tuples
or Records.
Database: Collection of related relations. Consider the
following collection of tables:
T1 T2
Roll Name Age
1 ABC 19
2 DEF 22
3 XYZ 28
T3 T4
We now have a collection of 4 tables. They can be called a
“related collection” because we can clearly find out that there
are some common attributes existing in a selected pair of
tables. Because of these common attributes we may combine
the data of two or more tables together to find out the complete
details of a student.

Roll Address
1 KOL
2 DEL
3 MUM
Roll Year
1I
2 II
3I
Year Hostel
I H1
II H2
Age and Hostel attributes are in different tables.
A database in a DBMS could be viewed by lots of different
people with different responsibilities.
Figure 1.1:

Empolyees are accessing Data through DBMS


For example, within a company there are different departments,
as well as customers, who each need to see different kinds of
data. Each employee in the company will have different levels
of access to the database with their own customized front-end
application.
In a database, data is organized strictly in row and column
format. The rows are called Tuple or Record.

The data items within one row may belong to different data
types. On the other hand, the columns are often called Domain
or Attribute.

All the data items within a single attribute are of the same data
type.
What is Management System?

A database-management system (DBMS) is a collection of


interrelated data and a set of programs to access those data.
This is a collection of related data with an implicit meaning and
hence is a database. The collection of data, usually referred to
as the database, contains information relevant to an enterprise.
The primary goal of a DBMS is to provide a way to store and
retrieve database information that is both convenient and
efficient. By data, we mean known facts that can be recorded
and that have implicit meaning.

The management system is important because without the


existence of some kind of rules and regulations it is not
possible to maintain the database. We have to select the
particular attributes which should be included in a particular
table; the common attributes to create relationship between two
tables; if a new record has to be inserted or deleted then which
tables should have to be handled etc. These issues must be
resolved by having some kind of rules to follow in order to
maintain the integrity of the database.

Database systems are designed to manage large bodies of


information. Management of data involves both defining
structures for storage of information and providing mechanisms
for the manipulation of information. In addition, the database
system must ensure the safety of the information stored,
despite system crashes or attempts at unauthorized access. If
data are to be shared among several users, the system must
avoid possible anomalous results.

Because information is so important in most organizations,


computer scientists have developed a large body of concepts
and techniques for managing data.

Application of DBMS:
Database Management System (DBMS) and Its
Applications:
A Database management system is a computerized record-
keeping system. It is a repository or a container for collection of
computerized data files. The overall purpose of DBMS is to
allow he users to define, store, retrieve and update the
information contained in the database on demand. Information
can be anything that is of significance to an individual or
organization.

Databases touch all aspects of our lives. Some of the


major areas of application are as follows:
1. Banking
2. Airlines
3. Universities
4. Manufacturing and selling
5. Human resources

Enterprise Information

◦ Sales: For customer, product, and purchase information.


◦ Accounting: For payments, receipts, account balances, assets
and other accounting information.
◦ Human resources: For information about employees, salaries,
payroll taxes, and benefits, and for generation of paychecks.
◦ Manufacturing: For management of the supply chain and for
tracking production of items in factories, inventories of items
inwarehouses and stores, and orders for items.
Online retailers: For sales data noted above plus online order
tracking,generation of recommendation lists, and maintenance
of online product evaluations.
Banking and Finance

◦ Banking: For customer information, accounts, loans, and


banking transactions.
◦ Credit card transactions: For purchases on credit cards and
generation of monthly statements.
◦ Finance: For storing information about holdings, sales, and
purchases of financial instruments such as stocks and bonds;
also for storing real-time market data to enable online trading
by customers and
automated trading by the firm.
• Universities: For student information, course registrations, and
grades (in addition to standard enterprise information such as
human resources and accounting).
• Airlines: For reservations and schedule information. Airlines
were among the first to use databases in a geographically
distributed manner.
• Telecommunication: For keeping records of calls made,
generating monthly bills, maintaining balances on prepaid
calling cards, and storing information about the communication
networks.

Purpose of Database Systems


Database systems arose in response to early methods of
computerized management of commercial data. As an example
of such methods, typical of the 1960s, consider part of a
university organization that, among other data, keeps
information about all instructors, students, departments, and
course offerings. One way to keep the information on a
computer is to store it in operating system files. To allow users
to manipulate the information, the system has a number of
application programs that manipulate the files, including
programs to:
Add new students, instructors, and courses
Register students for courses and generate class rosters
Assign grades to students, compute grade point averages
(GPA), and generate transcripts
System programmers wrote these application programs to meet
the needs of the university.
New application programs are added to the system as the need
arises. For example, suppose that a university decides to
create a new major (say, computer science).As a result, the
university creates a new department and creates new
permanent files (or adds information to existing files) to record
information about all the instructors in the department, students
in that major, course offerings, degree requirements, etc. The
university may have to write new application programs to deal
with rules specific to the new major. New application programs
may also have to be written to handle new rules in the
university. Thus, as time goes by, the system acquires more
files and more application programs.
This typical file-processing system is supported by a
conventional operating system. The system stores permanent
records in various files, and it needs different application
programs to extract records from, and add records to, the
appropriate files. Before database management systems
(DBMSs) were introduced, organizations usually stored
information in such systems. Keeping organizational
information in a fileprocessing system has a number of major
disadvantages:

Data redundancy and inconsistency. Since different


programmers create the files and application programs over a
long period, the various files are likely to have different
structures and the programs may be written in several
programming languages. Moreover, the same information may
be duplicated in several places (files).

For example, if a student has a double major (say, music and


mathematics) the address and telephone number of that
student may appear in a file that consists of student records of
students in the Music department and in a file that consists of
student records of students in the Mathematics department.
This redundancy leads to higher storage and access cost.
In addition, it may lead to data inconsistency; that is, the
various copies of the same data may no longer agree. For
example, a changed student address may be reflected in the
Music department records but not elsewhere in the system.

Difficulty in accessing data. Suppose that one of the


university clerks needs to find out the names of all students
who live within a particular postal-code area. The clerk asks the
data-processing department to generate such a list. Because
the designers of the original system did not anticipate this
request, there is no application program on hand to meet it.
There is, however, an application program to generate the list
of all students.

The university clerk has now two choices: either obtain the list
of all students and extract the needed
information manually or ask a programmer to write the
necessary application program. Both alternatives are obviously
unsatisfactory. Suppose that such a program is written, and
that, several days later, the same clerk needs to trim that list to
include only those students who have taken at least 60 credit
hours. As expected, a program to generate such a list does not
exist. Again, the clerk has the preceding two options, neither of
which is satisfactory. The point here is that conventional file-
processing environments do not allow needed data to be
retrieved in a convenient and efficient manner. More responsive
data-retrieval systems are required for general use.

Data isolation. Because data are scattered in various files, and


files may be in different formats, writing new application
programs to retrieve the appropriate data is difficult.

Integrity problems. The data values stored in the database


must satisfy certain types of consistency constraints.
Suppose the university maintains an account for each
department, and records the balance amount in each account.
Suppose also that the university requires that the account
balance of a department may never fall below zero. Developers
enforce these constraints in the system by adding appropriate
code in the various application programs. However, when new
constraints are added, it is difficult to change the programs to
enforce them. The problem is compounded when constraints
involve several data items from different files.

Atomicity problems. A computer system, like any other


device, is subject to failure. In many applications, it is crucial
that, if a failure occurs, the data be restored to the consistent
state that existed prior to the failure.

Consider a program to transfer $500 from the account balance


of department A to the account balance of department B. If a
system failure occurs during the execution of the program, it is
possible that the $500 was removed from the balance of
department A but was not credited to the balance of
department B, resulting in an inconsistent database state.
Clearly, it is essential to database consistency that either both
the credit and debit occur, or that neither occur.
That is, the funds transfer must be atomic—it must happen in
its entirety or not at all. It is difficult to ensure atomicity in a
conventional file-processing system.

Security problems.

Not every user of the database system should be able to


access all the data. For example, in a university, payroll
personnel need to see only that part of the database that has
financial information. They do not need access to information
about academic records. But, since application programs are
added to the file-processing system in an ad hoc manner,
enforcing such security constraints is difficult.
These difficulties, among others, prompted the development of
database systems. In what follows, we shall see the concepts
and algorithms that enable database systems to solve the
problems with file-processing systems.

Advantages of DBMS:
Controlling of Redundancy: Data redundancy refers to the
duplication of data (i.e storing same data multiple times). In a
database system, by having a centralized database and
centralized control of data by the DBA the unnecessary
duplication of data is avoided. It also eliminates the extra time
for processing the large volume of data. It results in saving the
storage space.
9
Improved Data Sharing : DBMS allows a user to share the
data in any number of application programs.

Data Integrity : Integrity means that the data in the database is


accurate. Centralized control of the data helps in permitting the
administrator to define integrity constraints to the data in the
database. For example: in customer database we can can
enforce an integrity that it must accept the customer only from
Noida and Meerut city.

Security : Having complete authority over the operational data,


enables the DBA in ensuring that the only mean of access to
the database is through proper channels. The DBA can define
authorization checks to be carried out whenever access to
sensitive data is attempted.

Data Consistency : By eliminating data redundancy, we


greatly reduce the opportunities for inconsistency. For example:
is a customer address is stored only once, we cannot have
disagreement on the stored values. Also updating data values
is greatly simplified when each value is stored in one place
only. Finally, we avoid the wasted storage that results from
redundant data storage.

Efficient Data Access : In a database system, the data is


managed by the DBMS and all access to the data is through
the DBMS providing a key to effective data processing

Enforcements of Standards : With the centralized of data,


DBA can establish and enforce the data standards which may
include the naming conventions, data quality standards etc.

Data Independence : Ina database system, the database


management system provides the interface between the
application programs and the data. When changes are made to
the data representation, the meta data obtained by the DBMS
is changed but the DBMS is continues to provide the data to
application program in the previously used way. The DBMs
handles the task of transformation of data wherever necessary.

Reduced Application Development and Maintenance Time :


DBMS supports many important functions that are common to
many applications, accessing data stored in the DBMS, which
facilitates the quick development of application.

Disadvantages of DBMS
1) It is bit complex. Since it supports multiple functionality to
give the user the best, the underlying software has become
complex. The designers and developers should have thorough
knowledge about the software to get the most out of it.
2) Because of its complexity and functionality, it uses large
amount of memory. It also needs large memory to run
efficiently.
3) DBMS system works on the centralized system, i.e.; all the
users from all over the world access this
database. Hence any failure of the DBMS, will impact all the
users.
4) DBMS is generalized software, i.e.; it is written work on the
entire systems rather specific one. Hence some of the
application will run slow.

View of Data
A database system is a collection of interrelated data and a set
of programs that allow users to access and modify these data.
A major purpose of a database system is to provide users with
an abstract view of the data. That is, the system hides certain
details of how the data are stored and maintained.
10
Data Abstraction
For the system to be usable, it must retrieve data efficiently.
The need for efficiency has led designers to use complex data
structures to represent data in the database. Since many
database-system users are not computer trained, developers
hide the complexity from users through several levels of
abstraction, to simplify users’ interactions with the system:
Figure 1.2 : Levels of Abstraction in a DBMS

• Physical level (or Internal View / Schema): The lowest level


of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in
detail.

• Logical level (or Conceptual View / Schema): The next-


higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The
logical level thus describes the entire database in terms of a
small number of relatively simple structures. Although
implementation of the simple structures at the logical level may
involve complex physical-level structures, the user of the logical
level does not need to be aware of this complexity. This is
referred to as physical data independence. Database
administrators, who must decide what information to keep in
the database, use the logical level of abstraction.
• View level (or External View / Schema): The highest level of
abstraction describes only part of the entire database. Even
though the logical level uses simpler structures, complexity
remains because of the variety of information stored in a large
database. Many users of the database system do not need all
this information; instead, they need to access only a part of the
database. The view level of abstraction exists to simplify their
interaction with the system. The system may provide many
views for the same database. Figure 1.2 shows the relationship
among the three levels of abstraction.
An analogy to the concept of data types in programming
languages may clarify the distinction among levels of
abstraction. Many high-level programming languages support
the notion of a structured type. For example, we may describe
a record as follows:
type instructor = record
ID : char (5);
name : char (20);
dept name : char (20);
salary : numeric (8,2);
end;

This code defines a new record type called instructor with four
fields. Each field has a name and a type associated with it. A
university organization may have several such record types,
including

department, with fields dept_name, building, and budget


• course, with fields course_id, title, dept_name, and credits
• student, with fields ID, name, dept_name, and tot_cred

At the physical level, an instructor, department, or student


record can be described as a block of consecutive storage
locations. The compiler hides this level of detail from
programmers. Similarly, the database system hides many of
the lowest-level storage details from database programmers.
Database administrators, on the other hand, may be aware of
certain details of the physical organization of the data.

At the logical level, each such record is described by a type


definition, as in the previous code segment, and the
interrelationship of these record types is defined as well.
Programmers using a programming language work at this level
of abstraction. Similarly, database administrators usually work
at this level of abstraction.

Finally, at the view level, computer users see a set of


application programs that hide details of the data types. At the
view level, several views of the database are defined, and a
database user sees some or all of these views. In addition to
hiding details of the logical level of the database, the views also
provide a security mechanism to prevent users from accessing
certain parts of the database. For example, clerks in the
university registrar office can see only that part of the database
that has information about students; they cannot access
information about salaries of instructors.

Components and structures:

A database system provides a data-definition language to


specify the database schema and a data-manipulation
language to express database queries. updates. In practice,
the data-definition and data-manipulation languages are not
two separate languages; instead they simply form parts of a
single database language,such as the widely used SQL
language.

Data-Manipulation Language

A data-manipulation language (DML) is a language that


enables users to access data as organized by the appropriate
data model. The types of
access are:
• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database

Structure of Database Management System is a software


that allows access to data stored in a database and provides
an easy and effective method of –
• Defining the information.
• Storing the information.
• Manipulating the information.
• Protecting the information from system crashes or data
theft.
• Differentiating access permissions for different users.

Please be note that the Structure of Database Management
System is also referred to as Overall System
Structure or Database Architecture but it is different from
the tier architecture of Database.

The database system is divided into three components: Query


Processor, Storage Manager, and Disk Storage. These are
explained as following below.

Architecture of DBMS

1. Query Processor :
It interprets the requests (queries) received from end user via
an application program into instructions. It also executes the
user request which is received from the DML compiler.
Query Processor contains the following components –
• DML Compiler –
It processes the DML statements into low level instruction
(machine language), so that they can be executed.

• DDL Interpreter –
It processes the DDL statements into a set of table
containing meta data (data about data).

• Embedded DML Pre-compiler –


It processes DML statements embedded in an application
program into procedural calls.

• Query Optimizer –
It executes the instruction generated by DML Compiler.

2. Storage Manager :
Storage Manager is a program that provides an interface
between the data stored in the database and the queries
received. It is also known as Database Control System. It
maintains the consistency and integrity of the database by
applying the constraints and executes the DCL statements. It
is responsible for updating, storing, deleting, and retrieving
data in the database.
It contains the following components –
• Authorization Manager –
It ensures role-based access control, i.e,. checks whether
the particular person is privileged to perform the requested
operation or not.

• Integrity Manager –
It checks the integrity constraints when the database is
modified.

• Transaction Manager –
It controls concurrent access by performing the operations
in a scheduled way that it receives the transaction. Thus, it
ensures that the database remains in the consistent state
before and after the execution of a transaction.

• File Manager –
It manages the file space and the data structure used to
represent information in the database.

• Buffer Manager –
It is responsible for cache memory and the transfer of data
between the secondary storage and main memory.

3. Disk Storage: It contains the following components –


• Data Files –
It stores the data.

• Data Dictionary –
It contains the information about the structure of any
database object. It is the repository of information that
governs the metadata.

• Indices –
It provides faster retrieval of data item.

Explain the components of DBMS?

The database management system (DBMS) software is divided


into several components. Each component will perform a
specific operation. Some of the functions of the DBMS are
supported by operating systems.
The DBMS accepts the SQL commands that are generated
from a variety of user interfaces, which produces a query
evaluation plan, executes these plans against the database,
and returns the answers.
Let’s have a look on the major software components of DBMS
with pictorial representation −
Components

The components of the DBMS are as follows −


• DBA − The Data Base Administrator (DBA) responsibility
is to create the DBMS structure and have the ability to
control the structure.
• Application Programs − It is used to create the records,
change and update the records. It is mainly useful in
designing the interface.
• DML processor − Data Manipulation language, it is
helpful to update data, manipulate data based on user
request, checks according to syntax of SQL.
• DDL Processor − Data Definition language checks the
structure of the database. It checks the improper
statements and the syntax of statements according to the
SQL.
• Data Dictionary − Store all the queries. Queries are
checked according to the SQL configuration, if the queries
are valid ok. Otherwise, it generates errors.
• Integrity Checker − Here data is stored which is designed
by Database administrator. Check the primary or unique
key.
• Authenticate control − Authenticate control checks
whether a user is valid or not.
• Command Processor − It processes the query ->SQL.
For example, SQL ->Oracle -> optimize -> generate file.
• Query optimizer − It updates the query, Reduces
response time at end.
• Transaction manager − Transaction manager, manage
changes in query.
• Scheduler − Send number of requests at a time, A queue
is formed according to time.
• Buffer manager − Buffer manager performs storage
management operation.
• Recovery manager − Recovery manager recovers the
data from main memory and manages the log files or
recovery files.
• Query processor − Query processor processes the query
coming from the user side. Its responsibility is to manage
DML and DDL commands

Database Users and Administrators

A primary goal of a database system is to retrieve information


from and store new information into the database. People who
work with a database can be categorized as database users or
database administrators.

Database Users and User Interfaces


There are four different types of database-system users,
differentiated by the way they expect to interact with the
system. Different types of user interfaces have been designed
for the different types of users.
• Na¨ıve users are unsophisticated users who interact with the
system by invoking one of the application programs that have
been written previously.

For example, a clerk in the university who needs to add a new


instructor to department A invokes a program called new hire.
This program asks the clerk for the name of the new instructor,
her new ID, the name of the department (that is, A), and the
salary.

The typical user interface for na¨ıve users is a forms interface,


where the user can fill in appropriate fields of the form. Na¨ıve
users may also simply read reports generated from the
database.

As another example, consider a student, who during class


registration period, wishes to register for a class by using a
Web interface. Such a user connects to a Web application
program that runs at a Web server. The application first verifies
the identity of the user, and allows her to access a form where
she enters the desired information. The form information is sent
back to the Web application at the server, which then
determines if there is room in the class (by retrieving
information from the database) and if so adds the
student information to the class roster in the database.

• Application programmers are computer professionalswho


write application programs. Application programmers can
choose frommany tools to develop user interfaces. Rapid
application development (RAD) tools are tools that enable an
application programmer to construct forms and reportswith
minimal
programming effort.

• Sophisticated users interact with the system without writing


programs. Instead,they form their requests either using a
database query language or byusing tools such as data
analysis software. Analysts who submit queries to explore data
in the database fall in this category.

• Specialized users are sophisticated users who write


specialized database applications that do not fit into the
traditional data-processing framework.

Among these applications are computer-aided design systems,


knowledgebase and expert systems, systems that store data
with complex data types (for example, graphics data and audio
data), and environment-modeling systems.

2 Database Administrator

One of the main reasons for using DBMSs is tohave central


control of both the data and the programs that access those
data. A person who has such central control over the system is
called a database administrator (DBA). The functions of a
DBA include:

• Schema definition. The DBA creates the original database


schema by executing a set of data definition statements in the
DDL.

• Storage structure and access-method definition.

• Schema and physical-organization modification.


TheDBAcarries out changes to the schema and physical
organization to reflect the changing needs of the organization,
or to alter the physical organization to improve performance.

History of data base system:

Database are a foundational element of the modern world. We


interact with them even without knowing it — any time we buy
something online, or log in to a service, or access our bank
accounts, and so on.

The concept of a database existed long before computers. In


these times, data was stored in journals, in libraries, and in
hundreds of filing cabinets. Everything was recorded via paper
— and that meant it took up space, was hard to find, and
difficult to back up.

And then computers became available, and with them, the


opportunity for better data management.

The 1960s – beginnings

The history of databases begins with the two earliest


computerised examples. Charles Bachman designed the first
computerised database in the early 1960s. This first database
was known as the Integrated Data Store, or IDS. This was
shortly followed by the Information Management System, a
database created by IBM.
Both databases were forerunners of the ‘navigational
database’.

Navigational databases required users to navigate through the


entire database to find the information they wanted. There were
two main models of this: the hierarchical model, and the
network model.

The hierarchical model was developed by IBM. In it, data is


organised like a family tree. Each data entry has a parent
record, starting with a root record.

The network model, meanwhile, was released at the


Conference on Data Systems Languages (CODASYL). It
differed from the hierarchical model in that it allowed a record to
have more than one parent and child record.

The 1970s – relational databases

Perhaps one of the most influential events in the history of


databases came in the 1970s. It was in this decade that E. F.
Codd would release his paper “A Relational Model of Data for
Large Shared Data Banks”. This paper coined the term
‘relational database’ at the start of the decade, and sparked
development of this new way to store and access data.

A relational database is one that shows the relationship


between different data records. Unlike their navigational
counterparts, relational databases would be searchable. They
would also be more space-efficient, meaning reduced data
storage costs.

What followed was the creation of INGRES by Michael


Stonebreaker and Eugene Wong at the University of California,
Berkeley. INGRES, short for Interactive Graphics and Retrieval
System, was a relational database model, proving the viability
of Codd’s ideas. INGRES used a query language called QUEL.
IBM then released their take on a relational database. Known
as System R, it was the first in the history of databases to use
structured query language (SQL).

The 1980s – growth and standardisation

The 1980s in the history of databases marked a time of growth.


Particularly, it was the time of growth for the relational database
model. Earlier navigational models faded, while the
commercialisation of relational systems saw this type of
database rise in use and popularity.

The 1980s also saw SQL become the standard language used
for databases, which we still use today.

Another noteworthy event for the history of databases was the


emergence of Object-oriented database management systems
(OODBMS). This concept appeared in the mid-80s. Object
databases would view data as ‘objects’. They would work
with programming languages that supported the ‘object-
oriented’ approach.

The 1990s – the internet

The early days of object-oriented database management did


not see the idea as a popular one. This was partially due to the
costs and time it would take to rewrite existing databases to
support the approach. However, object oriented database
systems grow more popular in the 90s.

Another key event impacting the history of databases in the 90s


was the creation of the World Wide Web. High investments in
online businesses fuelled demand for client-server database
systems. As such, the internet helped to power exponential
growth of the database industry in the 1990s.
A notable outcome of this was the creation of MySQL in
1995, which was open source. This meant that it provided
an alternative to the database systems offered by big
company

The 2000s – NoSQL


In 1998, the term NoSQL (not only structured query language)
was coined. It refers to databases that use query language
other than SQL to store and retrieve data. NoSQL databases
are useful for unstructured data, and they saw a growth in the
2000s.

This is a notable development in the history of databases


because NoSQL allowed for faster processing of larger, more
varied datasets. NoSQL databases are more flexible than the
traditional relational databases that had risen the decade
before.

es like Oracle and Microsoft. MySQL is still used by many


today.

The 2010s – distributed databases and cybersecurity

The 2010s were a decade of increased data awareness, with


the rise of big data and an increased emphasis on data
protection. And these trends naturally inform the history of
databases.

Having earned its name the decade before, big data was a
major buzzword of the 2010 — and big data meant big
databases to house it. With the need to collect, organise and
make use of such huge reams of data, automation software
has grown a popular tool when interacting with databases.
This is the decade where the value of data truly hit the
public consciousness. And, with it, the importance of
keeping data safe. Legislation like GDPR and the NIS
directive only served to further highlight the importance of
keeping data — and so dat

The history of databases


The history of databases is a rich one, stretching as far back as
the advent of the computer as we know it today. Databases
have grown alongside computers, and changed immensely
since their inception in the 1960s.

Now, we can only wait to see what the future holds for the
evolution of databases.

abases — well protected and secure.

Data Models

Data Model is the modeling of the data description, data


semantics, and consistency constraints of the data. It provides
the conceptual tools for describing the design of a database at
each level of data abstraction. Therefore, there are following
four data models used for understanding the structure of the
database:
Relational Data Model: This type of model designs the data in
the form of rows and columns within a table. Thus, a relational
model uses tables for representing data and in-between
relationships. Tables are also called relations. This model was
initially described by Edgar F. Codd, in 1969. The relational
data model is the widely used model which is primarily used by
commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical


representation of data as objects and relationships among
them. These objects are known as entities, and relationship is
an association among these entities. This model was designed
by Peter Chen and published in 1976 papers. It was widely
used in database designing. A set of attributes describe the
entities. For example, student_name, student_id describes the
'student' entity. A set of the same type of entities is known as
an 'Entity set', and the set of the same type of relationships is
known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model


with notions of functions, encapsulation, and object identity, as
well. This model supports a rich type system that includes
structured and collection types. Thus, in 1980s, various
database systems following the object-oriented approach were
developed. Here, the objects are nothing but the data carrying
its properties.

Semistructured Data Model: This type of data model is


different from the other three data models (explained above).
The semistructured data model allows the data specifications at
places where the individual data items of the same type may
have different attributes sets. The Extensible Markup
Language, also known as XML, is widely used for representing
the semistructured data. Although XML was initially designed
for including the markup information to the text document, it
gains importance because of its application in the exchange of
data.

Introduction of ER Model

Introduction

ER Model stands for Entity-Relationship Model, also known as


a high-level data model that shows the relationship among the
entity sets.ER Model is used to define the entities and the
relationships between them.

It helps developers to design the conceptual design or you can


say the logical design of the system from a data perspective.ER
model describes the structure of a database with the help of a
diagram, which is known as the Entity-Relationship Diagram
(ER Diagram).

What is an ER Diagram?
ER diagrams are used to sketch out the design of a database.
By defining the entities, their attributes, and showing the
relationships between them, an ER diagram illustrates the
logical structure of databases.ER diagrams are created based
on three basic concepts: entities, attributes, and relationships.

As shown in the above diagram, an ER diagram has three main


components:

• Entity
• Attribute
• Relationship

Entity

Any object that physically exists and is logically constructed in


the real world is called as an entity. It is a real-world object that
can be easily identifiable. An entity is represented as a
rectangle in an ER diagram.
Example- In an organization, employees, managers, and
projects assigned can be considered entities. All these entities
have some attributes or properties that give them their identity.

Here, in the above example, employee and project are entities.


Entities are of two types:

• Strong Entity – Strong entities are those entity types that


have a key attribute. The primary key helps in identifying
each entity uniquely. this can not accept null values so it
can not be a unique key. It is represented by a rectangle.

Example – in an example of organization emp_id identifies


each employee of the organization uniquely and hence, we can
say that employee is a strong entity type.
• Weak Entity – Weak entity type doesn’t have a key
attribute. Weak entity types can’t be identified on their
own. It depends upon some other strong entity for its
distinct identity. A weak entity is represented by a double
outlined rectangle. The relationship between a weak entity
type and strong entity type is shown with a double outlined
diamond instead of a single outlined diamond. This
representation can be seen in the example given below.

Here we cannot identify the address uniquely as there can be


many employees from the same locality. So, for this, we need
an attribute of Strong Entity Type i.e ‘employee’ here to
uniquely identify entities of ‘Address’ Entity Type.
ER Model is used to model the logical view of the system from
data perspective which consists of these components:
Entity, Entity Type, Entity Set –

An Entity may be an object with a physical existence – a


particular person, car, house, or employee – or it may be an
object with a conceptual existence – a company, a job, or a
university course.
An Entity is an object of Entity Type and set of all entities is
called as entity set. e.g.; E1 is an entity having Entity Type
Student and set of all students is called Entity Set. In ER
diagram, Entity Type is represented as:

Attribute(s):
Attributes are the properties which define the entity type.
For example, Roll_No, Name, DOB, Age, Address, Mobile_No
are the attributes which defines entity type Student. In ER
diagram, attribute is represented by an oval.

1. Key Attribute –
The attribute which uniquely identifies each entity in the
entity set is called key attribute.For example, Roll_No will be
unique for each student. In ER diagram, key attribute is
represented by an oval with underlying lines.

2. Composite Attribute –
An attribute composed of many other attribute is called as
composite attribute. For example, Address attribute of student
Entity type consists of Street, City, State, and Country. In ER
diagram, composite attribute is represented by an oval
comprising of ovals.

3. Multivalued Attribute –
An attribute consisting more than one value for a given entity.
For example, Phone_No (can be more than one for a given
student). In ER diagram, multivalued attribute is represented
by double oval.

4. Derived Attribute –
An attribute which can be derived from other attributes of
the entity type is known as derived attribute. e.g.; Age (can be
derived from DOB). In ER diagram, derived attribute is
represented by dashed oval.

The complete entity type Student with its attributes can be


represented as:
Relationship Type and Relationship Set:

A relationship type represents the association between


entity types. For example,‘Enrolled in’ is a relationship type
that exists between entity type Student and Course. In ER
diagram, relationship type is represented by a diamond and
connecting the entities with lines.

A set of relationships of same type is known as relationship


set. The following relationship set depicts S1 is enrolled in C2,
S2 is enrolled in C1 and S3 is enrolled in C3.

Degree of a relationship set:

The number of different entity sets participating in a


relationship set is called as degree of a relationship set.

1. Unary Relationship –
When there is only ONE entity set participating in a
relation, the relationship is called as unary relationship. For
example, one person is married to only one person.
2. Binary Relationship –
When there are TWO entities set participating in a relation,
the relationship is called as binary relationship.For example,
Student is enrolled in Course.

3. n-ary Relationship –
When there are n entities set participating in a relation, the
relationship is called as n-ary relationship.

Cardinality:
The number of times an entity of an entity set participates
in a relationship set is known as cardinality. Cardinality can
be of different types:
1. One to one – When each entity in each entity set
can take part only once in the relationship, the
cardinality is one to one. Let us assume that a male
can marry to one female and a female can marry to
one male. So the relationship will be one to one.
Using Sets, it can be represented as:

2. Many to one – When entities in one entity set can take


part only once in the relationship set and entities in other
entity set can take part more than once in the relationship
set, cardinality is many to one. Let us assume that a student
can take only one course but one course can be taken by
many students. So the cardinality will be n to 1. It means that
for one course there can be n students but for one student,
there will be only one course.

Using Sets, it can be represented as:


In this case, each student is taking only 1 course but 1 course
has been taken by many students.
3. Many to many – When entities in all entity sets can take
part more than once in the relationship cardinality is many
to many. Let us assume that a student can take more than one
course and one course can be taken by many students. So the
relationship will be many to many.

Using sets, it can be represented as:


In this example, student S1 is enrolled in C1 and C3 and
Course C3 is enrolled by S1, S3 and S4. So it is many to many
relationships.
Participation Constraint:
Participation Constraint is applied on the entity participating in
the relationship set.
1. Total Participation – Each entity in the entity set must
participate in the relationship. If each student must enroll in a
course, the participation of student will be total. Total
participation is shown by double line in ER diagram.
2. Partial Participation – The entity in the entity set may or
may NOT participate in the relationship. If some courses are
not enrolled by any of the student, the participation of course
will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with
Student Entity set having total participation and Course Entity
set having partial participation.
Using set, it can be represented as,

Every student in Student Entity set is participating in


relationship but there exists a course C4 which is not taking
part in the relationship.
Weak Entity Type and Identifying Relationship:

As discussed before, an entity type has a key attribute which


uniquely identifies each entity in the entity set. But there
exists some entity type for which key attribute can’t be
defined. These are called Weak Entity type.
For example, A company may store the information of
dependents (Parents, Children, Spouse) of an Employee. But
the dependents don’t have existence without the employee. So
Dependent will be weak entity type and Employee will be
Identifying Entity type for Dependent.
A weak entity type is represented by a double rectangle. The
participation of weak entity type is always total. The
relationship between weak entity type and its identifying strong
entity type is called identifying relationship and it is
represented by double diamond.
Additional Features of ER model/Conceptual design with
ER model:

The basic E-R concepts can model most database features,


some aspects of a database may be more aptly expressed by
certain extensions to the basic E-R model. The extended E-R
features are specialization, generalization, higher- and lower-
level entity sets, attribute inheritance, and aggregation.

Specialization – An entity set broken down sub-entities that


are distinct in some way from other entities in the set. For
instance, a subset of entities within an entity set may have
attributes that are not shared by all the entities in the entity set.
The E-R model provides a means for representing these
distinctive entity groupings.

Specialization is an “aTop-down approach” where a high-level


entity is specialized into two or more level entities.

Example – Consider an entity set vehicle, with attributes color


and no. of tires. A vehicle may be further classified as one of
the following:

• Car
• Bike
• Bus
Each of these vehicle types is described by a set of attributes
that includes all the attributes of the entity set vehicle plus
possibly additional attributes. For example, car entities may be
described further by the attribute gear, whereas bike entities
may be described further by the attributes automatic break. The
process of designating subgroupings within an entity set is
called specialization. The specialization of vehicles allows us to
distinguish among vehicles according to whether they are cars,
buses, or bikes.

Generalization – It is a process of extracting common


properties from a set of entities and creating a generalized
entity from it. generalization is a “Bottom-up approach”. In
which two or more entities can be combined to form a higher-
level entity if they have some attributes in common.

In generalization, Subclasses are combined to make a


superclass.

Example: There are three entities given, car, bus, and bike.
They all have some common attributes like all cars, buses, and
bikes they all have no. of tires and have some colors. So they
all can be grouped and make a superclass named a vehicle.

Inheritance – An entity that is a member of a subclass inherits


all the attributes of the entity as the member of the superclass,
the entity also inherits all the relationships that the superclass
participates in. Inheritance is an important feature of
Generalization and Specialization. It allows lower-level entities
to inherit the attributes of higher-level entities.

Example – Car, bikes, and buses inherit the attributes of a


vehicle. Thus, a car is described by its color and no. of tires,
and additionally a gear attribute; a bike is described by its color
and no. of tires attributes, and additionally automatic break
attribute.

Aggregation – In aggregation, the relation between two


entities is treated as a single entity. In aggregation, the
relationship with its corresponding entities is aggregated into a
higher-level entity.
Example- phone numbers on your mobile phone. You can refer
to them individually – your mother’s number, your best friend’s
number, etc. But it’s easier to think of them collectively, as your
phone number list. It is also important to realize that each
member of the aggregation still has the properties of the whole.
In other words, each phone number in the list remains a phone
number. The process of combining them has not altered them
in any way.

How To Create ER diagram in DBMS

Following are the steps to create an ER Diagram


CONCEPTUAL DESIGN WITH E-R MODEL
An E-R diagram can express the overall logical structure of a
database graphically. E-R diagrams are simple and clear—
qualities that may well account in large part for the widespread
use
of the E-R model. Such a diagram consists of the following
major
components:

Rectangles, which represent entity sets


Ellipses, which represent attributes


Diamonds, which represent relationship sets


Lines, which link attributes to entity sets and entity sets to


relationship sets

Double ellipses, which represent multi valued attributes


Dashed ellipses, which denote derived attributes


Double lines, which indicate total participation of an entity in a


relationship set

Double rectangles, which represent weak entity sets

Let’s study them with an Entity Relationship Diagram Example:

In an Organization, an employee is assigned to projects. An


employee must be assigned to at least one or more projects.
Each project is managed by a single manager. To maintain
instruction quality, a manager can control only one project.

Step 1) Entity Identification: We have three entities

• Employee
• Project
• Manager
Step 2) Relationship Identification: We have the following
two relationships

• The employee is assigned a project


• manager control a project

Step 3) Cardinality Identification: For them problem


statement we know that,

• An employee can be assigned multiple projects


• A manager can manage only one course

Step 4) Identify Attributes – Initially, it’s important to identify


the attributes without mapping them to a particular entity. Once
you have a list of Attributes, you need to map them to the
identified entities. Ensure an attribute is to be paired with
exactly one entity. If you think an attribute should belong to
more than one entity, use a modifier to make it unique.

Once the mapping is done, identify the primary Keys. If a


unique key is not readily available, create one.

Entity Primary Key Attribute


Employee Employee_ID EmployeeName
Manager Manager_ID ManagerName
Project Project_ID ProjectName

For the sake of ease, we have considered just one attribute.

Step 5) Create the ERD Diagram – A more modern


representation of Entity Relationship Diagram Example

Why use ER Diagrams?

Here, are prime reasons for using the ER Diagram


• It helps you to define terms related to entity relationship
modeling.
• It provides a preview of how all your tables should
connect, what fields are going to be on each table
• It helps to describe entities, attributes, relationships.
• ER diagrams are translatable into relational tables which
allows you to build databases quickly.
• ER diagrams can be used by database designers as a
blueprint for implementing data in specific software
applications.

Conclusion

ER diagram in DBMS is widely used to describe the conceptual


design of databases. It helps both users and database
developers to preview the structure of the database. With the
help of an ER diagram, we can create the required database
and perform queries. For Example in the case of the airline
reservation system, we can make queries like to find the
scheduled time of a flight, the number of booked seats in a
flight, flight fares, etc. We can convert the ER design into a
relational design or you can say table format.

Enhanced Entity Relationship Model (EER Model)

EER Model

EER is a high-level data model that incorporates the extensions


to the original ER model.

It is a diagrammatic technique for displaying the following


concepts
• Sub Class and Super Class
• Specialization and Generalization
• Union or Category
• Aggregation
These concepts are used when the comes in EER schema and
the resulting schema diagrams called as EER Diagrams.

Features of EER Model

• EER creates a design more accurate to database schemas.


• It reflects the data properties and constraints more precisely.
• It includes all modeling concepts of the ER model.
• Diagrammatic technique helps for displaying the EER schema.
• It includes the concept of specialization and generalization.
• It is used to represent a collection of objects that is union of
objects of different of different entity types.
A. Sub Class and Super Class
• Sub class and Super class relationship leads the concept of
Inheritance.
• The relationship between sub class and super class is denoted
with symbol.
1. Super Class
• Super class is an entity type that has a relationship with one or
more subtypes.
• An entity cannot exist in database merely by being member of
any super class.
For example: Shape super class is having sub groups as
Square, Circle, Triangle.
2. Sub Class
• Sub class is a group of entities with unique attributes.
• Sub class inherits properties and attributes from its super
class.
For example: Square, Circle, Triangle are the sub class of
Shape super class.
B. Specialization and Generalization

1. Generalization
• Generalization is the process of generalizing the entities which
contain the properties of all the generalized entities.
• It is a bottom approach, in which two lower level entities
combine to form a higher level entity.
• Generalization is the reverse process of Specialization.
• It defines a general entity type from a set of specialized entity
type.
• It minimizes the difference between the entities by identifying
the common features.
For example:

In the above example, Tiger, Lion, Elephant can all be


generalized as Animals.
2. Specialization
• Specialization is a process that defines a group entities which
is divided into sub groups based on their characteristic.
• It is a top down approach, in which one higher entity can be
broken down into two lower level entity.
• It maximizes the difference between the members of an entity
by identifying the unique characteristic or attributes of each
member.
• It defines one or more sub class for the super class and also
forms the superclass/subclass relationship.
For example

In the above example, Employee can be specialized as


Developer or Tester, based on what role they play in an
Organization.

C. Category or Union
• Category represents a single super class or sub class
relationship with more than one super class.
• It can be a total or partial participation.
For example Car booking, Car owner can be a person, a bank
(holds a possession on a Car) or a company. Category (sub
class) → Owner is a subset of the union of the three super
classes → Company, Bank, and Person. A Category member
must exist in at least one of its super classes.
D. Aggregation
• Aggregation is a process that represent a relationship between
a whole object and its component parts.
• It abstracts a relationship between objects and viewing the
relationship as an object.
• It is a process when two entity is treated as a single entity.


In the above example, the relation between College and
Course is acting as an Entity in Relation with Student.

You might also like