Unit 1 ADBMS
Unit 1 ADBMS
BHILAI
Course/Semester: MCA-I
UNIT-1
In traditional file processing, each user defines and implements the files needed for a
specific software application as part of programming the application.
For example, one user, the grade reporting office, may keep files on students and their grades.
Programs to print a student’s transcript and to enter new grades are implemented as part of the
application. A second user, the accounting office, may keep track of students’ fees and their
payments. Although both users are interested in data about students, each user maintains
separate files— and programs to manipulate these files—because each requires some data not
available from the other user’s files. This redundancy in defining and storing data results in
wasted storage space and in redundant efforts to maintain common up-to-date data.
In the database approach, a single repository maintains data that is defined once and then
accessed by various users. In file systems, each application is free to name data elements
independently. In contrast, in a database, the names or labels of data are defined once, and
used repeatedly by queries, transactions, and applications.
For example, an application program written in C++ may have struct or class declarations, and
a COBOL program has data division statements to define its files. Whereas file-processing
software can access only specific databases, DBMS software can access diverse databases by
extracting the database definitions from the catalog and using these definitions. The DBMS
catalog will store the definitions of all the files shown. Figure 1.3 shows some sample entries in
a database catalog. These definitions are specified by the database designer prior to creating the
actual database and are stored in the catalog. Whenever a request is made to access, say, the
Name of a STUDENT record, the DBMS software refers to the catalog to determine the structure
of the STUDENT file and the position and size of the Name data item within a STUDENT
record. By contrast, in a typical file-processing application, the file structure and, in the extreme
case, the exact location of Name within a STUDENT record are already coded within each
program that accesses this data item.
2. Insulation between Programs and Data, and Data Abstraction
In traditional file processing, the structure of data files is embedded in the application
programs, so any changes to the structure of a file may require changing all programs that
access that file. By contrast, DBMS access programs do not require such changes in most
cases. The structure of data files is stored in the DBMS catalog separately from the access
programs. We call this property program-data independence.
For example, a file access program may be written in such a way that it can access only
STUDENT records of the structure shown in Figure 1.4. If we want to add another piece of
data to each STUDENT record, say the Birth_date , such a program will no longer work and
must be changed. By contrast, in a DBMS environment, we only need to change the
description of STUDENT records in the catalog (Figure 1.3) to reflect the inclusion of the
new data item Birth_date ; no programs are changed.
The next time a DBMS program refers to the catalog, the new structure of STUDENT
records will be accessed and used. In some types of database systems, such as object-oriented
and object-relationa systems, users can define operations on data as part of the database
definitions. An operation (also called a function or method) is specified in two parts. The
interface (or signature) of an operation includes the operation name and the data types of its
arguments (or parameters). The implementation (or method) of operation is specified
separately and can be changed without affecting the interface. User application programs can
operate on the data by invoking these operations through their names and arguments,
regardless of how the operations are implemented. This may be termed program-operation
independence.
The characteristic that allows program-data independence and program-operation
independence is called data abstraction. A DBMS provides users with a conceptual
representation of data that does not include many of the details of how the data is stored or
how the operations are implemented. Informally, a data model is a type of data abstraction
that is used to provide this conceptual representation. The data model uses logical concepts,
such as objects, their properties, and their interrelationships, that may be easier for most users
to understand than computer storage concepts. Hence, the data model hides storage and
implementation details that are not of interest to most database users.
For example, The internal implementation of a file may be defined by its record length—the
number of characters (bytes) in each record—and each data item may be specified by its
starting byte within a record and its length in bytes. The STUDENT record would thus be
represented as shown in Figure 1.4. But a typical database user is not concerned with the
location of each data item within a record or its length; rather, the user is concerned that when
a reference is made to Name of STUDENT , the correct value is returned. Many other details
of file storage organization—such as the access paths specified on a file—can be
hidden database users by the DBMS;
In the database approach, the detailed structure and organization of each file are stored in
the catalog. Database users and application programs refer to the conceptual representation of
the files, and the DBMS extracts the details of file storage from the catalog when these are
needed by the DBMS file access modules. Many data models can be used to provide this data
abstraction to database users. A major part of this book is devoted to presenting various data
models and the concepts they use to abstract the representation of data. In object-oriented
and object-relational databases, the abstraction process includes not only the data structure
but also the operations on the data. These operations provide an abstraction of miniworld
activities commonly understood by the users.
For example, an operation CALCULATE_GPA can be applied to a STUDENT object to
calculate the grade point average. Such operations can be invoked by the user queries or
application programs without having to know the details of how the operations are
implemented. In that sense, an abstraction of the miniworld activity is made available to the
user as an abstract operation.
A database typically has many users, each of whom may require a different perspective or view of
the database. A view may be a subset of the database or it may contain virtual data that is derived from
the database files but is not explicitly stored. Some users may not need to be aware of whether the data
they refer to is stored or derived. A multiuser DBMS whose users have a variety of distinct applications
must provide facilities for defining multiple views. For example, one user of the database may be
interested only in accessing and printing the transcript of each student; the view for this user is shown
in Figure 1.5(a). A second user, who is interested only in checking that students have taken all the
prerequisites of each course
for which they register, may require the view shown in Figure 1.5(b).
4. Sharing of Data and Multiuser Transaction Processing
A multiuser DBMS, as its name implies, must allow multiple users to access the database at
the same time. This is essential if data for multiple applications is to be integrated and
maintained in a single database. The DBMS must include concurrency control software to
ensure that several users trying to update the same data do so in a controlled manner so that
the result of the updates is correct.
For example, when several reservation agents try to assign a seat on an airline flight,
the DBMS should ensure that each seat can be accessed by only one agent at a time for
assignment to a passenger. These types of applications are generally called online transaction
processing (OLTP) applications. A fundamental role of multiuser DBMS software is to
ensure that concurrent transactions operate correctly and efficiently.
The concept of a transaction has become central to many database applications. A
transaction is an executing program or process that includes one or more database accesses,
such as reading or updating of database records. Each transaction is supposed to execute a
logically correct database access if executed in its entirety without interference from other
transactions.
The DBMS must enforce several transaction properties. The isolation property ensures
that each transaction appears to execute in isolation from other transactions, even though
hundreds of transactions may be executing concurrently. The atomicity property ensures that
either all the database operations in a transaction are executed or none are. The preceding
characteristics are important in distinguishing a DBMS from traditional file-processing
software.
Database Management System (DBMS) is basically a collection of interrelated data and a set of
software tools/programs which access, process, and manipulate data. It allows access, retrieval, and
use of that data by considering appropriate security measures. The Database Management system
(DBMS) is really useful for better data integration and security.
Advantages of Database Management System (DBMS):
Some of them are given as follows below.
1. Better Data Transferring: Database management creates a place where users have an
advantage of more and better-managed data. Thus making it possible for end-users to
have a quick look and to respond fast to any changes made in their environment.
2. Better Data Security: The more accessible and usable the database, the more it is
prone to security issues. As the number of users increases, the data transferring or data
sharing rate also increases thus increasing the risk of data security. It is widely used in
the corporate world where companies invest money, time, and effort in large amounts to
ensure data is secure and is used properly. A Database Management System (DBMS)
provides a better platform for data privacy and security policies thus, helping
companies to improve Data Security.
3. Better data integration: Due to the Database Management System we have an access
to well managed and synchronized form of data thus it makes data handling very easy
and gives an integrated view of how a particular organization is working and also helps
to keep a track of how one segment of the company affects another segment.
4. Minimized Data Inconsistency: Data inconsistency occurs between files when
different versions of the same data appear in different places. For Example, data
inconsistency occurs when a student’s name is saved as “John Wayne” on a main
computer of the school but on the teacher registered system same student name is
“William J. Wayne”, or when the price of a product is $86.95 in the local system of the
company and its National sales office system shows the same product price as $84.95.
So if a database is properly designed then Data inconsistency can be greatly reduced
hence minimizing data inconsistency.
5. Faster data Access: The Database management system (DBMS) helps to produce
quick answers to database queries thus making data access faster and more accurate.
For example, to read or update the data. For example, end-users, when dealing with
large amounts of sale data, will have enhanced access to the data, enabling a faster sales
cycle. Some queries may be like:
What is the increase in sales in the last three months?
What is the bonus given to each of the salespeople in the last five months?
How many customers have a credit score of 850 or more?
6. Better decision making: Due to DBMS now we have Better managed data and
Improved data access because of which we can generate better quality information
hence on this basis better decisions can be made. Better Data quality improves
accuracy, validity, and time it takes to read data. DBMS does not guarantee data
quality, it provides a framework to make it easy to improve data quality.
7. Increased end-user productivity: The data which is available with the help of a
combination of tools that transform data into useful information, helps end-users to
make quick, informative, and better decisions that can make difference between success
and failure in the global economy.
8. Simple: Database management system (DBMS) gives a simple and clear logical view
of data. Many operations like insertion, deletion, or creation of files or data are easy to
implement.
9. Data abstraction: The major purpose of a database system is to provide users with an
abstract view of the data. Since many complex algorithms are used by the developers to
increase the efficiency of databases that are being hidden by the users through various
data abstraction levels to allow users to easily interact with the system.
10. Reduction in data Redundancy: When working with a structured database, DBMS
provides the feature to prevent the input of duplicate items in the database. for e.g. – If
there are two same students in different rows, then one of the duplicate data will be
deleted.
Data Models
Data models define how the logical structure of a database is modeled. Data Models are fundamental
entities to introduce abstraction in a DBMS. Data models define how data is connected to each other
and how they are processed and stored inside the system.
The very first data model could be flat data-models, where all the data used are to be kept in the same
plane. Earlier data models were not so scientific, hence they were prone to introduce lots of duplication
and update anomalies.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among
them. While formulating real-world scenario into the database model, the ER Model creates entity set,
relationship set, general attributes and constraints.
ER Model is based on −
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model than
others. This model is based on first-order predicate logic and defines a table as an n-ary relation.
1. Instances :
Instances are the collection of information stored at a particular moment. The instances can be
changed by certain CRUD operations as like addition, deletion of data. It may be noted that any
search query will not make any kind of changes in the instances.
Example –
Let’s say a table teacher in our database whose name is School, suppose the table has 50 records so
the instance of the database has 50 records for now and tomorrow we are going to add another fifty
records so tomorrow the instance have total 100 records. This is called an instance.
2. Schema :
Schema is the overall description of the database. The basic structure of how the data will be stored
in the database is called schema.
Schema is of three types: Logical Schema, Physical Schema and view Schema.
Example –
Let’s say a table teacher in our database name school, the teacher table require the name, dob, doj in
their table so we design a structure as :
Teacher table
name: String
doj: date
dob: date
Above given is the schema of the table teacher.
Defines the basic structure of the database i.e how It is the set of Information stored at a
the data will be stored in the database. particular time.
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is used to
deal with a large number of PCs, web servers, database servers and other components that are
connected with networks.
o The client/server architecture consists of many PCs and a workstation which are connected via
the network.
o DBMS architecture depends upon how users are connected to the database to get their request
done.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can directly
sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy
tool for end users.
o The 1-Tier architecture is used for development of the local application, where programmers
can directly communicate with the database for the quick response.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications
on the client end can directly communicate with the database at the server side. For this
interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with the
server side.
Fig: 2-tier Architecture
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
o Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.
o Logical data independence refers characteristic of being able to change the conceptual schema
without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data would
not be affected.
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the Conceptual
structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
Data Dictionary
A data dictionary contains metadata i.e data about the database. The data dictionary is very important
as it contains information such as what is in the database, who is allowed to access it, where is the
database physically stored etc. The users of the database normally don't interact with the data
dictionary, it is only handled by the database administrators.
Field Name Data Type Field Size for display Description Example
If the structure of the database or its specifications change at any point of time, it should be reflected
in the data dictionary. This is the responsibility of the database management system in which the data
dictionary resides.
So, the data dictionary is automatically updated by the database management system when any changes
are made in the database. This is known as an active data dictionary as it is self updating.
This is not as useful or easy to handle as an active data dictionary. A passive data dictionary is
maintained separately to the database whose contents are stored in the dictionary. That means that if
the database is modified the database dictionary is not automatically updated as in the case of Active
Data Dictionary.
So, the passive data dictionary has to be manually updated to match the database. This needs careful
handling or else the database and data dictionary are out of sync.
Database Language
o A DBMS has appropriate languages and interfaces to express database queries and updates.
o Database languages can be used to read, store and update the data in the database.
o DDL stands for Data Definition Language. It is used to define database structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of tables
and schemas, their names, indexes, columns in each table, constraints, etc.
These commands are used to update the database schema that's why they come under Data definition
language.
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.
o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
(But in Oracle database, the execution of data control language does not have the feature of
rolling back.)
There are the following operations which have the authorization of Revoke:
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.
A database management system (DBMS) interface is a user interface that allows for the ability to
input queries to a database without using the query language itself.
2. Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill out all of the form
entries to insert new data, or they can fill out only certain entries, in which case the
DBMS will redeem same type of data for other remaining entries. These types of forms
are usually designed or created and programmed for the users that have no expertise in
operating system. Many DBMSs have forms specification languages which are special
languages that help specify such forms.
Example: SQL* Forms is a form-based language that specifies queries using a form
designed in conjunction with the relational database schema.
ER model
o ER model stands for an Entity-Relationship model. It is a high-level data model. This model is
used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy to
design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-relationship
diagram.
For example, Suppose we design a school database. In this database, the student will be an entity with
attributes like address, name, id, age, etc. The address can be another entity with attributes like city,
street name, pin code, etc and there will be a relationship between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be represented as
rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be taken as
an entity.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain any key
attribute of its own. The weak entity is represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary key.
The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute. The composite
attribute is represented by an ellipse, and those ellipses are connected with an ellipse.
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued attribute. The
double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It can be represented
by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute like Date of
birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent
the relationship.
When only one instance of an entity is associated with the relationship, then it is known as one to one
relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on the right
associates with the relationship then this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on the right
associates with the relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.
Data Modeling
Data modeling is a technique to document a software system using diagrams and symbols. It is used
to represent communication of data.
The highest level of abstraction for the data model is called the Entity Relationship Diagram (ERD). It
is a graphical representation of data requirements for a database.
Attributes: Information such as property, facts you need to describe each table.
Relationships: How tables are linked together.
Entity
Entities are the basic objects of ERDs. These are the tables of your database. Entity are nouns and the
types usually fall into five classes: concepts, locations, roles, events or things.
For example: students, courses, books, campus, employees, payment, projects.
A specific example of an entity is called an instance. Each instance becomes a record or a row in a
table.
For example: the student John Smith is a record in a table called students.
Relationships
Relationships are the associations between the entities. Verbs often describe relationships between
entities. We will use Crow's Foot Symbols to represent the relationships. Three types of relationships
are discussed in this lab. If you read or hear cardinality ratios, it also refers to types of relationships.
For example:
Each student fills one seat and one seat is assigned to only one student.
For example:
One instructor can teach many courses, but one course can only be taught by one instructor.
One instructor may teach many students in one class, but all the students have one instructor
for that class.
For example:
Each student can take many classes, and each class can be taken by many students.
Each consumer can buy many products, and each product can be bought by many consumers.
The detailed Crow's Foot Relationship symbols can be found here. Crow's Foot Relationship Symbols
Many to many relationships are difficult to represent. We need to decompose a many to many (M:M)
relationship into two one-to-many (1:M) relationships.
Attributes
Attributes are facts or description of entities. They are also often nouns and become the columns of
the table. For example, for entity student, the attributes can be first name, last name, email, address
and phone numbers.
Primary Key
Primary Key* or identifier is an attribute or a set of attributes that uniquely identifies an instance of
the entity. For example, for a student entity, student number is the primary key since no two students
have the same student number. We can have only one primary key in a table. It identify uniquely
every row and it cannot be null.
Foreign key
A foreign key+ (sometimes called a referencing key) is a key used to link two tables together.
Typically you take the primary key field from one table and insert it into the other table where it
becomes a foreign key (it remains a primary key in the original table). We can have more than one
foreign key in a table.
An Example
Here's a sample crowsfoot diagram from a past offering of CS270 taught here at the University of
Regina. We've redrawn the diagrams using more modern diagramming tools, but the content is
unchanged. It uses a lot of ERD symbols, so you might want to use Vivek Chawla's quick guide while
you read it.
The ER model is a very important concept in DBMS, and it is used for the modeling of the logical
view of the system from a data perspective. The entity, Entity Set, and Entity Type all these terms
are very important concepts of ER Model. In this article, we will understand the difference between
them.
1. Entity : An entity is a thing in a real-world with independent existence. An entity can exist
independently and is distinguishable from other objects. It can be identified uniquely.
An entity can be of two types :
Tangible Entity : Entities that exist in the real world physically. Example: Person, car,
etc.
Intangible Entity : Entities that exist only logically and have no physical existence.
Example: Bank Account, etc.
Example :
A student with a particular roll number is an entity.
A company with a particular registration number is an entity.
Note :
An entity may be concrete like a student, a book, or abstract like a holiday or a
particular concept.
An entity is represented by a set of attributes.
In a particular relation in RDBMS, a particular record is called an entity.
2. Entity Type : It refers to the category that a particular entity belongs to.
Example :
A table named student in a university database.
A table named employee in a company database.
Note :
The category of a particular entity in the relation in RDBMS is called the entity type.
It is represented by the name of the table and its schema.
3. Entity Set : An entity set is a collection or set of all entities of a particular entity type at any
point in time. The type of all the entities should be the same.
Example :
The collection of all the students from the student table at a particular instant of time is
an example of an entity set.
The collection of all the employees from the employee table at a particular instant of
time is an example of an entity set.
Note :
Entity sets need not be disjoint. For example, the entity set of Article Writer (all content
creators for GeeksforGeeks) and the entity set of Article Reader (all students who read
the article of GeeksforGeeks) may have members in common.
The collection of all the entities in the relation of RDBMS is called an entity set.
Relation With Table :
Consider a table student as follows :
1 Avi 19 M
2 Ayush 23 M
3 Nikhil 21 M
4 Riya 16 F
1 Avi 19 M
Entity Type : Each entity belongs to the student type. Hence, the type of entity here is a student.
Entity Set : The complete data set of all entities is called entity set. For the above table, the records
with student id 1, 2, 3, 4 are the entity set.
Difference Table :
Entity Entity Type Entity Set
A thing in the real world with A category of a particular Set of all entities of a
independent existence entity particular entity type.
Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an
attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.
Example:
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation and
if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
Example:
3. Referential Integrity Constraints
Example:
4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary key.
A primary key can contain a unique and null value in the relational table.
Example: