Dbms Unit1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

Introduction to Database Management System

Unit One

Attiuttama Mishra
Asst. Prof.
DBMS - Overview

Database is a collection of related data and data is a collection of facts and


figures that can be processed to produce information.

Database management system is a software which is used to manage the


database. For example: MySQL, Oracle, etc are a very popular commercial
database which is used in different applications.

DBMS provides an interface to perform various operations like database


creation, storing data in it, updating data, creating a table in the database and
a lot more.

It provides protection and security to the database. In the case of multiple


users, it also maintains data consistency.
Drawbacks of File system:

Data Isolation: Because data are scattered in various files, and files may be
in different formats, writing new application programs to retrieve the
appropriate data is difficult.
Duplication of data – Redundant data
Dependency on application programs – Changing files would lead to
change in application programs.
It allows access to single files or tables at a time. FMS’s accommodate flat
files that have no relation to other files.
Advantage of DBMS over file system

There are several advantages of Database management system over file


system-

•No redundant data – Redundancy removed by data normalization


•Data Consistency and Integrity – data normalization takes care of it too
•Secure – Each user has a different set of access
•Privacy – Limited access
•Data sharing
•Easy access to data
•Easy recovery
•Flexible
Disadvantages of DBMS

Cost of Hardware and Software: It requires a high speed of data processor


and large memory size to run DBMS software.
Size: It occupies a large space of disks and large memory to run them
efficiently.
Complexity: Database system creates additional complexity and
requirements.
Higher impact of failure: Failure is highly impacted the database because in
most of the organization, all the data stored in a single database and if the
database is damaged due to electric failure or database corruption then the
data may be lost forever.
Characteristics of Database Management
System

Reduced Redundancy Data stored into Tables

Security Data Consistency:

Support Multiple user and Concurrent Access Query Language


DBMS Architecture

The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers and other components that are connected with networks.
The client/server architecture consists of many PCs and a workstation which
are connected via the network.

Fig: DBMS Architecture


1-Tier Architecture-
In this architecture, the database is directly available to the user. It means
the user can directly sit on the DBMS and uses it.
Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.
The 1-Tier architecture is used for development of the local application,
where programmers can directly communicate with the database for the
quick response.

2-Tier Architecture-
The 2-Tier architecture is same as basic client-server. In the two-tier
architecture, applications on the client end can directly communicate with
the database at the server side. For this interaction, API's
like: ODBC, JDBC are used.
The user interfaces and application programs are run on the client-side.
The server side is responsible to provide the functionalities like: query
processing and transaction management.
To communicate with the DBMS, client-side application establishes a
connection with the server side.
Fig: 2-tier Architecture
3-Tier Architecture-

The 3-Tier architecture contains another


layer between the client and server. In this
architecture, client can't directly
communicate with the server.

The application on the client-end interacts


with an application server which further
communicates with the database system.

End user has no idea about the existence of


the database beyond the application server.
The database also has no idea about any
other user beyond the application.
The 3-Tier architecture is used in case of
large web application.
Fig: 3-tier Architecture
Three schema Architecture

The three schema architecture is also called ANSI/SPARC


architecture or three-level architecture.

This framework is used to describe the structure of a specific


database system.

The three schema architecture is also used to separate the


user applications and physical database.

The three schema architecture contains three-levels. It breaks


the database down into three different categories.
Fig: Three schema Architecture
1. Internal Level
•The internal level has an internal schema which describes the physical storage
structure of the database.
•The internal schema is also known as a physical schema.
•It uses the physical data model. It is used to define that how the data will be
stored in a block.
•The physical level is used to describe complex low-level data structures in detail.

2. Conceptual Level
•The conceptual schema describes the design of a database at the
conceptual level. Conceptual level is also known as logical level.
•The conceptual schema describes the structure of the whole database.
•The conceptual level describes what data are to be stored in the database
and also describes what relationship exists among those data.
•In the conceptual level, internal details such as an implementation of the
data structure are hidden.
•Programmers and database administrators work at this level.
3. External Level
•At the external level, a database contains several schemas that sometimes
called as subschema. The subschema is used to describe the different view of
the database.
•An external schema is also known as view schema.
•Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
•The view schema describes the end user interaction with database systems.

Data Independence

•Data independence can be explained using the three-schema architecture.

•Data independence refers characteristic of being able to modify the


schema at one level of the database system without altering the schema at
the next higher level.
There are two types of data independence:

1. Logical Data Independence


•Logical data independence refers
characteristic of being able to change the
conceptual schema without having to change
the external schema.

•Logical data independence is used to


separate the external level from the
conceptual view.

•If we do any changes in the conceptual view


of the data, then the user view of the data
would not be affected.

•Logical data independence occurs at the


user interface level.
2. Physical Data Independence
•Physical data independence can be defined
as the capacity to change the internal schema
without having to change the conceptual
schema.

•If we do any changes in the storage size of


the database system server, then the
Conceptual structure of the database will not
be affected.

•Physical data independence is used to


separate conceptual levels from the internal
levels.

•Physical data independence occurs at the


logical interface level.
Data model Schema and Instance

•The data which is stored in the database at a particular moment of time is called
an instance of the database.
•The overall design of a database is called schema.
•A database schema is the skeleton structure of the database. It represents the
logical view of the entire database.
•A schema contains schema objects like table, foreign key, primary key, views,
columns, data types, stored procedure, etc.
•A database schema can be represented by using the visual diagram. That
diagram shows the database objects and relationship with each other.
For example: In the following diagram, we have a schema that shows the
relationship between three tables: Course, Student and Grade. It does not
show all aspects of database.
Example of Instance

For example, lets say we have a single table student in the database, today the table
has 100 records, so today the instance of the database has 100 records. Lets say we
are going to add another 100 records in this table by tomorrow so the instance of
database tomorrow will have 200 records in table. In short, at a particular moment
the data stored in database is called the instance, that changes over time when we
add or delete data from the database.
View of Data in DBMS/Data Abstraction

Data Abstraction is one of the main features of database systems. Hiding irrelevant
details from user and providing abstract view of data to users, helps in easy and
efficient user-database interaction.
The view level provides the “view of data” to the users and hides the irrelevant details
such as data relationship, database schema, constraints, security etc from the user.
Database Language
•A DBMS has appropriate languages and interfaces to express database
queries and updates.
•Database languages can be used to read, store and update the data in the
database.

1. Data Definition Language


•DDL stands for Data Definition Language. It is used to define database
structure or pattern.
•It is used to create schema, tables, indexes, constraints, etc. in the
database.
•Using the DDL statements, you can create the skeleton of the database.
Here are some tasks that come under DDL:
•Create: It is used to create objects in the database.
•Alter: It is used to alter the structure of the database.
•Drop: It is used to delete objects from the database.
•Truncate: It is used to remove all records from a table.
•Rename: It is used to rename an object.
•Comment: It is used to comment on the data dictionary.

CREATE TABLE table_name (


column1 datatype,
column2 datatype,
column3 datatype,
....
);
2. Data Manipulation Language
DML stands for Data Manipulation Language. It is used for accessing and
manipulating data in a database. It handles user requests.
Here are some tasks that come under DML:
•Select: It is used to retrieve data from a database.
•Insert: It is used to insert data into a table.
•Update: It is used to update existing data within a table.
•Delete: It is used to delete all records from a table.
•Merge: It performs UPSERT operation, i.e., insert or update operations.

SELECT column1, column2, ...


FROM table_name;
3. Data Control Language
•DCL stands for Data Control Language. It is used to retrieve the stored or
saved data.
•The DCL execution is transactional. It also has rollback parameters.

Here are some tasks that come under DCL:


•Grant: It is used to give user access privileges to a database.
•Revoke: It is used to take back permissions from the user.

4. Transaction Control Language


TCL is used to run the changes made by the DML statement. TCL can be
grouped into a logical transaction.
Here are some tasks that come under TCL:
•Commit: It is used to save the transaction on the database.
•Rollback: It is used to restore the database to original since the last
Commit.
DBMS Database Models

A Database model defines the logical design and structure of a database and
defines how data will be stored, accessed and updated in a database
management system. While the Relational Model is the most widely used
database model, there are other models too:

•Hierarchical Model
•Network Model
•Entity-relationship Model
•Relational Model
Hierarchical Model

This database model organizes data into a tree-like-structure, with a single


root, to which all the other data is linked. The hierarchy starts from
the Root data, and expands like a tree, adding child nodes to the parent
nodes.
This model efficiently describes many real-world relationships like index of a
book, recipes etc. In hierarchical model, data is organized into tree-like
structure with one-to-many relationship between two different types of data,
for example, one department can have many courses
Network Model

This is an extension of the Hierarchical model. In this model data is


organized more like a graph, and are allowed to have more than one parent
node.
In this database model data is more related as more relationships are
established in this database model. Also, as the data is more related, hence
accessing the data is also easier and fast. This database model was used to
map many-to-many data relationships.
Relational Model

In this model, data is organized in two-dimensional tables and the relationship is


maintained by storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most
widely used database model, infact, we can say the only database model used around
the world.
Hence, tables are also known as relations in relational model.
Entity-relationship Model
An ER model is a design or blueprint of a database that can later be
implemented as a database. The main components of E-R model are: entity set
and relationship set.
An ER diagram shows the relationship among entity sets. An entity set is a group
of similar entities and these entities can have attributes. In terms of DBMS, an
entity is a table or attribute of a table in database, so by showing relationship
among tables and their attributes, ER diagram shows the complete logical
structure of a database. Student will be an entity with attributes name, age,
address etc.
Entities are represented by means of rectangles. Rectangles are
named with the entity set they represent.

Attributes are the properties of entities. Attributes are represented by


means of ellipses. Every ellipse represents one attribute and is
directly connected to its entity.

If the attributes are composite, they are further divided in a tree like
structure. Every node is then connected to its attribute. That is,
composite attributes are represented by ellipses that are connected with
an ellipse.
Multivalued attributes are depicted by
double ellipse.

Derived Attribute : Attributes derived from


other stored attribute. For example age
from Date of Birth and Today’s date.
Relationship

Relationships are represented by diamond-shaped box. Name of the


relationship is written inside the diamond-box. All the entities
(rectangles) participating in a relationship, are connected to it by a
line.

One-to-one − When only one instance of an entity is associated


with the relationship, it is marked as '1:1'.

Example??
One-to-many − When more than one instance of an entity is
associated with a relationship, it is marked as '1:N'.

Many-to-one − When more than one instance of entity is


associated with the relationship, it is marked as 'N:1'.
Many-to-many − The following image reflects that more than one
instance of an entity on the left and more than one instance of an
entity on the right can be associated with the relationship.

Examples??
ER diagram
Weak Entity And Strong Entity:

The Weak entity that cannot be uniquely identified by its own attributes and
relies on the relationship with other entity is called weak entity. The weak
entity is represented by a double rectangle.
The Strong Entity is the one whose existence does not depend on the
existence of any other entity in a schema. It is denoted by a single rectangle.
A strong entity always has the primary key .
For example-the ER-diagram above, for each loan, there should be at least
one borrower otherwise that loan would not be listed in Loan entity set. But
even if a customer does not borrow any loan it would be listed in Customer
entity set. So we can conclude that a customer entity does not depend on a
loan entity.
E R notations
Data Generalization

Generalization is a process in which the common attributes of more than one


entities form a new entity. This newly formed entity is called generalized
entity.

Generalization Example
Lets say we have two
entities Student and
Teacher.
Attributes of Entity Student
are: Name, Address & Grade
Attributes of Entity Teacher
are: Name, Address & Salary
These two entities have two common attributes: Name and Address,
we can make a generalized entity with these common attributes.

Note:
1. Generalization uses bottom-up approach where two or more lower level
entities combine together to form a higher level new entity.
2. The new generalized entity can further combine together with lower level
entity to create a further higher level generalized entity.
Data Specialization

Specialization is a process in which an entity is divided into sub-entities. You


can think of it as a reverse process of generalization. Specialization is a top-
down process.
The idea behind Specialization is to find the subsets of entities that have few
distinguish attributes. For example – Consider an entity employee which can
be further classified as sub-entities Technician, Engineer & Accountant
because these sub entities have some distinguish attributes.
Data Aggregration
Aggregration is a process when relation between two entities is treated as
a single entity

the relationship between Center and Course together, is acting as an Entity,


which is in relationship with another entity Visitor. Now in real world, if a
Visitor or a Student visits a Coaching Center, he/she will never enquire about
the center only or just about the course, rather he/she will ask enquire about
both.
Constraints in DBMS
Constraints enforce limits to the data or type of data that can be
inserted/updated/deleted from a table. The whole purpose of constraints is to
maintain the data integrity during an update/delete/insert into a table.

Types of constraints
NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints – PRIMARY KEY, FOREIGN KEY
Domain constraints
Mapping constraints
NOT NULL:
NOT NULL constraint makes sure that a column does not hold NULL value.
When we don’t provide value for a particular column while inserting a
record into a table, it takes NULL value by default. By specifying NULL
constraint, we can be sure that a particular column(s) cannot have NULL
values.

Example:
CREATE TABLE STUDENT( ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO) );
UNIQUE:
UNIQUE Constraint enforces a column or set of columns to have unique values. If a
column has a unique constraint, it means that particular column cannot have
duplicate values in a table.

CREATE TABLE STUDENT( ROLL_NO INT NOT NULL,


STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO) );

DEFAULT:
The DEFAULT constraint provides a default value to a column when there is no value
provided while inserting a record into a table.

CREATE TABLE STUDENT( ROLL_NO INT NOT NULL, STU_NAME


VARCHAR (35) NOT NULL, STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) , PRIMARY KEY (ROLL_NO) )
CHECK:
This constraint is used for specifying range of values for a particular column of a
table. When this constraint is being set on a column, it ensures that the specified
column must have the value falling in the specified range.

CREATE TABLE STUDENT(ROLL_NO INT NOT NULL


CHECK(ROLL_NO>1000) ,STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
PRIMARY KEY (ROLL_NO) );

Key Constraints – PRIMARY KEY, FOREIGN KEY

Primary key uniquely identifies each record in a table. It must have unique values and
cannot contain nulls. In the below example the ROLL_NO field is marked as primary
key, that means the ROLL_NO field cannot have duplicate and null values.

FOREIGN KEY:
Foreign keys are the columns of a table that points to the primary key of another
table. They act as a cross-reference between tables.
Domain constraints:
Each table has certain set of columns and each column allows a same type of data,
based on its data type. The column does not accept values of any other data type
.
Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT)

Mapping constraints:
Mapping Cardinality: :
One to One: An entity of entity-set A can be associated with at most one entity of
entity-set B and an entity in entity-set B can be associated with at most one entity of
entity-set A.
One to Many: An entity of entity-set A can be associated with any number of entities of
entity-set B and an entity in entity-set B can be associated with at most one entity of
entity-set A.
Many to One: An entity of entity-set A can be associated with at most one entity of
entity-set B and an entity in entity-set B can be associated with any number of entities
of entity-set A.
Many to Many: An entity of entity-set A can be associated with any number of entities
of entity-set B and an entity in entity-set B can be associated with any number of
entities of entity-set A.
How to draw a basic ER diagram

Purpose and scope: Define the purpose and scope of what you’re analyzing or
modeling.
Entities: Identify the entities that are involved. When you’re ready, start drawing them
in rectangles (or your system’s choice of shape) and labeling them as nouns.
Relationships: Determine how the entities are all related. Draw lines between them to
signify the relationships and label them. Some entities may not be related, and that’s fine.
In different notation systems, the relationship could be labeled in a diamond, another
rectangle or directly on top of the connecting line.
Attributes: Layer in more detail by adding key attributes of entities. Attributes are often
shown as ovals.
Cardinality: Show whether the relationship is 1-1, 1-many or many-to-many.
Some examples of ER-Diagrams
Limitations of ER diagrams and models

Only for relational data: Understand that the purpose is to show relationships. ER
diagrams show only that relational structure.
Not for unstructured data: Unless the data is cleanly delineated into different fields,
rows or columns, ER diagrams are probably of limited use. The same is true of semi-
structured data, because only some of the data will be useful.
Difficulty integrating with an existing database: Using ER Models to integrate with
an existing database can be a challenge because of the different architectures.
No representation of data manipulation: It is difficult to show data manipulation in
ER model.
KEYS
•Keys play an important role in the relational database.

•It is used to uniquely identify any record or row of data from the table. It is
also used to establish and identify relationships between tables.
Primary key

•It is the first key used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys, as we saw in the PERSON
table. The key which is most suitable from those lists becomes a primary key.
2. Candidate key
•A candidate key is an attribute or set of attributes that can uniquely identify
a tuple.
•Except for the primary key, the remaining attributes are considered a
candidate key. The candidate keys are as strong as the primary key.

•For example: In the EMPLOYEE table, id is best suited for the primary key. The rest
of the attributes, like SSN, Passport_Number, License_Number, etc., are considered
a candidate key.

3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a
superset of a candidate key.

•The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.


4. Foreign key
•Foreign keys are the column of the table used to point to the primary key of
another table.
•Every employee works in a specific department in a company, and
employee and department are two different entities. So we can't store the
department's information in the employee table. That's why we link these two
tables through the primary key of one table.
•We add the primary key of the DEPARTMENT table, Department_Id, as a
new attribute in the EMPLOYEE table.
•In the EMPLOYEE table, Department_Id is the foreign key, and both the
tables are related.

You might also like