0% found this document useful (0 votes)
2 views

Database Management System

The document provides an overview of Database Management Systems (DBMS), detailing their function, popular software, applications across various sectors, and types of data models. It outlines the characteristics of DBMS, including support for ACID properties, data integrity, and security, as well as the different types of database languages such as DDL, DML, DCL, and TCL. Additionally, it describes DBMS architecture and the various types of database users, emphasizing the importance of DBMS in managing and manipulating data effectively.

Uploaded by

Sujeet
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Database Management System

The document provides an overview of Database Management Systems (DBMS), detailing their function, popular software, applications across various sectors, and types of data models. It outlines the characteristics of DBMS, including support for ACID properties, data integrity, and security, as well as the different types of database languages such as DDL, DML, DCL, and TCL. Additionally, it describes DBMS architecture and the various types of database users, emphasizing the importance of DBMS in managing and manipulating data effectively.

Uploaded by

Sujeet
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Unit-1

Basic Concept:

 Database Management System (DBMS)


Database Management System (DBMS) is a software for storing and retrieving users' data while
considering appropriate security measures. It consists of a group of programs which manipulate
the database. The DBMS accepts the request for data from an application and instructs the
operating system to provide the specific data. In large systems, a DBMS helps users and other
third-party software to store and retrieve data.

DBMS allows users to create their own databases as per their requirement. The term “DBMS”
includes the user of the database and other application programs. It provides an interface between
the data and the software application.

To define a database system:

 We need to specify the structure of the records of each file by defining the different types of
data elements to be stored in each record.
 We can also use a coding scheme to represent the values of a data item.
 Basically, your Database will have 5 tables with a foreign key defined amongst the various
tables.

Popular DBMS Software

Here, is the list of some popular DBMS system:

 MySQL
 Microsoft Access
 Oracle
 PostgreSQL
 dBASE
 FoxPro
 SQLite
 IBM DB2
 LibreOffice Base
 MariaDB
 Microsoft SQL Server etc.

Application of DBMS

Sector Use of DBMS

Banking For customer information, account activities,


payments, deposits, loans, etc.

Airlines For reservations and schedule information.

Universities For student information, course registrations,


colleges and grades.

Telecommunication It helps to keep call records, monthly bills,


maintaining balances, etc.
Finance For storing information about stock, sales,
and purchases of financial instruments like
stocks and bonds.

Sales Use for storing customer, product & sales


information.

Manufacturing It is used for the management of supply


chain and for tracking production of items.
Inventories status in warehouses.

HR Management For information about employees, salaries,


payroll, deduction, generation of paychecks,
etc.

 Types of DBMS/Data Models


Four Types of DBMS systems are:
 Flat Data Model
 Entity-Relationship Model
 Relation Model
 Record base Model
 Network Model
 Hierarchical Model
 Object-oriented Data Model
 Object Relation Model
 Semi-structured Model
 Associative Model

1. Flat Data Model


The flat data model is the first introduced traditional data model where data is kept in the same
plane. This is a very old model which is not much scientific.

2. Entity-Relationship Data Model


The Entity-relationship data model structure based on the impression of the real-world entities and
the existing relationship between them. In the process of designing the real-world scenario into the
database model the Entity sets are created in the beginning and then the model is dependent on
the two below vital things which are entities consisting of the attributes and the relationship that
exists among the entities.

An entity contains a real-world property called an attribute. Attributes are defined by a set of
values known as domains. For example, in an office the employee is an entity, the office is the
database, employee ID, name are the attributes. The logical association between the different
entities are known as the relationship among them.
3. Relational Data Model
The most popular and extensively used data model is the relational data model. The data model
allows the data to be stored in tables called a relation. The relations are normalized and the
normalized relation values are known as atomic values. Each of the rows in a relation is called
tuples which contains the unique value. The attributes are the values in each of the columns which
are of the same domain.

4. Network Data Model


In the network data model, all the entities are organized in graphical representations. There may
be several parts in the graph in which the entities can be accessed.

5. Hierarchical Data Model


The hierarchical model is based on the parent-child hierarchical relationship. In this model, there is
one parent entity with several children entity. At the top, there should be only one entity which is
called root. For example, an organization is the parent entity called root and it has several children
entities like clerk, officer, and many more.

6. Object-oriented Data Model


An object-oriented data model is one of the most developed data models which contains video,
graphical files, and audio. This consists of the data piece and the methods in the form of database
management system instructions.

7. Record base Data Model


The record-based data model is used to determine the overall design of the database. This data
model contains different kinds of record types. Each of the record types has a fixed length and a
fixed number of fields.

8. Object-relational Data Model


The object-relational data model is a powerful data model but for the design of the object-relational
data, the model is very complex. This model gives efficient results and widespread with huge
application thus some parts of the complexity problem can be ignored because of this. It also
offers features like working with other data models. Using the object-relational data model we
can work with the relational model also.

9. Semi-structured Data Model


The semi-structured data model is a self-describing data model. The data stored in this model is
generally associated with a scheme that is contained within the data property known as self-
describing property.

10. Associative Data Model


Associative data model follows the principle of division which data in two ways between entities
and association. Hence, the model is dividing the data for all the real-world scenarios into entities
and associations.
 Characteristics of Database Management
System

1. Real World Entity


DBMS these days is very realistic and real-world entities are used to design its architecture. Also
behavior and attributes are used by DBMS. To simplify it we can take an example of an
organization database where employee is an entity and his employee id is an attribute .
2. Self-Describing Nature
Before DBMS, traditional file management system was used for storing information and data.
There was no concept of definition in traditional file management system like we have in DBMS. A
DBMS should be of Self- Describing nature as it not only contains the database itself but also the
metadata. A metadata (data about data) defines and describes not only the extent, type, structure
and format of all data but also relationship between data. This data represent itself that what
actions should be taken on it.
3. Support ACID Properties
Any DBMS is able to support ACID (Accuracy, Completeness, Isolation, and Durability) properties.
It is made sure in every DBMS that the real purpose of data should not be lost while performing
transactions like delete, insert and update. Let us take an example; if an employee name is
updated then it should make sure that there is no duplicate data and no mismatch of employee
information.
4. Concurrent Use of Database
There are many chances that many users will be accessing the data at the same time. They may
require altering the database system concurrently. At that time, DBMS supports them to
concurrently use database without any problem. With the help of concurrency, economy of the
system can be increased. For Example, employees of railway reservation system can book and
access tickets for passengers concurrently. Every employee can see on his own interface that how
many seats are available or bogie is fully booked.

5. Insulation Between Data and Program


Program-data independence provides a big relief to database users. In traditional file management
system, structure of data files was defined in the application programs so user had to change all
the programs that are using that particular data file.
But in DBMS, structure of data files is not stored in the program but it is stored in system
catalogue. With the help of this, internal improvement of data efficiency or any changes in the data
do not have any effect on application software.
6. Transactions
Transactions are bunch of actions that are done to bring database from one consistent state to
new consistent state. Traditional file-based system did not have this feature. Transaction is always
atomic that means it can never be further divided. It can only be completed or uncompleted.
For example, A person wants to credit money from his account to another person’s account. Then
transaction will be complete if he sends money and other guy receives his money. Anything other
than this can lead to an inconsistent transaction.
7. Data Persistence
Persistence means if the data is not removed explicitly then all the data will be maintained in
DBMS. If any system failure happens then life span of data stored in the DBMS will be decided by
the users directly or indirectly. Any data stored in the DBMS can never be lost. If system failure
happens in between any transaction then it will be rolled back or fully completed, but data will
never be at risk.
8. Backup and Recovery
There are many chances of failure of whole database. At that time no one will be able to get the
database back and for sure company will be in a big loss. The only solution is to take backup of
database and whenever it is needed, it can be stored back. A database must have this
characteristic to enable more effectiveness.
9. Data Integrity
This is one of the most important characteristics of database management system. Integrity
ensures the quality and reliability of database system. It protects unauthorized access of database
and makes it more secure. It brings only consistence and accurate data into the database.
10. Multiple Views
Users can have multiple views of database depending on their department and interest. DBMS
support multiple views of database to the users. For example, a user of teaching department will
have different view and user of hostel department will have different. This feature helps users to
have somewhat security because users of other department cannot access their files.

11. Stores Any Kind of Data


A database management system should be able to store any kind of data. It should not be
restricted to employee name, salary and address. Any kind of data that exists in the real world can
be stored in DBMS because we need to work with all kinds of data that is present around us.
12. Security
DBMS provides security to the data stored in it because all users have different rights to access
database. Some of the user can access the whole database while other can access a small part of
database. For example, a computer network lecturer can only access files that are related to
computer subjects but HOD of the department can access files of all subject that are related to
their department.
13. Represents Complex Relationship Between Data
Data stored in a database is connected with each other and a relationship is made in between
data. DBMS should be able to represent the complex relationship between data to make efficient
and accurate use of data.
14. Query Language
Queries are used to retrieve and manipulate data but DBMS is armed by a strong query language
that make it more effective and efficient. Users have the power to retrieve any kind of data they
want from database by applying different set of queries. File-Based system has not this luxury of
query language.

Database Language
o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.

Types of Database Language


1. Data Definition Language

o DDL stands for Data Definition Language. It is used to define database structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of
tables and schemas, their names, indexes, columns in each table, constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under Data
definition language.

2. Data Manipulation Language


DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the feature
of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language


TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.

DBMS Architecture
The architecture of DBMS depends on the computer system on which it runs. For example, in a
client-server DBMS architecture, the database systems at server machine can run several
requests made by client machine. We will understand this communication with the help of
diagrams.

Types of DBMS Architecture


There are three types of DBMS architecture:

1. Single tier architecture


2. Two tier architecture
3. Three tier architecture
1. Single tier architecture
In this type of architecture, the database is readily available on the client machine, any request
made by client doesn’t require a network connection to perform the action on the database.

For example, lets say you want to fetch the records of employee from the database and the
database is available on your computer system, so the request to fetch employee details will be
done by your computer and the records will be fetched from the database by your computer as
well. This type of system is generally referred as local database system.

2. Two tier architecture

In two-tier architecture, the Database system is present at the server machine and the DBMS
application is present at the client machine, these two machines are connected with each other
through a reliable network as shown in the above diagram.

Whenever client machine makes a request to access the database present at server using a query
language like sql, the server perform the request on the database and returns the result back to
the client. The application connection interface such as JDBC, ODBC are used for the interaction
between server and client.

3. Three tier architecture


In three-tier architecture, another layer is present between the client machine and server machine.
In this architecture, the client application doesn’t communicate directly with the database systems
present at the server machine, rather the client application communicates with server application
and the server application internally communicates with the database system present at the
server.

 Database Users
Database users are the one who really use and take the benefits of database. There will be
different types of users depending on their need and way of accessing the database.

1. Application Programmers – They are the developers who interact with the database
by means of DML queries. These DML queries are written in the application programs like
C, C++, JAVA, Pascal etc. These queries are converted into object code to communicate
with the database. For example, writing a C program to generate the report of employees
who are working in particular department will involve a query to fetch the data from
database. It will include a embedded SQL query in the C Program.
2. Sophisticated Users – They are database developers, who write SQL queries to
select/insert/delete/update data. They do not use any application or programs to request
the database. They directly interact with the database by means of query language like
SQL. These users will be scientists, engineers, analysts who thoroughly study SQL and
DBMS to apply the concepts in their requirement. In short, we can say this category
includes designers and developers of DBMS and SQL.
3. Specialized Users – These are also sophisticated users, but they write special database
application programs. They are the developers who develop the complex programs to the
requirement.
4. Stand-alone Users – These users will have stand –alone database for their personal
use. These kinds of database will have readymade database packages which will have
menus and graphical interfaces.
5. Native Users – these are the users who use the existing application to interact with the
database. For example, online library system, ticket booking systems, ATMs etc which has
existing application and users use them to interact with the database to fulfill their requests.

Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level of
the database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.

Fig: Data Independence


Unit-2.
Database Design using ER Model:

 ER model
o ER model stands for an Entity-Relationship model. It is a high-level data model. This model
is used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy
to design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.

For example, Suppose we design a school database. In this database, the student will be an entity with
attributes like address, name, id, age, etc. The address can be another entity with attributes like city, street
name, pin code, etc and there will be a relationship between them.

Component of ER Diagram

1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be
taken as an entity.

a. Weak Entity

An entity that depends on another entity called a weak entity. The weak entity doesn't contain any
key attribute of its own. The weak entity is represented by a double rectangle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.

For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another attribute like
Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent the
relationship.

Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.
c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.

For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have many employees.

 Types of relationships
There are 3 types of relationships. One to One, One to Many, Many to Many.

 One to One: When one record in the first table relates to only one record in the second
table and vice versa. Here you may think that if it is one to one relationship then why do
not we store data into one table only rather than having two separate tables? The answer
to that is we design that way for security purposes. Let us say, we want to store data of
our name, email, address, contact, and password. Here, the detail of the password is
very sensitive and therefore we can create a different table just for that and we can store
the password in a separate table so that only certain people with access to it can see
that.
 One to Many: This is the most common type of relationship. One record in the first table
relates to many records in the second table but one record of the second table can only
relate to one record of the first table. For example, we may have one to many relationship
between a person and bank account where one person can have many bank accounts
but a bank account can only have one specific owner. (assuming joint bank account is
not allowed)

 Many to Many: One record in the first table relates to many records in the second table
and vice versa. Generally, we break down one many to many relation to two one to many
relations in logical design and the intermediary table is referred as a junction table. An
example would be student and course where one student can take many courses and
each course can be taken by many students.

Roles of Database Management


System
Database management systems are therefore crucial and important links in the creation as well as
management of data. They are needed for effective running and management of data. It also
helps companies to transfer the said data through the entire systems. Some of the reasons why
data management systems are important include the following:
1.Data management system is needed for data access within the company
Modern database management systems are dependent on a programming language that is called
structured query language. This language is then used to access, update and delete data that are
present within its tables. The database systems also contain programs that include Microsoft’s
SQL server and the open source MySQL queries that enable outside programs to access its data
through SQL queries. For example, a web page can display information or data that includes
product data and description, photographs and prices. This information is easily available to the
user easily, when the web server software is connected to the relational database management
system.
2.It is needed to maintain strong relationships between data
One of the most impo.rtant functions of relational database management systems programs is that
it allows different data tables to relate to one another. When a database contains information
about employee data on its product sales in one table and another table contain information one
with sales employee data, then a relational database will be perfect to manage their relationships
in a systematic and simple style. This system in turn can help brand managers to understand
important statistics like which salesperson is able to sell the most or which product is being sold by
a particular salesperson.
3.This system allows newer and better updates
A useful and productive database management system allows brand managers to not just enter
newer information but also update the current information and also delete information that they do
not require. For example, when a salesperson is able to sell 1,000 units, then that person can
enter that transaction information in the relational management system which can include certain
details like the person’s name, customer information along with the product and number of
products sold by the user. The relational database management system will enter the new records
and update all the required information, thereby allowing brands to track and sell their products in
an effective fashion.
4.It helps brand managers to search data in a better manner
The relational database management system also allows brand managers to maintain and build
their data over successive years. The various tables in the relational database management
system allows brand mangers to search through their entire system for a particular information.
The company manager can easily find any information that they need, using a particular criteria.
This is also available for customers who can search for any feature that they want including price,
colour and brand. By storing information in a predictable and sequential format, it enables users to
find the information they need with a lot of ease.

With so much information available for companies, investing in a database management systems
is of critical importance for brands across all sectors and groups. Today, virtually all companies
and brands run of database systems. These storehouse of organised information can help brands
to store information of all kinds, which they can not just sort but also make available at the click of
a mouse as well. In short, database management systems helps brands to track every part of their
business in a fast, effective, efficient and successful way than ever before

 Structural Constraints :
Structural Constraints are also called Structural properties of a database management system
(DBMS). Cardinality Ratios and Participation Constraints taken together are called Structural
Constraints. The name constraints refer to the fact that such limitations must be imposed on the
data, for the DBMS system to be consistent with the requirements.

The Structural constraints are represented by Min-Max notation. This is a pair of numbers(m, n)
that appear on the connecting line between the entities and their relationships. The minimum
number of times an entity can appear in a relation is represented by m whereas, the maximum time
it is available is denoted by n. If m is 0 it signifies that the entity is participating in the relation
partially, whereas, if m is either greater than or equal to 1, it denotes total participation of the entity.
 Extended ER Modeling Features
 Specialization – The process of designating to sub grouping within an entity set is called
specialization. In above figure, the “person” is distinguish in to whether they are “employee”
or “customer”.

Formally in above figure specialization is depicted by a triangle component labelled (is a), means
the customer is a person.

Sometime this ISA (is a) referred as a superclass-subclass relationship. This is also used to
emphasize on to creating the distinct lower level entity sets.

 Generalization – generalization is relationship that exist between higher level entity set and
one or more lower level entity sets. Generalization synthesizes these entity sets into single
entity set.

 Higher level and lower level entity sets – This property is created by specialization and
generalization. The attributes of higher level entity sets are inherited by lower level entity sets.

For example: In above figure “customers” and “employee” inherits the attributes of “person”.

 Attribute inheritance: When given entity set is involved as a lower entity set in only one “ISA”
(is a) relationship, it is referred as a single attribute inheritance. If lower entity set is involved in
more than one ISA (is a) relationship, it is referred as a multi attribute inheritance.

 Aggregation: there is a one limitation with E-R model that it cannot express relationships
among relationships. So aggregation is an abstraction through which relationship is treated
as higher level entities.

Design of an ER Database Schema


 The data which is stored in the database at a particular moment of time is called an instance of the
database.
 The overall design of a database is called schema.
 A database schema is the skeleton structure of the database. It represents the logical view of the
entire database.
 A schema contains schema objects like table, foreign key, primary key, views, columns, data
types, stored procedure, etc.
 A database schema can be represented by using the visual diagram. That diagram shows the
database objects and relationship with each other.
 A database schema is designed by the database designers to help programmers whose software
will interact with the database. The process of database creation is called data modeling.

A schema diagram can display only some aspects of a schema like the name of record type, data
type, and constraints. Other aspects can’t be specified through the schema diagram. For example,
the given figure neither show the data type of each data item nor the relationship among various
files.
In the database, actual data changes quite frequently. For example, in the given figure, the
database changes whenever we add a new grade or add a student. The data at a particular
moment of time is called the instance of the database.

 Reduction of an E-R Schema to Tables

A database that conforms to an E-R database schema can be represented by a collection of


tables. For each entity set and for each relationship set, there is a unique table. A table is a chart
with rows and columns. The set of all possible rows is the Cartesian product of all columns.
A row is also known as a tuple or a record. A table has an unlimited number of rows.
Each column is also known as a field.
Strong Entity Sets
It is common practice for the table to have the same name as the entity set. There is one column
for each attribute.
Weak Entity Sets
There is one column for each attribute, plus the attribute(s) the form the primary key of the strong
entity set that the weak entity set depends upon.
Relationship Sets
We represent a relationship with a table that includes the attributes of each of the primary keys
plus any descriptive attributes (if any).
There is a problem that if one of the entities in the relationship is a weak entity set, there would be
no unique information in the relationship table, and therefore may be omitted.
Another problem can occur if there is an existance dependency. In that case, you can combine the
two tables.
Multivalued Attributes
When an attribute is multivalued, remove the attribute from the table and create a new table with
the primary key and the attribute, but each value will be a separate row.
Generalization
Create a table for the higher-level entity set. For each lower-level entity set, create a table with the
attributes for that specialization and include the primary key from the higher-level entity set.

 Relational Model
Relational Model (RM) represents the database as a collection of relations. A relation is nothing
but a table of values. Every row in the table represents a collection of related data values. These
rows in the table denote a real-world entity or relationship.

The table name and column names are helpful to interpret the meaning of values in each row. The
data are represented as a set of relations. In the relational model, data are stored as tables.
However, the physical storage of the data is independent of the way the data are logically
organized.

Some popular Relational Database management systems are:

 DB2 and Informix Dynamic Server - IBM


 Oracle and RDB – Oracle
 SQL Server and Access - Microsoft

Codd's Rule for Relational DBMS


E.F Codd was a Computer Scientist who invented the Relational model for Database
management. Based on relational model, the Relational database was created. Codd proposed
13 rules popularly known as Codd's 12 rules to test DBMS's concept against his relational model.
Codd's rule actualy define what quality a DBMS requires in order to become a Relational
Database Management System(RDBMS). Till now, there is hardly any commercial product that
follows all the 13 Codd's rules.

Rule zero
This rule states that for a system to qualify as an RDBMS, it must be able to manage database
entirely through the relational capabilities.

Rule 1: Information rule


All information(including metadata) is to be represented as stored data in cells of tables. The rows
and columns have to be strictly unordered.

Rule 2: Guaranted Access


Each unique piece of data(atomic value) should be accesible by : Table Name + Primary Key(Row) +
Attribute(column).
NOTE: Ability to directly access via POINTER is a violation of this rule.

Rule 3: Systematic treatment of NULL


Null has several meanings, it can mean missing data, not applicable or no value. It should be
handled consistently. Also, Primary key must not be null, ever. Expression on NULL must give null.

Rule 4: Active Online Catalog


Database dictionary(catalog) is the structure description of the complete Database and it must be
stored online. The Catalog must be governed by same rules as rest of the database. The same
query language should be used on catalog as used to query database.

Rule 5: Powerful and Well-Structured Language


One well structured language must be there to provide all manners of access to the data stored in
the database. Example: SQL, etc. If the database allows access to the data without the use of this
language, then that is a violation.

Rule 6: View Updation Rule


All the view that are theoretically updatable should be updatable by the system as well.

Rule 7: Relational Level Operation


There must be Insert, Delete, Update operations at each level of relations. Set operation like
Union, Intersection and minus should also be supported.

Rule 8: Physical Data Independence


The physical storage of data should not matter to the system. If say, some file supporting table is
renamed or moved from one disk to another, it should not effect the application.

Rule 9: Logical Data Independence


If there is change in the logical structure(table structures) of the database the user view of data
should not change. Say, if a table is split into two tables, a new view should give result as the join
of the two tables. This rule is most difficult to satisfy.

Rule 10: Integrity Independence


The database should be able to enforce its own integrity rather than using other programs. Key
and Check constraints, trigger etc, should be stored in Data Dictionary. This also
make RDBMS independent of front-end.

Rule 11: Distribution Independence


A database should work properly regardless of its distribution across a network. Even if a
database is geographically distributed, with data stored in pieces, the end user should get an
impression that it is stored at the same place. This lays the foundation of distributed database

Rule 12: Nonsubversion Rule


If low level access is allowed to a system it should not be able to subvert or bypass integrity rules
to change the data. This can be achieved by some sort of looking or encryption.

 Relational Model Concepts


1. Attribute: Each column in a Table. Attributes are the properties which define a relation.
e.g., Student_Rollno, NAME,etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is stored
along with its entities. A table has two properties rows and columns. Rows represent
records and columns represent attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with its
attributes.
5. Degree: The total number of attributes which in the relation is called the degree of the
relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system.
Relation instances never have duplicate tuples.
9. Relation key - Every row has one, two or multiple attributes, which is called relation key.
10. Attribute domain – Every attribute has some pre-defined value and scope which is
known as attribute domain

 Relational Algebra
RELATIONAL ALGEBRA is a widely used procedural query language. It collects instances of
relations as input and gives occurrences of relations as output. It uses various operations to
perform this action. SQL Relational algebra query operations are performed recursively on a
relation. The output of these operations is a new relation, which might be formed from one or more
input relations.

Basic SQL Relational Algebra Operations

Relational Algebra devided in various groups

Unary Relational Operations


 SELECT (symbol: σ)
 PROJECT (symbol: π)
 RENAME (symbol: ρ)

Relational Algebra Operations From Set Theory


 UNION (υ)
 INTERSECTION ( ),
 DIFFERENCE (-)
 CARTESIAN PRODUCT ( x )

Binary Relational Operations


 JOIN
 DIVISION
Let's study them in detail with solutions:

SELECT (σ)
The SELECT operation is used for selecting a subset of the tuples according to a given selection
condition. Sigma(σ)Symbol denotes it. It is used as an expression to choose tuples which meet the
selection condition. Select operator selects tuples that satisfy a given predicate.

σp(r)

σ is the predicate

r stands for relation which is the name of the table

p is prepositional logic

Projection(π)
The projection eliminates all attributes of the input relation but those mentioned in the projection
list. The projection method defines a relation that contains a vertical subset of Relation.

This helps to extract the values of specified attributes to eliminates duplicate values. (pi) symbol is
used to choose attributes from a relation. This operator helps you to keep specific columns from a
relation and discards the other columns.

Example of Projection:

Consider the following table

CustomerI CustomerNam Status


D e

1 Google Active

2 Amazon Active

3 Apple Inactiv
e

4 Alibaba Active

Here, the projection of CustomerName and status will give

Π CustomerName, Status (Customers)


Rename (ρ)
Rename is a unary operation used for renaming attributes of a relation.

ρ (a/b)R will rename the attribute 'b' of relation by 'a'.

Union operation (υ)


UNION is symbolized by ∪ symbol. It includes all tuples that are in tables A or in B. It also
eliminates duplicate tuples. So, set A UNION set B would be expressed as:

The result <- A ∪ B

For a union operation to be valid, the following conditions must hold -

 R and S must be the same number of attributes.


 Attribute domains need to be compatible.
 Duplicate tuples should be automatically removed.

Set Difference (-)


- Symbol denotes it. The result of A - B, is a relation which includes all tuples that are in A but not
in B.

 The attribute name of A has to match with the attribute name in B.


 The two-operand relations A and B should be either compatible or Union compatible.
 It should be defined relation consisting of the tuples that are in relation A, but not in B.

Intersection
An intersection is defined by the symbol ∩

A∩B

Defines a relation consisting of a set of all tuple that are in both A and B. However, A and B must
be union-compatible

 Relational Calculus?
Contrary to Relational Algebra which is a procedural query language to fetch data and which also
explains how it is done, Relational Calculus in non-procedural query language and has no
description about how the query will work or the data will b fetched. It only focusses on what to do,
and not on how to do it.
Relational Calculus exists in two forms:

1. Tuple Relational Calculus (TRC)


2. Domain Relational Calculus (DRC)
Tuple Relational Calculus (TRC)
In tuple relational calculus, we work on filtering tuples based on the given condition.
Syntax: { T | Condition }
In this form of relational calculus, we define a tuple variable, specify the table(relation) name in
which the tuple is to be searched for, along with a condition.
We can also specify column name using a . dot operator, with the tuple variable to only get a
certain attribute(column) in result.
A lot of informtion, right! Give it some time to sink in.
A tuple variable is nothing but a name, can be anything, generally we use a single alphabet for
this, so let's say T is a tuple variable.
To specify the name of the relation(table) in which we want to look for data, we do the following:
Relation(T), where T is our tuple variable.
For example if our table is Student, we would put it as Student(T)
Then comes the condition part, to specify a condition applicable for a particluar attribute(column),
we can use the . dot variable with the tuple variable to specify it, like in table Student, if we want
to get data for students with age greater than 17, then, we can write it as,
T.age > 17, where T is our tuple variable.
Putting it all together, if we want to use Tuple Relational Calculus to fetch names of students, from
table Student, with age greater than 17, then, for T being our tuple variable,
T.name | Student(T) AND T.age > 17

Domain Relational Calculus (DRC)


In domain relational calculus, filtering is done based on the domain of the attributes and not based
on the tuple values.
Syntax: { c1, c2, c3, ..., cn | F(c1, c2, c3, ... ,cn)}
where, c1, c2... etc represents domain of attributes(columns) and F defines the formula including
the condition for fetching the data.
For example,
{< name, age > | ∈ Student ∧ age > 17}
Unit-3.
Introduction to SQL:

 SQL | Datatypes
1. Binary Datatypes :
There are four subtypes of this datatype which are given below :

2. Exact Numeric Datatype :


There are nine subtypes which are given below in the table. The table contains the range of data in
a particular type.
3. Approximate Numeric Datatype :
The subtypes of this datatype are given in the table with the range.

4. Character String Datatype :


The subtypes are given in below table –

5. Unicode Character String Datatype :


The details are given in below table –

6.
Date and Time Datatype The details are given in below table.

 SQL: Literals
String Literals
String literals are always surrounded by single quotes (').
For example:

'TechOnTheNet.com'
'This is a literal'
'XYZ'
'123'

These string literal examples contain of strings enclosed in single quotes.

Integer Literals
Integer literals can be either positive numbers or negative numbers, but do not contain decimals. If
you do not specify a sign, then a positive number is assumed. Here are some examples of valid
integer literals:

536
+536
-536

Decimal Literals
Decimal literals can be either positive numbers or negative numbers and contain decimals. If you
do not specify a sign, then a positive number is assumed. Here are some examples of valid
decimal literals:

24.7
+24.7
-24.7

Datetime Literals
Datetime literals are character representations of datetime values that are enclosed in single
quotes. Here are some examples of valid datetime literals:

'April 30, 2015'


'2015/04/30'
'2015/04/30 08:34:25'

 Types Of SQL Commands


 SQL Categorizes its commands on the basis of functionalities performed by them. There
are five types of SQL Commands which can be classified as:
o DDL(Data Definition Language).
o DML(Data Manipulation Language).
o DQL(Data Query Language).
o DCL(Data Control Language).
o TCL(Transaction Control Language).

Types Of SQL Commands : Data Definition Language(DDL)

 In order to make/perform changes on the physical structure of any table residing inside a
database, DDL is used. These commands when executed are auto commit in nature and all
the changes in the table are reflected and saved immediately. DDL commands includes :

Types Of SQL Commands : Data Manipulation Language(DML)

 Once the tables are created and database is generated using DDL commands,
manipulation inside those tables and databases is done using DML commands. The
advantage of using DML commands is, if in case any wrong changes or values are made,
they can be changes and rolled back easily. DML commands includes :

Types Of SQL Commands : Data Control Language(DCL)

 DCL commands as the name suggests manages the matters and issues related to the data
control in any database. TCL commands mainly provides special privilege access to users
and is also used to specify the roles of users accordingly. There are two commonly used
DCL commands, these are:

Types Of SQL Commands : Data Query Language(DQL)


 Data query language consists of only one command over which data selection in SQL
relies. SELECT command in combination with other SQL clauses is used to retrieve and
fetch data from database/tables on the basis of certain conditions applied by user.

Types Of SQL Commands : Transaction Control Language(TCL)


 Transaction Control Language as the name suggests manages the issues and matters
related to the transactions in any database. They are used to rollback or commit the
changes in the database.
 Roll back means “Undo” the changes and Commit means “Applying” the changes. There
are three major TCL commands.

 SQL Operators
SQL statements generally contain some reserved words or characters that are used to perform
operations such as comparison and arithmetical operations etc. These reserved words or
characters are known as operators.

Generally there are three types of operators in SQL:

1. SQL Arithmetic Operators


2. SQL Comparison Operators
3. SQL Logical Operators

SQL Arithmetic Operators:


Let's assume two variables "a" and "b". Here "a" is valued 50 and "b" valued 100.

Example:

Operator Descriptions Examples


s

+ It is used to add containing values of both operands a+b will give


150

- It subtracts right hand operand from left hand operand a-b will give -50

* It multiply both operand's values a*b will give


5000

/ It divides left hand operand by right hand operand b/a will give 2

% It divides left hand operand by right hand operand and returns b%a will give 0
reminder

SQL Comparison Operators:


Let's take two variables "a" and "b" that are valued 50 and 100.

Operato Description Example


r

= Examine both operands value that are equal or not,if yes condition (a=b) is not
become true. true

!= This is used to check the value of both operands equal or not,if not (a!=b) is true
condition become true.

<> Examines the operand's value equal or not, if values are not equal (a<>b) is
condition is true true

> Examine the left operand value is greater than right Operand, if yes (a>b) is not
condition becomes true true

< Examines the left operand value is less than right Operand, if yes (a<=""
condition becomes true td="">

>= Examines that the value of left operand is greater than or equal to the (a>=b) is not
value of right operand or not,if yes condition become true true

<= Examines that the value of left operand is less than or equal to the (a<=b) is
value of right operand or not, if yes condition becomes true true

!< Examines that the left operand value is not less than the right operand (a!<=""
value td="">

!> Examines that the value of left operand is not greater than the value of (a!>b) is true
right operand

SQL Logical Operators:


This is the list of logical operators used in SQL.

Operator Description

ALL this is used to compare a value to all values in another value set.

AND this operator allows the existence of multiple conditions in an SQL statement.

ANY this operator is used to compare the value in list according to the condition.

BETWEEN this operator is used to search for values, that are within a set of values

IN this operator is used to compare a value to that specified list value

NOT the NOT operator reverse the meaning of any logical operator

OR this operator is used to combine multiple conditions in SQL statements


EXISTS the EXISTS operator is used to search for the presence of a row in a specified table

LIKE this operator is used to compare a value to similar values using wildcard operator

 SQL Table
o SQL Table is a collection of data which is organized in terms of rows and columns. In
DBMS, the table is known as relation and row as a tuple.
o Table is a simple form of data storage. A table is also considered as a convenient
representation of relations.

Let's see an example of the EMPLOYEE table:

EMP_ID EMP_NAME CITY PHONE_NO

1 Kristen Washington 7289201223

2 Anna Franklin 9378282882

3 Jackson Bristol 9264783838

4 Kellan California 7254728346

5 Ashley Hawaii 9638482678

In the above table, "EMPLOYEE" is the table name, "EMP_ID", "EMP_NAME", "CITY",
"PHONE_NO" are the column names. The combination of data of multiple columns forms a row,
e.g., 1, "Kristen", "Washington" and 7289201223 are the data of one row.

Operation on Table
1. Create table
2. Drop table
3. Delete table
4. Rename table

SQL Create Table


SQL create table is used to create a table in the database. To define the table, you should define the name of the
table and also define its columns and column's data type.

Syntax

1. create table "table_name"


2. ("column1" "data type",
3. "column2" "data type",
4. "column3" "data type",
5. ...
6. "columnN" "data type");

Example

1. SQL> CREATE TABLE EMPLOYEE (


2. EMP_ID INT NOT NULL,
3. EMP_NAME VARCHAR (25) NOT NULL,
4. PHONE_NO INT NOT NULL,
5. ADDRESS CHAR (30),
6. PRIMARY KEY (ID)
7. );

Drop table
A SQL drop table is used to delete a table definition and all the data from a table. When this command is executed, all
the information available in the table is lost forever, so you have to very careful while using this command.

Syntax

1. DROP TABLE "table_name";

Firstly, you need to verify the EMPLOYEE table using the following command:

1. SQL> DESC EMPLOYEE;

Field Type Null Key Default Extra

EMP_ID int(11) NO PRI NULL

EMP_NAME varchar(25) NO NULL

PHONE_NO NO int(11) NULL

ADDRESS YES NULL char(30)

o 4 rows in set (0.35 sec)

This table shows that EMPLOYEE table is available in the database, so we can drop it as follows:

1. SQL>DROP TABLE EMPLOYEE;

Now, we can check whether the table exists or not using the following command:

1. Query OK, 0 rows affected (0.01 sec)

As this shows that the table is dropped, so it doesn't display it.

SQL DELETE table


In SQL, DELETE statement is used to delete rows from a table. We can use WHERE condition to delete a specific row
from a table. If you want to delete all the records from the table, then you don't need to use the WHERE clause.

Syntax

1. DELETE FROM table_name WHERE condition;

Example

Suppose, the EMPLOYEE table having the following records:

EMP_ID EMP_NAME CITY PHONE_NO SALARY

1 Kristen Chicago 9737287378 150000

2 Russell Austin 9262738271 200000

3 Denzel Boston 7353662627 100000

4 Angelina Denver 9232673822 600000

5 Robert Washington 9367238263 350000

6 Christian Los angels 7253847382 260000

The following query will DELETE an employee whose ID is 2.

1. SQL> DELETE FROM EMPLOYEE


2. WHERE EMP_ID = 3;

views and indexes


A view is simply any SELECT query that has been given a name and saved in the database. For
this reason, a view is sometimes called a named query or a stored query. To create a view, you
use the SQL syntax:

CREATE OR REPLACE VIEW <view_name> AS


SELECT <any valid select query>;
 The view query itself is saved in the database, but it is not actually run until it is called with
another SELECT statement. For this reason, the view does not take up any disk space for data
storage, and it does not create any redundant copies of data that is already stored in the tables
that it references (which are sometimes called the base tables of the view).
 Although it is not required, many database developers identify views with names such as
v_Customers or Customers_view. This not only avoids name conflicts with base tables, it helps in
reading any query that uses a view.
 The keywords OR REPLACE in the syntax shown above are optional. Although you don’t need
to use them the first time that you create a view, including them will overwrite an older version of
the view with your latest one, without giving you an error message.
 The syntax to remove a view from your schema is exactly what you would expect:
DROP VIEW <view_name>;
 Indexes
An index, as you would expect, is a data structure that the database uses to find records within a
table more quickly. Indexes are built on one or more columns of a table; each index maintains a
list of values within that field that are sorted in ascending or descending order. Rather than sorting
records on the field or fields during query execution, the system can simply access the rows in
order of the index.

Unique and non-unique indexes: When you create an index, you may allow the indexed columns
to contain duplicate values; the index will still list all of the rows with duplicates. You may also
specify that values in the indexed columns must be unique, just as they must be with a primary
key. In fact, when you create a primary key constraint on a table, Oracle and most other systems
will automatically create a unique index on the primary key columns, as well as not allowing null
values in those columns. One good reason for you to create a unique index on non-primary key
fields is to enforce the integrity of a candidate key, which otherwise might end up having
(nonsense) duplicate values in different rows.

Queries versus insertion/update: It might seem as if you should create an index on every column
or group of columns that will ever by used in an ORDER BY clause (for example: lastName,
firstName). However, each index will have to be updated every time that a row is inserted or a
value in that column is updated. Although index structures such as B or B+ trees allow this to
happen very quickly, there still might be circumstances where too many indexes would detract
from overall system performance. This and similar issues are often covered in more advanced
courses.

Syntax: As you would expect by now, the SQL to create an index is:

CREATE INDEX <indexname> ON <tablename> (<column>, <column>...);

To enforce unique values, add the UNIQUE keyword:

CREATE UNIQUE INDEX <indexname> ON <tablename> (<column>, <column>...);

To specify sort order, add the keyword ASC or DESC after each column name, just as you would
do in an ORDER BY clause.

To remove an index, simply enter:

DROP INDEX <indexname>;

 What is QUERY?

 A query is an operation that retrieves data from one or more tables or views.
 SELECT statement can be used for retrieving the data from various tables in a database.
Example:

<Employee> Table

Eid Ename Age City Salary


E001 ABC 29 Pune 20000

E002 PQR 30 Pune 30000

E003 LMN 25 Mumbai 5000

E004 XYZ 24 Mumbai 4000

E005 STU 32 Bangalore 25000

1. Selecting all columns (SELECT *)


SELECT * FROM Employee;

2. Displaying particular record with condition (WHERE)


SELECT Ename FROM Employee
WHERE City = 'Pune';

Output:

Ename

ABC

PQR

3. SELECT using DISTINCT


DISTINCT clause is used to eliminate the duplicate values from the table.

Example:
SELECT DISTINCT city FROM Employee;

Output:

City

Bangalore

Mumbai

Pune

4. SELECT using IN
'IN' determines whether a specified value matches any value in a sub-query or a list.

Example:
SELECT Eid, Ename FROM Employee
WHERE Salary IN (5000, 20000);

Output:

Eid Ename
E001 ABC

E003 LMN

5. SELECT using BETWEEN


'BETWEEN' is used to get those values who fall within a range.

Example:
SELECT Eid, Ename, Salary FROM Employee
WHERE Salary BETWEEN 5000 AND 30000;

Output:

Eid Ename Salary

E001 ABC 20000

E002 PQR 30000

E003 LMN 5000

E005 STU 25000

NOT BETWEEN

Example:
SELECT Eid, Ename, Age FROM Employee
WHERE Age NOT BETWEEN 24 AND 25;

Output:

Eid Ename Age

E001 ABC 29

E002 PQR 30

E005 STU 32

6. SELECT using LIKE

 LIKE clause is used for comparing a value with similar values using wildcard operators (% and _ ).
 Suppose, if you want user name starts with 'S', then use 'LIKE' clause as follows,
Example:
SELECT Ename, City, Salary FROM Employee
WHERE Ename LIKE 'S%';

Output:

Eid Ename City Salary

E005 STU Bangalore 25000


 SUB-QUERY

 Sub-query is a inner query within another query. It is used to return data in the main query as a
condition to retrieved the data.
 Sub-queries are nested SELECT statement.
 It is a query within a query.
 Sub-queries are mostly appear within the WHERE or HAVING clause of another SQL statement.
 It defines with another SELECT statement with a FROM clause and optional WHERE, GROUP
BY and HAVING clauses.
 It produces a single column of data as its result.
 In a sub-query, ORDER BY clause cannot be specified. ORDER BY clause is specified in the
main query.
 Sub-query is always enclosed in parentheses.
 It cannot be a UNION, only a single SELECT statement is allowed.
 In a sub-query, 'SELECT *' cannot be used unless the referring table has only one column and
nested query is evaluated first.
Example:
SELECT Ename, Salary FROM Employee
WHERE Salary IN
(SELECT MAX (Salary) FROM Employee);

Output:

Ename Salary

PQR 30000

Following are the comparison operators where sub-queries are expressed as one SELECT
statement connected to another,

Comparison Operator

Operator Description

= Equal to

< > or != Not equal to

> Greater than

< Less than

>= Greater than Equal to

<= Less than Equal to

Multiple-row Comparison Operator

Operator Description
IN Equal to any value retrieved in an Inner query.

NOT IN Not equal to any value retrieved in an inner query.

= ANY Equal to any value retrieved in an inner query – Logical OR.

> ANY, >= ANY Retrieves any highest value.

< ANY, <= ANY Retrieves any smallest value.

= ALL Equal to all values retrieved in an Inner query – Logical AND.

> ALL, >= ALL Retrieves all highest values.

< ALL, <=ALL Retrieves all smallest values.

SQL Aggregate Functions


o SQL aggregation function is used to perform the calculations on multiple rows of a single
column of a table. It returns a single value.
o It is also used to summarize the data.

Following are the Aggregate functions:


1. AVG
2. MAX
3. MIN
4. SUM
5. COUNT()
6. COUNT(*)

1. COUNT FUNCTION
o COUNT function is used to Count the number of rows in a database table. It can work on
both numeric and non-numeric data types.
o COUNT function uses the COUNT(*) that returns the count of all the rows in a specified
table. COUNT(*) considers duplicate and Null.

Syntax

COUNT(*)
or
COUNT( [ALL|DISTINCT] expression )

2. SUM Function
Sum function is used to calculate the sum of all selected columns. It works on numeric fields only.

Syntax
SUM()
or
SUM( [ALL|DISTINCT] expression )
Example: SUM()
SELECT SUM(COST)
FROM PRODUCT_MAST;
Output:
670
Example: SUM() with WHERE
SELECT SUM(COST)
FROM PRODUCT_MAST
WHERE QTY>3;
Output:
320
Example: SUM() with GROUP BY
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3
4. GROUP BY COMPANY;
Output:
Com1 150
Com2 170
Example: SUM() with HAVING
1. SELECT COMPANY, SUM(COST)
2. FROM PRODUCT_MAST
3. GROUP BY COMPANY
4. HAVING SUM(COST)>=170;
Output:
Com1 335
Com3 170

3. AVG function
The AVG function is used to calculate the average value of the numeric type. AVG function returns the average of all
non-Null values.
Syntax
1. AVG()
2. or
3. AVG( [ALL|DISTINCT] expression )
Example:
1. SELECT AVG(COST)
2. FROM PRODUCT_MAST;
Output:
67.00

4. MAX Function
MAX function is used to find the maximum value of a certain column. This function determines the largest value of all
selected values of a column.
Syntax
1. MAX()
2. or
3. MAX( [ALL|DISTINCT] expression )
Example:
1. SELECT MAX(RATE)
2. FROM PRODUCT_MAST;
30

5. MIN Function
MIN function is used to find the minimum value of a certain column. This function determines the smallest value of all
selected values of a column.
Syntax
1. MIN()
2. or
3. MIN( [ALL|DISTINCT] expression )
Example:
1. SELECT MIN(RATE)
2. FROM PRODUCT_MAST;
Output:
10

What is Cursor in SQL ?


Cursor is a Temporary Memory or Temporary Work Station. It is Allocated by Database Server at
the Time of Performing DML operations on Table by User. Cursors are used to store Database
Tables. There are 2 types of Cursors: Implicit Cursors, and Explicit Cursors. These are explained as
following below.
1. Implicit Cursors:
Implicit Cursors are also known as Default Cursors of SQL SERVER. These Cursors are
allocated by SQL SERVER when the user performs DML operations.
2. Explicit Cursors :
Explicit Cursors are Created by Users whenever the user requires them. Explicit Cursors are
used for Fetching data from Table in Row-By-Row Manner.
How to create Explicit Cursor:
1. Declare Cursor Object.
Syntax : DECLARE cursor_name CURSOR FOR SELECT * FROM table_name
DECLARE s1 CURSOR FOR SELECT * FROM studDetails
2. Open Cursor Connection.
Syntax : OPEN cursor_connection
OPEN s1
Fetch Data from cursor.
There are total 6 methods to access data from cursor. They are as follows :
FIRST is used to fetch only the first row from cursor table.
LAST is used to fetch only last row from cursor table.
NEXT is used to fetch data in forward direction from cursor table.
PRIOR is used to fetch data in backward direction from cursor table.
ABSOLUTE n is used to fetch the exact nth row from cursor table.
RELATIVE n is used to fetch the data in incremental way as well as decremental way.
Syntax : FETCH NEXT/FIRST/LAST/PRIOR/ABSOLUTE n/RELATIVE n FROM cursor_name
1. FETCH FIRST FROM s1
2. FETCH LAST FROM s1
3. FETCH NEXT FROM s1
4. FETCH PRIOR FROM s1
5. FETCH ABSOLUTE 7 FROM s1
6. FETCH RELATIVE -2 FROM s1
7. Close cursor connection.
Syntax : CLOSE cursor_name
CLOSE s1
8. Deallocate cursor memory.
Syntax : DEALLOCATE cursor_name
DEALLOCATE s
Unit-4.
Relational Database Design:

 Functional Dependency?
Functional Dependency (FD) determines the relation of one attribute to another attribute in a
database management system (DBMS) system. Functional dependency helps you to maintain the
quality of data in the database. A functional dependency is denoted by an arrow →. The functional
dependency of X on Y is represented by X → Y. Functional Dependency plays a vital role to find
the difference between good and bad database design.

Example:

Employee number Employee Name Salary City

1 Dana 50000 San Francisco

2 Francis 38000 London

3 Andrew 25000 Tokyo

In this example, if we know the value of Employee number, we can obtain Employee Name, city,
salary, etc. By this, we can say that the city, Employee Name, and salary are functionally
depended on Employee number.

Rules of Functional Dependencies


Below given are the Three most important rules for Functional Dependency:

 Reflexive rule –. If X is a set of attributes and Y is_subset_of X, then X holds a value of Y.


 Augmentation rule: When x -> y holds, and c is attribute set, then ac -> bc also holds. That
is adding attributes which do not change the basic dependencies.
 Transitivity rule: This rule is very much similar to the transitive rule in algebra if x -> y holds
and y -> z holds, then x -> z also holds. X -> y is called as functionally that determines y.

Types of Functional Dependencies


 Multivalued dependency:
 Trivial functional dependency:
 Non-trivial functional dependency:
 Transitive dependency:

Multivalued dependency in DBMS


Multivalued dependency occurs in the situation where there are multiple independent multivalued
attributes in a single table. A multivalued dependency is a complete constraint between two sets of
attributes in a relation. It requires that certain tuples be present in a relation.
Example:

Car_model Maf_year Color

H001 2017 Metallic

H001 2017 Green

H005 2018 Metallic

H005 2018 Blue

H010 2015 Metallic

H033 2012 Gray

In this example, maf_year and color are independent of each other but dependent on car_model.
In this example, these two columns are said to be multivalue dependent on car_model.

This dependence can be represented like this:

car_model -> maf_year

car_model-> colour

Trivial Functional dependency:


The Trivial dependency is a set of attributes which are called a trivial if the set of attributes are
included in that attribute.

So, X -> Y is a trivial functional dependency if Y is a subset of X.

For example:

Emp_id Emp_name

AS555 Harry

AS811 George

AS999 Kevin

Consider this table with two columns Emp_id and Emp_name.

{Emp_id, Emp_name} -> Emp_id is a trivial functional dependency as Emp_id is a subset of


{Emp_id,Emp_name}.
Non trivial functional dependency in DBMS
Functional dependency which also known as a nontrivial dependency occurs when A->B holds
true where B is not a subset of A. In a relationship, if attribute B is not a subset of attribute A, then
it is considered as a non-trivial dependency.

Company CEO Age

Microsoft Satya Nadella 51

Google Sundar Pichai 46

Apple Tim Cook 57

Example:

(Company} -> {CEO} (if we know the Company, we knows the CEO name)

But CEO is not a subset of Company, and hence it's non-trivial functional dependency.

Transitive dependency:
A transitive is a type of functional dependency which happens when t is indirectly formed by two
functional dependencies.

Example:

Company CEO Age

Microsoft Satya Nadella 51

Google Sundar Pichai 46

Alibaba Jack Ma 54

{Company} -> {CEO} (if we know the compay, we know its CEO's name)

{CEO } -> {Age} If we know the CEO, we know the Age

Therefore according to the rule of rule of transitive dependency:

{ Company} -> {Age} should hold, that makes sense because if we know the company name, we
can know his age.
Properties of Decomposition-
The following two properties must be followed when decomposing a given relation-

1. Lossless decomposition-
Lossless decomposition ensures-
 No information is lost from the original relation during decomposition.
 When the sub relations are joined back, the same relation is obtained that was decomposed.
Every decomposition must always be lossless.

2. Dependency Preservation-
Dependency preservation ensures-
 None of the functional dependencies that holds on the original relation are lost.
 The sub relations still hold or satisfy the functional dependencies of the original relation.
Types of Decomposition-
Decomposition of a relation can be completed in the following two ways-

1. Lossless Join Decomposition-


2. Lossy Join Decomposition-
1. Lossless Join Decomposition-

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


 This decomposition is called lossless join decomposition when the join of the sub relations
results in the same relation R that was decomposed.
 For lossless join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R

where ⋈ is a natural join operator

2. Lossy Join Decomposition-

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


 This decomposition is called lossy join decomposition when the join of the sub relations does not
result in the same relation R that was decomposed.
 The natural join of the sub relations is always found to have some extraneous tuples.
 For lossy join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R

where ⋈ is a natural join operator

Noluralization up to 5 NF
NORMALIZATION is a database design technique that reduces data redundancy and eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization rules
divides larger tables into smaller tables and links them using relationships. The purpose of
Normalization in SQL is to eliminate redundant (repetitive) data and ensure data is stored logically.

The inventor of the relational model Edgar Codd proposed the theory of normalization with the
introduction of the First Normal Form, and he continued to extend theory with Second and Third
Normal Form. Later he joined Raymond F. Boyce to develop the theory of Boyce-Codd Normal
Form.

Database Normal Forms

Here is a list of Normal Forms

 1NF (First Normal Form)


 2NF (Second Normal Form)
 3NF (Third Normal Form)
 BCNF (Boyce-Codd Normal Form)
 4NF (Fourth Normal Form)
 5NF (Fifth Normal Form)

1NF (First Normal Form) Rules


 Each table cell should contain a single value.
 Each record needs to be unique.

The above table in 1NF-

1NF Example

Table 1: In 1NF Form


 2NF (Second Normal Form) Rules
 Rule 1- Be in 1NF
 Rule 2- Single Column Primary Key

It is clear that we can't move forward to make our simple database in 2nd Normalization form
unless we partition the table above.

Table 1

Table 2

We have divided our 1NF table into two tables viz. Table 1 and Table2. Table 1 contains member
information. Table 2 contains information on movies rented.

We have introduced a new column called Membership_id which is the primary key for table 1.
Records can be uniquely identified in Table 1 using membership id

3NF (Third Normal Form) Rules


 Rule 1- Be in 2NF
 Rule 2- Has no transitive functional dependencies

To move our 2NF table into 3NF, we again need to again divide our table.

3NF Example

TABLE 1
Table 2

Table 3

We have again divided our tables and created a new table which stores Salutations.

There are no transitive functional dependencies, and hence our table is in 3NF

In Table 3 Salutation ID is primary key, and in Table 1 Salutation ID is foreign to primary key in
Table 3

Now our little example is at a level that cannot further be decomposed to attain higher normal
forms of normalization. In fact, it is already in higher normalization forms. Separate efforts for
moving into next levels of normalizing data are normally needed in complex databases. However,
we will be discussing next levels of normalizations in brief in the following.

BCNF (Boyce-Codd Normal Form)


Even when a database is in 3rd Normal Form, still there would be anomalies resulted if it has more
than one Candidate Key.

Sometimes is BCNF is also referred as 3.5 Normal Form.

 4NF (Fourth Normal Form) Rules


If no database table instance contains two or more, independent and multivalued data describing
the relevant entity, then it is in 4th Normal Form.

 5NF (Fifth Normal Form) Rules


A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any number of
smaller tables without loss of data.

Unit-5.
Selected Database Issues:

Database Security In DBMS


Seeing the vast increase in volume and speed of threats to databases and many information assets,
research efforts need to be consider to the following issues
such as data quality, intellectual property rights, and database survivability.
Let’s discuss them one by one.
1. Data quality –
 The database community basically needs techniques and some organizational solutions to
assess and attest the quality of data. These techniques may include the simple mechanism
such as quality stamps that are posted on different websites. We also need techniques that
will provide us more effective integrity semantics verification tools for assessment of data
quality, based on many techniques such as record linkage.
 We also need application-level recovery techniques to automatically repair the incorrect data.
 The ETL that is extracted transform and load tools widely used for loading the data in the
data warehouse are presently grappling with these issues.
2. Intellectual property rights –
As the use of Internet and intranet is increasing day by day, legal and informational aspects of data
are becoming major concerns for many organizations. To address this concerns watermark
technique are used which will help to protect content from unauthorized duplication and distribution
by giving the provable power to the ownership of the content.
Traditionally they are dependent upon the availability of a large domain within which the objects can
be altered while retaining its essential or important properties.
However, research is needed to access the robustness of many such techniques and the study and
investigate many different approaches or methods that aimed to prevent intellectual property rights
violation.
3. Database survivability –
Database systems need to operate and continued their functions even with the reduced capabilities,
despite disruptive events such as information warfare attacks
A DBMS in addition to making every effort to prevent an attack and detecting one in the event of the
occurrence should be able to do the following:
 Confident:
We should take immediate action to eliminate the attacker’s access to the system and to
isolate or contain the problem to prevent further spread.
 Damage assessment:
Determine the extent of the problem, including failed function and corrupted data.
 Recover:
Recover corrupted or lost data and repair or reinstall failed function to reestablish a normal
level of operation.
 Reconfiguration:
Reconfigure to allow the operation to continue in a degraded mode while recovery proceeds.
 Fault treatment:
To the extent possible, identify the weakness exploited in the attack and takes steps to
prevent a recurrence.

Transaction Management in DBMS


transaction is a set of logically related operations. For example, you are transferring money from
your bank account to your friend’s account, the set of operations would be like this:

Simple Transaction Example


1. Read your account balance
2. Deduct the amount from your balance
3. Write the remaining balance to your account
4. Read your friend’s account balance
5. Add the amount to his account balance
6. Write the new updated balance to his account

This whole set of operations can be called a transaction. Although I have shown you read, write
and update operations in the above example but the transaction can have operations like read,
write, insert, update, delete.

In DBMS, we write the above 6 steps transaction like this:


Lets say your account is A and your friend’s account is B, you are transferring 10000 from A to B,
the steps of the transaction are:

1. R(A);
2. A = A - 10000;
3. W(A);
4. R(B);
5. B = B + 10000;
6. W(B);
In the above transaction R refers to the Read operation and W refers to the write operation.

Transaction failure in between the operations


Now that we understand what is transaction, we should understand what are the problems
associated with it.

The main problem that can happen during a transaction is that the transaction can fail before
finishing the all the operations in the set. This can happen due to power failure, system crash etc.
This is a serious problem that can leave database in an inconsistent state. Assume that
transaction fail after third operation (see the example above) then the amount would be deducted
from your account but your friend will not receive it.

To solve this problem, we have the following two operations

Commit: If all the operations in a transaction are completed successfully then commit those
changes to the database permanently.

Rollback: If any of the operation fails then rollback all the changes done by previous operations .

Query Processing in DBMS


Query Processing is the activity performed in extracting data from the database. In query processing, it takes
various steps for fetching the data from the database. The steps involved are:

1. Parsing and translation


2. Optimization
3. Evaluation

The query processing works in the following way:


1. Parsing and Translation
As query processing includes certain activities for data retrieval. Initially, the given user queries
get translated in high-level database languages such as SQL. It gets translated into
expressions that can be further used at the physical level of the file system. After this, the
actual evaluation of the queries and a variety of query -optimizing transformations and takes
place. Thus before processing a query, a computer system needs to translate the query into a
human-readable and understandable language. Consequently, SQL or Structured Query
Language is the best suitable choice for humans. But, it is not perfectly suitable for the internal
representation of the query to the system. Relational algebra is well suited for the internal
representation of a query. The translation process in query processing is similar to the parser of
a query. When a user executes any query, for generating the internal form of the query, the
parser in the system checks the syntax of the query, verifies the name of the relation in the
database, the tuple, and finally the required attribute value. The parser creates a tree of the
query, known as 'parse-tree.' Further, translate it into the form of relational algebra. With this, it
evenly replaces all the use of the views when used in the query.
Thus, we can understand the working of a query processing in the below-described diagram:

Suppose a user executes a query. As we have learned that there are various methods of
extracting the data from the database. In SQL, a user wants to fetch the records of the employees
whose salary is greater than or equal to 10000. For doing this, the following query is undertaken:

select emp_name from Employee where salary>10000;

Thus, to make the system understand the user query, it needs to be translated in the form of
relational algebra. We can bring this query in the relational algebra form as:

o σsalary>10000 (πsalary (Employee))


o πsalary (σsalary>10000 (Employee))

After translating the given query, we can execute each relational algebra operation by using different
algorithms. So, in this way, a query processing begins its working.

2. Evaluation
For this, with addition to the relational algebra translation, it is required to annotate the translated relational
algebra expression with the instructions used for specifying and evaluating each operation. Thus, after
translating the user query, the system executes a query evaluation plan.

Query Evaluation Plan


o In order to fully evaluate a query, the system needs to construct a query evaluation plan.
o The annotations in the evaluation plan may refer to the algorithms to be used for the particular index
or the specific operations.
o Such relational algebra with annotations is referred to as Evaluation Primitives. The evaluation
primitives carry the instructions needed for the evaluation of the operation.
o Thus, a query evaluation plan defines a sequence of primitive operations used for evaluating a query. The
query evaluation plan is also referred to as the query execution plan.
o A query execution engine is responsible for generating the output of the given query. It takes the query
execution plan, executes it, and finally makes the output for the user query.

 Query Optimization
Query: A query is a request for information from a database.
Query Plans: A query plan (or query execution plan) is an ordered set of steps used to access data
in a SQL relational database management system.
Query Optimization: A single query can be executed through different algorithms or re-written in
different forms and structures. Hence, the question of query optimization comes into the picture –
Which of these forms or pathways is the most optimal? The query optimizer attempts to determine
the most efficient way to execute a given query by considering the possible query plans.
Importance: The goal of query optimization is to reduce the system resources required to fulfill a
query, and ultimately provide the user with the correct result set faster.

 First, it provides the user with faster results, which makes the application seem faster to the
user.
 Secondly, it allows the system to service more queries in the same amount of time, because
each request takes less time than unoptimized queries.
 Thirdly, query optimization ultimately reduces the amount of wear on the hardware (e.g. disk
drives), and allows the server to run more efficiently (e.g. lower power consumption, less
memory usage).
There are broadly two ways a query can be optimized:
1. Analyze and transform equivalent relational expressions: Try to minimize the tuple and
column counts of the intermediate and final query processes (discussed here).
2. Using different algorithms for each operation: These underlying algorithms determine how
tuples are accessed from the data structures they are stored in, indexing, hashing, data
retrieval and hence influence the number of disk and block accesses (discussed in query
processing).
Analyze and transform equivalent relational expressions

 Concurrency Control?
Concurrency control is the procedure in DBMS for managing simultaneous operations without
conflicting with each another. Concurrent access is quite easy if all users are just reading data.
There is no way they can interfere with one another. Though for any practical database, would
have a mix of reading and WRITE operations and hence the concurrency is a challenge.

Concurrency control is used to address such conflicts which mostly occur with a multi-user
system. It helps you to make sure that database transactions are performed concurrently without
violating the data integrity of respective databases.

Therefore, concurrency control is a most important element for the proper functioning of a system
where two or multiple database transactions that require access to the same data, are executed
simultaneously.

Concurrency Control Protocols


Different concurrency control protocols offer different benefits between the amount of concurrency
they allow and the amount of overhead that they impose.

 Lock-Based Protocols
 Two Phase
 Timestamp-Based Protocols
 Validation-Based Protocols

A. Lock-based Protocols
A lock is a data variable which is associated with a data item. This lock signifies that operations
that can be performed on the data item. Locks help synchronize access to the database items by
concurrent transactions.

All lock requests are made to the concurrency-control manager. Transactions proceed only once
the lock request is granted.

Binary Locks: A Binary lock on a data item can either locked or unlocked states.

Shared/exclusive: This type of locking mechanism separates the locks based on their uses. If a
lock is acquired on a data item to perform a write operation, it is called an exclusive lock.

1. Shared Lock (S):

A shared lock is also called a Read-only lock. With the shared lock, the data item can be shared
between transactions. This is because you will never have permission to update data on the data
item.

For example, consider a case where two transactions are reading the account balance of a
person. The database will let them read by placing a shared lock. However, if another transaction
wants to update that account's balance, shared lock prevent it until the reading process is over.

2. Exclusive Lock (X):

With the Exclusive Lock, a data item can be read as well as written. This is exclusive and can't be
held concurrently on the same data item. X-lock is requested using lock-x instruction. Transactions
may unlock the data item after finishing the 'write' operation.
For example, when a transaction needs to update the account balance of a person. You can
allows this transaction by placing X lock on it. Therefore, when the second transaction wants to
read or write, exclusive lock prevent this operation.

3. Simplistic Lock Protocol

This type of lock-based protocols allows transactions to obtain a lock on every object before
beginning operation. Transactions may unlock the data item after finishing the 'write' operation.

4. Pre-claiming Locking

Pre-claiming lock protocol helps to evaluate operations and create a list of required data items
which are needed to initiate an execution process. In the situation when all locks are granted, the
transaction executes. After that, all locks release when all of its operations are over.

B. Two Phase Locking (2PL) Protocol


Two-Phase locking protocol which is also known as a 2PL protocol. It is also called P2L. In this
type of locking protocol, the transaction should acquire a lock after it releases one of its locks.

This locking protocol divides the execution phase of a transaction into three different parts.

 In the first phase, when the transaction begins to execute, it requires permission for the
locks it needs.
 The second part is where the transaction obtains all the locks. When a transaction releases
its first lock, the third phase starts.
 In this third phase, the transaction cannot demand any new locks. Instead, it only releases
the acquired locks.

C. Timestamp-based Protocols
3. The timestamp-based algorithm uses a timestamp to serialize the execution of
concurrent transactions. This protocol ensures that every conflicting read and write
operations are executed in timestamp order. The protocol uses the System Time or
Logical Count as a Timestamp.

The older transaction is always given priority in this method. It uses system time to determine the
time stamp of the transaction. This is the most commonly used concurrency protocol.

Lock-based protocols help you to manage the order between the conflicting transactions when
they will execute. Timestamp-based protocols manage conflicts as soon as an operation is
created.

Example:

Suppose there are there transactions T1, T2, and T3.


T1 has entered the system at time 0010
T2 has entered the system at 0020
T3 has entered the system at 0030
Priority will be given to transaction T1, then transaction T2 and lastly Transaction T3.

Advantages:
 Schedules are serializable just like 2PL protocols
 No waiting for the transaction, which eliminates the possibility of deadlocks!

Disadvantages:

Starvation is possible if the same transaction is restarted and continually aborted

Database Recovery Techniques


Database Recovery is a process of recovering or restoring data in the database when a data loss
occurs or data gets deleted by system crash, hacking, errors in the transaction, damage occurred
coincidentally, by viruses, sudden terrible failure, commands incorrect implementation, etc. Data
loss or failures happen in databases like other systems but the data stored in the database should
be available whenever it's required. For fast restoration or recovery of data, the database must
hold tools which recover the data efficiently. It should have atomicity means either the transactions
showing the consequence of successful accomplishment perpetually in the database or the
transaction must have no sign of accomplishment consequence in the database.

From any failure set of circumstances, there are both voluntary and involuntary ways for both,
backing up of data and recovery. So, recovery techniques which are based on deferred update
and immediate update or backing up data can be used to stop loss in the database.

Crash recovery
Crash recovery is the operation through which the database is transferred back to a compatible
and operational condition. In DBMS, this is performed by rolling back insufficient transactions and
finishing perpetrated transactions that even now existed in memory when the crash took place.

With many transactions being implemented with each second shows that, DBMS may be a
tremendously complex system. The fundamental hardware of the system manages to sustain
robustness and stiffness of software which depends upon its complex design. It’s anticipated that
the system would go behind with some methodology or techniques to restore lost data when it fails
or crashes in between the transactions.

Classification of failure

The following points are the generalization of failure into various classifications, to examine
the source of a problem,

1. Transaction failure: a transaction has to terminate when it arrives at a point from


where it can’t extend any further and when it fails to implement the operation.
Transaction failure reasons could be,
o Logical errors: The errors which take place in some code or any intrinsic
error situation, where a transaction cannot properly fulfill.
o System errors: The errors which take place when the database management
system is not able to implement the active transaction or it has to terminate it
because of some conditions in a system.
2. System Crash: There are issues which may stop the system unexpectedly from
outside and may create the system condition to crash. For example, disturbance or
interference in the power supply may create the system condition of fundamental
hardware or software to crash or failure.
3. Disk Failure: Disk failures comprise bad sectors evolution in the disk, disk
inaccessibility, and head crash in the disk, other failures which damage disk storage
completely or its particular parts.

Storage structure

The storage structure can be classified into two following categories,

 Volatile storage: A volatile storage cannot hold on crashes in the system. These
devices are located within reach of CPU. Examples of volatile storage are the main
memory and cache memory.
 Non-volatile storage: A non-volatile storage are created to hold on crashes in the
system. These devices are enormous in the magnitude of data storage, but not quick
in approachability. Examples of non-volatile storage are hard-disks, magnetic tapes,
flash memory, and RAM.

Recovery and Atomicity

To recover and also to sustain the transaction atomicity, there are two types of
methodology,

 Sustaining each transaction logs and before actually improving the database put
them down onto some storage which is substantial.
 Sustaining shadow paging, in which on a volatile memory the improvements are
completed and afterward, the real database is reformed.

Log-based Recovery
The log is an order of sequence of records, which sustains the operations record
accomplished by a transaction in the database. Before the specific changes and
improvements survive on a storage media which is stable and failing securely, it’s essential
that the logs area unit put down in storage.

Following are the workings of Log-based Recovery,

The log file is not damaged on a stable storage media.

Log-based recovery puts down a log regarding a transaction when a transaction begins to
be involved in the system and starts implementation.

Recovery with Concurrent Transactions

The logs are interleaved, when multiple transactions are being implemented in collateral. It
would be difficult for the system of recovery to make an order of sequence of all logs again,
and then start recovering at the time of recovery. Most recent times Database systems use
the abstraction of 'checkpoints' to make this condition uncomplicated.

Checkpoint
The checkpoint is an established process where all the logs which are previously used are
clear out from the system and stored perpetually in a storage disk. Checkpoint mention a
point before which the DBMS was in a compatible state, and all the transactions were
perpetrated.

You might also like