0% found this document useful (0 votes)
15 views47 pages

DBMS Unit1

Uploaded by

sirisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views47 pages

DBMS Unit1

Uploaded by

sirisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 47

1

DSC–3D Database Management Systems

Unit – I

the Database Environment, Advantages and Disadvantages of DBMSs, The Three-Level ANSI-
SPARC Architecture,Database Languages, Data Models, Functions of a DBMS, Components of
a DBMS.Relational Model: Introduction, Terminology, Integrity Constraints, Views.The
Relational Algebra: Unary Operations, Set Operations, Join Operations, Division Operation,
Aggregation and Grouping Operations.

Unit – II

SQL: Introduction, Data Manipulation–Simple Queries, Sorting Results, Using the SQL
Aggregate Functions,Grouping Results, Sub-queries, ANY and ALL, Multi-table Queries,
EXISTS and NOT EXIST, Combining ResultTables, Database Updates.SQL: The ISO SQL Data
Types, Integrity Enhancement Feature–Domain Constraints, Entity Integrity,Referential
Integrity, General Constraints, Data Definition–Creating a Database, Creating a Table, Changing
a Table Definition, Removing a Table, Creating an Index, Removing an Index, Views–Creating a
View, Removinga View, View Resolution, Restrictions on Views, View Updatability, WITH
CHECK OPTION, Advantages and Disadvantages of Views, View Materialization,
Transactions, Discretionary Access Control–Granting
Privileges to Other Users, Revoking Privileges from Users.
Advanced SQL: The SQL Programming Language–Declarations, Assignments, Control
Statements, Exceptions,Cursors, Subprograms, Stored Procedures, Functions, and Packages,
Triggers, Recursion.
Unit – III

Entity–Relationship Modeling: Entity Types, Relationship Types, Attributes, Keys, Strong and
Weak EntityTypes, Attributes on Relationships, Structural Constraints, Problems with ER
Models–Fan Traps, ChasmTraps.Enhanced Entity–Relationship Modeling:
Specialization/Generalization, Aggregation, Composition.Functional–Dependencies: Anomalies,
Partial Functional Dependency, Transitive Functional Dependency,Multi Valued Dependency,
Join Dependency.

Normalization: The Purpose of Normalization, How Normalization Supports Database Design,


DataRedundancy and Update Anomalies, FunctionalDependencies in brief, The Process of
Normalization,1NF,2NF, 3NF,BCNF. The Database Design Methodology for Relational
Databases (Appendix–D).

Unit – IV
2

Transaction Management: Transaction Support–Properties of Transactions, Database


Architecture,

Concurrency Control–The Need for Concurrency Control, Serializability and Recoverability,


Locking Methods,Deadlock, Time Stamping Methods, Multi-version Timestamp Ordering,
Optimistic Techniques, Granularity of Data Items, Database Recovery–The Need for Recovery,
Transactions and Recovery, Recovery Facilities,Recovery Techniques, Nested Transaction
Model.Security: Database Security–Threats, Computer-Based Controls–Authorization, Access
Controls, Views,Backup and Recovery, Integrity, Encryption, RAID.

Database Management System

Introduction:

Database: Database is a collection of inter-related data which helps in efficient retrieval,


insertion and deletion of data from database and organizes the data in the form of tables, views,
schemas, reports etc. For Example, university database organizes the data about students, faculty,
and admin staff etc. which helps in efficient retrieval, insertion and deletion of data from it.

Database Management System: The software which is used to manage database is called
Database Management System (DBMS). For Example, MySQL, Oracle etc. are popular
commercial DBMS used in different applications. DBMS allows users the following tasks:

Data Definition: It helps in creation, modification and removal of definitions that define the
organization of data in database.
Data Updation: It helps in insertion, modification and deletion of the actual data in the database.
Data Retrieval: It helps in retrieval of data from the database which can be used by applications
for various purposes.
User Administration: It helps in registering and monitoring users, enforcing data security,
monitoring performance, maintaining data integrity, dealing with concurrency control and
recovering information corrupted by unexpected failure.

Paradigm Shift from File System to DBMS

File System manages data using files in hard disk. Users are allowed to create, delete, and
update the files according to their requirement. Let us consider the example of file based
University Management System. Data of students is available to their respective Departments,
Academics Section, Result Section, Accounts Section, Hostel Office etc. Some of the data is
common for all sections like Roll No, Name, Father Name, Address and Phone number of
students but some data is available to a particular section only like Hostel allotment number
which is a part of hostel office. Let us discuss the issues with this system:
 Redundancy of data: Data is said to be redundant if same data is copied at many places. If
a student wants to change Phone number, he has to get it updated at various sections.
Similarly, old records must be deleted from all sections representing that student.
3

Inconsistency of Data: Data is said to be inconsistent if multiple copies of same data does
not match with each other. If Phone number is different in Accounts Section and
Academics Section, it will be inconsistent. Inconsistency may be because of typing errors
or not updating all copies of same data.
 Difficult Data Access: A user should know the exact location of file to access data, so the
process is very cumbersome and tedious. If user wants to search student hostel allotment
number of a student from 10000 unsorted students’ records, how difficult it can be.
 Unauthorized Access: File System may lead to unauthorized access to data. If a student
gets access to file having his marks, he can change it in unauthorized way.
 No Concurrent Access: The access of same data by multiple users at same time is known
as concurrency. File system does not allow concurrency as data can be accessed by only
one user at a time.
 No Backup and Recovery: File system does not incorporate any backup and recovery of
data if a file is lost or corrupted.
These are the main reasons which made a shift from file system to DBMS.

Database Management System (DBMS) Applications


 Airlines: reservations, schedules, etc
 Telecom: calls made, customer details, network usage, etc
 Universities: registration, results, grades, etc
 Sales: products, purchases, customers, etc
 Banking: all transactions etc
Important characteristics of DBMS
A modern DBMS has the following characteristics −

1. Represent Some Aspects of real world applications

A database represents some features of real world applications. Any change in the real world is
reflected in the database. For example, let us take railway reservation system; we have in our
mind some certain applications of maintaining records of attendance, waiting list, train arrival
and departure time, certain day etc. related to each train.

2. Self Describing nature

A database is of self describing nature; it always describes and narrates itself. It contains the
description of the whole data structure, the constraints and the variables. It makes it different
from traditional file management system in which definition was not the part of application
program. These definitions are used by the users and DBMS software when needed.

3. Logical relationship between records and data


4

A database gives a logical relationship between its records and data. So a user can access various
records depending upon the logical conditions by a single query from the database.

4. Control Data Redundancy

DBMS follows the rules of normalization, which splits a relation when any of its attributes is
having redundancy in values. Normalization is a mathematically rich and scientific process that
reduces data redundancy.

5. Query Language

DBMS is equipped with query language, which makes it more efficient to retrieve and
manipulate data. A user can apply as many and as different filtering options as required to
retrieve a set of data. Traditionally it was not possible where file-processing system was used.

6. Multiuser and Concurrent Access

DBMS supports multi-user environment and allows them to access and manipulate data in
parallel. Though there are restrictions on transactions when users attempt to handle the same data
item, but users are always unaware of them.

7. Multiple views of database

Basically, a view is a subset of the database. A view is defined and devoted for a particular user
of the system. Different users of the system may have different views of the same system. Every
view contains only the data of interest to a user or a group of users. It is the responsibility of
users to be aware of how and where the data of their interest is stored.

8. Security

Features like multiple views offer security to some extent where users are unable to access data
of other users and departments. DBMS offers methods to impose constraints while entering data
into the database and retrieving the same at a later stage. DBMS offers many different levels of
security features, which enables multiple users to have different views with different features.
For example, a user in the Sales department cannot see the data that belongs to the Purchase
department. Additionally, it can also be managed how much data of the Sales department should
be displayed to the user. Since a DBMS is not saved on the disk as traditional file systems, it is
very hard for miscreants to break the code.

9. ACID Properties

DBMS follows the concepts of Atomicity, Consistency, Isolation, and Durability (normally
shortened as ACID) in order to ensure accuracy, completeness, and data integrity. These
concepts are applied on transactions, which manipulate data in a database.

 Atomicity
5

The atomicity property identifies that the transaction is atomic. An atomic transaction is either
fully completed, or is not begun at all. Any updates that a transaction might affect on a system
are completed in their entirety. If for any reason an error occurs and the transaction is unable to
complete all of its steps, the then system is returned to the state it was in before the transaction
was started. An example of an atomic transaction is an account transfer transaction. The money
is removed from account A then placed into account B. If the system fails after removing the
money from account A, then the transaction processing system will put the money back into
account A, thus returning the system to its original state. This is known as a rollback.

 Isolation
When a transaction runs in isolation, it appears to be the only action that the system is carrying
out at one time. If there are two transactions that are both performing the same function and are
running at the same time, transaction isolation will ensure that each transaction thinks it has
exclusive use of the system. This is important in that as the transaction is being executed, the
state of the system may not be consistent. The transaction ensures that the system remains
consistent after the transaction ends, but during an individual transaction, this may not be the
case. If a transaction was not running in isolation, it could access data from the system that may
not be consistent. By providing transaction isolation, this is prevented from happening.

 Consistency
A transaction enforces consistency in the system state by ensuring that at the end of any
transaction the system is in a valid state. If the transaction completes successfully, then all
changes to the system will have been properly made, and the system will be in a valid state. If
any error occurs in a transaction, then any changes already made will be automatically rolled
back. This will return the system to its state before the transaction was started. Since the system
was in a consistent state when the transaction was started, it will once again be in a consistent
state. The account transfer system, the system is consistent if the total of all accounts is constant.
If an error occurs and the money is removed from account A and not added to account B, then
the total in all accounts would have changed. The system would no longer be consistent. By
rolling back the removal from account A, the total will again be what it should be, and the
system back in a consistent state.

Durability
A transaction is durable in that once it has been successfully completed, all of the changes it
made to the system are permanent. There are safeguards that will prevent the loss of information,
even in the case of system failure. By logging the steps that the transaction performs, the state of
the system can be recreated even if the hardware itself has failed. The concept of durability
allows the developer to know that a completed transaction is a permanent part of the system,
regardless of what happens to the system later on.
6

Components of DBMS

 Users: Users may be of any kind such as DB administrator, System developer or database
users.
 Database application: Database application may be Departmental, Personal, organization's and /
or Internal.
 DBMS: Software that allows users to create and manipulate database access,
 Database: Collection of logical data as a single unit.

Database Environment

A database environment is a system of components that regulate the collection, management and
use of data. It includes software, hardware, people, procedures and the data itself.

The hardware in a database environment includes computers and computer peripherals and the
software is everything from the operating system to the application programs. it includes
database management software like Microsoft Access or SQL Server.

The people in a database environment include everyone who administrates and uses the system.
The procedures are the rules and instructions given to both the people and the software and the
data is the collection of facts and information located within the database environment.

 To develop software applications in less time.


 Data independence and efficient use of data.
 For uniform data administration.
 For data integrity and security.
7

 For concurrent access to data, and data recovery from crashes.


 To use user-friendly declarative query language.

Roles in Database Environment


People that participate in DBMS environment r
Data administrator: ( DA )responsible for management of data resources e,g planning ,
Policies, standards., procedures, conceptual/logical db design..
Database administrator:: ( DBA ) ….resposible for physical reliazation of db e.g. physical
db design, performance, implementation, security, integrity, maintaince, users/ groups
Database designer…
a. logical designer : identify data (entities & attribbutes), relationships b/w data,
constraints on the data
b. physical designer : mapp the logical data models in to a set of tables & integrity constraints,
select sp storage str & access methods, designing security measures.
application Programmer :: uses a 3rd or 4th generation prog lang
End Users ::
a . Naive users : unaware of DBMS.
b . Sophisticated users : familiar wth the str & facilities of DBMS.

Advantages of Database Management System (DBMS)

1. Improved data sharing


An advantage of the database management approach is, the DBMS helps to create an
environment in which end users have better access to more and better-managed data.
Such access makes it possible for end users to respond quickly to changes in their environment.

2. Improved data security


The more users access the data, the greater the risks of data security breaches. Corporations
invest considerable amounts of time, effort, and money to ensure that corporate data are used
properly. A DBMS provides a framework for better enforcement of data privacy and security
policies.
8

3. Better data integration


Wider access to well-managed data promotes an integrated view of the organization’s operations
and a clearer view of the big picture. It becomes much easier to see how actions in one segment
of the company affect other segments.
4. Minimized data inconsistency
Data inconsistency exists when different versions of the same data appear in different places. For
example, data inconsistency exists when a company’s sales department stores a sales
representative’s name as “Bill Brown” and the company’s personnel department stores that same
person’s name as “William G. Brown,” or when the company’s regional sales office shows the
price of a product as $45.95 and its national sales office shows the same product’s price as
$43.95. The probability of data inconsistency is greatly reduced in a properly designed database.
5. Improved data access

The DBMS makes it possible to produce quick answers to ad hoc queries. From a database
perspective, a query is a specific request issued to the DBMS for data manipulation—for
example, to read or update the data. Simply put, a query is a question, and an ad hoc query is a
spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to
the application. For example, end users, when dealing with large amounts of sales data, might
want quick answers to questions (ad hoc queries) such as:
- What was the dollar volume of sales by product during the past six months?
- What is the sales bonus figure for each of our salespeople during the past three months?
- How many of our customers have credit balances of 3,000 or more?

6. Improved decision making

Better-managed data and improved data access make it possible to generate better-quality
information, on which better decisions are based. The quality of the information generated
depends on the quality of the underlying data. Data quality is a comprehensive approach to
promoting the accuracy, validity, and timeliness of the data. While the DBMS does not guarantee
data quality, it provides a framework to facilitate data quality initiatives.

7. Increased end-user productivity


The availability of data, combined with the tools that transform data into usable information,
empowers end users to make quick, informed decisions that can make the difference between
success and failure in the global economy.
Till now we have seen different benefits of database management systems. But it has certain
limitations or disadvantages.
Let's find various disadvantages of database system.
9

Disadvantages of Database Management System (DBMS):


Although the database system yields considerable advantages over previous data management
approaches, database systems do carry significant disadvantages. For example:
1. Increased costs

one of the disadvantages of dbms is Database systems require sophisticated hardware and
software and highly skilled personnel. The cost of maintaining the hardware, software, and
personnel required to operate and manage a database system can be substantial. Training,
licensing, and regulation compliance costs are often overlooked when database systems are
implemented.
2. Management complexity
Database systems interface with many different technologies and have a significant impact on a
company’s resources and culture. The changes introduced by the adoption of a database system
must be properly managed to ensure that they help advance the company’s objectives. Given the
fact that database systems hold crucial company data that are accessed from multiple sources,
security issues must be assessed constantly.

3. Maintaining currency
To maximize the efficiency of the database system, you must keep your system current.
Therefore, you must perform frequent updates and apply the latest patches and security measures
to all components.
Because database technology advances rapidly, personnel training costs tend to be significant.
Vendor dependence. Given the heavy investment in technology and personnel training,
companies might be reluctant to change database vendors.
As a consequence, vendors are less likely to offer pricing point advantages to existing customers,
and those customers might be limited in their choice of database system components.

4. Frequent upgrade/replacement cycles

DBMS vendors frequently upgrade their products by adding new functionality. Such new
features often come bundled in new upgrade versions of the software. Some of these versions
require hardware upgrades. Not only do the upgrades themselves cost money, but it also costs
money to train database users and administrators to properly use and manage the new features.

The ANSI-SPARC Architecture


10

The ANSI-SPARC Architecture, where ANSI-SPARC stands for American National Standards
Institute, Standards Planning And Requirements Committee, is an abstract design standard for a
Database Management System (DBMS), first proposed in 1975.

All users should be able to access same data but have a different customized view of the data.
These views independent, changes to one view should not affect others.

Users should not need to know physical database storage details. Database storage structures
could be change without affecting to user views.

The internal (Storage) structure of database not effecting changes made to logical structure of
Database. As a example when shifting database to another hard disc it’s not affecting to the
structure of database.

Levels are

External level – User’s view of the database.

Conceptual level – Describes what data is stored in the database and the relationships among the
data.

Internal – Describes how the data is stored in the database.

Internal Level
The physical representation of the database on the computer to achieve optimal runtime
performance and storage space utilization.

Covers data structures and file organizations used to store data on the storage device.
11

Storage space allocation for data and indexes.

Conceptual Level
This level contains the logical structure of the entire database. Provides a complete view of the
data requirements of the organization that is independent of any storage considerations.

The conceptual level represents:


· All entities, their attributes and their relationships
· The constraints on the data
· Security and integrity information

External Level
Describes the part of the database that is relevant to the user.

The external view include only the entities, attributes or relationships in the ‘real world’ that the
user is interested in.

Through these mappings data independence handled in each level.

Logical Data Independence


Refers to immunity of external schemas to changes in conceptual schema.

Conceptual schema changes (e.g. addition/removal of entities) should not require changes to
external schema or rewrites of application programs.

Physical Data Independence


Refers to immunity of conceptual schema to changes in the internal schema.

Internal schema changes (e.g. using different file organizations, storage structures/devices)
should not require change to conceptual or external schemas.

Database Language
o DBMS has appropriate languages and interfaces to express database queries and updates.
o Database languages can be used to read, store and update the data in the database.

Types of Database Language

1. Data Definition Language

o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
12

o It is used to create schema, tables, indexes, constraints, etc. in the database.


o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of
tables and schemas, their names, indexes, columns in each table, constraints, etc.
Here are some tasks that come under DDL:
o Create: It is used to create objects in the database.
o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come under Data
definition language.

2. Data Manipulation Language


DML stands for Data Manipulation Language. It is used for accessing and manipulating data in
a database. It handles user requests.
Here are some tasks that come under DML:
o Select: It is used to retrieve data from a database.
o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature
of rolling back.)
Here are some tasks that come under DCL:
o Grant: It is used to give user access privileges to a database.
o Revoke: It is used to take back permissions from the user.
There are the following operations which have the authorization of Revoke:
CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.
13

4. Transaction Control Language


TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.
Here are some tasks that come under TCL:
o Commit: It is used to save the transaction on the database.
o Rollback: It is used to restore the database to original since the last Commit.

DBMS - Data Models


Data models define how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is
connected to each other and how they are processed and stored inside the system.

The very first data model could be flat data-models, where all the data used are to be kept in the
same plane. Earlier data models were not so scientific, hence they were prone to introduce lots
of duplication and update anomalies.

Types of Data Model

Following are the types of Data Model,

1. Entity-Relationship (ER) Model

2. Relational Model

3. Hierarchical Model

4. Network Database Model

5. Object Model

1.Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.

ER Model is best used for the conceptual design of a database.

ER Model is based on −

 Entities and their attributes.


 Relationships among entities.

These concepts are explained below.


14

 Entity − An entity in an ER Model is a real-world entity having properties


called attributes. Every attribute is defined by its set of values called domain. For
example, in a school database, a student is considered as an entity. Student has various
attributes like name, age, class, etc.
 Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities define the
number of association between two entities.

Mapping cardinalities −

o one to one
o one to many
o many to one
o many to many
15

2.Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.

The main highlights of this model are −

 Data is stored in tables called relations.


 Relations can be normalized.
 In normalized relations, values saved are atomic values.
 Each row in a relation contains a unique value.
 Each column in a relation contains values from a same domain.
 It is easy to maintain and modify the existing

Relational data model is the primary data model, which is used widely around the world for data
storage and processing. This model is simple and it has all the properties and capabilities
required to process data with storage efficiency.

Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores
the relation among entities. A table has rows and columns, where rows represents records and
columns represent the attributes.
16

Tuple − A single row of a table, which contains a single record for that relation is called a tuple.

Relation instance − A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.

Relation schema − A relation schema describes the relation name (table name), attributes, and
their names.

Relation key − Each row has one or more attributes, known as relation key, which can identify
the row in the relation (table) uniquely.

Attribute domain − Every attribute has some pre-defined value scope, known as attribute
domain.

3. Hierarchical Model

 Hierarchical model was developed by IBM and North American Rockwell known as
Information Management System.
 It represents the data in a hierarchical tree structure.
 This model is the first DBMS model.
 In this model, the data is sorted hierarchically.
 It uses pointer to navigate between the stored data.

4. Network Database Model

 Network Database Model is same like Hierarchical Model, but the only difference is that it
allows a record to have more than one parent.
 In this model, there is no need of parent to child association like the hierarchical model.
 It replaces the hierarchical tree with a graph.
 It represents the data as record types and one-to-many relationship.
17

 This model is easy to design and understand.

5. Object Model

 Object model stores the data in the form of objects, classes and inheritance.
 This model handles more complex applications, such as Geographic Information System (GIS),
scientific experiments, engineering design and manufacturing.
 It is used in File Management System.
 It represents real world objects, attributes and behaviors.
 It provides a clear modular structure.

Functions of Database Management System (DBMS)


DBMS performs several important functions that guarantee the integrity and consistency of the
data in the database. The most important functions of Database Management System are

Data Dictionary Management,


Data Storage Management,
Data Transformation and Presentation,
Security Management,
Multi user Access Control,
Backup and Recovery Management,
Data Integrity Management,
Database Access Languages and Application Programming Interfaces and
Database Communication interfaces.
Functions of DBMS
1. Data Dictionary Management
18

Data Dictionary Management is the one of the most important function of database management
system.
DBMS stores definitions of the data elements and their relationships (metadata) in a data
dictionary.
So, all programs that access the data in the database work through the DBMS.
The DBMS uses the data dictionary to look up the required data component structures and
relationships which relieves you from coding such complex relationships in each program.
Additionally, any changes made in a database structure are automatically recorded in the data
dictionary, thereby freeing you from having to modify all of the programs that access the
changed structure.
In other words, the DBMS system provides data abstraction, and it removes structural and data
dependence from the system.

2. Data Storage Management


One of the DBMS functionality is creating and managing the complex structures required for
data storage, thus relieving you from the difficult task of defining and programming the physical
data characteristics.

A modern DBMS system provides storage not only for the data, but also for related data entry
forms or screen definitions, report definitions, data validation rules, procedural code, structures
to handle video and picture formats, and so on.
Data storage management is also important for database performance tuning. Performance tuning
relates to the activities that make the database perform more efficiently in terms of storage and
access speed. So, the data storage management is another important function of Database
Management System.

3. Data transformation and presentation


The DBMS transforms entered data in to required data structures. The DBMS relieves you of
the chore of making a distinction between the logical data format and the physical data format.
That is, the DBMS formats the physically retrieved data to make it conform to the user’s logical
expectations.
For example, imagine an enterprise database used by a multinational company. An end user in
England would expect to enter data such as July 11, 2009, as “11/07/2009.” In contrast, the same
date would be entered in the United States as “07/11/2009.” Regardless of the data presentation
format, the DBMS system must manage the date in the proper format for each country.
4. Security Management
Security Management is another important function of Database Management System(DBMS).
19

The DBMS creates a security system that enforces user security and data privacy. Security rules
determine which users can access the database, which data items each user can access, and which
data operations (read, add, delete, or modify) the user can perform. This is especially important
in multiuser database systems.

5. Multi User Access Control


Multiuser access control is another important DBMS Function. To provide data integrity and
data consistency, the DBMS uses sophisticated algorithms to ensure that multiple users can
access the database concurrently without compromising the integrity of the database.

6. Backup and Recovery Management


The DBMS provides backup and data recovery to ensure data safety and integrity.
Current DBMS systems provide special utilities that allow the DBA to perform routine and
special backup and restore procedures. Recovery management deals with the recovery of the
database after a failure, such as a bad sector in the disk or a power failure. Such capability is
critical to preserving the database’s integrity.

7. Data Integrity Management


Data integrity management is another important DBMS function.
The DBMS promotes and enforces integrity rules, thus minimizing data redundancy and
maximizing data consistency.
The data relationships stored in the data dictionary are used to enforce data integrity. Ensuring
data integrity is important DBMS functionality in transaction-oriented database systems.
8. Database Access Languages and Application Programming Interfaces

The DBMS provides data access through a query language. A query language is a non
procedural language—one that lets the user specify what must be done without having to specify
how it is to be done.
Structured Query Language (SQL) is the defacto query language and data access standard
supported by the majority of DBMS vendors.

9. Database Communication Interfaces

Current-generation DBMS's accept end-user requests via multiple, different network


environments. For example, the DBMS might provide access to the database via the Internet
through the use of Web browsers such as Mozilla Firefox or Microsoft Internet Explorer. In this
environment, communications can be accomplished in several ways:
20

- End users can generate answers to queries by filling in screen forms through their preferred
Web browser.

- The DBMS can automatically publish predefined reports on a Website.

- The DBMS can connect to third-party systems to distribute information via e-mail or other
productivity applications.

Constraints in DBMS

Constraints enforce limits to the data or type of data that can be inserted/updated/deleted from a
table. The whole purpose of constraints is to maintain the data integrity during an
update/delete/insert into a table. In this tutorial we will learn several types of constraints that can
be created in RDBMS.

Types of constraints

NOT NULL

UNIQUE

DEFAULT

CHECK

Key Constraints – PRIMARY KEY, FOREIGN KEY

Domain constraints

Mapping constraints

NOT NULL:

NOT NULL constraint makes sure that a column does not hold NULL value. When we don’t
provide value for a particular column while inserting a record into a table, it takes NULL value
by default. By specifying NULL constraint, we can be sure that a particular column(s) cannot
have NULL values.

Example:

CREATE TABLE STUDENT(


ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (235),
PRIMARY KEY (ROLL_NO)
21

);
Read more about this constraint here.

UNIQUE:

UNIQUE Constraint enforces a column or set of columns to have unique values. If a column has
a unique constraint, it means that particular column cannot have duplicate values in a table.

CREATE TABLE STUDENT(


ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
Read more about it here.

DEFAULT:

The DEFAULT constraint provides a default value to a column when there is no value provided
while inserting a record into a table.

CREATE TABLE STUDENT(


ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
PRIMARY KEY (ROLL_NO)
);
Read more: Default constraint

CHECK:

This constraint is used for specifying range of values for a particular column of a table. When
this constraint is being set on a column, it ensures that the specified column must have the value
falling in the specified range.

CREATE TABLE STUDENT(


ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ,
STU_NAME VARCHAR (35) NOT NULL,
STU_AGE INT NOT NULL,
EXAM_FEE INT DEFAULT 10000,
STU_ADDRESS VARCHAR (35) ,
22

PRIMARY KEY (ROLL_NO)


);
In the above example we have set the check constraint on ROLL_NO column of STUDENT
table. Now, the ROLL_NO field must have the value greater than 1000.

Key constraints:

PRIMARY KEY:

Primary key uniquely identifies each record in a table. It must have unique values and cannot
contain nulls. In the below example the ROLL_NO field is marked as primary key, that means
the ROLL_NO field cannot have duplicate and null values.

CREATE TABLE STUDENT(


ROLL_NO INT NOT NULL,
STU_NAME VARCHAR (35) NOT NULL UNIQUE,
STU_AGE INT NOT NULL,
STU_ADDRESS VARCHAR (35) UNIQUE,
PRIMARY KEY (ROLL_NO)
);
FOREIGN KEY:

Foreign keys are the columns of a table that points to the primary key of another table. They act
as a cross-reference between tables.

Foreign key in DBMS

Definition: Foreign keys are the columns of a table that points to the primary key of another
table. They act as a cross-reference between tables.

For example:
In the below example the Stu_Id column in Course_enrollment table is a foreign key as it points
to the primary key of the Student table.

Course_enrollment table:

Course_Id Stu_Id

C01 101
23

C02 102

C03 101

C05 102

C06 103

C07 102

Student table:

Stu_Id Stu_Name Stu_Age

101 Chaitanya 22

102 Arya 26
24

103 Bran 25

104 Jon 21

Note: Practically, the foreign key has nothing to do with the primary key tag of another table, if
it points to a unique column (not necessarily a primary key) of another table then too, it would be
a foreign key. So, a correct definition of foreign key would be: Foreign keys are the columns of a
table

Domain constraints in DBMS

A table is DBMS is a set of rows and columns that contain data. Columns in table have a unique
name, often referred as attributes in DBMS. A domain is a unique set of values permitted for an
attribute in a table. For example, a domain of month-of-year can accept January,
February….December as possible values, a domain of integers can accept whole numbers that
are negative, positive and zero.

Definition: Domain constraints are user defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT)

Example:
For example I want to create a table “student_info” with “stu_id” field having value greater than
100, I can create a domain and table like this:

create domain id_value int


constraint id_test
check(value > 100);

create table student_info (


stu_id id_value PRIMARY KEY,
stu_name varchar(30),
stu_age int
);
Another example:
I want to create a table “bank_account” with “account_type” field having value either
“checking” or “saving”:

create domain account_type char(12)


25

constraint acc_type_test
check(value in ("Checking", "Saving"));

create table bank_account (


account_nbr int PRIMARY KEY,
account_holder_name varchar(30),
account_type account_type
Mapping constraints in DBMS

Mapping constraints can be explained in terms of mapping cardinality:

Mapping Cardinality:
One to One: An entity of entity-set A can be associated with at most one entity of entity-set B
and an entity in entity-set B can be associated with at most one entity of entity-set A.

One to Many: An entity of entity-set A can be associated with any number of entities of entity-
set B and an entity in entity-set B can be associated with at most one entity of entity-set A.

Many to One: An entity of entity-set A can be associated with at most one entity of entity-set B
and an entity in entity-set B can be associated with any number of entities of entity-set A.

Many to Many: An entity of entity-set A can be associated with any number of entities of
entity-set B and an entity in entity-set B can be associated with any number of entities of entity-
set A.

We can have these constraints in place while creating tables in database.

Example:

CREATE TABLE Customer (


customer_id int PRIMARY KEY NOT NULL,
first_name varchar(20),
last_name varchar(20)
);

CREATE TABLE Order (


order_id int PRIMARY KEY NOT NULL,
customer_id int,
order_details varchar(50),
constraint fk_Customers foreign key (customer_id)
references dbo.Customer
);
Assuming, that a customer orders more than once, the above relation represents one to
many relation. Similarly we can achieve other mapping constraints based on the requirements.
26

View of Data in DBMS

Abstraction is one of the main features of database systems. Hiding irrelevant details from user
and providing abstract view of data to users, helps in easy and efficient user-
database interaction. In the previous tutorial, we discussed the three level of DBMS architecture,
The top level of that architecture is “view level”. The view level provides the “view of data” to
the users and hides the irrelevant details such as data relationship, database schema, constraints,
security etc from the user.

To fully understand the view of data, you must have a basic knowledge of data abstraction and
instance & schema. Refer these two tutorials to learn them in detail.

1. Data abstraction
2. Instance and schema
Data Abstraction in DBMS

Database systems are made-up of complex data structures. To ease the user interaction with
database, the developers hide internal irrelevant details from users. This process of hiding
irrelevant details from user is called data abstraction.

We have three levels of abstraction:


Physical level: This is the lowest level of data abstraction. It describes how data is actually
stored in database. You can get the complex data structure details at this level.

Logical level: This is the middle level of 3-level data abstraction architecture. It describes what
data is stored in database.
27

View level: Highest level of data abstraction. This level describes the user interaction with
database system.

Example: Let’s say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in
memory. These details are often hidden from the programmers.

At the logical level these records can be described as fields and attributes along with their data
types, their relationship among each other can be logically implemented. The programmers
generally work at this level because they are aware of such things about database systems.

At view level, user just interact with system with the help of GUI and enter the details at the
screen, they are not aware of how the data is stored and what data is stored; such details are
hidden from them.

Relational Algebra
Relational database systems are expected to be equipped with a query language that can assist
its users to query the database instances. There are two kinds of query languages − relational
algebra and relational calculus.

Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator can
be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.

The fundamental operations of relational algebra are as follows −

 Select
 Project
 Union
 Set different
 Cartesian product
 Rename
Select Operation (σ)
It selects tuples that satisfy the given predicate from a relation.
28

Notation − σp(r)

Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula
which may use connectors like and, or, and not. These terms may use relational operators like
− =, ≠, ≥, < , >, ≤.

For example −

σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.

σsubject = "database" and price = "450"(Books)


Output − Selects tuples from books where subject is 'database' and 'price' is 450.

σsubject = "database" and price = "450" or year > "2010"(Books)


Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books
published after 2010.

Project Operation (∏)


It projects column(s) that satisfy a given predicate.

Notation − ∏A1, A2, An (r)

Where A1, A2 , An are attribute names of relation r.

Duplicate rows are automatically eliminated, as relation is a set.

For example −

∏subject, author (Books)


Selects and projects columns named as subject and author from the relation Books.

Union Operation (∪)


It performs binary union between two given relations and is defined as −

r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s

Where r and s are either database relations or relation result set (temporary relation).

For a union operation to be valid, the following conditions must hold −

 r, and s must have the same number of attributes.


29

 Attribute domains must be compatible.


 Duplicate tuples are automatically eliminated.
∏ author (Books) ∪ ∏ author (Articles)
Output − Projects the names of the authors who have either written a book or an article or both.

Set Difference (−)


The result of set difference operation is tuples, which are present in one relation but are not in
the second relation.

Notation − r − s

Finds all the tuples that are present in r but not in s.

∏ author (Books) − ∏ author (Articles)


Output − Provides the name of authors who have written books but not articles.

Cartesian Product (Χ)


Combines information of two different relations into one.

Notation − r Χ s

Where r and s are relations and their output will be defined as −

r Χ s = { q t | q ∈ r and t ∈ s}

σauthor = 'tutorialspoint'(Books Χ Articles)


Output − Yields a relation, which shows all the books and articles written by tutorialspoint.

Rename Operation (ρ)


The results of relational algebra are also relations but without any name. The rename operation
allows us to rename the output relation. 'rename' operation is denoted with small Greek
letter rho ρ.

Notation − ρ x (E)

Where the result of expression E is saved with name of x.

Additional operations are −

 Set intersection
 Assignment
 Natural join
30

SET Operations in SQL


SQL supports few Set operations which can be performed on the table data. These are used to get
meaningful results from data stored in the table, under different special conditions.
In this tutorial, we will cover 4 different types of SET operations, along with example:

1. UNION
2. UNION ALL
3. INTERSECT
4. MINUS

UNION Operation
UNION is used to combine the results of two or more SELECT statements. However it will
eliminate duplicate rows from its resultset. In case of union, number of columns and datatype
must be same in both the tables, on which UNION operation is being applied.

Example of UNION
The First table,

ID Name

1 abhi

2 adam

The Second table,

ID Name
31

2 adam

3 Chester

Union SQL query will be,


SELECT * FROM First
UNION
SELECT * FROM Second;
The resultset table will look like,

ID NAME

1 abhi

2 adam

3 Chester

UNION ALL
This operation is similar to Union. But it also shows the duplicate rows.

Example of Union All


The First table,

ID NAME
32

1 abhi

2 adam

The Second table,

ID NAME

2 adam

3 Chester

Union All query will be like,


SELECT * FROM First
UNION ALL
SELECT * FROM Second;
The resultset table will look like,

ID NAME

1 abhi

2 adam

2 adam

3 Chester

INTERSECT
Intersect operation is used to combine two SELECT statements, but it only retuns the records
which are common from both SELECT statements. In case of Intersect the number of columns
and datatype must be same.
33

NOTE: MySQL does not support INTERSECT operator.

Example of Intersect
The First table,

ID NAME

1 abhi

2 adam

The Second table,

ID NAME

2 adam

3 Chester

Intersect query will be,


SELECT * FROM First
INTERSECT
SELECT * FROM Second;
The resultset table will look like

ID NAME
34

2 adam

MINUS
The Minus operation combines results of two SELECT statements and return only those in the
final result, which belongs to the first set of the result.

Example of Minus
The First table,

ID NAME

1 abhi

2 adam

The Second table,

ID NAME

2 adam

3 Chester

Minus query will be,


SELECT * FROM First
MINUS
35

SELECT * FROM Second;


The resultset table will look like,

ID NAME

1 abhi

Division Operator in SQL


The division operator is used when we have to evaluate queries which contain the keyword ALL.
Some instances where division operator is used are:

1. Which person has account in all the banks of a particular city?


2. Which students have taken all the courses required to graduate?
In above specified problem statements, the description after the keyword 'all' defines a set which
contains some elements and the final result contains those units which satisfy these requirements.
Another way how you can identify the usage of division operator is by using the logical
implication of if...then. In context of the above two examples, we can see that the queries mean
that,

1. If there is a bank in that particular city, that person must have an account in that bank.
2. If there is a course in the list of courses required to be graduated, that person must have taken
that course.
Do not worry if you are not clear with all this new things right away, we will try to expain as we
move on with this tutorial.
We shall see the second example, mentioned above, in detail.
Table 1: Course_Taken → It consists of the names of Students against the courses that they
have taken.

Student_Name Course

Robert Databases

Robert Programming Languages


36

David Databases

David Operating Systems

Hannah Programming Languages

Hannah Machine Learning

Tom Operating Systems

Table 2: Course_Required → It consists of the courses that one is required to take in order to
graduate.

Course

Databases

Programming Languages

Using Division Operator


So now, let's try to find out the correct SQL query for getting results for the first requirement,
which is:
Query: Find all the students who can graduate. (i.e. who have taken all the subjects required for
one to graduate.)

Unfortunately, there is no direct way by which we can express the division operator. Let's walk
through the steps, to write the query for the division operator.
37

1. Find all the students


Create a set of all students that have taken courses. This can be done easily using the following
command.
CREATE TABLE AllStudents AS SELECT DISTINCT Student_Name FROM Course_Taken
This command will return the table AllStudents, as the resultset:

Student_name

Robert

David

Hannah

Tom

2. Find all the students and the courses required to graduate


Next, we will create a set of students and the courses they need to graduate. We can express this
in the form of Cartesian Product of AllStudents and Course_Required using the following
command.
CREATE table StudentsAndRequired AS
SELECT AllStudents.Student_Name, Course_Required.Course
FROM AllStudents, Course_Required
Now the new resultset - table StudentsAndRequired will be:

Student_Name Course

Robert Databases

Robert Programming Languages

David Databases
38

David Programming Languages

Hannah Databases

Hannah Programming Languages

Tom Databases

Tom Programming Languages

3. Find all the students and the required courses they have not taken
Here, we are taking our first step for finding the students who cannot graduate. The idea is to
simply find the students who have not taken certain courses that are required for graduation and
hence they wont be able to graduate. This is simply all those tuples/rows which are present
in StudentsAndRequired and not present in Course_Taken.
CREATE table StudentsAndNotTaken AS
SELECT * FROM StudentsAndRequired WHERE NOT EXISTS
(Select * FROM Course_Taken WHERE StudentsAndRequired.Student_Name =
Course_Taken.Student_Name
AND StudentsAndRequired.Course = Course_Taken.Course)
The table StudentsAndNotTaken comes out to be:

Student_Name Course

David Programming Languages

Hannah Databases

Tom Databases
39

Tom Programming Languages

4. Find all students who cannot graduate


All the students who are present in the table StudentsAndNotTaken are the ones who cannot
graduate. Therefore, we can find the students who cannot graduate as,
CREATE table CannotGraduate AS SELECT DISTINCT Student_Name FROM
StudentsAndNotTaken

Student_name

David

Hannah

Tom

5. Find all students who can graduate


The students who can graduate are simply those who are present in AllStudents but not
in CannotGraduate. This can be done by the following query:
CREATE Table CanGraduate AS SELECT * FROM AllStudents
WHERE NOT EXISTS
(SELECT * FROM CannotGraduate WHERE
CannotGraduate.Student_name = AllStudents.Student_name)
The results will be as follows:

Student_name

Robert

Hence we just learned, how different steps can lead us to the final answer. Now let us see how to
write all these 5 steps in one single query so that we do not have to create so many tables.
40

SELECT DISTINCT x.Student_Name FROM Course_Taken AS x WHERE NOT


EXISTS(SELECT * FROM Course_Required AS y WHERE NOT
EXISTS(SELECT * FROM Course_Taken AS z
WHERE z.Student_name = x.Student_name
AND z.Course = y.Course ))

Student_name

Robert

This gives us the same result just like the 5 steps above.
SQL | Join (Inner, Left, Right and Full Joins)

A SQL Join statement is used to combine data or rows from two or more tables based on a
common field between them. Different types of Joins are:
 INNER JOIN
 LEFT JOIN
 RIGHT JOIN
 FULL JOIN

Consider the two tables below:

Student

StudentCourse
41

The simplest Join is INNER JOIN.


1. INNER JOIN: The INNER JOIN keyword selects all rows from both the tables as long as
the condition satisfies. This keyword will create the result-set by combining all rows from
both the tables where the condition satisfies i.e value of the common field will be same.
Syntax:
2. SELECT table1.column1,table1.column2,table2.column1,....
3. FROM table1
4. INNER JOIN table2
5. ON table1.matching_column = table2.matching_column;
6.
7.
8. table1: First table.
9. table2: Second table
10. matching_column: Column common to both the tables.
Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.

Example Queries(INNER JOIN)


 This query will show the names and age of students enrolled in different courses.
42

 SELECT StudentCourse.COURSE_ID, Student.NAME, Student.AGE FROM Student


 INNER JOIN StudentCourse
 ON Student.ROLL_NO = StudentCourse.ROLL_NO;
Output:

11. LEFT JOIN: This join returns all the rows of the table on the left side of the join and
matching rows for the table on the right side of join. The rows for which there is no
matching row on right side, the result-set will contain null. LEFT JOIN is also known as
LEFT OUTER JOIN.Syntax:
12. SELECT table1.column1,table1.column2,table2.column1,....
13. FROM table1
14. LEFT JOIN table2
15. ON table1.matching_column = table2.matching_column;
16.
17.
18. table1: First table.
19. table2: Second table
20. matching_column: Column common to both the tables.
Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are same.

Example Queries(LEFT JOIN):


43

SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
LEFT JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:

21. RIGHT JOIN: RIGHT JOIN is similar to LEFT JOIN. This join returns all the rows of the
table on the right side of the join and matching rows for the table on the left side of join.
The rows for which there is no matching row on left side, the result-set will contain null.
RIGHT JOIN is also known as RIGHT OUTER JOIN.Syntax:
22. SELECT table1.column1,table1.column2,table2.column1,....
23. FROM table1
24. RIGHT JOIN table2
25. ON table1.matching_column = table2.matching_column;
26.
27.
28. table1: First table.
29. table2: Second table
30. matching_column: Column common to both the tables.
44

Note: We can also use RIGHT OUTER JOIN instead of RIGHT JOIN, both are same.

Example Queries(RIGHT JOIN):


SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
RIGHT JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:

31. FULL JOIN: FULL JOIN creates the result-set by combining result of both LEFT JOIN
and RIGHT JOIN. The result-set will contain all the rows from both the tables. The rows
for which there is no matching, the result-set will contain NULL values.Syntax:
32. SELECT table1.column1,table1.column2,table2.column1,....
33. FROM table1
34. FULL JOIN table2
35. ON table1.matching_column = table2.matching_column;
45

36.
37.
38. table1: First table.
39. table2: Second table
40. matching_column: Column common to both the tables.

Example Queries(FULL JOIN):


SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
FULL JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:

SQL | JOIN (Cartesian Join, Self Join)

SQL Aggregate Functions


46

Aggregate functions

 Aggregate functions perform a calculation on a set of values and return a single value.
 Aggregate functions ignore NULL values except COUNT.
 It is used with the GROUP BY clause of the SELECT statement.
Following are the Aggregate functions:
1. AVG
2. MAX
3. MIN
4. SUM
5. COUNT()
6. COUNT(*)

Example

<Employee> Table

Eid Ename Age City Salary

E001 ABC 29 Pune 20000

E002 PQR 30 Pune 30000

E003 LMN 25 Mumbai 5000

E004 XYZ 24 Mumbai 4000

E005 STU 32 Bangalore 25000


47

Description Syntax Example Output

AVG It returns the average SELECT AVG SELECT AVG(Salary)


of the data values. <column_name> AVG(Salary) 16800
FROM FROM Employee;
<table_name>;

MAX It returns the SELECT MAX SELECT MAX(Salary)


maximum value for a <column_name> MAX(Salary) 30000
column. FROM FROM Employee;
<table_name>;

MIN It returns the SELECT MIN SELECT MIN(Salary) MIN(Salary)


minimum value for a <column_name> FROM Employee; 4000
column. FROM
<table_name>;

SUM It returns the sum SELECT SUM SELECT SUM(Salary)


(addition) of the data <column_name> SUM(Salary) 50000
values. FROM FROM Employee
<table_name>; WHERE City='Pune';

COUNT() It returns total SELECT SELECT COUNT(Empid)


number of values in COUNT COUNT(Empid) 5
a given column. <column_name> FROM Employee;
FROM
<table_name>;

COUNT(*) It returns the number SELECT SELECT COUNT(*) COUNT(*)


of rows in a table. COUNT(*) FROM Employee; 5
FROM
<table_name>;

You might also like