Unit 1 AND 2
Unit 1 AND 2
Introduction of DBMS
The database is a collection of inter-related data which is used to retrieve, insert and delete
the data efficiently. It is also used to organize the data in the form of a table, schema, views,
and reports, etc.
For example: The college Database organizes the data about the admin, staff, students and
faculty etc.
What is Data?
Data is a collection of a distinct small unit of information. It can be used in a variety of forms
like text, numbers, media, bytes, etc. it can be stored in pieces of paper or electronic memory,
etc.
Word 'Data' is originated from the word 'datum' that means 'single piece of information.' It is
plural of the word datum.
What is Database?
A database is an organized collection of data, so that it can be easily accessed and managed.
You can organize data into tables, rows, columns, and index it to make it easier to find
relevant information.
There are many databases available like MySQL, Sybase, Oracle, MongoDB, Informix,
PostgreSQL, SQL Server, etc.
SQL or Structured Query Language is used to operate on the data stored in a database. SQL
depends on relational algebra and tuple relational calculus.
The Evolution
File-Based
1968 was the year when File-Based database were introduced. In file-based databases, data
was maintained in a flat file. Though files have many advantages, there are several
limitations.
One of the major advantages is that the file system has various access methods, e.g.,
sequential, indexed, and random.
1968-1980 was the era of the Hierarchical Database. Prominent hierarchical database model
was IBM's first DBMS. It was called IMS (Information Management System).
Relational Database
1970 - Present: It is the era of Relational Database and Database Management. In 1970, the
relational model was proposed by E.F. Codd.
Relational database model has two main terminologies called instance and schema.
Schema specifies the structure like name of the relation, type of each column and name.
Cloud database
Cloud database facilitates you to store, manage, and retrieve their structured, unstructured
data via a cloud platform. This data is accessible over the Internet. Cloud databases are also
called a database as service (DBaaS) because they are offered as a managed service.
o Database management system is a software which is used to manage the database. For
example: MySQL, Oracle, etc are a very popular commercial database which is used
in different applications.
o DBMS provides an interface to perform various operations like database creation,
storing data in it, updating data, creating a table in the database and a lot more.
o It provides protection and security to the database. In the case of multiple users, it also
maintains data consistency.
o Data Definition: It is used for creation, modification, and removal of definition that
defines the organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual
data in the database.
o Data Retrieval: It is used to retrieve the data from the database which can be used by
applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data
integrity, enforcing data security, dealing with concurrency control, monitoring
performance and recovering information corrupted by unexpected failure.
Characteristics of DBMS
Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all
the data in one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data
among multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of the
database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup
of data from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical
user interfaces, application program interfaces
Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and large
memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
o Higher impact of failure: Failure is highly impacted the database because in most of
the organization, all the data stored in a single database and if the database is damaged
due to electric failure or database corruption then the data may be lost forever.
Applications of DBMS
without going to the bank. This is all possible just because of DBMS that
manages all the bank transactions.
Universities and colleges − Now-a-days examinations are done online. So, the
universities and colleges are maintaining DBMS to store Student’s
registrations details, results, courses and grade all the information in the
database. For example, telecommunications. Without DBMS there is no
telecommunication company. DBMS is most useful to these companies to
store the call details and monthly postpaid bills.
Credit card transactions − The purchase of items and transactions of credit
cards are made possible only by DBMS. A credit card holder has to know the
importance of their information that all are secured through DBMS.
Social Media Sites − By filling the required details we are able to access
social media platforms. Many users sign up daily on social websites such as
Facebook, Pinterest and Instagram. All the information related to the users are
stored and maintained with the help of DBMS.
Finance − Now-a-days there are lots of things to do with finance like storing
sales, holding information and finance statement management etc. these all can
be done with database systems.
Military − In military areas the DBMS is playing a vital role. Military keeps
records of soldiers and it has so many files that should be kept secure and safe.
DBMS provides a high security to military information.
Online Shopping − Now-a-days we all do Online shopping without wasting
the time by going shopping with the help of DBMS. The products are added
and sold only with the help of DBMS like Purchase information, invoice bills
and payment.
Human Resource Management − The management keeps records of each
employee’s salary, tax and work through DBMS.
Manufacturing − Manufacturing companies make products and sell them on a
daily basis. To keep records of all those details DBMS is used.
Airline Reservation system − Just like the railway reservation system, airlines
also need DBMS to keep records of flights arrival, departure and delay status.
Different types of Database Users
1. Database Administrator (DBA) : Database Administrator (DBA) is a
person/team who defines the schema and also controls the 3 levels of database.
The DBA will then create a new account id and password for the user if he/she
need to access the database. DBA is also responsible for providing security to
the database and he allows only the authorized users to access/modify the data
base. DBA is responsible for the problems such as security breaches and poor
system response time.
DBA also monitors the recovery and backup and provide technical
support.
The DBA has a DBA account in the DBMS which called a system or
superuser account.
DBA repairs damage caused due to hardware and/or software failures.
DBA is the one having privileges to perform DCL (Data Control
Language) operations such as GRANT and REVOKE, to
allow/restrict a particular user from accessing the database.
2. Simple / Parametric End Users: Parametric End Users are the unsophisticated who
don’t have any DBMS knowledge but they frequently use the database applications
in their daily life to get the desired results. For examples, Railway’s ticket booking
users are naive users. Clerks in any bank is a naive user because they don’t have any
DBMS knowledge but they still use the database and perform their given task.
3. System Analyst:
System Analyst is a user who analyses the requirements of parametric end users.
They check whether all the requirements of end users are satisfied.
4. Classy Users: Classy users can be engineers, scientists, business analyst, who are
familiar with the database. They can develop their own database applications
according to their requirement. They don’t write the program code but they interact
the database by writing SQL queries directly through the query processor.
5. Database Designers : Data Base Designers are the users who design the structure of
database which includes tables, indexes, views, triggers, stored procedures and
constraints which are usually enforced before the database is created or populated
with data. He/she controls what data must be stored and how the data items to be
related. It is responsibility of Database Designers to understand the requirements of
different user groups and then create a design which satisfies the need of all the user
groups.
6. Application Programmers : Application Programmers also referred as System
Analysts or simply Software Engineers, are the back-end programmers who writes
the code for the application programs. They are the computer professionals. These
programs could be written in Programming languages such as Visual Basic,
Developer, C, FORTRAN, COBOL etc. Application programmers design, debug,
test, and maintain set of programs called “canned transactions” for the Naive
(parametric) users in order to interact with database.
7. Casual Users / Temporary Users : Casual Users are the users who occasionally
use/access the database but each time when they access the database they require the
new information, for example, Middle or higher level manager.
The Database Management System (DBMS) architecture shows how data in the database is
viewed by the users. It is not concerned about how the data are handled and processed by the
DBMS.
It helps in implementation, design, and maintenance of a database to store and organize
information for companies. The concept of DBMS depends upon its architecture. The
architecture can be designed as centralized, decentralized, or hierarchical.
The architecture of DBMS can be defined at three levels as follows −
External levels.
Conceptual levels.
Internal levels.
The main objective of the three level architecture is nothing but to separate each user view of
the data from the way the database is physically represented. The database internal structure
should be unaffected while changes to the physical aspects of storage.
The DBA should be able to change the conceptual structure of the database without affecting
all other users.
External level/ View level
External level describes a part of the database that is relevant to each user. This level
insulates the users from the details of conceptual and the internal level.
Conceptual level/ logic level
Conceptual level describes what data is stored into the database and the relationship among
the data.
It represents the following −
All the entities, attributes and their relationships.
The constraints on the data.
Security and integrity information.
Internal level/ storage level
Internal level is the physical representation of the database on the computer. This level
describes how the data is stored in the database. It covers the data structure and file
organization used to store the data on storage devices.
The levels in the architecture of DBMS are shown below in diagram form −
Data Models
Data Model is the modelling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a database
at each level of data abstraction. Therefore, there are following four data models used for
understanding the structure of the database:
Data model
Data models describe how a database's logical structure is represented. In a database
management system, data models are essential for introducing abstraction. Data models
specify how data is linked to one another, as well as how it is handled and stored within the
system.
1) Relational Data Model: This type of model designs the data in the form of rows and
columns within a table. Thus, a relational model uses tables for representing data and in-
between relationships. Tables are also called relations. This model was initially described by
Edgar F. Codd, in 1969. The relational data model is the widely used model which is
primarily used by commercial data processing applications.
4) Semi structured Data Model: This type of data model is different from the other three
data models (explained above). The semistructured data model allows the data specifications
at places where the individual data items of the same type may have different attributes sets.
The Extensible Markup Language, also known as XML, is widely used for representing the
semistructured data. Although XML was initially designed for including the markup
information to the text document, it gains importance because of its application in the
exchange of data. C
Database Schema
A database schema is a structure that represents the logical storage of the data in a
database. It represents the organization of data and provides information about the
relationships between the tables in a given database. In this topic, we will understand more
about database schema and its types. Before understanding database schema, lets first
understand what a Database is.
The “blueprint” of a database which describes how the data may relate to other tables or other
data models.
Database Schema
o A database schema is the logical representation of a database, which shows how the
data is stored logically in the entire database. It contains list of attributes and
instruction that informs the database engine that how the data is organized and how
the elements are related to each other.
o A database schema contains schema objects that may include tables, fields,
packages, views, relationships, primary key, foreign key,
o In actual, the data is physically stored in files that may be in unstructured form, but to
retrieve it and use it, we need to put it in a structured form. To do this, a database
schema is used. It provides knowledge about how the data is organized in a database
and how it is associated with other data.
1. Logical Schema
2. Physical Schema
3. View Schema
A physical database schema specifies how the data is stored physically on a storage system or
disk storage in the form of Files and Indices. Designing a database at the physical level is
called a physical schema.
The Logical database schema specifies all the logical constraints that need to be applied to
the stored data. It defines the views, integrity constraints, and table. Here the term integrity
constraints define the set of rules that are used by DBMS (Database Management System) to
maintain the quality for insertion & update the data. The logical schema represents how the
data is stored in the form of tables and how the attributes of a table are linked together.
At this level, programmers and administrators work, and the implementation of the data
structure is hidden at this level.
Various tools are used to create a logical database schema, and these tools demonstrate the
relationships between the component of your data; this process is called ER modelling.
The ER modelling stands for entity-relationship modelling, which specifies the relationships
between different entities.
We can understand it with an example of a basic commerce application. Below is the schema
diagram, the simple ER model representing the logical flow of transaction in a commerce
application.
In the given example, the Ids are given in each circle, and these Ids are primary key & foreign
keys.
The primary key is used to uniquely identify the entry in a document or record. The Ids of
the upper three circles are the primary keys.
The Foreign key is used as the primary key for other tables. The FK represent the foreign
key in the diagram. It relates one table to another table.
3. View Schema
The view level design of a database is known as view schema. This schema generally
describes the end-user interaction with the database systems.
It contains both primary & secondary Keys. It also contains both primary & secondary keys.
It contains the column names and their data types. It does not contain any column name or datatype.
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture
is used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get their
request done.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can
directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide
a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture,
applications on the client end can directly communicate with the database at the
server side. For this interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction management.
o To communicate with the DBMS, client-side application establishes a connection with
the server side.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one
level of the database system without altering the schema at the next higher level.
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Logical data independence occurs at the user interface level.
o Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal
levels.
o Physical data independence occurs at the logical interface level.
Sophisticated users:
o Sophisticated users interact with the system without writing programs. Instead,
they form their requests in a database query language.
o They submit each such query to a query processor, whose function is to break
down DML statements into instructions that the storage manager understands.
Specialized users :
o Specialized users are sophisticated users who write specialized database
applications that do not fit into the traditional data-processing framework.
o Among these applications are computer-aided design systems, knowledge base
and expert systems, systems that store data with complex data types (for
example, graphics data and audio data), and environment-modeling systems.
Naïve users :
o Naive users are unsophisticated users who interact with the system by
invoking one of the application programs that have been written previously.
o For example, a bank teller who needs to transfer $50 from account A to
account B invokes a program called transfer. This program asks the teller for
the amount of money to be transferred, the account from which the money is
to be transferred, and the account to which the money is to be transferred.
Database Administrator:
Coordinates all the activities of the database system. The database administrator has a
good understanding of the enterprise’s information resources and needs.
Database administrator's duties include:
o Schema definition: The DBA creates the original database schema by
executing a set of data definition statements in the DDL.
o Storage structure and access method definition.
o Schema and physical organization modification: The DBA carries out
changes to the schema and physical organization to reflect the changing needs
of the organization, or to alter the physical organization to improve
performance.
o Granting user authority to access the database: By granting different types
of authorization, the database administrator can regulate which parts of the
database various users can access.
o Specifying integrity constraints.
o Monitoring performance and responding to changes in requirements.
Query Processor:
The query processor will accept query from user and solves it by accessing the database.
Parts of Query processor:
DDL interpreter
This will interprets DDL statements and fetch the definitions in the data dictionary.
DML compiler
a. This will translates DML statements in a query language into low level instructions
that the query evaluation engine understands.
b. A query can usually be translated into any of a number of alternative evaluation
plans for same query result DML compiler will select best plan for query
optimization.
Query evaluation engine
This engine will execute low-level instructions generated by the DML compiler on
DBMS.
Storage Manager/Storage Management:
A storage manager is a program module which acts like interface between the data
stored in a database and the application programs and queries submitted to the system.
Thus, the storage manager is responsible for storing, retrieving and updating data in
the database.
The storage manager components include:
o Authorization and integrity manager: Checks for integrity constraints and
authority of users to access data.
o Transaction manager: Ensures that the database remains in a consistent state
although there are system failures.
o File manager: Manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
o Buffer manager: It is responsible for retrieving data from disk storage into
main memory. It enables the database to handle data sizes that are much larger
than the size of main memory.
o Data structures implemented by storage manager.
o Data files: Stored in the database itself.
o Data dictionary: Stores metadata about the structure of the database.
o Indices: Provide fast access to data items.
o Database languages can be used to read, store and update the data in the database.
o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number
of tables and schemas, their names, indexes, columns in each table, constraints, etc.
These commands are used to update the database schema that's why they come under Data
definition language.
DML stands for Data Manipulation Language. It is used for accessing and manipulating data
in a database. It handles user requests.
o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the
feature of rolling back.)
There are the following operations which have the authorization of Revoke:
TCL is used to run the changes made by the DML statement. TCL can be grouped into a
logical transaction.
Classification of DBMS
1. Relational DBMS
Relational DBMS is the most widely used type of DBMS. It uses a table-based structure to
store and organize data, with each table representing a specific entity. Data is stored in rows
and columns, each representing a unique record and each representing a specific attribute of
that record. The relational DBMS uses SQL (Structured Query Language) to manipulate and
retrieve data from the database.
2. Object-Oriented DBMS
Object-Oriented DBMS stores data in the form of objects, which are instances of classes. It
allows users to store complex data structures such as images, videos, and audio files and
supports object-oriented programming languages like Java and C++. Object-oriented DBMS
is commonly used in computer-aided design (CAD) and multimedia systems applications.
3. Document-Oriented DBMS
Key-Value Store DBMS stores data as key-value pairs, where each key is associated with a
corresponding value. Key-value store DBMS is typically used when high availability and
scalability are critical. It is a simple and scalable DBMS type commonly used in caching and
session management applications.
5. Columnar DBMS
Columnar DBMS stores data in a column-oriented format, where data is organized and stored
in columns rather than rows. It provides faster query performance and efficient data
compression, making it ideal for data warehousing and analytics applications. Columnar
DBMS is commonly used in finance, healthcare, and retail industries.
6. Graph DBMS
Graph DBMS inputs data in the form of nodes and edges, where each node represents a
single unit, and each edge represents a relationship between two entities. It allows users to
store and retrieve complex data structures such as social networks and recommendation
systems and supports graph query languages like Cypher and Gremlin.
7. NoSQL DBMS
NoSQL DBMS stands for "not only SQL" and is a non-relational type of DBMS. It provides
flexible data models that allow users to store and retrieve unstructured data and is designed to
handle large amounts of data and high transaction rates. NoSQL DBMS is commonly used in
applications such as big data, social networking, and e-commerce.
In addition to the types of DBMSs mentioned above, hybrid DBMSs combine two or more
types of DBMSs, taking advantage of their strengths and overcoming their limitations. For
example, a hybrid DBMS could combine a relational DBMS with a graph DBMS to store and
retrieve structured and unstructured data.
For example, suppose we design a school database. In this database, the student will be an
entity with attributes like address, name, id, age, etc. The address can be another entity with
attributes like city, street name, pin code, etc and there will be a relationship between them.
ER diagrams are used to represent the E-R model in a database, which makes
them easy to be converted into relations (tables).
ER diagrams provide the purpose of real-world modeling of objects which
makes them intently useful.
ER diagrams require no technical knowledge and no hardware support.
These diagrams are very easy to understand and easy to create even for a naive
user.
It gives a standard solution for visualizing the data logically.
Component of ER Diagram
Entity
An Entity may be an object with a physical existence – a particular person, car, house, or
employee – or it may be an object with a conceptual existence – a company, a job, or a
university course.
Entity Set
An Entity is an object of Entity Type and a set of all entities is called an entity set. For
Example, E1 is an entity having Entity Type Student and the set of all students is called
Entity Set. In ER diagram, Entity Type is represented as:
Prashanth C Patel Page 26
Unit 1 AND 2
For Example, A company may store the information of dependents (Parents, Children) of
an Employee. But the dependents don’t have existed without the employee. So Dependent
will be a Weak Entity Type and Employee will be Identifying Entity type for Dependent,
which means it is Strong Entity Type.
A weak entity type is represented by a Double Rectangle. The participation of weak entity
types is always total. The relationship between the weak entity type and its identifying
strong entity type is called identifying relationship and it is represented by a double
diamond.
1. Strong Entity Type: It is an entity that has its own existence and is independent.
The entity relationship diagram represents a strong entity type with the help of a single
rectangle. Below is the ERD of the strong entity type:
In the above example, the "Customer" is the entity type with attributes such as ID, Name,
Gender, and Phone Number. Customer is a strong entity type as it has a unique ID for each
customer.
Attributes
Attributes are the properties that define the entity type. For example, Roll_No, Name,
DOB, Age, Address, and Mobile_No are the attributes that define entity type Student. In
ER diagram, the attribute is represented by an oval.
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key
attribute. For example, Roll_No will be unique for each student. In ER diagram, the key
attribute is represented by an oval with underlying lines.
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For
example, the Address attribute of the student Entity type consists of Street, City, State, and
Country. In ER diagram, the composite attribute is represented by an oval comprising of
ovals.
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No
(can be more than one for a given student). In ER diagram, a multivalued attribute is
represented by a double oval.
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a
derived attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived
attribute is represented by a dashed oval.
The Complete Entity Type Student with its Attributes can be represented as:
What is a Domain
Domain refers to a set of values we can assign to an attribute. Each attribute has a domain.
Moreover, the Name should be a string. It cannot have numeric values. Age has to be a
positive number.
Definition
An attribute is a descriptive property which is owned by each entity of an entity set while a
domain is the set of values allowed for an attribute. Thus, this is the main difference between
Attribute and Domain.
Usage
Importantly, attributes help to describe an entity while domains help to define the range of
values that suit a specific attribute. Hence, this is another difference between Attribute and
Domain.
Example
Name and age are two examples of attributes. Moreover, the name has to be alphabetic, and
age has to be positive to explain the domain.
A set of relationships of the same type is known as a relationship set. The following
relationship set depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as registered in
C3.
The number of different entity sets participating in a relationship set is called the degree of
a relationship set.
Cardinality
The number of times an entity of an entity set participates in a relationship set is known
as cardinality. Cardinality can be of different types:
1. Unary Relationship: When there is only ONE entity set participating in a relation, the
relationship is called a unary relationship. For example, a person has only one passport and
only one passport is given to only one person and hence unary relationship is observed
Unary Relationship
Person Passport
2. Binary Relationship: When there are TWO entities set participating in a relationship,
the relationship is called a binary relationship. For example, a Student is enrolled in a
Course.
Binary Relationship
3. n-ary Relationship: When there are n entities set participating in a relation, the
relationship is called an n-ary relationship.
1. One-to-One: When each entity in each entity set can take part only once in the
relationship, the cardinality is one-to-one.
2. One-to-Many: In one-to-many mapping as well where each entity can be related to more
than one relationship and the total number of tables that can be used in this is 2.
Using sets, one-to-many cardinality can be represented as:
3. Many-to-One: When entities in one entity set can take part only once in the
relationship set and entities in other entity sets can take part more than once in the
relationship set, cardinality is many to one. Let us assume that a student can take only
one course but one course can be taken by many students. So the cardinality will be n to
1. It means that for one course there can be n students but for one student, there will be
only one course.
4. Many-to-Many: When entities in all entity sets can take part more than once in the
relationship cardinality is many to many. Let us assume that a student can take more
than one course and one course can be taken by many students. So the relationship will
be many to many.
There are numbers (represented by M and N) written above the lines which connect
relationships and entities. These are called cardinality ratios. These represent the
maximum number of entities that can be associated with each other through
relationship, R. Types of Cardinality : There can be 4 types of cardinality –
E-R Diagram
An Entity Relationship (ER) Diagram is a type of flowchart that illustrates how “entities”
such as people, objects or concepts relate to each other within a system.
ER diagrams are related to data structure diagrams (DSDs), which focus on the relationships
of elements within entities instead of relationships between entities themselves. ER diagrams
also are often used in combination with data flow diagrams (DFDs), which map out the flow
of information for processes or systems.
ER Model is used to model the logical view of the system from a data perspective which
consists of these symbols:
Rectangles: Rectangles represent Entities in ER Model.
Ellipses: Ellipses represent Attributes in ER Model.
Diamond: Diamonds represent Relationships among Entities.
Lines: Lines represent attributes to entities and entity sets with other
relationship types.
Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
Double Rectangle: Double Rectangle represents a Weak Entity.
Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are used to
express the cardinality. These notations are as follows:
Unified Modeling Language (UML) is a general-purpose modeling language. The main aim
of UML is to define a standard way to visualize the way a system has been designed.
What is UML?
Relational Model.
What is the Relational Model?
The relational model represents how data is stored in Relational Databases. A relational
database consists of a collection of tables,
Relational model can represent as a table with columns and rows
Important Terminologies
Attribute: Attributes are the properties that define an entity.
e.g.; ROLL_NO, NAME, ADDRESS.
Relation Schema: A relation schema defines the structure of the relation and
represents the name of the relation with its attributes. e.g.; STUDENT
(ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for
STUDENT. If a schema has more than 1 relation, it is called Relational Schema.
Tuple: Each row in the relation is known as a tuple. The above relation contains
4 tuples, one of which is shown as:
Degree: The number of attributes in the relation is known as the degree of the
relation. The STUDENT relation defined above has degree 5.
Column: The column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from the relation STUDENT.
NULL Values: The value which is not known or unavailable is called a NULL
value. It is represented by blank space. e.g.; PHONE of STUDENT having
ROLL_NO 4 is NULL.
Relation Key: These are basically the keys that are used to identify the rows
uniquely or also help in identifying tables. These are of the following types.
Primary Key
Candidate Key
Super Key
Foreign Key
Alternate Key
Composite Key
Advantages of the Relational Model
Flexible: Relational Model is more flexible than any other relational model
present.
Secure: Relational Model is more secure than any other relational model.
Data Accuracy: Data is more accurate in the relational data model.
Data Integrity: The integrity of the data is maintained in the relational model.
Operations can be Applied Easily: It is better to perform operations in the
relational model.
Relational Constraints
These are the restrictions or sets of rules imposed on the database contents. It validates the
quality of the database. It validates the various operations like data insertion, updation, and
other processes which have to be performed without affecting the integrity of the data. It
protects us against threats/damages to the database. Mainly Constraints on the relational
database are of 4 types
1. Domain constraints
2. Key constraints or Uniqueness Constraints
3. Entity Integrity constraints
4. Referential integrity constraints
Domain Constraints
In a database table, domain constraints are guidelines that specify the acceptable values for a
certain property or field. These restrictions guarantee data consistency and aid in preventing
the entry of inaccurate or inconsistent data into the database. The following are some
instances of domain restrictions in a Relational Database Model −
Data type constraints − These limitations define the kinds of data that can be
kept in a column. A column created as VARCHAR can take string values, but
a column specified as INTEGER can only accept integer values.
Length Constraints − These limitations define the largest amount of data that
may be put in a column. For instance, a column with the definition
VARCHAR(10) may only take strings that are up to 10 characters long.
Range constraints − The allowed range of values for a column is specified by
range restrictions. A column designated as DECIMAL(5,2), for example, may
only take decimal values up to 5 digits long, including 2 decimal places.
Key Constraints
Key constraints are regulations that a Relational Database Model uses to ensure data
accuracy and consistency in a database. They define how the values in a table's one or more
columns are related to the values in other tables, making sure that the data remains correct.
In Relational Database Model, there are several key constraint kinds, including −
Primary Key Constraint − A primary key constraint is an individual
identifier for each record in a database. It guarantees that each database entry
contains a single, distinct value—or a pair of values—that cannot be null—as
its method of identification.
Foreign Key Constraint − Reference to the primary key in another table is a
foreign key constraint. It ensures that the values of a column or set of columns
in one table correspond to the primary key column(s) in another table.
Unique Constraint − In a database, a unique constraint ensures that no two
values inside a column or collection of columns are the same.
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation
and if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
Example:
NULL values are those values in which there is no data value in the particular field in the
table.
Importance of NULL Value
It is important to understand that a NULL value differs from a zero value.
A NULL value is used to represent a missing value, but it usually has one of
three different interpretations:
The value unknown (value exists but is not known)
Value not available (exists but is purposely withheld)
Attribute not applicable (undefined for this tuple)
Principles of NULL values
Setting a NULL value is appropriate when the actual value is unknown, or when
a value is not meaningful.
A NULL value is not equivalent to a value of ZERO if the data type is a number
and is not equivalent to spaces if the data type is a character.
A NULL value can be inserted into columns of any data type.
A NULL value will evaluate NULL in any expression.
Suppose if any column has a NULL value, then UNIQUE, FOREIGN key, and
CHECK constraints will ignore by SQL.
The primary key is a unique or non-null key that uniquely identifies every record in a table
or relation. Each database needs a unique identifier for every row of a table, and the primary
key plays a vital role in identifying rows in the table uniquely. The primary key column can't
store duplicate values. It is also called a minimal super key; therefore, we cannot specify
more than one primary key in any relationship.
For example, we have a table named customer with attributes such as ID, Name, and City.
Only the ID column can never contain duplicate and NULL values because each customer
has a unique identification number. This feature helps to identify each record in the database
uniquely. Hence, we can make the ID attribute a primary key.
The foreign key is a group of one or more columns in a database to uniquely identify another
database record in some other table to maintain the referential integrity. It is also known as
the referencing key that establishes a relationship between two different tables in a
database. A foreign key always matches the primary key column in another table. It means a
foreign key column in one table refers to the primary key column of another table. A foreign
key is beneficial in relational database normalization, especially when we need to access
records from other tables.
The following points explain the differences between primary and foreign keys:
o A primary key constraint in the relational database acts as a unique identifier for every
row in the table. In contrast, a foreign key constraint establishes a relationship
between two different tables to uniquely identify a row of the same table or another
table.
o The primary key column does not store NULL values, whereas the foreign key can
accept more than one NULL value.
o Each table in a relational database can't define more than one primary key while we
can specify multiple foreign keys in a table.
o We can't remove the parent table's primary key value, which is referenced with a
foreign key column in the child table. In contrast, we can delete the child table's
foreign key value even though they refer to the parent table's primary key.
o A primary key is a unique and non-null constraint, so no two rows can have identical
values for a primary key attribute, whereas foreign key fields can store duplicate
values.
o We can insert the values into the primary key column without any limitation. In
contrast, we need to ensure that the value is present in a primary key column while
inserting values in the foreign key table.
o We can implicitly define the primary key constraint on temporary tables, whereas we
cannot enforce foreign key constraints on temporary tables.
OR
Relational model can represent as a table with columns and rows. Each row is known as a
tuple. Each table of the column has a name or attribute.
Table Student
ROLL_NO NAME ADDRESS PHONE AGE
4 SURESH DAVANAGERE 18
Important Terminologies
Attribute: Attributes are the properties that define an entity.
e.g.; ROLL_NO, NAME, ADDRESS
Relation Schema: A relation schema defines the structure of the relation and
represents the name of the relation with its attributes. e.g.; STUDENT
(ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for
STUDENT. If a schema has more than 1 relation, it is called Relational Schema.
Tuple: Each row in the relation is known as a tuple. The above relation contains
4 tuples, one of which is shown as:
Relation Instance: The set of tuples of a relation at a particular instance of time
is called a relation instance. Table 1 shows the relation instance of STUDENT at
a particular time. It can change whenever there is an insertion, deletion, or
update in the database.
Degree: The number of attributes in the relation is known as the degree of the
relation. The STUDENT relation defined above has degree 5.
Cardinality: The number of tuples in a relation is known as cardinality.
The STUDENT relation defined above has cardinality 4.
Column: The column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from the relation STUDENT.
NULL Values: The value which is not known or unavailable is called a NULL
value. It is represented by blank space. e.g.; PHONE of STUDENT having
ROLL_NO 4 is NULL.
Relation Key: These are basically the keys that are used to identify the rows
uniquely or also help in identifying tables. These are of the following types.
Primary Key
Candidate Key
Super Key
Foreign Key
Alternate Key
Composite Key
Simplicity: A Relational data model in DBMS is simpler than the hierarchical and
network model.
Structural Independence: The relational database is only concerned with data and
not with a structure. This can improve the performance of the model.
Easy to use: The Relational model in DBMS is easy as tables consisting of rows and
columns are quite natural and simple to understand
Query capability: It makes possible for a high-level query language like SQL to
avoid complex database navigation.
Data independence: The Structure of Relational database can be changed without
having to change any application.
Scalable: Regarding a number of records, or rows, and the number of fields, a
database should be enlarged to enhance its usability.
Few relational databases have limits on field lengths which can’t be exceeded.
Relational databases can sometimes become complex as the amount of data grows,
and the relations between pieces of data become more complicated.
Complex relational database systems may lead to isolated databases where the
information cannot be shared from one system to another .
1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an
attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.
Example:
2. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary
key. A primary key can contain a unique and null value in the relational table.
The primary key is a unique or non-null key that uniquely identifies every record in a table
or relation. Each database needs a unique identifier for every row of a table, and the primary
key plays a vital role in identifying rows in the table uniquely. The primary key column can't
store duplicate values. It is also called a minimal super key; therefore, we cannot specify
more than one primary key in any relationship.
or
A primary key is used to ensure that data in the specific column is unique. A column cannot
have NULL values. It is either an existing table column or a column that is specifically
generated by the database according to a defined sequence.
Example: STUD_NO, as well as STUD_PHONE both, are candidate keys for relation
STUDENT but STUD_NO can be chosen as the primary key (only one out of many
candidate keys).
Table STUDENT
STUD_N STUD_NA STUD_PHO STUD_STA STUD_COU STUD_A
O ME NE TE NT GE
A foreign key is a column or group of columns in a relational database table that provides a
link between data in two tables. It is a column (or columns) that references a column (most
often the primary key) of another table.
Example: STUD_NO in STUDENT_COURSE is a foreign key to STUD_NO in
STUDENT relation.
or
The foreign key is a group of one or more columns in a database to uniquely identify another
database record in some other table to maintain the referential integrity. It is also known as
the referencing key that establishes a relationship between two different tables in a
database. A foreign key always matches the primary key column in another table. It means a
foreign key column in one table refers to the primary key column of another table. A foreign
key is beneficial in relational database normalization, especially when we need to access
records from other tables.
Table STUDENT_COURSE
STUD_NO COURSE_NO COURSE_NAME
1 C1 DBMS
2 C2 Computer Networks
1 C2 Computer Networks
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation
and if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
Example:
Example:
Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
Null values
Or
If a field in a table is optional, it is possible to insert a new record or update a record without
adding a value to this field. Then, the field will be saved with a NULL value.
or
Null values are special values in DBMS that represent values which are unknown and are
always different from zero value. These values are supported by SQL and deals with them.
Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to obtain the
result of the query. It uses operators to perform queries.
SELECT (symbol: σ)
PROJECT (symbol: π)
RENAME (symbol: ρ)
UNION (υ)
INTERSECTION ( ),
DIFFERENCE (-)
CARTESIAN PRODUCT ( x )
JOIN
DIVISION
1. Select Operation:
o The select operation selects tuples that satisfy a given predicate.
o It is denoted by sigma (σ).
o Notation: σ p(r)
Where:
A B C
1 2 4
2 2 3
A B C A B C
3 2 3 1 2 4
4 3 4 4 3 4
For the above relation, σ(c>3)R will select the tuples which have c more than 3.
2. Project Operation:
Projection(π): It is used to project required column data from a relation.
o Example: Consider Table 1. Suppose we want columns B and C from Relation R.
o π(B,C)R will show following columns.
A B C
B C
1 2 4
2 4
2 2 3
2 3
3 2 3
3 4
4 3 4
3. Union Operation:
o Suppose there are two tuples R and S. The union operation contains all the tuples that
are either in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.
o .
R S
Student_Name Roll_Number
Student_Name Roll_Number
Ram 01
Vivek 13
Mohan 02
Geeta 17
Vivek 13
Ravi 21
Geeta 17
Rohan 25
o Consider the following table of
Students having different optional subjects in their course.
o π(Student_Name)R U π(Student_Name)S
Student_Name
Ram
Mohan
Vivek
Geeta
Ravi
Rohan
4. Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation contains all
tuples that are in both R & S.
o It is denoted by intersection ∩.
π(Student_Name)R ∩ π(Student_Name)S
Student_Name
Vivek
Student_Name
Geeta
Note: The only constraint in the Set Difference between two relations is that both
relations must have the same set of Attributes.
5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all
tuples that are in R but not in S.
o It is denoted by intersection minus (-).
1. Notation: R - S
π(Student_Name)R - π(Student_Name)S
Student_Name
Ram
Mohan
6. Cartesian product
o The Cartesian product is used to combine each row in one table with each row in the
other table. It is also known as a cross product.
o It is denoted by X.
A B
Ram 14 M 1 DS
Sona 15 F 2 DBMS
Kim 20 M
AXB
Name Age Sex ID Course
Ram 14 M 1 DS
Ram 14 M 2 DBMS
Sona 15 F 1 DS
Sona 15 F 2 DBMS
Kim 20 M 1 DS
Kim 20 M 2 DBMS
7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation to STUDENT1.
ρ(STUDENT1, STUDENT)
In SQL, the term NULL represents a missing or undefined value in a table. A NULL value is
different from a zero or an empty string; it signifies that no data exists for that field.
Understanding how to work with NULL values is crucial for accurate data retrieval and
manipulation.