0% found this document useful (0 votes)
5 views177 pages

Best DBMS

The document provides an overview of Database Management Systems (DBMS), highlighting its importance in data organization, storage, retrieval, and security. It discusses various types of DBMS, database users, database models, SQL commands, and transaction control, as well as advanced topics like NoSQL and data warehousing. Additionally, it covers concepts such as ACID properties, indexing, and RAID configurations.

Uploaded by

sharama445
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views177 pages

Best DBMS

The document provides an overview of Database Management Systems (DBMS), highlighting its importance in data organization, storage, retrieval, and security. It discusses various types of DBMS, database users, database models, SQL commands, and transaction control, as well as advanced topics like NoSQL and data warehousing. Additionally, it covers concepts such as ACID properties, indexing, and RAID configurations.

Uploaded by

sharama445
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 177

DBMS

Why it is important?
An Overview of the Database Management
• DBMS is the acronym of Data Base Management System.
• DBMS is a collection of interrelated data and a set of programs to
access this data in a convenient and efficient way.
• It controls the organization, storage, retrieval, security and integrity of
data in a database.
• The emergence of the first type of DBMS was between 1960's-70's;
that was the Hierarchical DBMS. IBM had the first model, developed
on IBM 360 and their (DBMS) was called IMS, originally it was written
for the Apollo program. This type of DBMS was based on binary trees,
where the shape was like a tree and relations were only limited
between parent and child records.
File System vs DBMS
• Data accessibility is easy
• Transaction support
• Concurrency control with Recovery services
• Authorization services
• The value of data is the same at all places.
• Allows multiple users to share a file at the same time
• Protection & Security
Types of Data
• Structured Data/Relational Database examples: Microsoft SQL Server,
Oracle Database, MySQL, PostgreSQL and IBM Db2
• Unstructured Data/NOSQL example: Apache Cassandra, MongoDB,
CouchDB, and Couchbase.
Types of Database Users
• Administrators − Administrators maintain the DBMS and are responsible for
administrating the database. They are responsible to look after its usage and by whom it
should be used. They create access profiles for users and apply limitations to maintain
isolation and force security. Administrators also look after DBMS resources like system
license, required tools, and other software and hardware related maintenance.

• Designers − Designers are the group of people who actually work on the designing part
of the database. They keep a close watch on what data should be kept and in what
format. They identify and design the whole set of entities, relations, constraints, and
views.

• End Users − End users are those who actually take the benefits of having a DBMS. End
users can range from simple viewers who pay attention to the logs or market rates to
sophisticated users such as business analysts.
Database Models
A database model shows the logical structure of a database, including
the relationships and constraints that determine how data can be
stored and accessed.
• Hierarchical Database Model
• Network Database Model
• Relational Database Model
• Object-Oriented Database Model
• Entity-Relationship Database Model
SQL Introduction
• SQL stands for Structured Query Language
• Declarative or Informal
• Not case sensitive
Types of Commands
DDL
• CREATE TABLE EMPLOYEE(Name VARCHAR2(20),
Email VARCHAR2(100), DOB DATE);
• DROP TABLE EMPLOYEE;
• Alter for Add column, remove column, change datatype, column
name, change datatype length
• ALTER TABLE table_name ADD column_name COLUMN-definition;
• TRUNCATE TABLE table_name;
• Rename old to new name
DML
• INSERT INTO TABLE_NAME (value1, value2, value3, .... valueN);
• UPDATE table_name SET [column_name1= value1,...column_nameN
= valueN] [WHERE CONDITION]
• DELETE FROM table_name [WHERE condition];
DCL
• GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_US
ER;
• REVOKE SELECT, UPDATE ON MY_TABLE FROM USER1, USER2;
Transaction Control Language
• COMMIT;
• ROLLBACK;
• SAVEPOINT SAVEPOINT_NAME;
DQL
• SELECT expressions FROM TABLES WHERE conditions;
• SELECT Distinct
Order by
• The ORDER BY keyword is used to sort the result-set in ascending or
descending order.
• By default ascending
• SELECT * FROM employee
ORDER BY Salary DESC;
Examples
• Employee table
ID NAME SALARY DEPARTMENT
1 A 10000 IT
2 B 20000 HR
3 C 30000 IT
4 A 40000 SALES
5 D 50000 IT
Aggregate Functions
• Sum
• Avg
• Count
• Max
• Min
Group by & Having
• The GROUP BY statement groups rows that have the same values into
summary rows, like "find the number of customers in each country".
• The GROUP BY statement is often used with aggregate functions
(COUNT, MAX, MIN, SUM, AVG) to group the result-set by one or
more columns.
• The HAVING clause was added to SQL because the WHERE keyword
could not be used with aggregate functions.
Operators
• AND, OR, NOT
• SELECT * FROM EMPLOYEE
WHERE DEPARTMENT=‘IT’ OR DEPARTMENT=‘HR’;
• SELECT * FROM EMPLOYEE
WHERE DEPARTMENT=‘IT AND NAME=‘A’;
• SELECT * FROM EMPLOYEE
WHERE NOT DEPARTMENT=‘IT’;
• >, <, <=, >=, <>
• Between, Not between
Operators
• In, Not In
• SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1, value2, ...);
Joins and Nested Queries
• Joins we discussed in relational algebra video and practice questions
will be discussed in LIVE Class!!!!!!!!!
Command Syntax with example (SQL is not case sensitive)
Create Create table Student (id int, name varchar (10)); /* table name is student and two columns
inserted, id with datatype integer and name with varchar of size 10 */

Alter Adding new column-> Alter table Student Add age int;
(it has many Remove a column-> Alter table Student drop column age;
variations) Change a column datatype-> Alter table student modify id varchar (10);
Rename a column-> Alter table student rename column id TO new_id;
Change size of datatype-> Alter table student modify id varchar (25);

Drop Drop table student

Rename Rename table student to student-new

Desc Desc student

Command Syntax with variations of example (SQL is not case sensitive)


Select Select * from Student; /* student is name of table which have two columns id and name, it
will select all records*/
Select * from student where id=1; /* it will select particular row based on condition*/

Insert Insert into Student values (1, ‘Ram’); /* insert into student table but column sequences
wise*/

Delete Delete from student; /* it will delete all rows in one go */


Delete from student where id =1; /* it will delete a particular row based on
condition, delete will not delete table structure*/
Update Update student set name =’Varun’; /* it will update entire column values*/
Update student set name =’Varun’ where id =1; /* it will update the name based on
condition i.e. id=1/*
Command Syntax with example
Grant GRANT privilege_name ON object_name TO {user_name |PUBLIC |role_name} [WITH
GRANT OPTION];
Some examples:

Grant select on Student to user1


Grant create table to user1
Grant update on student to user1;
GRANT CREATE SESSION TO user1;

Revoke REVOKE privilege_name ON object_name FROM {user_name |PUBLIC |role_name}


Some examples:

REVOKE CREATE TABLE FROM user1;


Revoke select on student from user1;
Revoke update on student from user1;

Command Syntax with example


Commit Commit; /* commit is written at the end when all
commands written. It permanently saves the values
into database*/

Savepoint Savepoint savepoint_name;

For example:
Savepoint S1;
Savepoint S2;
Rollback Rollback to savepoint_name;

For example:
Rollback to S1;
Rollback to S2;
Introduction to Constraints in SQL
Constraints are the rules that we can apply on the type of data in a table.

Constraints Constraint Functions


NOT NULL It restricts storing null value in a column if we apply this constraint

UNIQUE It ensures that all the values in the column must be unique.
PRIMARY KEY It ensures that all the values in the column must be unique as well as Not null

FOREIGN KEY A Foreign key is a field which can uniquely identify each row in another table. It is
used for Referential Integrity.
CHECK It is used for validating the values of a column to meet a particular condition. For
example, if age column should contain data value more than 18. Then check is used.

DEFAULT This constraint specifies a default value for the column when no value is specified by
the user. For example, if we want default date 01/01/2000 should come. Then
default is used

During the table creation After the table is created


CREATE TABLE Student (Id int Alter table Student MODIFY SALARY Numeric (18,
Not null, Phone_no int Not 2) NOT NULL;
Null, Name varchar (10),
Salary int);
During the table creation After the table is created
CREATE TABLE Student ( Alter table Student Add Unique(Phone_no);
Id int Unique,
Phone_no int,
Name varchar (10));

During the table creation After the table is created


CREATE TABLE Student ( ALTER TABLE name
Id int Primary key, ADD CONSTRAINT constraint_name
Phone_no int, PRIMARY KEY (column_name)
Name varchar (10));

Parent table (Referenced table) Child table (Referencing table)


CREATE TABLE Student (id CREATE TABLE Course (
int, Course_id int NOT NULL,
student_name varchar 20), Student_id int,
PRIMARY KEY (id), PRIMARY KEY (Course_id),
); FOREIGN KEY (Student_id) REFERENCES
Student table is parent table because it Student (id)
contains Primary key and it gives referenceto );
child table(course) It is child table it takes reference from
primary key(id) of student table
Check Constraint with Example After the table is created
CREATE TABLE Employee ALTER TABLE Employee
( ADD CONSTRAINT ck2
id int, CHECK (address IN (MUMBAI',
address varchar 50), 'DELHI', 'BANGLORE’));
CONSTRAINT ck1 CHECK (id BETWEEN 1 and /* ck2 is constraint name*/
999)
);

Default Constraint with Syntax After the table is created

CREATE TABLE Persons ( ALTER TABLE Persons


ID int NOT NULL, MODIFY City DEFAULT 'Jaipur';
Age int,
City varchar (255) DEFAULT 'Jaipur’
);
Definition
• The transaction is a set of logically related operation. It contains a
group of tasks.
• Read(X): Read operation is used to read the value of X from the
database and stores it in a buffer in main memory.
• Write(X): Write operation is used to write the value back to the
database from the buffer.
• Commit: It is used to save the work done permanently.
• Rollback: It is used to undo the work done.
ACID Properties
• Atomicity (either all or none)
• Consistency (The total amount must be maintained before or after
the transaction)
• Isolation (In isolation, if the transaction T1 is being executed and
using the data item X, then that data item can't be accessed by any
other transaction T2 until the transaction T1 ends)
• Durability( It states that the transaction made the permanent
changes)
Types of problems in concurrency
• Dirty read
• Incorrect summary
• Lost update
• Unrepeatable read
• Phantom read
Dirty Read or Uncommited Read or RAW
Incorrect summary
Lost Update
Unrepeatable read
Phantom read
Irrecoverable
Recoverable
Conflict Serializable
• Conflict Serializable: A schedule is called conflict serializable if it can
be transformed into a serial schedule by swapping non-conflicting
operations.
Locks
• Shared Lock
• Exclusive lock
Introduction
• Indexing is used to optimize the performance of a database by
minimizing the number of disk accesses required when a query is
processed.
Index structure
• Key and data pointer
Example
• In a hard disk every block is of size 1000 bytes, Record size=100 bytes.
Key field=12 bytes, pointer takes 8 bytes. There is a file of 10,000
records. What will be I/O with and without indexing.
Dense & sparse
Types
• Primary
• Clustered
• Secondary
• Multilevel( B & B+ Tree)
Key Non Key

Ordered Primary (Sparse) Clustered(Sparse)

Non-Ordered Secondary(dense) Secondary(dense)


Primary Index
• If the index is created on the basis of the primary key of the table,
then it is known as primary indexing. These primary keys are unique
to each record and contain 1:1 relation between the records.
• As primary keys are stored in sorted order, the performance of the
searching operation is quite efficient.
• The primary index can be classified into two types: Dense index and
Sparse index.
Clustering Index
• A clustered index can be defined as an ordered data file. Sometimes
the index is created on non-primary key columns which may not be
unique for each record.
• In this case, to identify the record faster, we will group two or more
columns to get the unique value and create index out of them. This
method is called a clustering index.
• The records which have similar characteristics are grouped, and
indexes are created for these group.

Secondary Index
B Tree
• Index will be stored in tree structure
• It is balanced always
• Terminologies ( Block pointer, Keys, Data Pointer, Order)
• Order of a B tree is maximum no of Block pointers(Children)
Imp Points
• Keys are distributed all over the tree. Left node will contain smaller
values then parent and right contain larger than parent.
• Order of leaf and non leaf is same
B+ tree
• Data pointers are only present in the leaf node
• Searching is faster and deletion is easy
• Leaf nodes are linked together like a linked list
• Order of leaf and non leaf node is different
• Leaf node order is maximum key and data pointer pair
• Non leaf order is maximum children possible
Advance Topics
DBMS
Topics to be covered
• BIG Data(Structured vs Unstructured data, 3V’s, Hadoop, Map
Reduce)
• NoSQL
• Datawarehouse, Data Mining
Definition
• HBase is a data model that is similar to Google’s big table
designed to provide quick random access to huge amounts of
structured data.
• NoSQL databases (aka "not only SQL") are non tabular, and store data
differently than relational tables. NoSQL databases come in a variety
of types based on their data model. The main types are document,
key-value, wide-column, and graph. They provide flexible schemas
and scale easily with large amounts of data and high user loads.
Definition
• A data warehouse is a large collection of business data
used to help an organization make decisions.
• core component of business intelligence
• Extract, transform, load (ETL) and extract, load, transform (ELT) are
the two main approaches used to build a data warehouse system.
• A data mart is a simple form of a data warehouse that is
focused on a single subject (or functional area)
OLAP vs OLTP
• Online analytical processing (OLAP) is characterized by a
relatively low volume of transactions.
• OLAP applications are widely used by Data Mining techniques
• The three basic operations in OLAP are: Roll-up
(Consolidation), Drill-down and Slicing & Dicing.
• Online transaction processing (OLTP) is characterized by a
large number of short on-line transactions (INSERT, UPDATE,
DELETE).
Star Schema
Snowflake schema
• A snowflake schema is a logical arrangement of tables in a
multidimensional database such that the entity relationship diagram
resembles a snowflake shape.
RAID
DBMS
RAID
• Redundant Array of Independent Disks
• Redundant Array of Inexpensive Disks
Minimum
Level Description Space efficiency Fault tolerance
no. of drives

RAID 0 Block-level striping without parity or mirroring 2 1 None

1
RAID 1 Mirroring without parity or striping 2 n − 1 drive failures
𝑛
1
RAID 3 Byte-level striping with dedicated parity 3 1− One drive failure
𝑛
1
RAID 4 Block-level striping with dedicated parity 3 1− One drive failure
𝑛
1
RAID 5 Block-level striping with distributed parity 3 1− One drive failure
𝑛
2
RAID 6 Block-level striping with double distributed parity 4 1− Two drive failure
𝑛
DBMS Architecture
2-Tier Architecture
The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the client end
can directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC are
used.
The user interfaces and application programs are run on the client-side.
The server side is responsible to provide the functionalities like: query processing and transaction management.
To communicate with the DBMS, client-side application establishes a connection with the server side.
3-Tier Architecture
The 3-Tier architecture contains another layer between the client and server. In this architecture, client can't
directly communicate with the server.
The application on the client-end interacts with an application server which further communicates with the
database system.
End user has no idea about the existence of the database beyond the application server. The database also has
no idea about any other user beyond the application.
The 3-Tier architecture is used in case of large web application.
3 Schema Architecture: Data abstraction.

Physical Level: At the physical level, the information about the location of database objects in the data store is
kept. Various users of DBMS are unaware of the locations of these objects. In simple terms, physical level of a
database describes how the data is being stored in secondary storage devices like disks and tapes etc.
Conceptual Level: At conceptual level, data is represented in the form of various database tables. For Example,
STUDENT database may contain STUDENT and COURSE tables which will be visible to users but users are unaware
of their storage. It defines tables, views, and integrity constraints. Also referred as logical schema, it describes
what kind of data is to be stored in the database.
External Level: An external level specifies a view of the data in terms of conceptual level tables. Each external
level view is used to cater to the needs of a particular category of users. For Example, FACULTY of a university is
interested in looking course details of students, STUDENTS are interested in looking at all details related to
academics, accounts, courses and hostel details as well. So, different views can be generated for different users.
Data Independence
Data independence means a change of data at one level should not affect another level. Two types of data
independence are present in this architecture:
Physical Data Independence: Any change in the physical location of tables and indexes should not affect the
conceptual level or external view of data. This data independence is easy to achieve and implemented by most
of the DBMS.
Conceptual Data Independence: The data at conceptual level schema and external level schema must be
independent. This means a change in conceptual schema should not affect external schema. e.g.; Adding or
deleting attributes of a table should not affect the user’s view of the table. But this type of independence is
difficult to achieve as compared to physical data independence because the changes in conceptual schema are
reflected in the user’s view.

Database Schema
Database schema is the skeleton of database that represents the logical view of the entire database.
A database schema does not contain any data or information.
A database schema defines its entities and the relationship among them.

Database Instance
It contains a snapshot of the database.
Advantages of DBMS
• Minimized redundancy and data inconsistency
• Simplified Data Access
• Multiple data views
• Data Security
• Concurrent access to data
• Backup and Recovery mechanism
Database model

A Database model defines the logical design and structure of a database and defines how data will
be stored, accessed and updated in a database management system. While the Relational Model is
the most widely used database model, there are other models too:
• Hierarchical Model
• Network Model
• Entity-relationship Model
• Relational Model
Hierarchical Model
This database model organizes data into a tree-like-structure, with a single root, to which all the other data is
linked. The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book, recipes etc.
In hierarchical model, data is organized into tree-like structure with one one-to-many relationship between two
different types of data, for example, one department can have many courses, many professors and one course
many students.
Network Model
This is an extension of the Hierarchical model. In this model data is organized more like a graph, and are allowed
to have more than one parent node.
In this database model data is more related as more relationships are established in this database model. Also,
as the data is more related, hence accessing the data is also easier and fast. This database model was used to
map many-to-many data relationships.
This was the most widely used database model, before Relational Model was introduced.
Entity-relationship Model
In this database model, relationships are created by dividing object of interest into entity and its characteristics
into attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial form to make it easier for different
stakeholders to understand.
This model is good to design a database, which can then be turned into tables in relational model.
Let's take an example, If we have to design a School Database, then Student will be an entity with attributes
name, age, address etc. As Address is generally complex, it can be another entity with attributes street name,
pincode, city etc, and there will be a relationship between them.
Relational Model
In this model, data is organized in two-dimensional tables and the relationship is maintained by storing a
common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most widely used database model,
infect, we can say the only database model used around the world.
The basic structure of data in the relational model is tables. All the information related to a particular type is
stored in rows of that table.
Hence, tables are also known as relations in relational model.
In the coming tutorials we will learn how to design tables, normalize them to reduce data redundancy and how
to use Structured Query language to access data from tables.
Schema
• Schema can be defined as the design of a database. The overall
description of the database is called the database schema.
• You can relate it as something like Functions, Comments,
Preprocessor, statements, types and variables in programming
languages.
• Subschema is subset of schema which allows the user to view only
their authorized part.
Types
1. Physical Schema: The design of a database at physical level is called
physical schema, how the data stored in blocks of storage is described
at this level.
2. Logical schema: Logical schema can be defined as the design of
database at logical level. In this level, the programmers as well as the
database administrator (DBA) work.
3. View Schema: View schema can be defined as the design of
database at view level which generally describes end-user interaction
with database systems.
What is an Instance?
• Databases change over time as information is inserted and deleted.
The collection of information stored in the database at a particular
moment is called an instance.
Data Independence/Transparency
• There are two types of data independence: physical and
logical data independence.
• Physical data independence is the ability to modify the
physical schema without causing application programs to
be rewritten.
• Logical data independence is the ability to modify the
logical schema without causing application programs to be
rewritten.
• Logical data independence is more difficult to achieve than
physical data independence.
What is Key
• A DBMS key is an attribute or set of an attribute which helps you to
identify a row(tuple) in a relation(table).
• Simple vs Composite

NAME AGE
Ravi 20
Aman 21
Ravi 20
Types of Keys
• Candidate Key
• Primary Key
• Alternate Key
• Super Key
• Foreign Key
Candidate Key
• CANDIDATE KEY is a set of attributes that uniquely identify tuples in a
table
• The Primary key should be selected from the candidate keys.
• Every table must have at least a single candidate key.
• A table can have multiple candidate keys but only a single primary
key.
Primary Key
• Unique + Not Null
• Only one in a table
• Can be composite
• Qus. Can a table exist without primary key attribute??
Alternative key
• Candidate Key – Primary Key
Super Key
• Super Set of Candidate Key
• Must contain Candidate key + Anything
• R(ABCD) contain four attributes and given that A is candidate key.
Find all super keys?
Qus. R(ABCD) contain four attributes and given that A and B are two
candidate keys. Find all super keys?
Foreign Key
• FOREIGN KEY is a column that creates a relationship between two
tables
• Can have duplicate values
• Maintain Referential Integrity
Qus. Can a table contain many foreign keys????
Qus. Can foreign key have Null values????
Referenced table vs Referencing table
(Insertion & deletion)
Integrity Constraints
• Integrity constraints are a set of rules. It is used to maintain the
quality of information.
• Unique key constraint
• Primary key Constraint
• Referential integrity constraint
• Domain Constraint ( Check constraint)
Questions
• Which of the following is True?
A) All the candidate keys can be called as super keys
B) All the Super keys can be called as candidate keys
Topics to be covered
• Definition with examples
• Notations
• What is Entity
• Attributes and their types
• Relationship and their types
• Minimization
• Generalization
• Specialization
• Aggregation
• Conversion of ER to tables
ER model (Entity-Relationship Model)

• Conceptual design for the database


• simple and easy to design
Entity
• An entity may be any object, class, person or place ( Anything which
have some characteristics or attributes)
• Notation
• Strong entity
• Weak entity
Attributes
• The attribute is used to describe the property of an entity
• Simple vs Composite
• Single vs Multivalued
• Key vs Non Key
• Derived vs Non derived
• Notations
Relationship
• 1 to 1
• 1 to many
• Many to 1
• Many to many (M-N)
• Total participation vs Partial Participation
One to One Relationship:
One to Many Relationship:
Many to Many Relationship:
Weak Entity Set
• As the weak entities do not have any primary key,
• They cannot be identified on their own, so they depend on some
other entity (known as owner entity).
• The weak entities have total participation constraint (existence
dependency) in its identifying relationship with owner identity.
• Weak entity types have partial keys. Partial Keys are set of attributes
with the help of which the tuples of the weak entities can be
distinguished and identified.
Weak entity set
• Weak entities are represented with double rectangular box in the ER
Diagram
• the identifying relationships are represented with double diamond.
• Partial Key attributes are represented with dotted lines.
Weak entity set
Participation
• Total Participation – Each entity in the entity set must participate in
the relationship. If each student must enroll in a course, the
participation of student will be total. Total participation is shown by
double line in ER diagram.
• Partial Participation – The entity in the entity set may or may NOT
participate in the relationship. If some courses are not enrolled by
any of the student, the participation of course will be partial.The
diagram depicts the ‘Enrolled in’ relationship set with Student Entity
set having total participation and Course Entity set having partial
participation.
Generalization
• Generalization is like a bottom-up approach
• Two or more entities of lower level combine to form a higher level
entity if they have some attributes in common.
Specialization
• Specialization is a top-down approach
• It is opposite to Generalization. In specialization, one higher level
entity can be broken down into two lower level entities.
Aggregation
• Relation between two entities is treated as a single entity
Conversion of ER to Tables
• Entity type becomes a table.
• All single-valued attribute becomes a column for the table.
• A key attribute of the entity type represented by the primary key.
• The multivalued attribute is represented by a separate table**(1st
Normal form)
• Composite attribute represented by components.
• Derived attributes are not considered in the table.
Participation Constraints
• Total Participation − Each entity is involved in the relationship. Total participation is represented by
double lines.
• Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.
1. Strong Entity Set:
• A strong entity set is an entity set that contains sufficient attributes to uniquely identify all its entities.
• In other words, a primary key exists for a strong entity set.
• Primary key of a strong entity set is represented by underlining it.

Symbols Used:
• A single rectangle is used for representing a strong entity set.
• A diamond symbol is used for representing the relationship that exists between two strong entity sets.
• A single line is used for representing the connection of the strong entity set with the relationship set.
• A double line is used for representing the total participation of an entity set with the relationship set.
• Total participation may or may not exist in the relationship.
2. Weak Entity Set:
• A weak entity set is an entity set that does not contain sufficient attributes to uniquely identify its entities.
• In other words, a primary key does not exist for a weak entity set.
• However, it contains a partial key called as a discriminator.
• Discriminator can identify a group of entities from the entity set.
• Discriminator is represented by underlining with a dashed line.
• The combination of discriminator and primary key of the strong entity set makes it possible to uniquely
identify all entities of the weak entity set.
• Thus, this combination serves as a primary key for the weak entity set.
• Clearly, this primary key is not formed by the weak entity set completely.
Symbols Used
• A double rectangle is used for representing a weak entity set.
• A double diamond symbol is used for representing the relationship that exists between the strong and
weak entity sets and this relationship is known as identifying relationship.
• A double line is used for representing the connection of the weak entity set with the relationship set.
• Total participation always exists in the identifying relationship.
In ER diagram, weak entity set is always present in total participation with the identifying relationship set.
So, the Primary key of Apartment= Building number + Door number
Closure Property
• The set of all those attributes which can be functionally determined from
an attribute set is called as a closure of that attribute set.
• Closure of attribute set {X} is denoted as {X}+.
Consider a relation R(A , B , C , D , E , F , G ) with the functional dependencies
A → BC, BC → DE, D → F, CF → G
Find Candidate Key from Closure
• Minimal set of attribute whose closure contains all the attributes of
the relation, then that attribute set is called as a candidate key of that
relation.
Minimal Cover/Canonical Cover/Irreducible
• R ( W , X , Y , Z ) – : X → W, WZ → XY, Y → WXZ
What is a Functional Dependency?

• Functional Dependency (FD) determines the relation of one attribute


to another attribute in a database management system (DBMS)
system.
• X → Y ( if X is same then Y has to be same)
• Emp_id → Emp_name
Types of Functional Dependencies
• Trivial functional dependency
X -> Y is a trivial functional dependency if Y is a subset of X
For example X ->X, XY->X, XY ->Y
Qus L.H.S intersect R.H.S = ??
• Non-trivial functional dependency**
• X→ Y has a non-trivial functional dependency if B is not a subset of A.
Rules of Functional Dependencies X->Y
• Reflexive rule – If X is a set of attributes and Y is subset of X, then X
holds a value of Y.
• Augmentation rule: When X -> Y holds, and c is attribute set, then
Xc -> Yc also holds. That is adding attributes which do not change the
basic dependencies.
• Transitivity rule: This rule is very much similar to the transitive rule in
algebra if x -> y holds and y -> z holds, then x -> z also holds. X -> y is
called as functionally that determines y.
Rules of Functional Dependencies X->Y
• Union: If X → Y and X → Z then X → YZ
• Decomposition: If X → YZ then X → Y and X → Z
Qus. If XY->Z can I write X->Z & Y->Z
Normalization
• Normalization is the process of organizing the data in the database.
• Normalization is used to minimize the redundancy from a relation or
set of relations. It is also used to eliminate the undesirable
characteristics like Insertion, Update and Deletion Anomalies.
• Normalization divides the larger table into the smaller table and links
them using relationship.
• The normal form is used to reduce redundancy from the database
table.
First Normal Form (1NF)
• Each attribute of a relation contains only an atomic value.
• No Multivalued attribute only single valued

Student_id Name Subject

101 AK Computer Network, JAVA

102 VK DBMS, C++, JAVA

Software Engineering,
103 Amrita
Compiler Design
Second Normal Form (2NF)
• A relation must be in 1NF and
• No partial dependency should exist in the relation.
• Partial Dependency: If a non-prime attribute can be determined by
the part of the candidate key in a relation, it is known as a partial
dependency.
• if L.H.S is the proper subset of a candidate key and R.H.S is the non-
prime attribute, then it shows a partial dependency.
Partial Dependency &
Prime, Non-Prime attributes
• Partial Dependency: If a non-prime attribute can be determined by
the part of the candidate key in a relation, it is known as a partial
dependency.
• if L.H.S is the proper subset of a candidate key and R.H.S is the non-
prime attribute, then it shows a partial dependency.
• R(ABCD) && FD= AB->C, C->D, B->D && Candidate key=AB
Third Normal Form (3NF)
• A relation must be in second normal form (2NF) And there should be
no transitive functional dependency exists for non-prime attributes in
a relation.
Rollno State City
1 Punjab Chandigarh
2 Haryana Ambala
3 Punjab Chandigarh
4 Haryana Ambala
5 Uttar Pradesh Ghaziabad
Check whether a table in 3rd NF or Not
• In all FDs X->Y
• X is a super key or candidate key And Y is a prime attribute, i.e., Y is a
part of candidate key.
Boyce-Codd Normal Form (BCNF)
• A relation is in 3NF And for every functional dependency, X → Y, L.H.S
of the every functional dependency (X) be the super key of the table.
• R(ABCD) && FD: A->B, B->C, C->D, D->A
Fourth Normal Form
• A relation is in BCNF.
• And, there is no multivalued dependency exists in the relation.
• Multivalued dependency: For a dependency X → Y, if for a single
value of X, multiple values of Y exists, then the relation may have a
multi-valued dependency. It is represented by the double arrow
sign (→→).
Fourth Normal Form
STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing

34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey
Fifth normal form (5NF)
• A relation is in 5NF if it is in 4NF and not contains any join
dependency and joining should be lossless.
• No spurious tuples
• 5NF is also known as Project-join normal form (PJ/NF).
What is Relational Algebra?
• RELATIONAL ALGEBRA is a widely used procedural query
language(also Formal).
• Mathematical expressions
• Building block of SQL
• SQL a procedural language
Basic Relational Algebra Operations
• Unary Relational Operations
• SELECT (symbol: σ)
• PROJECT (symbol: π)
• RENAME (symbol: ρ Rho )
• Relational Algebra Operations From Set Theory
• UNION (υ)
• INTERSECTION (⋂),
• DIFFERENCE (-)
• CARTESIAN PRODUCT ( x )
• Binary Relational Operations or Derived
• JOIN
• DIVISION
Examples
Union, Intersection, Difference
• Both tables must be the same number of attributes.
• Attribute domains need to be compatible.
Cross Product (M*N)
Types of JOIN
• Various forms of join operation are:
• Inner Joins:
EQUI join ⋈ A.column = B.column (B)

Natural join ⋈
• Outer join:
Left Outer Join
Right Outer Join
Full Outer Join
Division
• When Query is like every/all then we use division
• Find the student who have completed all the tasks

You might also like