0% found this document useful (0 votes)
13 views33 pages

DBMS Assignment

The document provides a comprehensive overview of databases, including their structure, abstraction levels, instances, schemas, and database languages such as SQL. It discusses the advantages and disadvantages of database systems, key concepts like keys in RDBMS, and various SQL operations for data management. Additionally, it introduces relational algebra as a procedural query language for efficient data manipulation.

Uploaded by

bhaveshtupe06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views33 pages

DBMS Assignment

The document provides a comprehensive overview of databases, including their structure, abstraction levels, instances, schemas, and database languages such as SQL. It discusses the advantages and disadvantages of database systems, key concepts like keys in RDBMS, and various SQL operations for data management. Additionally, it introduces relational algebra as a procedural query language for efficient data manipulation.

Uploaded by

bhaveshtupe06
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Detailed Key Points on Data Views, Abstraction, Instances, Schemas, and Database Languages

1. Database Overview:

o A database is a collection of interrelated data along with a set of programs to access or


modify the data.

o It provides an abstract view of data storage and maintenance, allowing users to interact
without worrying about the underlying complexity.

o Purpose: Efficient data retrieval and modification while maintaining simplicity for the user.

2. Data Abstraction:

o Data Abstraction: Simplifies user interactions by retrieving only the required information
and hiding the technical details of storage and maintenance.

o There are three levels of abstraction:

1. Physical Level:

▪ Lowest level of abstraction.

▪ Describes how data is physically stored in the database, including complex


low-level data structures like indexing, file organization, etc.

2. Logical Level:

▪ Middle level of abstraction.

▪ Describes what data is stored and the relationships among the data.

▪ Example: Type employee = record (empID, empname, dept_no, salary)


defines the structure of an employee record.

▪ Used by database administrators to decide which data to store and how to


organize it.

3. View Level:

▪ Highest level of abstraction.

▪ Describes only part of the database relevant to specific users or tasks.

▪ Provides multiple user-specific views, e.g.,:

▪ A reservation clerk views only passenger details, not entire database


records.

▪ Customers can view employee names or IDs but not sensitive details
like salaries.

3. Instances and Schemas:


o Schema:

▪ The overall design of a database, analogous to declaring variables in a program.

▪ Example: Schema for a student record = (studentID, studentName, DOB).

o Instance:

▪ The database's state at a particular moment, reflecting the data inserted, updated, or
deleted.

▪ Example: Instance of student record = {101, "John", "2003-04-15"}.

o Types of Schema:

1. Physical Schema: Design of the database at the physical storage level.

2. Logical Schema: Design at the logical level, defining the database's structure.

3. Subschema: User-specific views or parts of the database designed at the view level.

4. Database Languages:

o Two main types of languages in database systems:

1. Data Definition Language (DDL):

▪ Used to define and modify database schema.

▪ Common commands:

▪ CREATE: Creates tables, views, and structures.

▪ ALTER: Modifies existing structures (e.g., adding or deleting columns).

▪ DROP: Deletes tables, views, or structures.

▪ Example: CREATE TABLE employee (empID INT, empname VARCHAR(20));.

2. Data Manipulation Language (DML):

▪ Used for accessing and manipulating data within the database.

▪ Common operations:

▪ Retrieval of information.

▪ Insertion of new records.

▪ Deletion of records.

▪ Modification of existing records.

▪ Two types of DML:

▪ Procedural DML: Specifies both what data is needed and how to get
it.
▪ Declarative DML: Specifies what data is needed without specifying
how to get it.

▪ Example Query: SELECT empname FROM employee WHERE dept_no = 10;.

5. Key Examples:

o Physical Level: Employee, department, and customer records stored as blocks of consecutive
storage locations.

o Logical Level: Relationships like department linked to employees by dept_no.

o View Level: A customer service agent sees employee names and IDs but not sensitive
information like salaries.

Advantages of Database Systems:

1. Data Redundancy Control:

o Reduces duplication of data, saving storage and improving data consistency.

2. Data Sharing:

o Allows multiple users to access and share data simultaneously while maintaining integrity.

3. Backup and Recovery:

o Provides automatic backup and recovery mechanisms to safeguard data from loss or
corruption.

4. Integrity and Security:

o Enforces integrity constraints (e.g., primary keys, foreign keys) and restricts unauthorized
access to sensitive data.

Disadvantages of Database Systems:

1. High Initial Investment:

o Setting up database systems requires significant costs for software, hardware, and skilled
personnel.

2. Complexity:

o Managing and maintaining a database system is complex and requires specialized expertise.

3. Performance Overhead:

o Handling large volumes of data or complex queries may lead to slower performance if not
optimized properly.

4. Vulnerability to Attacks:
o Centralized data storage makes databases a target for cyberattacks if proper security
measures are not in place.

5. Simple Differences Between Database System and File System

Aspect Database System File System


Data Storage Data is stored in structured tables. Data is stored in plain files or folders.
Redundancy Redundancy is minimized. High redundancy due to duplicate data.
Data Access Provides easy access through queries (e.g., Requires writing custom programs for
SQL). access.
Security Offers strong security features (e.g., Limited or no built-in security.
permissions).
Concurrent Handles multiple users simultaneously. Struggles with multiple users accessing
Access files.

1. What is SQL?

• Definition: SQL (Structured Query Language) is used for managing and manipulating relational
databases.
• Key Components of SQL:
o DDL (Data Definition Language): Defines database structure (CREATE, ALTER, DROP).
o DML (Data Manipulation Language): Modifies data (INSERT, UPDATE, DELETE).
o DCL (Data Control Language): Manages user permissions (GRANT, REVOKE).
o TCL (Transaction Control Language): Manages transactions (COMMIT, ROLLBACK).
o View Definition: Creates virtual tables using views.
o Integrity Constraints: Ensures data validity.
o Embedded SQL: SQL can be integrated into programming languages like C, Java.

2. Basic Data Types in SQL

Data Type Description Example


CHAR(n) Fixed-length string CHAR(10)
VARCHAR(n) Variable-length string VARCHAR(50)
INT Integer values INT
NUMERIC(m, d) Fixed-point numbers NUMERIC(5,2) (e.g., 99.99)
SMALLINT Small integer values SMALLINT
REAL Floating-point numbers REAL
FLOAT(n) Floating-point with precision FLOAT(10)

3. Basic SQL Operations

1. Creating a Database

• Syntax:

CREATE DATABASE database_name;

• Example:

CREATE DATABASE Person_DB;


2. Creating a Table

• Syntax:

CREATE TABLE table_name (


column1 datatype,
column2 datatype,
...
);

• Example:

CREATE TABLE person_details (


AdharNo INT,
FirstName VARCHAR(20),
MiddleName VARCHAR(20),
LastName VARCHAR(20),
Address VARCHAR(30),
City VARCHAR(10)
);
AdharNo FirstName MiddleName LastName Address City
(Blank) (Blank) (Blank) (Blank) (Blank) (Blank)

3. Inserting Data

• Syntax:

INSERT INTO table_name (col1, col2, ...)


VALUES (value1, value2, ...);

• Example:

INSERT INTO person_details (AdharNo, FirstName, MiddleName, LastName, Address,


City)
VALUES (111, 'AAA', 'BBB', 'CCC', 'M.G. Road', 'Pune');
AdharNo FirstName MiddleName LastName Address City
111 AAA BBB CCC M.G. Road Pune

4. Selecting Data

• Syntax:

SELECT column1, column2 FROM table_name;

• Example:

SELECT AdharNo, FirstName, Address, City FROM person_details;

• Select all records:

SELECT * FROM person_details;

5. Filtering Data with WHERE


• Syntax:

SELECT column1, column2 FROM table_name WHERE condition;

• Example:

SELECT * FROM person_details WHERE City = 'Pune';

6. Updating Data

• Syntax:

UPDATE table_name
SET column1 = value1, column2 = value2
WHERE condition;

• Example:

UPDATE person_details
SET City = 'Chennai'
WHERE AdharNo = 111;

7. Deleting Data

• Syntax:

DELETE FROM table_name WHERE condition;

• Example:

DELETE FROM person_details WHERE AdharNo = 111;

• Delete all records:

DELETE FROM person_details;

4. Logical Operators

Operator Description Example


AND Returns records where both SELECT * FROM person_details WHERE
City='Pune' AND AdharNo=111;
conditions are true
OR Returns records where at least one SELECT * FROM person_details WHERE
City='Pune' OR City='Mumbai';
condition is true
NOT Returns records where the condition SELECT * FROM person_details WHERE NOT
City='Pune';
is false

5. Sorting Data with ORDER BY

• Syntax:

SELECT column1, column2 FROM table_name ORDER BY column_name ASC | DESC;


• Example (Descending Order by AdharNo):

SELECT * FROM person_details ORDER BY AdharNo DESC;

6. Altering a Table

• Add a column:

ALTER TABLE table_name ADD column_name datatype;

• Example:

ALTER TABLE person_details ADD Email VARCHAR(30);

• Delete a column:

ALTER TABLE table_name DROP COLUMN column_name;

• Example:

ALTER TABLE person_details DROP COLUMN Address;

Conclusion

SQL is a powerful language used to manage relational databases efficiently. It includes:

• DDL (CREATE, ALTER, DROP)


• DML (INSERT, UPDATE, DELETE)
• DCL (GRANT, REVOKE)
• TCL (COMMIT, ROLLBACK)

Keys in Relational Database Management System


(RDBMS)
Keys are crucial elements in Relational Database Management Systems (RDBMS) that ensure
uniqueness, integrity, and relationships among tables. They help identify records uniquely and maintain
data consistency.

Types of Keys in SQL

1. Super Key (SK)


2. Candidate Key (CK)
3. Primary Key (PK)
4. Alternate Key (AK)
5. Foreign Key (FK)

1. Super Key (SK)


Definition

A super key is a set of one or more attributes that can uniquely identify each tuple (row) in a table. A super
key may contain extra attributes that are not necessary for unique identification.

Example

Consider a Student table:

RegNo RollNo Name Phone Marks


101 A01 John 9999999999 85
102 A02 Alice 8888888888 90

Possible Super Keys:

• {RegNo}
• {RollNo, Phone}
• {RollNo, Name, Phone}
• {RegNo, Name, Phone}

Note: {Name, Marks} is NOT a super key because two students can have the same name and marks,
meaning it does not guarantee uniqueness.

2. Candidate Key (CK)


Definition

A candidate key is a minimal subset of a super key that uniquely identifies each tuple. A candidate key
cannot have redundant attributes.

Example

From the super keys, the minimal sets that uniquely identify each record are:

• {RegNo}
• {RollNo, Phone}

These are candidate keys because:

• RegNo alone is enough to identify a student.


• RollNo and Phone together uniquely identify a student.

Every candidate key is a super key, but every super key is not necessarily a candidate key.

3. Primary Key (PK)


Definition

A primary key is a candidate key selected to uniquely identify tuples in a table.


Properties of Primary Key

Uniqueness – The primary key uniquely identifies each row.


Non-nullability – It cannot contain NULL values.
Single per Table – Only one primary key is allowed per table.

Example

Let’s define a Student table with a primary key:

CREATE TABLE Student (


RegNo INT PRIMARY KEY,
RollNo VARCHAR(10),
Name VARCHAR(50),
Phone VARCHAR(15),
Marks INT
);

Here, RegNo is the primary key.

Other possible primary keys:

• {RollNo, Phone} could also serve as a primary key.


• However, only one primary key is selected by the database designer.

4. Alternate Key (AK)


Definition

An alternate key is a candidate key that is not chosen as the primary key.

Example

If we select RegNo as the primary key, then {RollNo, Phone} becomes the alternate key.

Implementation in SQL
ALTER TABLE Student
ADD CONSTRAINT unique_roll_phone UNIQUE (RollNo, Phone);

Here, we define {RollNo, Phone} as an alternate key by enforcing uniqueness.

5. Foreign Key (FK)


Definition

A foreign key is an attribute (or a group of attributes) in one table that references the primary key in
another table.

• It establishes a relationship between two tables.


• The table containing the foreign key is called the child table.
• The table containing the primary key is called the parent table.

Example

Let’s say we have two tables:

Student Table (Parent Table)

RegNo (PK) Name


101 John
102 Alice

Course Table (Child Table)

CourseID (PK) CourseName StudentRegNo (FK)


CS101 ComputerSci 101
CS102 Math 102

Here, StudentRegNo in the Course table is a foreign key referencing RegNo in the Student table.

SQL Code for Foreign Key


CREATE TABLE Course (
CourseID VARCHAR(10) PRIMARY KEY,
CourseName VARCHAR(50),
StudentRegNo INT,
FOREIGN KEY (StudentRegNo) REFERENCES Student(RegNo)
);

If a foreign key is used, we cannot insert a StudentRegNo in the Course table unless it already
exists in the Student table.

Data Control Language (DCL) is used to manage user access and permissions in a database. It is
primarily used for granting and revoking privileges on database objects such as tables, views, and
procedures.

DCL Commands

1. GRANT – Assigns privileges to users.


2. REVOKE – Removes previously granted privileges.

1. GRANT Command
The GRANT statement allows the database administrator to give permissions to users or roles.

Syntax
GRANT privilege(s) ON table_name TO user_name;

Example
Let's say we have a "Students" table:

StudentID Name Age Marks


101 John 20 85
102 Alice 22 90

Now, if we want to allow user 'bhavesh' to SELECT and INSERT data into the "Students" table, we use:

GRANT SELECT, INSERT ON Students TO 'bhavesh';

This means:

• Bhavesh can read (SELECT) the data from the "Students" table.
• Bhavesh can insert (INSERT) new records into the "Students" table.

2. REVOKE Command
The REVOKE statement is used to take back privileges from a user or role.

Syntax
REVOKE privilege(s) ON table_name FROM user_name;

Example

If we want to remove the INSERT permission from Bhavesh, we use:

REVOKE INSERT ON Students FROM 'bhavesh';

Now, Bhavesh:

• Can still SELECT data.


• Cannot INSERT new records anymore.

Relational Algebra in DBMS

Seminar Document

1. Introduction

Relational algebra is a procedural query language used in relational databases. It provides a set of
operations to retrieve and manipulate data efficiently.

2. Types of Relational Algebra Operations

2.1 Selection Operation (σ)

• The selection operation retrieves rows (tuples) that satisfy a given condition.

• Notation: σ condition (Relation)

• Example:
Given LOAN table:

BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000

Redwood L-23 2000

Perryride L-15 1500

Downtown L-14 1500

Perryride L-16 1300

Query: Select all loans from the "Perryride" branch.

Input:

arduino

CopyEdit

σ BRANCH_NAME="Perryride" (LOAN)

Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500

Perryride L-16 1300

2.2 Projection Operation (∏)

• The projection operation selects specific columns (attributes) from a table.

• Notation: ∏ column1, column2 (Relation)

• Example:

Given CUSTOMER table:

NAME STREET CITY

Jones Main Harrison

Smith North Rye

Hays Main Harrison

Curry North Rye

Johnson Alma Brooklyn

Query: Show only "NAME" and "CITY" columns.


Input:

objectivec

CopyEdit

∏ NAME, CITY (CUSTOMER)

Output:

NAME CITY

Jones Harrison

Smith Rye

Hays Harrison

Curry Rye

Johnson Brooklyn

2.3 Union Operation (∪)

• Union merges two relations and removes duplicates.

• Notation: R ∪ S

• Example:

Given DEPOSITOR table:

CUSTOMER_NAME ACCOUNT_NO

Johnson A-101

Smith A-121

Mayes A-321

Given BORROW table:

CUSTOMER_NAME LOAN_NO

Jones L-17

Smith L-23

Query: Find all unique customers who have either a loan or an account.

Input:

scss

CopyEdit
∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Johnson

Smith

Mayes

Jones

2.4 Set Intersection (∩)

• Returns only the common tuples from two relations.

• Notation: R ∩ S

• Example:

Query: Find customers who have both loans and accounts.

Input:

scss

CopyEdit

∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Smith

2.5 Set Difference (-)

• Returns tuples present in one table but not in another.

• Notation: R - S

• Example:

Query: Find customers who have loans but do not have accounts.

Input:

scss

CopyEdit

∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)


Output:

CUSTOMER_NAME

Jones

2.6 Cartesian Product (X)

• Combines every row of one table with every row of another.

• Notation: R X S

• Example:

Given EMPLOYEE table:

EMP_ID EMP_NAME

1 Smith

2 Harry

Given DEPARTMENT table:

DEPT_NO DEPT_NAME

A Marketing

B Sales

Query: Get all possible employee-department combinations.

Input:

CopyEdit

EMPLOYEE X DEPARTMENT

Output:

EMP_ID EMP_NAME DEPT_NO DEPT_NAME

1 Smith A Marketing

1 Smith B Sales

2 Harry A Marketing

2 Harry B Sales

2.7 Rename Operation (ρ)

• Renames a relation for better readability.


• Notation: ρ(new_name, relation)

• Example:

Input:

scss

CopyEdit

ρ (STUDENT1, STUDENT)

3. Join Operations in DBMS

3.1 Natural Join (⋈)

• Combines tuples based on common attributes.

• Example:

Given EMPLOYEE table:

EMP_CODE EMP_NAME

101 Stephan

102 Jack

Given SALARY table:

EMP_CODE SALARY

101 50000

102 30000

Query: Find employee names along with their salaries.

Input:

CopyEdit

EMPLOYEE ⋈ SALARY

Output:

EMP_CODE EMP_NAME SALARY

101 Stephan 50000

102 Jack 30000


3.2 Outer Join (⟕, ⟖, ⟗)

Outer joins help retain unmatched records from one or both tables.

a. Left Outer Join (⟕)

• Includes all records from the left table and matching records from the right.

• If there is no match, NULL is placed in the right table's columns.

Example:

Given EMPLOYEE table:

EMP_NAME CITY

Ram Mumbai

Shyam Kolkata

Ravi Delhi

Given FACTORY_WORKERS table:

EMP_NAME FACTORY SALARY

Ram Infosys 10000

Shyam Wipro 20000

Hari TCS 50000

Query: Perform a left outer join between EMPLOYEE and FACTORY_WORKERS.

Input:

CopyEdit

EMPLOYEE ⟕ FACTORY_WORKERS

Output:

EMP_NAME CITY FACTORY SALARY

Ram Mumbai Infosys 10000

Shyam Kolkata Wipro 20000

Ravi Delhi NULL NULL

Note: "Ravi" does not exist in FACTORY_WORKERS, so NULL is placed in the missing columns.

b. Right Outer Join (⟖)


• Includes all records from the right table and matching records from the left.

• If there is no match, NULL is placed in the left table's columns.

Example:

Query: Perform a right outer join between EMPLOYEE and FACTORY_WORKERS.

Input:

CopyEdit

EMPLOYEE ⟖ FACTORY_WORKERS

Output:

EMP_NAME CITY FACTORY SALARY

Ram Mumbai Infosys 10000

Shyam Kolkata Wipro 20000

Hari NULL TCS 50000

Note: "Hari" exists in FACTORY_WORKERS but not in EMPLOYEE, so NULL is placed in the missing
column.

c. Full Outer Join (⟗)

• Includes all records from both tables.

• If there is no match, NULL is placed in the missing columns.

Example:

Query: Perform a full outer join between EMPLOYEE and FACTORY_WORKERS.

Input:

CopyEdit

EMPLOYEE ⟗ FACTORY_WORKERS

Output:

EMP_NAME CITY FACTORY SALARY

Ram Mumbai Infosys 10000

Shyam Kolkata Wipro 20000

Ravi Delhi NULL NULL

Hari NULL TCS 50000

Note: Both "Ravi" and "Hari" had missing values, so NULL is placed accordingly.
4. Conclusion

• Relational algebra forms the foundation of SQL queries.

• Understanding these operations helps in query optimization and database manipulation.

Fundamental Dependency in DBMS

What is Functional Dependency?

Functional dependency is a key concept in relational database management systems (RDBMS) that defines
a relationship between two attributes in a table. It indicates that the value of one attribute uniquely
determines the value of another attribute.

Notation:

If an attribute X uniquely determines another attribute Y, it is written as:


X→Y
Here:

• X is called the Determinant.

• Y is called the Dependent.

Functional dependencies are crucial for designing efficient database schemas, ensuring data integrity, and
eliminating redundancy.

Example of Functional Dependency

Consider the following Student table:

roll_no name dept_name dept_building

42 abc CO A4

43 pqr IT A3

44 xyz CO A4

45 xyz IT A3

46 mno EC B2

47 jkl ME B2

From this table, we can determine the following valid functional dependencies:

1. roll_no → {name, dept_name, dept_building}

o Each student has a unique roll_no that determines their name, dept_name, and
dept_building.
2. roll_no → dept_name

o Since roll_no determines the entire set {name, dept_name, dept_building}, it also
determines its subset, dept_name.

3. dept_name → dept_building

o Each department is associated with a specific building, so dept_name determines


dept_building.

Invalid Functional Dependencies

• name → dept_name

o Students with the same name can be in different departments, so this dependency is not
valid.

• dept_building → dept_name

o Multiple departments can be in the same building (ME and EC are in B2), making this
dependency invalid.

Types of Functional Dependencies in DBMS

1. Trivial Functional Dependency

A functional dependency X → Y is trivial if Y is a subset of X.

Example:

• ABC → AB

• roll_no, name → name (Since name is already part of {roll_no, name}, it's trivial)

2. Non-Trivial Functional Dependency

A functional dependency X → Y is non-trivial if Y is not a subset of X.

Example:

• roll_no → name (name is not a subset of roll_no)

• name → DOB (DOB is not a subset of name)

3. Semi Non-Trivial Functional Dependency

A dependency X → Y is semi non-trivial when part of Y is included in X, but not all.

Example:

Student_ID Course_ID Course_Name

101 CSE101 Computer Science

102 CSE102 Data Structures

103 CSE101 Computer Science


{Student_ID, Course_ID} → Course_ID is semi non-trivial because:

• Course_ID is already part of the determinant {Student_ID, Course_ID}.

4. Multivalued Functional Dependency

A dependency X →→ Y is multivalued if Y values are independent of each other.

Example:

bike_model manuf_year color

tu1001 2007 Black

tu1001 2007 Red

tu2012 2008 Black

tu2012 2008 Red

• bike_model →→ color (Independent from manuf_year)

• bike_model →→ manuf_year (Independent from color)

5. Transitive Functional Dependency

A dependency X → Z is transitive if X → Y and Y → Z exist.

Example:

enrol_no name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1

• enrol_no → dept

• dept → building_no

• Thus, enrol_no → building_no (Transitive Dependency)

6. Fully Functional Dependency

A dependency X → Y is fully functional if removing any part of X makes it invalid.

Example:

Emp_ID Project_ID Project_Name

101 P1 Alpha

102 P2 Beta

• {Emp_ID, Project_ID} → Project_Name (Fully functional, as both attributes together determine


Project_Name)

7. Partial Functional Dependency


A partial functional dependency occurs when a non-key attribute depends on part of a composite key.

Example:

Student_ID Course_ID Instructor

1 CS101 Prof. A

2 CS102 Prof. B

• {Student_ID, Course_ID} → Instructor is fully functional.

• If Course_ID → Instructor, it is a partial dependency (Instructor depends only on Course_ID).

Conclusion

Functional dependencies play a crucial role in database normalization and ensure data consistency and
accuracy. They help identify redundancies and anomalies, leading to better database design.

This topic is fundamental for database normalization (1NF, 2NF, 3NF, BCNF, etc.), ensuring minimal
redundancy and efficient querying.

ER Diagrams in DBMS
An Entity-Relationship (ER) Diagram represents the logical structure of a database graphically. It is used
to model real-world objects (entities) and their relationships in a structured manner.

Features of ER Model
1. Graphical Representation – ER diagrams visually represent database relationships.
2. Real-World Modeling – They model objects like persons, companies, projects, etc..
3. No Technical Knowledge Required – Even a non-technical user can understand ER diagrams.
4. Ease of Conversion – ER diagrams can be easily converted into relational tables.
5. Standard Notation – ER diagrams provide a structured way to represent logical relationships.

Components of an ER Diagram
1. Entity

An entity represents a real-world object that has attributes.

• Strong Entity: Has a primary key that uniquely identifies records.


• Weak Entity: Does not have a primary key and depends on a strong entity.

Example of Entities

+-------------+
| Student |
+-------------+
| Student_ID |
| Name |
| Age |
+-------------+

+------------+
| Course |
+------------+
| Course_ID |
| Course_Name|
+------------+

2. Relationship

A relationship represents an association between two or more entities.

Example of Relationship

(Student) ---- enrolls in ----> (Course)

Mapping Cardinality in ER Diagrams


Cardinality defines the number of instances of an entity that can be associated with instances of another
entity.

1. One-to-One (1:1)

• Definition: Each entity in set A is related to at most one entity in set B.


• Example: A project manager manages only one project.

(Project_Manager) ---- (Manages) ----> (Project)

ER Diagram:

+-----------------+ +-------------+
| Project_Manager | | Project |
+-----------------+ +-------------+
| 1 | 1
---------------------

2. One-to-Many (1:M)

• Definition: A single entity in set A can be related to multiple entities in set B.


• Example: A customer can place multiple orders.

(Customer) ---- places ----> (Order)

ER Diagram:

+-----------+ +--------+
| Customer | | Order |
+-----------+ +--------+
| 1 | M
-------------------

3. Many-to-One (M:1)
• Definition: Multiple entities in set A can be related to one entity in set B.
• Example: Many students enroll in one Computer Science course.

(Student) ---- enrolls in ----> (ComputerSciCourse)

ER Diagram:

+----------+ +--------------------+
| Student | | ComputerSciCourse |
+----------+ +--------------------+
| M | 1
------------------------

4. Many-to-Many (M:N)

• Definition: Multiple entities in set A can be related to multiple entities in set B.


• Example: Many teachers can teach many students.

(Teacher) ---- teaches ----> (Student)

ER Diagram:

+----------+ +--------+
| Teacher | | Student|
+----------+ +--------+
| M | N
-------------------

Ternary Relationship (3-ary)


• A ternary relationship connects three entities.
• Example: A customer buys a product from a supplier, where price depends on both the product
and supplier.

ER Diagram for Ternary Relationship:

(Customer) ---- buys ----> (Product) ---- supplied by ----> (Supplier)


| Price |
+-----------+ +--------+ +---------+
| Customer | | Product| | Supplier|
+-----------+ +--------+ +---------+
| | Price |
--------------------------------

Weak Entity Set


A weak entity:

• Cannot be uniquely identified by its own attributes.


• Requires a strong entity (with a primary key) for identification.

Example:
A Player needs a Team to be identified uniquely.

(Player) ---- belongs to ----> (Team)


ER Diagram for Weak Entity:

+--------+ +-------+
| Player | | Team |
+--------+ +-------+
| Name | | Team_ID |
| Number | | Name |
+--------+ +-------+
| | 1
-------------------
| M (Weak Entity)

Weak Entity Rules

1. A weak entity set has one or more many-one relationships to a supporting entity set.
2. A weak entity’s key is formed using its own attributes + supporting entity’s primary key.
o Example: The key for Player = (Player-Number, Team-Name).

Conclusion
• ER diagrams visually represent the logical structure of a database.
• Different types of relationships (1:1, 1:M, M:1, M:N, ternary) define how entities interact.
• Weak entities depend on strong entities for unique identification.

Normalization

Normalization is the process of organizing data in a database to reduce redundancy and ensure data
integrity. It involves dividing large tables into smaller ones and defining relationships between them.

Need for Normalization

1. Eliminates redundancy – Prevents storing the same data multiple times.

2. Reduces data errors – Ensures consistency in the database.

3. Saves storage space – Optimizes data storage.

4. Improves performance – Faster retrieval and update operations.

5. Enhances data integrity – Maintains logical relationships between data.

Example: Normalization on Student Data

Unnormalized Table (UNF)

Consider the following Student table:

StudentID Name Course Instructor Phone Numbers

101 Alex Math Mr. John 9876543210, 9123456789


StudentID Name Course Instructor Phone Numbers

102 Ben Science Ms. Rose 9765432101

103 Alex Physics Mr. Smith 9876543210, 9123456789

Issues:

• Multiple values in the "Phone Numbers" column (violates atomicity).

• Repetition of student details.

First Normal Form (1NF)

Rule: No repeating or multi-valued attributes (each field must contain atomic values).

StudentID Name Course Instructor Phone Number

101 Alex Math Mr. John 9876543210

101 Alex Math Mr. John 9123456789

102 Ben Science Ms. Rose 9765432101

103 Alex Physics Mr. Smith 9876543210

103 Alex Physics Mr. Smith 9123456789

Fixed multi-valued attribute issue by creating separate rows for each phone number.

Second Normal Form (2NF)

Rule: Must be in 1NF, and no partial dependencies (all non-key attributes must be fully dependent on the
entire primary key).

• Here, (StudentID, Course) is the composite primary key.

• Instructor depends only on Course, not on StudentID → Partial dependency.

Decomposed Tables:

Student Table

StudentID Name

101 Alex

102 Ben

103 Alex

Course Table
Course Instructor

Math Mr. John

Science Ms. Rose

Physics Mr. Smith

Student_Course Table

StudentID Course

101 Math

102 Science

103 Physics

Fixed partial dependency by separating courses into another table.

Third Normal Form (3NF)

Rule: Must be in 2NF, and no transitive dependencies (non-key attributes should not depend on another
non-key attribute).

• Phone Number depends on StudentID.

• Instructor depends on Course (already handled in 2NF).

Decomposed Tables:

Student Table (Same as 2NF)

StudentID Name

101 Alex

102 Ben

103 Alex

Course Table (Same as 2NF)

Course Instructor

Math Mr. John

Science Ms. Rose

Physics Mr. Smith

Student_Course Table (Same as 2NF)


StudentID Course

101 Math

102 Science

103 Physics

Phone Table

StudentID Phone Number

101 9876543210

101 9123456789

102 9765432101

103 9876543210

103 9123456789

Fixed transitive dependency by creating a separate table for phone numbers.

Boyce-Codd Normal Form (BCNF)

Rule: Must be in 3NF, and for every functional dependency (X → Y), X should be a superkey.

No anomalies exist in our case after 3NF, so BCNF = 3NF.

Fourth Normal Form (4NF)

Rule: Must be in BCNF and no multi-valued dependencies (MVDs).

Issue: A student can enroll in multiple courses and also have multiple phone numbers.

Decomposed Tables:

Student Table (Same as before)

StudentID Name

101 Alex

102 Ben

103 Alex

Student_Course Table (Same as before)


StudentID Course

101 Math

102 Science

103 Physics

Student_Phone Table (Now separate to remove MVDs)

StudentID Phone Number

101 9876543210

101 9123456789

102 9765432101

103 9876543210

103 9123456789

Fixed multi-valued dependency issue.

Fifth Normal Form (5NF)

• table is in 5NF (Projection-Join Normal Form) if it is already in 4NF and cannot be further
decomposed without losing data.

• It resolves join dependencies, meaning that if a table can be split into smaller tables and rejoined
without data loss, it should be decomposed.

Identifying a 5NF Issue in Our 4NF Structure

In 4NF, we have:

1. Student Table → (StudentID, Name)

2. Course Table → (Course, Instructor)

3. Student_Course Table → (StudentID, Course)

4. Student_Phone Table → (StudentID, Phone Number)

Issue in 4NF Structure

If students are also assigned projects based on their courses, we may store:
StudentID Course Project

101 Math P1

101 Math P2

102 Science P3

103 Physics P4

Here, a student can have multiple courses and multiple projects, creating a multi-value dependency
between StudentID, Course, and Project.

Breaking into 5NF

We decompose this into:

1. Student_Course Table (StudentID, Course)

2. Course_Project Table (Course, Project)

3. Student_Project Table (StudentID, Project)

Now, instead of storing Student-Course-Project in one table, we store them separately and join them when
needed.

Final 5NF Tables

1. Student Table

StudentID Name

101 Alex

102 Ben

103 Alex

2. Course Table

Course Instructor

Math Mr. John

Science Ms. Rose

Physics Mr. Smith

3. Student_Course Table

StudentID Course

101 Math
102 Science

103 Physics

4. Student_Phone Table

StudentID Phone Number

101 9876543210

101 9123456789

102 9765432101

103 9876543210

103 9123456789

5. Course_Project Table

Course Project

Math P1

Math P2

Science P3

Physics P4

6. Student_Project Table

StudentID Project

101 P1

101 P2

102 P3

103 P4

Why is this 5NF?

• Now, StudentID, Course, and Project are managed in separate tables, eliminating redundancy.

• The original Student-Course-Project table is now broken down into three smaller tables without
losing data.

• If we join Student_Course and Course_Project, we can reconstruct the original relationships.

This ensures minimal redundancy and maximum data integrity.

Aggregate Functions in SQL

An aggregate function in SQL performs a calculation on a set of values and returns a single scalar value.
These functions are commonly used with the GROUP BY clause to summarize data in a database.
Built-in Aggregate Functions in SQL

SQL provides five built-in aggregate functions:

1. AVG (Average) – Returns the average value of a numeric column.

2. MIN (Minimum) – Returns the smallest value in a column.

3. MAX (Maximum) – Returns the largest value in a column.

4. SUM (Total) – Returns the sum of all values in a column.

5. COUNT (Count) – Returns the total number of values in a column, including or excluding NULL
values.

Example Table: Employees

Emp_ID Name Salary Dept

1 Alice 50000 HR

2 Bob 60000 IT

3 Charlie 55000 IT

4 David 70000 HR

5 Eve 65000 IT

SQL Queries with Aggregate Functions

1. Average Salary of Employees:

sql

CopyEdit

SELECT AVG(Salary) FROM Employees;

Output: 60000

2. Minimum Salary:

sql

CopyEdit

SELECT MIN(Salary) FROM Employees;

Output: 50000

3. Maximum Salary:

sql

CopyEdit

SELECT MAX(Salary) FROM Employees;


Output: 70000

4. Total Salary Paid:

sql

CopyEdit

SELECT SUM(Salary) FROM Employees;

Output: 300000

5. Total Number of Employees:

sql

CopyEdit

SELECT COUNT(*) FROM Employees;

Output: 5

You might also like