DBMS
DBMS
Unit 1 :
1. Explain the evolution of database systems from traditional file-based systems to modern
DBMS. Highlight the advantages gained over time.
A. The application of database systems has evolved significantly over time, driven by technological
advancements, changing business requirements, and the increasing demand for efficient data storage
and retrieval. Here's a historical perspective on their development and applications:
• Before Database Systems: Data was stored in flat files or custom file systems. Each
application managed its own data, leading to data redundancy and inconsistency.
• Limitations:
o Lack of data independence.
o Difficult to manage and query large datasets.
o Redundancy and inconsistency.
• Introduction of the Relational Model: Proposed by Edgar F. Codd in 1970, the relational
model laid the foundation for modern databases.
• Applications:
o Banking and finance: Transaction processing systems.
o Government: Census and demographic data management.
• Key Systems: IBM developed System R, and Oracle introduced its commercial relational
database.
• Decentralized Access: With the rise of client-server computing, databases became more
accessible across networks.
• Applications:
o E-commerce: Online shopping platforms began using databases for product catalogs
and order management.
o Telecommunications: Call data records and billing systems.
o Healthcare: Patient record management systems.
• Web-Based Databases: The growth of the internet led to the proliferation of web
applications requiring scalable and accessible databases.
• Applications:
o Social media: Handling user-generated content and large-scale interactions.
o Online services: Booking systems for travel and hospitality.
o Content management systems: Powering blogs and websites.
• Challenges of Big Data: Traditional RDBMSs struggled with unstructured data and large-
scale distributed systems.
• Introduction of NoSQL Databases:
o Document databases (e.g., MongoDB).
o Key-value stores (e.g., Redis).
o Wide-column stores (e.g., Cassandra).
o Graph databases (e.g., Neo4j).
• Applications:
o Real-time analytics: Recommendation engines and fraud detection.
o IoT: Sensor data management.
o Cloud services: Scalable storage for diverse applications.
• Cloud Databases: Platforms like AWS, Azure, and Google Cloud provide scalable database
services.
• AI and Machine Learning:
o Integration with databases to support predictive analytics.
o Automated database tuning and optimization.
• Applications:
o Autonomous vehicles: Managing and processing sensor data.
o Financial technology: Blockchain and distributed ledger systems.
o Healthcare: Precision medicine with large-scale genomic data.
Key Takeaways
• Database systems have transformed from simple file storage solutions to complex,
distributed systems supporting global-scale applications.
• The evolution has been driven by the need to handle more data, faster queries, and greater
complexity in applications.
• Future advancements may involve more AI-powered database management, tighter
integration with IoT, and quantum databases.
If you'd like a more focused discussion on any specific era or technology, let me know!
2. Compare and contrast file systems and database management systems (DBMS). Provide
examples to support your answer
A. https://www.geeksforgeeks.org/difference-between-file-system-and-dbms/
3. Define a data model and discuss the main types of data models used in database systems.
Include examples for each type.
A. file:///C:/Users/dell/Documents/DBMS%20unit%201.pdf
A. Data Abstraction in DBMS Schema: The design of the database is called a schema. This tells us
about the structural view of the database. It gives us an overall description of the database. A database
schema defines how the data is organised using the schema diagram. A schema diagram is a diagram
which contains entities and the attributes that will define that schema. A schema diagram only shows
us the database design. It does not show the actual data of the database. Schema can be a single table
or it can have more than one table which is related. The schema represents the relationship between
these tables. Example: Let us suppose we have three tables Employee, Department and Project. So,
we can represent the schema of these three tables using the schema diagram as follows. In this schema
diagram, Employee and Department are related and the Employee and Project table are related.
Data Abstraction is a process of hiding unwanted or irrelevant details from the end user.
The
developers keep away the complex data from the user and remove the complications so that
the user
can comfortably access data in the database and can only access the data they want, which is
done
with the help of data abstraction.The main purpose of data abstraction is to hide irrelevant
data and provide an abstract view of the
data. With the help of data abstraction, developers hide irrelevant data from the user and
provide them
the relevant data. By doing this, users can access the data without any pressure, and the
system will
also work efficiently.
In DBMS, data abstraction is performed in layers which means there are levels of data
abstraction in
DBMS that we will further study in this article. Based on these levels, the database
management
system is designed.
Levels of abstraction for DBMS (or) three levels of the schema (or)
Three tier schema Architecture:
Database systems include complex data-structures. In terms of retrieval of data, reduce
complexity in
terms of usability of users and in order to make the system efficient, developers use levels of
abstraction that hide irrelevant details from the users. Levels of abstraction simplify database
design.
Mainly there are three levels of abstraction for DBMS, which are as follows −
● Physical or Internal Level
● Logical or Conceptual Level
● View or External Level
In DBMS, there are three levels of data abstraction, which are as follows:
.1. Physical or Internal Level or physical schema:
The physical or internal layer is the lowest level of data abstraction in the database
management
system. It is the layer that defines how data is actually stored in the database. It defines
methods to
access the data in the database. It defines complex data structures in detail, so it is very
complex to
understand, which is why it is kept hidden from the end user.
Data Administrators (DBA) decide how to arrange data and where to store data. The Data
Administrator (DBA) is the person whose role is to manage the data in the database at the
physical or
internal level. There is a data center that securely stores the raw data in detail on hard drives
at this
level.
2. Logical or Conceptual Level or logical schema:In database design, the conceptual
level refers to the highest level of abstraction, where the focus
is on understanding the overall structure and meaning of the data being stored, rather than the
specific implementation details of the database system.
At the conceptual level, the database designer creates a data model, which defines the
entities,
attributes, and relationships of the data. This schema is independent of any specific database
management system (DBMS), and serves as a blueprint for the design and implementation of
the
database system. In the logical level, the data model is transformed into a logical schema,
which
includes the specific tables, columns, and constraints needed to implement the database.
It describes the structure of the entire data in the form of tables. The logical level or
conceptual level
is less complex than the physical level. In software companies, the conceptual level of the
database
is typically handled by a team of database designers or data architects. These individuals are
responsible for understanding the data requirements of the organization and creating a
conceptual
schema that reflects those requirements. However, the database designers may continue to be
involved
in the ongoing maintenance and evolution of the database, to ensure that it continues to meet
the needs
of the organization over time.
After the data model has been created, the logical database design phase typically follows,
where
the data model is translated into a physical schema that includes specific tables, columns, and
constraints, along with any necessary optimization and indexing for performance and
scalability.
This phase is typically handled by database developers, rather than database architects.
Example: If we have to define an employee schema then it will have attributes like
Emlpoyee_id,
Name, Age, Salary, Phone_no etc. So, the data types for these attributes would be defined
here.
Also, if we have more than one table in our schema then how these tables would be related is
also
defined here. Like if we have some more tables like department table and project table then
how
these tables would be related is defined here.
3. View or External Level or view schema :
View Schema defines the design of the database at the view level of the data abstraction. It
defines how an end-user will interact with the database system. There are many view schema
for a database system. Each view schema defines the view of data for a particular group of
people. It shows only those data to a view group in which they are interested and hides the
remaining details from them.
Example : A website has different views depending upon the user's authorization. A college
website has a different view for students, faculty and dean. Similarly, a companies website
would have a different view for the employee, accountant and manager.
Advantages of data abstraction in DBMS:
o Users can easily access the data based on their queries.
o It provides security to the data stored in the database.o Database systems work efficiently
because of data abstraction.
Features of data abstraction in DBMS:
1.Multiple levels:
Data abstraction provides multiple levels of abstraction, including external, conceptual, and
internal, which enable different users to view the database system at different levels of
complexity.
2.Separation of concerns:
Data abstraction separates the concerns of different users and stakeholders, such as users,
designers, and administrators, by providing a clear separation between the logical and
physical
implementation of the database system.
3.Mapping between levels:
Data abstraction provides a mapping between the different levels of abstraction, which
ensures
that changes made at one level do not affect the other levels.
4.Transparency: (Transparency mean clearness)
Data abstraction provides transparency to the users by hiding the implementation details and
complexity of the database system, and providing a consistent view of the data.
5.What is data independence in a DBMS? Differentiate between logical and physical data
independence with examples.
6. Explain the components of a DBMS and their roles in ensuring efficient data management.
A. https://www.dataentryoutsourced.com/blog/components-of-a-database-management-system/
7. What are ER diagrams? Explain their significance in database design and outline the main
elements of an ER diagram.
1. Simplifies Complex Systems: ER diagrams help break down complex systems by visually
representing data structures and their relationships, making it easier to understand and design
the database.
2. Clear Representation of Data: It provides a clear and structured way to define the data
requirements and relationships, serving as a blueprint for developers and stakeholders.
3. Prevents Redundancy: By defining entities and their relationships explicitly, ER diagrams
help identify and eliminate redundancy in the data storage.
4. Facilitates Communication: The diagrams serve as a communication tool between database
designers, developers, and clients, ensuring a common understanding of the system.
5. Foundation for Database Creation: ER diagrams are translated into relational schemas
during the physical database design, guiding the actual implementation.
1. Entities:
a. Definition: An entity represents a distinct object or concept that is of interest to the
organization and needs to be stored in the database.
b. Representation: Typically shown as rectangles.
c. Example: A "Customer" or "Product" could be entities.
2. Attributes:
a. Definition: Attributes are properties or characteristics that describe an entity.
b. Representation: Shown as ellipses.
c. Example: For a "Customer" entity, attributes could include "Name", "Address", and
"Phone Number".
3. Primary Key:
a. Definition: A primary key is an attribute or set of attributes that uniquely identifies
an instance of an entity.
b. Representation: Underlined in the diagram.
c. Example: For a "Customer" entity, "Customer_ID" could be a primary key.
4. Relationships:
a. Definition: Relationships represent the associations between two or more entities.
b. Representation: Shown as diamonds.
c. Example: A "Customer" might have a relationship called "Places Order" with an
"Order" entity.
5. Cardinality:
a. Definition: Cardinality defines the number of instances of one entity that can be
associated with another entity.
b. Representation: Indicated using numbers (1, M, N) or symbols (crow's foot).
c. Example: A "Customer" can place many "Orders", but each "Order" is placed by one
"Customer" (One-to-Many relationship).
6. Participation Constraints:
a. Definition: This specifies whether all or only some instances of an entity participate
in a relationship.
b. Types:
i. Total Participation: Every instance of an entity must participate in a
relationship.
ii. Partial Participation: Some instances of an entity may not participate in a
relationship.
c. Representation: Shown using double lines for total participation and single lines for
partial participation.
7. Weak Entities:
a. Definition: A weak entity depends on another entity (called a "strong" entity) for its
identification and cannot exist independently.
b. Representation: A weak entity is shown as a double rectangle, and its relationship
to the strong entity is a double diamond.
c. Example: A "Dependent" entity might be a weak entity related to a "Employee"
entity, where a dependent's existence is tied to an employee.
Example of an ER Diagram:
Summary:
• ER Diagrams are used for database design and depict entities, their attributes, and
relationships.
• They play a significant role in simplifying the database structure, ensuring proper
relationships, and guiding the implementation of the database.
8. Differentiate between entities, attributes, and entity sets in an ER model. Provide examples
for each concept.
1. Entities:
a. Definition: An entity represents a distinct real-world object or concept that can be
identified and stored in a database.
b. Characteristics:
i. It has properties (attributes) that describe it.
ii. It can be tangible (e.g., a person, car) or intangible (e.g., a course, event).
c. Example:
i. Entity: A "Student".
ii. Attributes: Name, Roll Number, and Date of Birth describe the student.
2. Attributes:
a. Definition: Attributes are the descriptive properties or characteristics of an entity.
b. Characteristics:
i. They hold the actual data values.
ii. Attributes can be:
1. Simple: Cannot be divided further (e.g., Roll Number).
2. Composite: Can be divided into sub-parts (e.g., Full Name into First
Name and Last Name).
3. Derived: Computed from other attributes (e.g., Age derived from
Date of Birth).
4. Multivalued: Can have multiple values (e.g., Phone Numbers).
c. Example:
i. For the entity "Student", attributes include Roll Number (unique identifier),
Name, and Course.
3. Entity Sets:
a. Definition: An entity set is a collection of similar entities that share the same
attributes.
b. Characteristics:
i. It represents all entities of a particular type in the database.
ii. Think of it as a "table" in the database context, where each entity is a "row".
c. Example:
i. Entity Set: All "Students" enrolled in a university form the entity set
"Student".
ii. Each "Student" (individual entity) is a member of this entity set.
Entity:
• Entity: A specific car (e.g., a red Tesla Model 3 with registration number "AB12345").
• Attributes: Model, Color, Registration Number.
Attributes:
Entity Set:
Summary Table:
9. What are relationships and relationship sets in an ER model? Discuss the types of
relationships with examples.
1. Relationships:
a. Definition: A relationship represents an association between two or more entities in
an ER model.
b. Characteristics:
i. It connects entities that share a meaningful interaction or association.
ii. Represented by a diamond shape in ER diagrams.
c. Example:
i. A "Student" enrolls in a "Course". Here, "enrolls in" is the relationship
connecting the "Student" and "Course" entities.
2. Relationship Sets:
a. Definition: A relationship set is a collection of similar relationships involving entities
from specific entity sets.
b. Characteristics:
i. It represents all instances of a relationship type in the database.
ii. For example, all "enrollments" between students and courses form a
relationship set.
c. Example:
i. The relationship set "Enrolls" contains all individual "Student-Course"
associations in a university.
1. One-to-One (1:1):
a. Definition: Each entity in one entity set is associated with exactly one entity in
another entity set, and vice versa.
b. Example:
i. Relationship: "Manages".
ii. Entity Sets: "Employee" and "Department".
iii. Example Association: Each "Employee" manages exactly one "Department",
and each "Department" is managed by exactly one "Employee".
2. One-to-Many (1:N):
a. Definition: An entity in one entity set is associated with many entities in another
entity set, but each entity in the second set is associated with at most one entity in the
first set.
b. Example:
i. Relationship: "Owns".
ii. Entity Sets: "Customer" and "Car".
iii. Example Association: A "Customer" can own multiple "Cars", but each "Car"
is owned by only one "Customer".
3. Many-to-Many (M:N):
a. Definition: Entities in one entity set can be associated with multiple entities in
another entity set, and vice versa.
b. Example:
i. Relationship: "Enrolls".
ii. Entity Sets: "Student" and "Course".
iii. Example Association: A "Student" can enroll in multiple "Courses", and a
"Course" can have multiple "Students" enrolled.
4. Self-Referencing Relationships:
a. Definition: An entity set forms a relationship with itself.
b. Example:
i. Relationship: "Reports To".
ii. Entity Set: "Employee".
iii. Example Association: An "Employee" reports to another "Employee" (e.g., a
manager).
ER Diagram Representation
Relationship Symbol in ER
Example
Type Diagram
Line with "1" on both Employee ↔️ Department
One-to-One (1:1)
ends ("Manages").
One-to-Many (1:N) Line with "1" and "N" Customer ↔️ Cars ("Owns").
Many-to-Many
Line with "M" and "N" Student ↔️ Course ("Enrolls").
(M:N)
Employee ↔️ Employee ("Reports
Self-Referencing Loopback arrow
To").
Summary:
10. Discuss the additional features of the ER model, such as specialization, generalization, and
aggregation. How do these enhance database design?
A. https://www.geeksforgeeks.org/generalization-specialization-and-aggregation-in-er-model/
UNIT 2:
1.Define integrity constraints in relational databases and explain the different types of
constraints with examples
A, https://in.docworkspace.com/d/sIP_GyZzUAaqX87sG
2. Discuss how integrity constraints are enforced in a relational database. Illustrate with
examples how a violation of constraints is handled.
Relational databases use integrity constraints to ensure the accuracy, consistency, and reliability of
data. These constraints are enforced by the database management system (DBMS) at the time of
data insertion, update, or deletion.
1. Enforcement Mechanisms
a. Domain Constraint
• Rule: The values in a column must adhere to a predefined data type or range.
• Example: CREATE TABLE Students (
Student_ID INT,
Age INT CHECK (Age >= 18)
);
Violation: INSERT INTO Students (Student_ID, Age) VALUES (101, 15);
b. Entity Integrity
c. Referential Integrity
• Rule: Foreign keys must match a primary key in the referenced table or be null.
• Example: CREATE TABLE Departments (
Dept_ID INT PRIMARY KEY,
Dept_Name VARCHAR(50)
);
e. Check Constraint
3. Handling Violations
• Rejecting the Transaction: The DBMS stops the operation and returns an error.
• Error Reporting: A detailed message is provided to the user for correction.
• Default Actions: For referential integrity, actions like CASCADE or SET NULL may resolve
the issue automatically.
Significance
https://www.geeksforgeeks.org/violation-of-constraints-in-relational-database/
3. Explain the process of querying relational data using SQL. Write SQL queries to
demonstrate the use of SELECT, WHERE, GROUP BY, and ORDER BY.
A.https://in.docworkspace.com/d/sIP_GyZzUAaqX87sG
4. What is logical database design? Explain the steps involved in designing a logical schema with
an example .
A.https://in.docworkspace.com/d/sIP_GyZzUAaqX87sG
Example:
For a school database, entities might include Student, Course, and Enrollment, with
relationships like "Students enroll in Courses."
Example:
Example:
Example:
Ensure Student table doesn't have repeating courses, and Course table doesn't store multiple
instructors in one column.
Example:
Example:
Conceptual Model:
Entities:
Logical Schema:
Logical design is a crucial step in database development, bridging the gap between conceptual
understanding and physical implementation.
5. What are views in relational databases? Discuss their advantages and limitations with
examples of creating, altering, and dropping views in SQL.
https://in.docworkspace.com/d/sIP_GyZzUAaqX87sG
A view is a virtual table in a relational database that is based on the result of a SELECT query. It does
not store data itself but provides a way to access or manipulate data from one or more underlying
tables. Views are often used to simplify complex queries, enhance security, and present data in a user-
friendly format.
Advantages of Views
Limitations of Views
1. Performance: Complex views involving joins or aggregations can slow down queries.
2. Updatability: Views with complex queries (e.g., involving joins, aggregations, or distinct)
may not allow direct updates.
3. Dependency Issues: Dropping or modifying underlying tables can render views invalid.
4. Storage Overhead: While views don’t store data, materialized views (stored views) do.
1. Creating a View
Syntax:
Example:
2. Querying a View
3. Altering a View
Example:
4. Dropping a View
To delete a view:
Example:
1. Simple View:
2. Restrict Access:
3. Complex View:
Summary
6. Explain how to destroy and alter tables and views in a relational database. Provide examples
of SQL commands for each operation.
In relational databases, tables and views can be destroyed (dropped) or altered to accommodate
changes in the schema or to remove unnecessary objects.
1. Destroying Tables
To destroy (delete) a table and all its data, the DROP TABLE command is used. This action is
irreversible.
Syntax:
Example:
Effect:
The Employees table and all its data will be permanently removed.
2. Altering Tables
The ALTER TABLE command is used to modify the structure of an existing table.
Common Operations:
1. Add a Column:
Example:
2. Modify a Column:
Example:
3. Drop a Column:
ALTER TABLE table_name
DROP COLUMN column_name;
Example:
4. Rename a Table:
Example:
3. Destroying Views
Syntax:
Example:
Effect:
4. Altering Views
The CREATE OR REPLACE VIEW command is used to alter the definition of an existing view.
Syntax:
Example:
Effect:
Oper
atio Purpose Effect
n
Permanently deletes a table or Irreversible; all data in tables/views is
Drop
view. lost.
Modifies the structure or Structure changes, but data is retained
Alter
definition of a table or view. unless explicitly removed.
Example Scenario
1. Create a table:
4. Create a view:
Summary
7. Describe the fundamental operations of relational algebra with examples. How do these
operations support data retrieval and manipulation?
A. https://www.geeksforgeeks.org/introduction-of-relational-algebra-in-dbms/
Fundamental Operations of Relational Algebra
Relational Algebra provides a set of operations to manipulate and retrieve data from relational
databases. These operations form the theoretical foundation for SQL and support data retrieval and
manipulation by allowing the combination, filtering, and transformation of data.
1. Selection (σ)
σ(condition)(Relation)
σ(Age = 20)(Students)
Result:
2. Projection (π)
• Example: Query: Retrieve only the names and grades of students: π(Name,
Grade)(Students)
Result:
Name Grade
Alice A
Bob B
Charlie C
• Purpose: Combines two relations by pairing every tuple in one relation with every tuple in
another.
• Operation:
Relation1 × Relation2
CourseNam
CourseID
e
101 Math
102 Physics
Students × Courses
Result:
CourseNam
ID Name Age Grade CourseID
e
1 Alice 20 A 101 Math
1 Alice 20 A 102 Physics
2 Bob 22 B 101 Math
... ... ... ... ... ...
4. Union (∪)
• Purpose: Combines two relations to include all unique rows present in either.
• Operation:
Relation1 ∪ Relation2
Students ∪ Alumni
5. Intersection (∩)
6. Difference (-)
6. Difference (-)
• Example:
From the Students and Alumni tables above, query students who are not alumni: Students -
Alumni
Result:
ID Name Age Grade
1 Alice 20 A
2 Bob 22 B
3 Charlie 20 C
7. Joins
Joins are special cases of Cartesian Product, combined with a selection operation. They combine
related rows from two or more relations.
Types of Joins:
2. Natural Join (⋈): Combines rows with matching attribute names and values.
Students ⋈ Courses
8. Renaming (ρ)
Conclusion
Relational Algebra provides the theoretical foundation for complex SQL queries, allowing powerful
and precise data manipulation and retrieval. These operations enable flexible database design and
efficient query execution.
8. What is tuple relational calculus? Compare it with relational algebra and provide examples of
queries written in tuple relational calculus
9. Explain domain relational calculus with examples. How does it differ from tuple relational
calculus?
10. Differences Between Relational Algebra, Tuple Relational Calculus, and Domain Relational
Calculus
UNIT – 3
1.Explain the structure of a basic SQL query with examples. Write and explain queries that use
SELECT, WHERE, ORDER BY, and GROUP BY clauses.
A.https://in.docworkspace.com/d/sIP_GyZzUAaqX87sG
2. What are UNION, INTERSECT, and EXCEPT operators in SQL? Explain their use with
examples and highlight how they handle duplicates.
These are set operators in SQL used to combine or compare the results of two or more queries. They
operate on the principle of set theory and allow manipulation of query results.
1. UNION
Combines the results of two or more queries into a single result set, removing duplicates by default.
• Syntax:
• Key Points:
o Removes duplicates unless UNION ALL is used.
o The number and order of columns in both queries must match.
o The data types of the columns must be compatible.
• Example: Table 1: Students1
Name
Alice
Bob
Table 2: Students2
Name
Bob
Charlie
Query:
Result:
Name
Alice
Bob
Charlie
Result:
Name
Alice
Bob
Bob
Charlie
2. INTERSECT
• Syntax:
SELECT column1 FROM table1
INTERSECT
SELECT column1 FROM table2;
• Key Points:
o Only the rows present in both result sets are included.
o Duplicates are removed by default.
• Example: Table 1: Students1
Name
Alice
Bob
Table 2: Students2
Name
Bob
Charlie
Query:
Result:
Name
Bob
3. EXCEPT
Returns the rows from the first query that are not present in the second query.
• Syntax:
• Key Points:
o Only rows unique to the first query are included.
o Duplicates are removed by default.
• Example: Table 1: Students1
Name
Alice
Bob
Table 2: Students2
Name
Bob
Charlie
Query:
Result:
Name
Alice
Handling Duplicates
Key Differences
Removes
Operator Combines/Compares Result Set
Duplicates
All rows from both
UNION Combines results Yes (default)
queries
INTERSEC
Finds common rows Yes Common rows only
T
Finds rows unique to first Rows only in the first
EXCEPT Yes
query query
These operators are powerful tools for data retrieval, especially when combining or comparing
datasets across tables or queries!
A. Here's a detailed explanation of nested queries (subqueries) in SQL, with example tables, SQL
queries, and corresponding outputs for each clause.
A subquery is a query within another query. The subquery's result is used by the outer query.
Subqueries can be used in the WHERE, FROM, and SELECT clauses to perform advanced filtering,
aggregation, or data manipulation.
Subqueries in the WHERE clause are often used to filter rows based on conditions involving other
tables or aggregate results.
Example: Find employees who earn more than the average salary.
employee_i department_i
name salary
d d
1 Alice 50000 101
2 Bob 60000 102
3 Charlie 70000 101
4 David 80000 103
5 Eve 55000 102
SQL Query:
Output:
employee_i
name salary
d
3 Charlie 70000
4 David 80000
Subqueries in the FROM clause are treated as a temporary table (also known as a derived table) that the
outer query can use.
SQL Query:
Subquery Explanation:
Output:
department_i
avg_salary
d
101 60000
102 57500
103 80000
Subqueries in the SELECT clause calculate values dynamically for each row in the result set.
Example: Display each employee's name and the total number of employees in their
department.
SQL Query:
Subquery Explanation:
Output:
department_i total_employee
name
d s
Alice 101 2
Bob 102 2
Charlie 101 2
David 103 1
Eve 102 2
Summary of Clauses
Cla
Purpose Key Use
use
WHE Comparing individual rows or
Filters rows based on subquery results.
RE values.
FRO Treats subquery results as a temporary
Aggregation or joining.
M (derived) table.
SEL Dynamically calculates values for each row in
Adding computed columns.
ECT the output.
Nested queries provide flexibility and power, enabling complex data manipulation and advanced
filtering. Each clause serves a unique purpose, as demonstrated above.
Aggregation operators (or aggregate functions) in SQL perform calculations on a group of rows and
return a single summarized value. These functions are widely used to analyze and summarize data,
often in combination with the GROUP BY clause to aggregate data within specified groups.
1. COUNT(): Counts the rows in a result set or the non-NULL values in a column.
2. SUM(): Computes the total of numeric column values.
3. AVG(): Calculates the average of numeric column values.
4. MAX(): Finds the highest value in a column.
5. MIN(): Finds the lowest value in a column.
employee_i department_i
name salary
d d
1 Alice 101 50000
2 Bob 102 60000
3 Charlie NULL NULL
4 David 103 80000
5 Eve 101 NULL
1. COUNT()
Query:
Explanation:
Output:
non_null_salarie
total_rows
s
5 3
2. SUM()
Query:
Explanation:
Output:
total_salary
190000
3. AVG()
Query:
Explanation:
Output:
average_salar
y
63333.33
4. MAX() and MIN()
Query:
Explanation:
Output:
max_salary min_salary
80000 50000
Query:
Explanation:
Output:
department_i total_employee
avg_salary
d s
101 2 50000
102 1 60000
103 1 80000
NULL 1 NULL
Integrity constraints in SQL are rules enforced on data in tables to ensure its accuracy, reliability, and
consistency. Complex integrity constraints involve rules that maintain the relationships and validity
of data across multiple columns, tables, or conditions.
Types of Constraints
1. CHECK Constraint
The CHECK constraint ensures that all values in a column satisfy a specific condition. It is used for
enforcing business rules at the database level.
Example: Enforcing Age Restriction
Explanation:
Insert Operation:
The FOREIGN KEY constraint maintains referential integrity by ensuring that a value in one table
matches a value in another table.
Explanation:
Insert Operations:
Explanation:
Insert Operation:
Summary
• CHECK: Validates specific conditions on data, ensuring logical consistency within a table.
• FOREIGN KEY: Enforces referential integrity between related tables, ensuring data
consistency across tables.
These constraints together make databases robust, ensuring that invalid or inconsistent data does not
corrupt the system.
DECIMAL(10, 2) in SQL
The DECIMAL data type in SQL is used to store exact numeric values with a fixed number of digits.
The parameters (10, 2) define the precision and scale of the number.
Explanation
1. Precision (10):
a. The total number of digits that the number can have, including both the digits before
and after the decimal point.
b. In DECIMAL(10, 2), the total digits allowed are 10.
2. Scale (2):
a. The number of digits that can appear after the decimal point.
b. In DECIMAL(10, 2), up to 2 digits can appear after the decimal point.
3. Digits Before Decimal:
a. The number of digits allowed before the decimal point is calculated as Precision -
Scale.
b. In DECIMAL(10, 2), the digits before the decimal point are 10 - 2 = 8.
Examples
Use Case
When storing currency or financial data, DECIMAL ensures precise representation, avoiding rounding
errors common with floating-point types like FLOAT or REAL.
Result:
product_nam
product_id price
e
1 Laptop 899.99
2 Mouse 25.50
3 Keyboard 49.95
Summary
• DECIMAL(10, 2):
o Maximum 10 digits total.
o Up to 2 digits after the decimal.
• Ideal for financial or monetary data requiring precision.
A. Triggers in SQL
A trigger in SQL is a special type of stored procedure that is automatically executed (or "triggered")
in response to certain events on a database table or view. Triggers are used to enforce business rules,
validate data, maintain audit logs, or automatically update related data when specific conditions are
met.
Active databases utilize triggers to automate tasks and ensure data integrity and consistency. Triggers
play a crucial role in:
Types of Triggers
1. BEFORE Trigger: Executes before the triggering event (INSERT, UPDATE, DELETE).
2. AFTER Trigger: Executes after the triggering event.
3. INSTEAD OF Trigger: Executes instead of the triggering event (primarily for views).
Syntax of Triggers
1. BEFORE Trigger
A BEFORE trigger is used to validate or modify data before it is inserted or updated in the table.
Explanation:
• The BEFORE INSERT trigger checks the salary value.
• If the salary is less than 3000, the trigger raises an error and prevents the insertion.
Test:
2. AFTER Trigger
An AFTER trigger is used to perform actions after the data has been inserted, updated, or deleted.
Explanation:
• The AFTER INSERT trigger automatically adds a log entry in the audit_log table whenever
a new employee is added.
Test:
Summary
Triggers are powerful tools for maintaining database integrity and automating workflows, but they
should be used carefully to avoid performance overhead.
An audit log is used in conjunction with an AFTER trigger to automatically track changes to a table.
The primary purpose is to maintain a historical record of changes (insertions, updates, deletions) to
ensure accountability, traceability, and compliance with business or regulatory requirements.
• Track Changes: Record details about who made changes, what changes were made, and
when.
• Ensure Transparency: Provide a complete history of data modifications.
• Error Investigation: Helps in debugging and identifying erroneous operations.
• Security Compliance: Meet regulatory standards that require maintaining logs of data
changes.
Example Scenario: Using an AFTER Trigger for Audit Logging
Theory Example
Suppose you have an employees table, and you want to log every time a record is inserted, updated,
or deleted into a separate audit_log table. An AFTER trigger is used here because you want the
logging to occur only after the operation is successfully completed.
Table Schemas
2. audit_log Table: Stores audit trail of operations performed on the employees table.
The following AFTER INSERT trigger logs details into the audit_log table whenever a new
employee is added:
How It Works
1. Trigger Activation:
a. The trigger is activated after a new row is successfully inserted into the employees
table.
2. Logging:
a. The trigger inserts a new row into the audit_log table.
b. This row contains details of the operation (INSERT), the employee_id, name,
salary, and the operation_time.
Result:
operation_typ employee_i
log_id name salary operation_time
e d
2025-01-12
1 INSERT 1 Alice 4000.00
14:30:00
Audit logs, combined with AFTER triggers, are essential tools for ensuring data changes are properly
tracked in active databases.
https://www.geeksforgeeks.org/the-problem-of-redundancy-in-database/
Redundancy refers to the unnecessary duplication of data within a database. While redundancy might
seem harmless, it often leads to several issues, including data inconsistencies, wastage of storage
space, and maintenance complexities.
Key Problems Caused by Redundancy
Examples of Anomalies
Employee_Nam Department_Locatio
Employee_ID Department
e n
1 Alice HR New York
2 Bob IT San Francisco
3 Alice HR New York
4 Charlie Finance Chicago
1. Insertion Anomaly
Problem: Redundancy makes it challenging to insert new information without including unrelated or
duplicated data.
Example:
• If a new department, "Marketing" in "Los Angeles," is introduced but no employees have yet
joined, we cannot insert the department details into the table because Employee_ID and
Employee_Name cannot be null.
Solution: Normalize the table by splitting it into separate tables for employees and departments.
Normalized Tables:
Employees Table:
Employee_Nam
Employee_ID Department_ID
e
1 Alice 101
2 Bob 102
4 Charlie 103
Departments Table:
Department_Locatio
Department_ID Department
n
101 HR New York
102 IT San Francisco
103 Finance Chicago
104 Marketing Los Angeles
2. Update Anomaly
Problem: Redundant data leads to inconsistencies when updates are not applied to all relevant rows.
Example:
• If the location of the "HR" department changes from "New York" to "Boston," all rows
containing "HR" must be updated. If one row is missed, inconsistent data arises.
Employee_Nam Department_Locatio
Employee_ID Department
e n
1 Alice HR Boston
3 Alice HR New York
Solution: Use normalization to store department information in a separate table. Updating the
"Departments" table ensures consistency across all employees.
3. Deletion Anomaly
Example:
• If Charlie (Employee_ID = 4) leaves the organization, the row for "Finance" is deleted. This
inadvertently removes the information that the "Finance" department is located in Chicago.
Solution: Normalize the data by separating employee details and department details into distinct
tables. Deleting an employee does not affect the department information.
Summary
Normalization (splitting data into multiple related tables) is a critical solution to eliminate
redundancy and its associated problems.
Functional Dependency (FD) is a constraint that describes the relationship between attributes in a
relation. In simple terms, it indicates how one attribute uniquely determines another attribute in a
database.
Notation:
A→B
This means that for any two tuples in the table, if A values are the same, the B values must also be the
same.
Role of Functional Dependencies in Normalization
Functional dependencies are fundamental in achieving the First Normal Form (1NF), Second
Normal Form (2NF), and Third Normal Form (3NF) by identifying and eliminating redundancies,
partial dependencies, and transitive dependencies.
• FDs ensure that each column in a relation has a single value that can uniquely determine the
value of another column.
Example:
Unnormalized Table:
Normalized to 1NF:
1. It is in 1NF.
2. It eliminates partial dependencies (non-key attributes are fully dependent on the entire
primary key).
• Identifies attributes that depend only on a part of the primary key and removes such
dependencies.
Example:
1NF Table:
StudentNam CourseNam
StudentID CourseID
e e
1 C101 Alice Math
2 C102 Bob Chemistry
• Functional Dependencies:
o StudentID → StudentName
o CourseID → CourseName
o StudentID, CourseID → (StudentName, CourseName)
StudentNam
StudentID
e
1 Alice
2 Bob
Courses Table:
CourseNam
CourseID
e
C101 Math
C102 Chemistry
Enrollment Table:
StudentID CourseID
1 C101
2 C102
3. Third Normal Form (3NF)
1. It is in 2NF.
2. It eliminates transitive dependencies (non-key attributes depend on other non-key
attributes).
• FDs help identify transitive dependencies and eliminate them by creating separate tables.
Example:
2NF Table:
Instructor InstructorOffice
Smith Room 101
Johnson Room 102
Enrollment Table:
Normal
Key Condition Role of Functional Dependencies
Form
Atomic values, no Ensures unique and atomic data based on FDs
1NF
repeating groups like A → B.
No partial Eliminates FDs where attributes depend only on
2NF
dependency part of the primary key.
No transitive Removes FDs where non-key attributes depend on
3NF
dependency other non-key attributes.
Functional dependencies are crucial in identifying and resolving anomalies, ensuring a well-structured
and efficient database design.
1. It is already in 3NF.
2. Every determinant is a candidate key:
a. A determinant is an attribute that uniquely determines another attribute.
b. BCNF eliminates anomalies caused by functional dependencies.
Example:
3NF Table:
Here, CourseID determines Instructor, but CourseID is not a candidate key (it does not
uniquely identify rows).
CourseID Instructor
C101 Smith
C101 Johnson
Enrollment Table:
StudentID CourseID
1 C101
2 C101
Summary Table:
BCNF is a critical step in database normalization that ensures optimal data organization by
eliminating redundancy and dependency anomalies. Here's why BCNF is important:
1. Eliminates Redundancy
• BCNF reduces data duplication in the database by addressing functional dependencies that
could lead to repeated data.
• Example: A table where a course determines the instructor (Course → Instructor) may
lead to the instructor's name being repeated across rows. Splitting into separate tables resolves
this.
• In a non-BCNF table, updating one instance of a repeated value requires updating all rows
containing that value.
• Example: If the instructor for "Math" changes, you would need to update all rows with
"Math." BCNF prevents such inconsistencies.
• Deleting a record in a non-BCNF table might result in the unintended loss of important data.
• Example: If a student withdraws from a course, the instructor information might also be lost.
BCNF ensures that instructor data is stored independently.
4. Ensures Data Integrity
• BCNF enforces rules that make sure every determinant is a candidate key, meaning no
attribute depends on non-unique data.
• This ensures that data relationships are logically sound and consistent.
5. Simplifies Querying
• A properly normalized table (in BCNF) is easier to query and understand because
dependencies are clear and straightforward.
• This leads to better performance and simpler joins in complex queries.
• By minimizing redundancy, BCNF reduces the amount of storage space required to maintain
the database, which is particularly beneficial for large-scale systems.
7. Enhances Scalability
• A database in BCNF can handle changes in requirements (e.g., adding new attributes or
relationships) more easily because the structure is clean and modular.
Non-BCNF Table:
BCNF Table:
Course-Instructor Table:
CourseID Instructor
C101 Smith
C102 Johnson
Course-Room Table:
CourseID Room
C101 Room 101
C102 Room 102
• Advantage: Updating or deleting data is now simpler and does not cause inconsistencies or
data loss.
By ensuring every determinant is a candidate key, BCNF offers a robust foundation for maintaining
efficient, reliable, and scalable databases.
https://www.geeksforgeeks.org/lossless-decomposition-in-dbms/
Lossless join decomposition is a key concept in relational database design. It ensures that when a
table is decomposed into two or more smaller tables (to achieve normalization), the original table can
be reconstructed without any loss of information by performing a natural join on the decomposed
tables.
• When decomposing a table, lossless join guarantees that no tuples or data are lost.
• It avoids anomalies and ensures data consistency.
A decomposition is lossless if at least one of the following conditions holds for the decomposed
relations R1R_1 and R2R_2:
Here:
• R1∩R2R_1 \cap R_2: The common attributes between R1R_1 and R2R_2.
• Superkey: A set of attributes that uniquely identifies tuples in a table.
Example of Lossless Join
DeptNam
EmpID EmpName DeptID
e
1 Alice D101 HR
2 Bob D102 IT
3 Carol D101 HR
Decomposition:
DeptNam
DeptID
e
D101 HR
D102 IT
Performing a natural join on the decomposed tables reconstructs the original table.
DeptNam
EmpID EmpName DeptID
e
1 Alice D101 HR
2 Bob D102 IT
3 Carol D101 HR
Decomposition:
EmpID Salary
1 5000
2 6000
3 5000
EmpID DeptID
1 D101
2 D102
3 D101
This reconstructed table does not match the original relation due to ambiguity in the join, making the
decomposition lossy.
Key Takeaways
• Lossless join decomposition is essential to ensure no data is lost when a table is divided into
smaller relations.
• To verify lossless join, ensure that the common attributes between decomposed relations act
as a superkey in at least one of the relations.
• Lossy decomposition can lead to incorrect or incomplete reconstruction, causing anomalies
and inconsistencies.
A Multivalued Dependency (MVD) occurs when one attribute in a table determines a set of values
for another attribute, independent of other attributes.
This means that for a given value of attribute AA, there are multiple values of attribute BB associated
with it, while CC remains independent of BB.
An MVD A→→BA \rightarrow\rightarrow B exists in a relation if, for every pair of tuples with the
same value of AA, the set of BB values associated with AA is independent of other attributes.
Here:
The MovieMovie and AwardAward attributes are independent of each other but depend on
ActorActor.
Problems Caused by Multivalued Dependencies
1. It is already in BCNF.
2. It has no non-trivial multivalued dependencies.
In 4NF, tables are decomposed to eliminate multivalued dependencies while maintaining a lossless
join.
Example:
Relation 1 (Actor-Movie):
Actor Movie
John Action Hero
John Spy Thriller
Relation 2 (Actor-Award):
Actor Award
John Best Actor
1. It is already in 4NF.
2. It has no join dependency that is not implied by candidate keys.
A join dependency occurs when a relation can be reconstructed by joining multiple smaller relations,
but not by joining any subset of them.
Here:
Relation 1 (Supplier-Part):
Supplier Part
S1 P1
S1 P2
S2 P1
Relation 2 (Part-Project):
Part Project
P1 PR1
P1 PR2
P2 PR1
Relation 3 (Supplier-Project):
Supplier Project
S1 PR1
S2 PR2
This decomposition eliminates the join dependency, ensures no redundancy, and maintains lossless
joins.
Summary
UNIT 4
1. Explain the concept of a database transaction. Discuss
the different states of a transaction with the help of a
state diagram
A. Database Transaction
States of a Transaction
A transaction goes through multiple states during its lifecycle. These states are:
1. Active: The transaction begins and is in progress. Operations like INSERT, DELETE,
UPDATE, or SELECT are performed during this phase.
2. Partially Committed: After the final statement of the transaction is executed but
before the changes are permanently saved, the transaction enters this state.
3. Committed: Once the changes are successfully written to the database, the
transaction moves to the committed state.
4. Failed: If an error occurs during the transaction execution, the transaction enters this
state.
5. Aborted: If a failure occurs or the transaction is explicitly rolled back, it transitions to
the aborted state. Any changes made are undone.
6. Terminated: The transaction ends, either after being committed or aborted.
+-------------------+
| Partially |
+---->| Committed |
| +-------------------+
| |
| v
+--------+ +-----------------+ +------------+
| Active |----->| Committed |----->| Terminated |
+--------+ +-----------------+ +------------+
| ^
| |
| +----------------+ |
+-----> | Failed |-----------+
+----------------+
|
v
+------------+
| Aborted |
+------------+
|
v
+------------+
| Terminated |
+------------+
This state diagram encapsulates the lifecycle of a database transaction and highlights the
importance of maintaining database consistency and reliability through the use of these states.
Here are examples of all four ACID properties (Atomicity, Consistency, Isolation, and
Durability) to explain how they work in the context of database transactions:
1. Atomicity
Atomicity ensures that all operations in a transaction are completed successfully as a single
unit of work, or none are performed.
BEGIN TRANSACTION;
UPDATE accounts
SET balance = balance - 100
WHERE account_id = 'A';
UPDATE accounts
SET balance = balance + 100
WHERE account_id = 'B';
COMMIT;
• If Step 1 fails (e.g., insufficient balance in Account A), the transaction is rolled back:
ROLLBACK;
Result: Either both steps occur, or neither occurs. No partial deduction or credit happens.
2. Consistency
Consistency ensures the database remains in a valid state before and after the transaction.
Scenario: A student is enrolled in a course, and the course's seat count is updated.
• Transaction Steps:
o Add the student to the enrollment table.
o Deduct one seat from the course's available seats.
• SQL:
BEGIN TRANSACTION;
COMMIT;
• If a failure occurs in any step, the database state is inconsistent (e.g., the student is
added, but seats are not updated). Hence, the transaction is rolled back.
Result: The database enforces integrity by ensuring seat counts and enrollments are
consistent.
3. Isolation
Isolation ensures that concurrent transactions do not interfere with each other.
Scenario: Two customers, Customer A and Customer B, attempt to book the last available
movie ticket simultaneously.
• Without Isolation:
o Both transactions read that 1 ticket is available.
o Both deduct 1 ticket.
o Tickets are overbooked, causing inconsistency.
• With Isolation: Using isolation levels like Serializable, one transaction locks the
ticket row, preventing the other transaction from proceeding until the first is
completed.
• SQL:
BEGIN TRANSACTION;
COMMIT;
-- Customer B's transaction will wait until Customer A's completes.
4. Durability
Durability ensures that once a transaction is committed, its changes are permanent, even in
case of a system crash.
Scenario: A customer places an order, and the order details are saved to the database.
• Transaction Steps:
o Deduct stock from inventory.
o Record the order in the orders table.
• SQL:
BEGIN TRANSACTION;
UPDATE inventory
SET stock = stock - 1
WHERE product_id = 'P001';
COMMIT;
• After the COMMIT, the changes are written to non-volatile storage (e.g., disk).
Result: Even if the server crashes immediately after the commit, the changes remain intact
when the system restarts.
Propert
Scenario Outcome
y
Atomici Both debit and credit occur, or neither
Bank account transfer
ty occurs.
Consist Database integrity is maintained (seats and
Student course enrollment
ency enrollments).
Isolatio Two customers booking the Concurrent transactions are properly
n last ticket managed.
Durabil Committed changes survive system
Placing an order
ity failures.
These examples demonstrate how ACID properties ensure reliable and consistent database
operations.