Unit 1 Notes
Unit 1 Notes
A Database Management System (DBMS) is software that facilitates the creation, management, and
interaction with databases. It provides an efficient, secure, and convenient way to store, retrieve, and
manipulate data.
What is a Database?
A database is an organized collection of data that can be easily accessed, managed, and updated. Examples
include customer records, inventory systems, financial transactions, etc.
Functions of a DBMS
o Stores data efficiently and allows users to retrieve and update it easily.
o Ensures data consistency, accuracy, and security through constraints and user access control.
6. Data Independence:
Types of DBMS
1. Hierarchical DBMS:
Disadvantages of DBMS
o MySQL
o PostgreSQL
o Oracle Database
o MongoDB (Document-based)
o Cassandra (Column-family)
o Redis (Key-value)
o Neo4j (Graph-based)
The File Processing System (FPS) is an early method of managing data where information is stored in separate
files, often managed manually by different programs. Although it was widely used before modern databases,
it has several limitations that led to the development of Database Management Systems (DBMS).
o Impact:
o Example: Incorrect values for age or salary may be entered without validation.
o Impact:
o Impact:
▪ Duplication of effort.
4. Data Isolation
o Problem: Data is scattered across multiple files and formats, making it hard to retrieve.
o Example: Retrieving data from sales and customer records requires separate manual searches.
o Impact:
5. Program-Data Dependence
o Impact:
6. Concurrency Issues
o Problem: Multiple users accessing the same file can cause conflicts.
o Example: Two users updating an inventory record simultaneously may overwrite changes.
o Impact:
▪ Data inconsistencies.
o Problem: No built-in support for data backup and recovery in case of failure.
o Impact:
8. Scalability Issues
o Example: Searching for a record in large files becomes slower over time.
o Impact:
9. Lack of Standardization
o Example: Payroll may use CSV files, while HR uses Excel sheets.
o Impact:
• Impact:
A Database Management System (DBMS) is essential for efficiently storing, managing, and retrieving data in
a structured manner. Traditional file processing systems have several limitations, such as data redundancy,
inconsistency, and difficulty in managing large volumes of data. DBMS overcomes these limitations and
provides a robust, secure, and scalable way to handle data.
o Problem in File Systems: Data duplication occurs when the same data is stored in multiple
files, leading to inconsistencies.
o DBMS Solution:
o Example: Customer details are stored once and referenced across multiple modules (billing,
orders, etc.).
o Problem in File Systems: Difficult for multiple users to access and modify data
simultaneously.
o DBMS Solution:
o Example: Employees in different departments can access and update data without conflicts.
o Problem in File Systems: No control over unauthorized access and accidental data loss.
o DBMS Solution:
▪ Role-based access ensures only authorized users can modify or view sensitive data.
o DBMS Solution:
▪ Enforces integrity constraints such as Primary Key, Foreign Key, and Unique
constraints.
o Problem in File Systems: Data is stored in different files across multiple locations.
o DBMS Solution:
o Example: A single database for an entire enterprise improves decision-making and reporting.
o Problem in File Systems: Manual backup procedures and no efficient recovery mechanism.
o DBMS Solution:
o DBMS Solution:
▪ Provides a structured framework with built-in functions for data handling (SQL).
o Example: Developers can quickly retrieve data using simple SQL queries instead of complex
programming logic.
8. Data Independence
o Problem in File Systems: Changes in data structure require modifying application code.
o DBMS Solution:
o Example: Adding a new column to a table does not require changes in user applications.
o DBMS Solution:
▪ Efficient indexing, query optimization, and partitioning mechanisms for large data
handling.
• Problem in File Systems: Simultaneous updates by multiple users may lead to data inconsistencies.
• DBMS Solution:
o Ensures data consistency using locking mechanisms and transaction isolation levels.
1. Data
• Definition: Raw facts and figures that have no meaning until processed.
2. Database
• Definition: A structured collection of related data that can be accessed, managed, and updated easily.
• Example: A library database containing book records with titles, authors, and publication years.
• Definition: A software system that allows users to create, manage, and manipulate databases.
4. Schema
• Definition: The logical structure of the database that defines tables, columns, and relationships.
• Example: A schema of an employee database includes tables for employees, departments, and salaries.
5. Table (Relation)
• Definition: A collection of rows and columns that store related data in a structured format.
• Example: A "Students" table with columns like ID, Name, Age, Course.
• Example: (101, John, 25, CS) represents one student in the "Students" table.
• Definition: A specific data point in a table that represents a characteristic of the entity.
• Example: Name, Age, and Course are attributes of the "Students" table.
8. Primary Key
• Definition: A unique identifier for each row in a table to ensure no duplicate records.
• Definition: An attribute in one table that refers to the primary key of another table to establish
relationships.
• Definition: A set of attributes that can uniquely identify a row. One candidate key is chosen as the
primary key.
• Definition: A key formed by combining two or more columns to uniquely identify a row.
• Definition: A set of one or more attributes that uniquely identify a record in a table.
13. Index
• Definition: A data structure used to quickly locate and access data within a table.
14. Normalization
• Definition: The process of organizing data to reduce redundancy and dependency by dividing tables
logically.
• Example: Splitting a "Students" table into separate "Students" and "Courses" tables.
15. Denormalization
• Definition: The process of combining tables to improve query performance at the cost of redundancy.
17. Transaction
• Example: Transferring money between bank accounts (debit and credit operations).
8. Query
• Example:
• Example:
• Example: Constraints like NOT NULL, UNIQUE, and CHECK maintain data integrity.
• Definition: Issues arising from data redundancy and inconsistency during operations like insert,
update, and delete.
• Example: An update anomaly when the same data must be changed in multiple places.
• Example: A student record cannot exist without a valid department in the "Departments" table.
24. Views
• Definition: A virtual table based on the result of an SQL query that does not store data itself.
• Example:
• Definition: A precompiled set of SQL statements stored in the database for reuse.
• Example:
26. Trigger
• Definition: A special procedure that automatically executes when a specified event occurs in the
database.
27. Deadlock
• Definition: A situation where two or more transactions are waiting for each other to release resources,
causing a standstill.
• Example: Transaction A waits for a record locked by transaction B, while B waits for A.
28. Replication
• Definition: The process of copying data from one database to another for backup or distribution
purposes.
29. Partitioning
• Definition: Dividing a large table into smaller, more manageable pieces for better performance.
• OLAP (Online Analytical Processing): Used for complex queries and data analysis (e.g., business
intelligence).
The architecture of a database system defines how data is stored, processed, and accessed within the system.
It provides a systematic way to manage databases efficiently and securely. There are two primary types of
database architectures:
The three-tier architecture separates the database system into different layers to enhance performance,
scalability, and maintainability. The three levels are:
• Description:
o The lowest level of the database architecture that defines how data is stored physically on
storage devices (hard disks, SSDs, etc.).
• Responsibilities:
• Description:
o This level provides a logical view of the entire database and defines relationships among data.
o It specifies what data is stored and the relationships between them without focusing on how
they are physically stored.
• Description:
o Different users may require different views of the same data based on their roles.
• Responsibilities:
• Example: A bank teller sees only customer balance, while the manager sees all customer information.
The physical database architecture defines how the database system is physically deployed and accessed
by users. It is commonly categorized into:
a) Centralized Architecture
• Description:
• Advantages:
b) Distributed Architecture
• Description:
o Each site can have its local database but communicate with others.
• Advantages:
• Disadvantages:
c) Client-Server Architecture
• Description:
• Advantages:
A database system consists of several key components that work together to ensure data is managed
efficiently.
2. Software: The DBMS software that manages the database (e.g., MySQL, PostgreSQL, Oracle).
• Example: MS Access.
2. Two-Tier Architecture
3. Three-Tier Architecture
A data model defines how data is structured, organized, and manipulated within a database system. It
provides a systematic way to store and retrieve data while ensuring consistency and integrity.
• Concept:
• Example:
o Organization structure:
Company
├── HR Department
├── IT Department
Advantages:
Disadvantages:
• Concept:
o Represents data using a graph-like structure with nodes and relationships (many-to-many
relationships).
• Example:
o A student can enroll in multiple courses, and each course can have multiple students.
• Advantages:
• Disadvantages:
• Concept:
o Organizes data into tables (relations) consisting of rows (tuples) and columns (attributes).
Students Table:
+-----------+---------+
| StudentID | Name |
+-----------+---------+
| 101 | Alice |
| 102 | Bob |
+-----------+---------+
Advantages:
Disadvantages:
• Concept:
• Example:
o A "Car" object may contain attributes like Brand, Model, and behaviors like Start() and
Stop().
• Advantages:
Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
o Useful for complex data like multimedia, CAD/CAM.
• Disadvantages:
• Concept:
o Represents data as entities (objects) and relationships between them using diagrams.
• Example:
o Entity: Student
• Advantages:
• Disadvantages:
• Concept:
o Used in NoSQL databases, storing data as key-value pairs, documents, column-family stores,
or graphs.
• Types:
o Column-Family Model: Data stored in columns rather than rows (e.g., Cassandra).
• Advantages:
• Disadvantages:
a) Entities
• Types of Entities:
o Weak Entity: Depends on a strong entity and cannot exist without it (e.g., Order Item,
dependent on Order).
Notation:
• Represented by a rectangle.
o Example: STUDENT
Attributes
• Types of Attributes:
2. Composite: Can be broken down into smaller components (e.g., Name → First Name, Last
Name).
3. Derived: Calculated from other attributes (e.g., Age from Date of Birth).
Notation:
• Types of Relationships:
2. One-to-Many (1:M): One entity in A can relate to multiple entities in B, but not vice versa.
Notation:
Keys
• Types of Keys:
a) Cardinality
Defines the number of instances of an entity that can be associated with instances of another entity.
• Types:
Notation:
• 1: Single relationship
• M or N: Multiple relationships
b) Participation
The Enhanced ER Model (EER) extends the basic ER model by adding more sophisticated concepts to
handle complex database requirements.
a) Generalization
b) Specialization
c) Aggregation
• Notation: Draw a box around the relationship and relate it to another entity.
d) Categorization
Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
• A way to represent a single entity as a subclass of multiple superclasses.
Example of an ER Diagram
Entities:
Relationships:
+-------------+ +-----------+
| STUDENT | | COURSE |
+-------------+ +-----------+
| |
| |
+-------------+ +-----------+
| PROFESSOR |
+-------------+
Advantages of ER Diagrams
Once the ER diagram is designed, it needs to be converted into a relational schema using the following
steps:
2. Convert Relationships.
3. Define Keys.
The Extended Entity-Relationship (EER) model extends the basic ER model by incorporating additional
concepts such as generalization, specialization, and aggregation, which help in designing more complex and
realistic data models.
1. Generalization
Definition:
Generalization is the process of combining two or more lower-level entity sets into a higher-level entity set
based on common attributes. It abstracts common characteristics and creates a more generalized, higher-
level entity.
Example:
Consider Car and Truck as separate entities with common attributes such as vehicle_id, manufacturer, and
price. Using generalization, we can create a higher-level entity called Vehicle that encompasses these
common attributes.
Vehicle
/ \
Car Truck
Key Points:
manufacturer VARCHAR(50),
price DECIMAL(10,2)
);
car_type VARCHAR(50),
);
load_capacity INT,
Specialization
Definition:
Specialization is the opposite of generalization. It involves creating subclasses from a higher-level entity
based on distinguishing characteristics. It adds more specific attributes to each subclass.
Example:
If we have a general entity Employee, we can specialize it into Manager and Technician, with each having
unique attributes such as manager_bonus and technical_skill.
Diagram Representation:
Employee
/ \
Manager Technician
Key Points:
• Introduces constraints like disjoint (no overlap between subclasses) or overlapping (an instance can
belong to multiple subclasses).
name VARCHAR(50),
salary DECIMAL(10,2)
);
manager_bonus DECIMAL(10,2),
Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
FOREIGN KEY (emp_id) REFERENCES Employee(emp_id)
);
technical_skill VARCHAR(50),
);
3. Aggregation
Definition:
Aggregation is a higher-level abstraction that treats relationships between entities as higher-level entities
themselves. It allows modeling of complex relationships involving relationships.
Example:
Consider an entity Project that is assigned to an entity Employee, and there is a relationship Works_On. If
we want to track which Department oversees a given project, we can aggregate the Works_On relationship
into a higher-level entity.
Diagram Representation:
Employee Project
\ /
Works_On
Overseen_By
Department
Key Points:
name VARCHAR(50)
);
proj_name VARCHAR(50)
);
emp_id INT,
proj_id INT,
hours_worked INT,
);
dept_name VARCHAR(50)
);
proj_id INT,
dept_id INT,
);
1. Use Generalization when you find multiple entities with common attributes and want to simplify
your design.
2. Use Specialization when you need to introduce more specific attributes for some entities.
3. Use Aggregation when dealing with complex relationships involving multiple entities.
4. Enrollment – Represents the association between students and courses (many-to-many relationship).
1. Student
o first_name
o last_name
o email
o phone
o date_of_birth
2. Course
o course_name
o credits
3. Instructor
o first_name
o last_name
o email
o phone
o enrollment_date
5. Department
o department_name
o head_of_department
6. Grades
o grade
o comments
1. Student enrolls in multiple Courses, and each course can have multiple students. (Many-to-Many)
2. Each Course is taught by one Instructor, but an instructor can teach multiple courses. (One-to-
Many)
4. Each Student belongs to one Department, but a department can have many students. (One-to-
Many)
Each Enrollment belongs to a single Student and a single Course. (Many-to-One for both)
+-------------------+ +-------------------+
| Department | | Instructor |
+-------------------+ +-------------------+
| head_of_department | | first_name |
+-------------------+ | last_name |
| +-------------------+
| (One-to-Many)
+-------------------------------+
| Course |
+-------------------------------+
| course_id (PK) |
| course_name |
| credits |
| department_id (FK) |
+-------------------------------+
+----------------------+ +----------------------+
| Enrollment | | Student |
+----------------------+ +----------------------+
| enrollment_date | | email |
+----------------------+ | phone |
+----------------------+
| (One-to-Many)
+----------------+
| Grades |
+----------------+
| grade_id (PK) |
| student_id (FK) |
| course_id (FK) |
| grade |
| comments |
+----------------+
department_name VARCHAR(100),
head_of_department VARCHAR(100)
);
-- Instructor Table
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100),
phone VARCHAR(15),
department_id INT,
);
-- Student Table
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100),
phone VARCHAR(15),
department_id INT,
);
-- Course Table
course_name VARCHAR(100),
credits INT,
department_id INT,
);
student_id INT,
course_id INT,
enrollment_date DATE,
);
-- Grades Table
student_id INT,
course_id INT,
grade CHAR(2),
comments VARCHAR(255),
);
FROM Student s
WHERE department_id = 1;
FROM Grades g
GROUP BY c.course_name;