0% found this document useful (0 votes)
11 views39 pages

Unit 1 Notes

This document provides an overview of Database Management Systems (DBMS), including their functions, types, advantages, and disadvantages. It highlights the limitations of traditional file processing systems and explains the need for DBMS in managing data efficiently. Key terminologies related to databases, such as data, schema, tables, and ACID properties, are also defined.

Uploaded by

sa3251
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views39 pages

Unit 1 Notes

This document provides an overview of Database Management Systems (DBMS), including their functions, types, advantages, and disadvantages. It highlights the limitations of traditional file processing systems and explains the need for DBMS in managing data efficiently. Key terminologies related to databases, such as data, schema, tables, and ACID properties, are also defined.

Uploaded by

sa3251
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

UNIT 1

Introduction to DBMS (Database Management System)

A Database Management System (DBMS) is software that facilitates the creation, management, and
interaction with databases. It provides an efficient, secure, and convenient way to store, retrieve, and
manipulate data.

What is a Database?

A database is an organized collection of data that can be easily accessed, managed, and updated. Examples
include customer records, inventory systems, financial transactions, etc.

Functions of a DBMS

1. Data Storage, Retrieval, and Manipulation:

o Stores data efficiently and allows users to retrieve and update it easily.

2. Data Integrity and Security:

o Ensures data consistency, accuracy, and security through constraints and user access control.

3. Data Redundancy Reduction:

o Minimizes data duplication and optimizes storage.

4. Multi-user Access Control:

o Allows concurrent access to data by multiple users while maintaining consistency.

5. Backup and Recovery:

o Provides mechanisms to recover data in case of system failures.

6. Data Independence:

o Separation of data structure from application programs, making changes easier.

Types of DBMS

1. Hierarchical DBMS:

o Data is organized in a tree-like structure.

o Example: IBM's Information Management System (IMS).

o Pros: Fast access; Cons: Difficult to modify relationships.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
2. Network DBMS:

o Data is represented using a graph with multiple parent-child relationships.

o Example: Integrated Data Store (IDS).

o Pros: More flexible than hierarchical; Cons: Complex structure.

3. Relational DBMS (RDBMS):

o Data is stored in tables (relations) with rows and columns.

o Example: MySQL, PostgreSQL, Oracle, SQL Server.

o Pros: Easy to use and highly flexible.

4. Object-Oriented DBMS (OODBMS):

o Data is stored as objects, similar to programming concepts (OOP).

o Example: ObjectDB, db4o.

o Pros: Good for complex data types.

Advantages of Using DBMS

• Improved Data Sharing: Multiple users can access data concurrently.

• Data Consistency: Ensures accuracy and integrity.

• Security: Provides role-based access control and encryption.

• Scalability: Can handle large amounts of data efficiently.

• Data Abstraction: Hides the complexity from the end user.

Disadvantages of DBMS

• Complexity: Setting up and managing a DBMS requires expertise.

• Cost: Licensing and maintenance can be expensive.

• Performance Overhead: More resource-intensive compared to simple file storage.

• Failure Risks: Centralized control increases failure impact.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
Popular DBMS Software

1. Relational DBMS (RDBMS):

o MySQL

o PostgreSQL

o Oracle Database

o Microsoft SQL Server

2. NoSQL DBMS (Non-Relational):

o MongoDB (Document-based)

o Cassandra (Column-family)

o Redis (Key-value)

o Neo4j (Graph-based)

Key Components of a DBMS

1. DBMS Engine: Manages data storage, processing, and retrieval.

2. Database Schema: Defines the logical structure of the database.

3. Query Processor: Interprets and executes queries (e.g., SQL).

4. Transaction Management: Ensures data consistency through ACID properties.

5. Backup and Recovery Systems: Handles data loss prevention.

Issues in File Processing System

The File Processing System (FPS) is an early method of managing data where information is stored in separate
files, often managed manually by different programs. Although it was widely used before modern databases,
it has several limitations that led to the development of Database Management Systems (DBMS).

Key Issues in File Processing System

1. Data Redundancy and Inconsistency

o Problem: Same data is stored in multiple files, leading to duplication.

o Example: Customer details stored separately in billing and order files.

o Impact:

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
▪ Increases storage space requirements.

▪ Changes in one file may not reflect in others, causing inconsistencies.

2. Lack of Data Integrity and Security

o Problem: No centralized control to enforce data integrity rules (e.g., constraints).

o Example: Incorrect values for age or salary may be entered without validation.

o Impact:

▪ Leads to unreliable data.

▪ No access control—unauthorized users can modify data.

3. Limited Data Sharing and Accessibility

o Problem: Different applications maintain separate files with no shared access.

o Example: Payroll system cannot access HR data easily.

o Impact:

▪ Duplication of effort.

▪ Difficulties in data sharing across departments.

4. Data Isolation

o Problem: Data is scattered across multiple files and formats, making it hard to retrieve.

o Example: Retrieving data from sales and customer records requires separate manual searches.

o Impact:

▪ Complex queries are difficult to implement.

▪ Inconsistent data formats cause compatibility issues.

5. Program-Data Dependence

o Problem: File structures are hard-coded in application programs.

o Example: Any changes in file format require rewriting application code.

o Impact:

▪ High maintenance cost.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
▪ Slows down system upgrades and modifications.

6. Concurrency Issues

o Problem: Multiple users accessing the same file can cause conflicts.

o Example: Two users updating an inventory record simultaneously may overwrite changes.

o Impact:

▪ Data inconsistencies.

▪ Difficulty in multi-user environments.

7. Lack of Backup and Recovery Mechanisms

o Problem: No built-in support for data backup and recovery in case of failure.

o Example: If a file is accidentally deleted, it may be lost permanently.

o Impact:

▪ Risk of data loss.

▪ Business continuity is affected.

8. Scalability Issues

o Problem: As data grows, managing files manually becomes challenging.

o Example: Searching for a record in large files becomes slower over time.

o Impact:

▪ Poor performance with increasing data volume.

▪ Limited ability to handle growing data needs.

9. Lack of Standardization

o Problem: Different applications use different file formats and structures.

o Example: Payroll may use CSV files, while HR uses Excel sheets.

o Impact:

▪ Difficult integration between systems.

▪ Data exchange requires conversion efforts.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
10. Complex Data Relationships

• Problem: Difficult to represent complex relationships between data elements.

• Example: Customer orders linked to multiple products and payments.

• Impact:

o Hard to perform relational queries.

o Data interconnectivity is difficult to maintain.

Need for DBMS

A Database Management System (DBMS) is essential for efficiently storing, managing, and retrieving data in
a structured manner. Traditional file processing systems have several limitations, such as data redundancy,
inconsistency, and difficulty in managing large volumes of data. DBMS overcomes these limitations and
provides a robust, secure, and scalable way to handle data.

Key Reasons for Using a DBMS

1. Data Redundancy and Inconsistency Control

o Problem in File Systems: Data duplication occurs when the same data is stored in multiple
files, leading to inconsistencies.

o DBMS Solution:

▪ Centralized data storage reduces redundancy.

▪ Ensures data consistency through integrity constraints and normalization.

o Example: Customer details are stored once and referenced across multiple modules (billing,
orders, etc.).

2. Efficient Data Sharing and Multi-user Access

o Problem in File Systems: Difficult for multiple users to access and modify data
simultaneously.

o DBMS Solution:

▪ Supports concurrent access with transaction management (ACID properties).

▪ Role-based access control to manage data sharing securely.

o Example: Employees in different departments can access and update data without conflicts.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
3. Improved Data Security

o Problem in File Systems: No control over unauthorized access and accidental data loss.

o DBMS Solution:

▪ Provides authentication, authorization, and encryption mechanisms.

▪ Role-based access ensures only authorized users can modify or view sensitive data.

o Example: Only HR personnel can access employee salary details.

4. Data Integrity and Accuracy

o Problem in File Systems: No control over incorrect or duplicate data entries.

o DBMS Solution:

▪ Enforces integrity constraints such as Primary Key, Foreign Key, and Unique
constraints.

▪ Ensures consistent and accurate data across the system.

o Example: Prevents duplicate entries of a customer by enforcing unique constraints.

5. Centralized Data Management

o Problem in File Systems: Data is stored in different files across multiple locations.

o DBMS Solution:

▪ Centralized storage allows easier management and access.

▪ Simplifies maintenance and improves data visibility.

o Example: A single database for an entire enterprise improves decision-making and reporting.

6. Data Backup and Recovery

o Problem in File Systems: Manual backup procedures and no efficient recovery mechanism.

o DBMS Solution:

▪ Automated backup mechanisms prevent data loss.

▪ Recovery options allow restoring data in case of system failures.

o Example: Recovery features ensure business continuity in case of accidental deletions.

7. Reduced Application Development Time


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
o Problem in File Systems: Developers need to write complex code to handle data storage and
retrieval.

o DBMS Solution:

▪ Provides a structured framework with built-in functions for data handling (SQL).

▪ Simplifies application development with CRUD (Create, Read, Update, Delete)


operations.

o Example: Developers can quickly retrieve data using simple SQL queries instead of complex
programming logic.

8. Data Independence

o Problem in File Systems: Changes in data structure require modifying application code.

o DBMS Solution:

▪ Provides physical and logical data independence.

▪ Changes in database structure do not affect applications.

o Example: Adding a new column to a table does not require changes in user applications.

9. Scalability and Performance Optimization

o Problem in File Systems: Poor performance with increasing data size.

o DBMS Solution:

▪ Efficient indexing, query optimization, and partitioning mechanisms for large data
handling.

▪ Supports scalability to handle large volumes of transactions.

o Example: E-commerce platforms handle millions of transactions efficiently.

10. Concurrency Control

• Problem in File Systems: Simultaneous updates by multiple users may lead to data inconsistencies.

• DBMS Solution:

o Ensures data consistency using locking mechanisms and transaction isolation levels.

• Example: Prevents double booking of hotel rooms in an online reservation system.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
Basic terminologies of Database

1. Data

• Definition: Raw facts and figures that have no meaning until processed.

• Example: "John," "25," and "NYC" are individual pieces of data.

2. Database

• Definition: A structured collection of related data that can be accessed, managed, and updated easily.

• Example: A library database containing book records with titles, authors, and publication years.

3. DBMS (Database Management System)

• Definition: A software system that allows users to create, manage, and manipulate databases.

• Examples: MySQL, PostgreSQL, Oracle, MongoDB.

4. Schema

• Definition: The logical structure of the database that defines tables, columns, and relationships.

• Example: A schema of an employee database includes tables for employees, departments, and salaries.

5. Table (Relation)

• Definition: A collection of rows and columns that store related data in a structured format.

• Example: A "Students" table with columns like ID, Name, Age, Course.

6. Row (Tuple, Record)

• Definition: A single entry or record in a table.

• Example: (101, John, 25, CS) represents one student in the "Students" table.

7. Column (Attribute, Field)

• Definition: A specific data point in a table that represents a characteristic of the entity.

• Example: Name, Age, and Course are attributes of the "Students" table.

8. Primary Key

• Definition: A unique identifier for each row in a table to ensure no duplicate records.

• Example: The Student_ID in a "Students" table.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
9. Foreign Key

• Definition: An attribute in one table that refers to the primary key of another table to establish
relationships.

• Example: Department_ID in the "Employees" table, referencing the "Departments" table.

10. Candidate Key

• Definition: A set of attributes that can uniquely identify a row. One candidate key is chosen as the
primary key.

• Example: Email and Phone_Number in an "Employees" table can be candidate keys.

11. Composite Key

• Definition: A key formed by combining two or more columns to uniquely identify a row.

• Example: OrderID and ProductID together to identify a specific order item.

12. Super Key

• Definition: A set of one or more attributes that uniquely identify a record in a table.

• Example: Student_ID alone or the combination of Student_ID and Email.

13. Index

• Definition: A data structure used to quickly locate and access data within a table.

• Example: Indexing the Last_Name column to speed up search queries.

14. Normalization

• Definition: The process of organizing data to reduce redundancy and dependency by dividing tables
logically.

• Example: Splitting a "Students" table into separate "Students" and "Courses" tables.

15. Denormalization

• Definition: The process of combining tables to improve query performance at the cost of redundancy.

• Example: Merging "Orders" and "Customers" tables for faster access.

16. ACID Properties

• Definition: Set of properties to ensure reliable transactions in a database.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
o Atomicity: All operations are completed or none.

o Consistency: Data remains valid before and after the transaction.

o Isolation: Transactions do not interfere with each other.

o Durability: Changes persist even after system failures.

17. Transaction

• Definition: A set of operations performed as a single unit of work.

• Example: Transferring money between bank accounts (debit and credit operations).

8. Query

• Definition: A request to retrieve or manipulate data from a database.

• Example:

SELECT * FROM Students WHERE Age > 20;

19. SQL (Structured Query Language)

• Definition: A standard language used to interact with relational databases.

• Example:

INSERT INTO Employees (Name, Age) VALUES ('Alice', 30);

20. Data Integrity

• Definition: The accuracy, consistency, and reliability of data stored in a database.

• Example: Constraints like NOT NULL, UNIQUE, and CHECK maintain data integrity.

21. Data Redundancy

• Definition: The unnecessary duplication of data across multiple locations.

• Example: Storing customer addresses in multiple tables.

22. Data Anomalies

• Definition: Issues arising from data redundancy and inconsistency during operations like insert,
update, and delete.

• Example: An update anomaly when the same data must be changed in multiple places.

23. Referential Integrity


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
• Definition: Ensures that foreign key values always reference existing primary key values.

• Example: A student record cannot exist without a valid department in the "Departments" table.

24. Views

• Definition: A virtual table based on the result of an SQL query that does not store data itself.

• Example:

CREATE VIEW StudentView AS SELECT Name, Age FROM Students;

25. Stored Procedure

• Definition: A precompiled set of SQL statements stored in the database for reuse.

• Example:

CREATE PROCEDURE GetStudentDetails() AS SELECT * FROM Students;

26. Trigger

• Definition: A special procedure that automatically executes when a specified event occurs in the
database.

• Example: Automatically updating inventory after an order is placed.

27. Deadlock

• Definition: A situation where two or more transactions are waiting for each other to release resources,
causing a standstill.

• Example: Transaction A waits for a record locked by transaction B, while B waits for A.

28. Replication

• Definition: The process of copying data from one database to another for backup or distribution
purposes.

• Example: A backup database maintained in another region.

29. Partitioning

• Definition: Dividing a large table into smaller, more manageable pieces for better performance.

• Example: A sales table partitioned by year.

30. OLTP vs OLAP

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
• OLTP (Online Transaction Processing): Handles real-time transaction-oriented operations (e.g.,
banking systems).

• OLAP (Online Analytical Processing): Used for complex queries and data analysis (e.g., business
intelligence).

Database system Architecture

The architecture of a database system defines how data is stored, processed, and accessed within the system.
It provides a systematic way to manage databases efficiently and securely. There are two primary types of
database architectures:

1. Logical Architecture (Three-tier model)

2. Physical Architecture (Client-server model)

1. Logical Database Architecture (Three-tier Architecture)

The three-tier architecture separates the database system into different layers to enhance performance,
scalability, and maintainability. The three levels are:

a) Internal Level (Physical Level)

• Description:

o The lowest level of the database architecture that defines how data is stored physically on
storage devices (hard disks, SSDs, etc.).

o It handles file organization, indexing, and data structures.

• Responsibilities:

o Data compression and encryption.

o Storage allocation and access methods.

• Example: B-tree indexing, tables stored in binary format.

b) Conceptual Level (Logical Level)

• Description:

o This level provides a logical view of the entire database and defines relationships among data.

o It specifies what data is stored and the relationships between them without focusing on how
they are physically stored.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
• Responsibilities:

o Schema definition (tables, columns, constraints).

o Defining data integrity rules.

• Example: ER models, relational schema (tables, attributes, keys).

c) External Level (View Level)

• Description:

o The highest level that provides user-specific views of the database.

o Different users may require different views of the same data based on their roles.

• Responsibilities:

o Data abstraction and security.

o Hiding unnecessary details from users.

• Example: A bank teller sees only customer balance, while the manager sees all customer information.

Key Advantages of Three-tier Architecture:

• Improved security (each user gets a limited view of the data).

• Enhances data independence (logical changes don't affect physical storage).

• Simplifies database management for large systems.

2. Physical Database Architecture (Client-Server Architecture)

The physical database architecture defines how the database system is physically deployed and accessed
by users. It is commonly categorized into:

a) Centralized Architecture

• Description:

o A single database is stored on a central server and accessed by multiple clients.

o All processing takes place in one central location.

• Advantages:

o Easier maintenance and backup.

o Centralized security management.


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
• Disadvantages:

o Performance bottleneck with many users.

o Single point of failure.

b) Distributed Architecture

• Description:

o Data is distributed across multiple locations, improving availability and scalability.

o Each site can have its local database but communicate with others.

• Advantages:

o Increased fault tolerance.

o Faster data access for remote users.

• Disadvantages:

o Complex synchronization and consistency management.

c) Client-Server Architecture

• Description:

o The database system is divided into two main components:

1. Client: Sends queries and requests data.

2. Server: Processes requests and returns data to the client.

• Advantages:

o Better resource allocation and efficient load balancing.

o Easier maintenance and updates.

Database System Components

A database system consists of several key components that work together to ensure data is managed
efficiently.

1. Hardware: Physical devices such as servers, storage, and network infrastructure.

2. Software: The DBMS software that manages the database (e.g., MySQL, PostgreSQL, Oracle).

3. Data: The actual data stored in the system.


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
4. Users: Individuals who interact with the database (e.g., end-users, developers, administrators).

5. Procedures: Rules and instructions for operating the database effectively.

Types of Database System Architectures

1. One-Tier Architecture (Monolithic)

• Everything (application and database) resides on a single system.

• Suitable for personal or small applications.

• Example: MS Access.

2. Two-Tier Architecture

• Separates the database and application layers.

• Client sends requests directly to the database server.

• Example: Web applications with direct database connections.

3. Three-Tier Architecture

• Separates application logic into three layers:

o Presentation Layer (User Interface)

o Application Layer (Business Logic)

o Database Layer (Data Storage)

• Example: Web-based applications like e-commerce platforms.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1

Various Data Models in DBMS

A data model defines how data is structured, organized, and manipulated within a database system. It
provides a systematic way to store and retrieve data while ensuring consistency and integrity.

Types of Data Models

Data models are broadly classified into the following categories:

1. Hierarchical Data Model

2. Network Data Model

3. Relational Data Model

4. Object-Oriented Data Model

5. Entity-Relationship (E-R) Model

6. Document and NoSQL Data Models

1. Hierarchical Data Model

• Concept:

o Organizes data in a tree-like structure (parent-child relationship).

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
o Each parent node can have multiple child nodes, but each child has only one parent.

• Example:

o Organization structure:

Company

├── HR Department

├── IT Department

├── Software Team

├── Hardware Team

Advantages:

• Fast data retrieval for hierarchical relationships.

• Easy navigation using parent-child relationships.

Disadvantages:

• Complex to modify (inserting/deleting nodes).

• Redundant data storage due to one-to-many relationships.

Network Data Model

• Concept:

o Represents data using a graph-like structure with nodes and relationships (many-to-many
relationships).

o Entities (nodes) can have multiple relationships (edges).

• Example:

o A student can enroll in multiple courses, and each course can have multiple students.

• Advantages:

o Flexible relationships (many-to-many).

o Faster access due to pointer-based navigation.

• Disadvantages:

o Complex structure compared to relational models.


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
o Difficult to modify relationships.

Relational Data Model (Most Popular)

• Concept:

o Organizes data into tables (relations) consisting of rows (tuples) and columns (attributes).

o Uses keys (Primary, Foreign) to establish relationships.

Students Table:

+-----------+---------+

| StudentID | Name |

+-----------+---------+

| 101 | Alice |

| 102 | Bob |

+-----------+---------+

Advantages:

• Easy to use with SQL (Structured Query Language).

• Data integrity and consistency.

Disadvantages:

• Performance issues with very large datasets.

• Complex joins may slow down queries.

Object-Oriented Data Model

• Concept:

o Data is stored as objects, similar to object-oriented programming (OOP).

o Supports encapsulation, inheritance, and polymorphism.

• Example:

o A "Car" object may contain attributes like Brand, Model, and behaviors like Start() and
Stop().

• Advantages:
Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
o Useful for complex data like multimedia, CAD/CAM.

o Combines procedural programming with data storage.

• Disadvantages:

o Steeper learning curve.

o Slower compared to relational models for simple queries.

Entity-Relationship (E-R) Model

• Concept:

o Represents data as entities (objects) and relationships between them using diagrams.

o Entities have attributes that define their properties.

• Example:

o Entity: Student

o Attributes: StudentID, Name, Age

o Relationship: A student "enrolls in" a course.

• Advantages:

o Intuitive visual representation.

o Helps in database design.

• Disadvantages:

o Limited use for direct implementation.

o Requires conversion to other models (relational).

Document and NoSQL Data Models

• Concept:

o Used in NoSQL databases, storing data as key-value pairs, documents, column-family stores,
or graphs.

o Suitable for unstructured or semi-structured data.

• Types:

o Key-Value Model: Data stored as key-value pairs (e.g., Redis).


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
o Document Model: JSON-like documents (e.g., MongoDB).

o Column-Family Model: Data stored in columns rather than rows (e.g., Cassandra).

o Graph Model: Focuses on relationships (e.g., Neo4j).

• Advantages:

o Highly scalable and flexible schema.

o Ideal for big data applications.

• Disadvantages:

o Lack of standardization (no strict schema).

o Weaker consistency models compared to relational DBMS.

ER diagram basics and extensions

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
An Entity-Relationship (ER) diagram is a visual representation of data that helps in database design by
showing entities, attributes, and relationships between them. It is widely used to create a conceptual model
of a database before converting it to a relational schema.

Basic Components of an ER Diagram

a) Entities

An entity represents a real-world object that can be uniquely identified.

• Types of Entities:

o Strong Entity: Can exist independently (e.g., Student, Employee).

o Weak Entity: Depends on a strong entity and cannot exist without it (e.g., Order Item,
dependent on Order).

Notation:

• Represented by a rectangle.

o Example: STUDENT

o Weak entities are represented with double rectangles.

Attributes

Attributes describe the properties of an entity.

• Types of Attributes:

1. Simple/Atomic: Cannot be divided further (e.g., Name, Age).

2. Composite: Can be broken down into smaller components (e.g., Name → First Name, Last
Name).

3. Derived: Calculated from other attributes (e.g., Age from Date of Birth).

4. Multivalued: Can have multiple values (e.g., Phone Numbers).

5. Key Attribute: Uniquely identifies an entity (e.g., StudentID).

Notation:

• Represented by ellipses connected to the entity.

o Example: STUDENT → (StudentID, Name, Age)

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
Relationships

A relationship represents an association between two or more entities.

• Types of Relationships:

1. One-to-One (1:1): Each entity in A is related to at most one entity in B.

2. One-to-Many (1:M): One entity in A can relate to multiple entities in B, but not vice versa.

3. Many-to-Many (M:N): Multiple entities in A can relate to multiple entities in B.

Notation:

• Represented by a diamond shape between related entities.

o Example: STUDENT ── enrolls in ── COURSE

Keys

A key is an attribute (or set of attributes) that uniquely identifies an entity.

• Types of Keys:

1. Primary Key: Uniquely identifies records in an entity (e.g., StudentID).

2. Candidate Key: A set of potential primary keys.

3. Foreign Key: An attribute referring to the primary key of another entity.

ER Diagram Notations Summary

ER Diagram Cardinality and Participation

a) Cardinality

Defines the number of instances of an entity that can be associated with instances of another entity.

• Types:

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
1. 1:1 (One-to-One): One employee has one parking space.

2. 1:M (One-to-Many): One department has many employees.

3. M:N (Many-to-Many): Many students enroll in many courses.

Notation:

• 1: Single relationship

• M or N: Multiple relationships

b) Participation

Defines whether all entities participate in a relationship.

• Total Participation: Every instance must participate. (Double line)

• Partial Participation: Some instances may participate. (Single line)

ER Diagram Extensions (Enhanced ER Model - EER)

The Enhanced ER Model (EER) extends the basic ER model by adding more sophisticated concepts to
handle complex database requirements.

a) Generalization

• Process of combining multiple entities into a single generalized entity.

• Example: Car and Truck can be generalized into Vehicle.

• Notation: Upward triangle pointing to the generalized entity.

b) Specialization

• Process of dividing a generalized entity into more specific entities.

• Example: Employee can be specialized into Manager and Clerk.

• Notation: Downward triangle pointing to specialized entities.

c) Aggregation

• A relationship between relationships, allowing abstraction of complex relationships.

• Example: A project "manages" employees, and a manager oversees it.

• Notation: Draw a box around the relationship and relate it to another entity.

d) Categorization
Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
• A way to represent a single entity as a subclass of multiple superclasses.

• Example: A Person can be both an Employee and a Student.

Example of an ER Diagram

Consider a university database with the following entities and relationships:

Entities:

• STUDENT (StudentID, Name, Age)

• COURSE (CourseID, CourseName)

• PROFESSOR (ProfID, Name, Department)

Relationships:

• STUDENT enrolls in COURSE (M:N)

• PROFESSOR teaches COURSE (1:M)

+-------------+ +-----------+

| STUDENT | | COURSE |

+-------------+ +-----------+

| |

(enrolls in) (teaches)

| |

+-------------+ +-----------+

| PROFESSOR |

+-------------+

Advantages of ER Diagrams

• Provides a clear conceptual design of a database.

• Helps in data modeling and visualization.

• Facilitates communication between stakeholders and developers.

• Ensures consistency and minimizes redundancy.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
Limitations of ER Diagrams

• Difficult to represent complex operations.

• Performance aspects like indexing are not considered.

• Not directly implementable in database systems without conversion.

Converting ER Diagram to Relational Schema

Once the ER diagram is designed, it needs to be converted into a relational schema using the following
steps:

1. Convert Entities to Tables.

o Each entity becomes a table with attributes as columns.

2. Convert Relationships.

o Many-to-many relationships require a separate table.

3. Define Keys.

o Primary and foreign keys are established.

4. Normalize the schema.

o Remove redundancy and improve efficiency.

Extended ER Model Concepts: Generalization, Specialization, and Aggregation

The Extended Entity-Relationship (EER) model extends the basic ER model by incorporating additional
concepts such as generalization, specialization, and aggregation, which help in designing more complex and
realistic data models.

1. Generalization

Definition:
Generalization is the process of combining two or more lower-level entity sets into a higher-level entity set
based on common attributes. It abstracts common characteristics and creates a more generalized, higher-
level entity.

Example:
Consider Car and Truck as separate entities with common attributes such as vehicle_id, manufacturer, and
price. Using generalization, we can create a higher-level entity called Vehicle that encompasses these
common attributes.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
Diagram Representation:

Vehicle

/ \

Car Truck

Key Points:

• Moves from specific to general (bottom-up approach).

• Common attributes are factored into the generalized entity.

• Reduces redundancy by grouping similar entities.

SQL Implementation Example:

CREATE TABLE Vehicle (

vehicle_id INT PRIMARY KEY,

manufacturer VARCHAR(50),

price DECIMAL(10,2)

);

CREATE TABLE Car (

vehicle_id INT PRIMARY KEY,

car_type VARCHAR(50),

FOREIGN KEY (vehicle_id) REFERENCES Vehicle(vehicle_id)

);

CREATE TABLE Truck (

vehicle_id INT PRIMARY KEY,

load_capacity INT,

FOREIGN KEY (vehicle_id) REFERENCES Vehicle(vehicle_id)

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
);

Specialization

Definition:
Specialization is the opposite of generalization. It involves creating subclasses from a higher-level entity
based on distinguishing characteristics. It adds more specific attributes to each subclass.

Example:
If we have a general entity Employee, we can specialize it into Manager and Technician, with each having
unique attributes such as manager_bonus and technical_skill.

Diagram Representation:

Employee

/ \

Manager Technician

Key Points:

• Moves from general to specific (top-down approach).

• Adds specialized attributes to subclasses.

• Introduces constraints like disjoint (no overlap between subclasses) or overlapping (an instance can
belong to multiple subclasses).

SQL Implementation Example:

CREATE TABLE Employee (

emp_id INT PRIMARY KEY,

name VARCHAR(50),

salary DECIMAL(10,2)

);

CREATE TABLE Manager (

emp_id INT PRIMARY KEY,

manager_bonus DECIMAL(10,2),
Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
FOREIGN KEY (emp_id) REFERENCES Employee(emp_id)

);

CREATE TABLE Technician (

emp_id INT PRIMARY KEY,

technical_skill VARCHAR(50),

FOREIGN KEY (emp_id) REFERENCES Employee(emp_id)

);

3. Aggregation

Definition:
Aggregation is a higher-level abstraction that treats relationships between entities as higher-level entities
themselves. It allows modeling of complex relationships involving relationships.

Example:
Consider an entity Project that is assigned to an entity Employee, and there is a relationship Works_On. If
we want to track which Department oversees a given project, we can aggregate the Works_On relationship
into a higher-level entity.

Diagram Representation:

Employee Project

\ /

Works_On

Overseen_By

Department

Key Points:

• Aggregates relationships as an entity.

• Useful for complex interactions and many-to-many relationships.


Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST
UNIT 1
• Simplifies query processing in relational models.

SQL Implementation Example:

CREATE TABLE Employee (

emp_id INT PRIMARY KEY,

name VARCHAR(50)

);

CREATE TABLE Project (

proj_id INT PRIMARY KEY,

proj_name VARCHAR(50)

);

CREATE TABLE Works_On (

emp_id INT,

proj_id INT,

hours_worked INT,

PRIMARY KEY (emp_id, proj_id),

FOREIGN KEY (emp_id) REFERENCES Employee(emp_id),

FOREIGN KEY (proj_id) REFERENCES Project(proj_id)

);

CREATE TABLE Department (

dept_id INT PRIMARY KEY,

dept_name VARCHAR(50)

);

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1

CREATE TABLE Overseen_By (

proj_id INT,

dept_id INT,

PRIMARY KEY (proj_id, dept_id),

FOREIGN KEY (proj_id) REFERENCES Project(proj_id),

FOREIGN KEY (dept_id) REFERENCES Department(dept_id)

);

Comparison of Generalization, Specialization, and Aggregation

Feature Generalization Specialization Aggregation

Combining similar entities Dividing an entity into sub-


Definition Treating relationships as entities
into one entities

Approach Bottom-up Top-down Relationship abstraction

Focus Common attributes Unique attributes Complex relationships

Vehicle (generalizing Car, Employee (specializing Employee assigned to Project


Example
Truck) Manager, Technician) managed by Department

When to Use Each Concept

1. Use Generalization when you find multiple entities with common attributes and want to simplify
your design.

2. Use Specialization when you need to introduce more specific attributes for some entities.

3. Use Aggregation when dealing with complex relationships involving multiple entities.

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
Entity-Relationship (ER) Model for a general application that could be built into a database. I'll provide an
example ER model for an Online Student Management System, which includes students, courses,
instructors, and enrollments.

Step 1: Identify Entities

The main entities in the system are:

1. Student – Represents students enrolled in the system.

2. Course – Represents courses offered by the institution.

3. Instructor – Represents teachers/professors who teach courses.

4. Enrollment – Represents the association between students and courses (many-to-many relationship).

5. Department – Represents different academic departments.

6. Grades – Stores grades for enrolled students.

Step 2: Define Attributes for Each Entity

1. Student

o student_id (Primary Key)

o first_name

o last_name

o email

o phone

o date_of_birth

o department_id (Foreign Key)

2. Course

o course_id (Primary Key)

o course_name

o credits

o department_id (Foreign Key)

3. Instructor

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
o instructor_id (Primary Key)

o first_name

o last_name

o email

o phone

o department_id (Foreign Key)

4. Enrollment (Associative Entity)

o enrollment_id (Primary Key)

o student_id (Foreign Key)

o course_id (Foreign Key)

o enrollment_date

5. Department

o department_id (Primary Key)

o department_name

o head_of_department

6. Grades

o grade_id (Primary Key)

o student_id (Foreign Key)

o course_id (Foreign Key)

o grade

o comments

Step 3: Define Relationships Between Entities

1. Student enrolls in multiple Courses, and each course can have multiple students. (Many-to-Many)

2. Each Course is taught by one Instructor, but an instructor can teach multiple courses. (One-to-
Many)

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
3. Each Course belongs to a specific Department, and each department can have multiple courses.
(One-to-Many)

4. Each Student belongs to one Department, but a department can have many students. (One-to-
Many)

Each Enrollment belongs to a single Student and a single Course. (Many-to-One for both)

Step 4: ER Model Diagram Representation (Text-Based)

+-------------------+ +-------------------+

| Department | | Instructor |

+-------------------+ +-------------------+

| department_id (PK) |<------->| department_id (FK) |

| department_name | | instructor_id (PK) |

| head_of_department | | first_name |

+-------------------+ | last_name |

| +-------------------+

| (One-to-Many)

+-------------------------------+

| Course |

+-------------------------------+

| course_id (PK) |

| course_name |

| credits |

| department_id (FK) |

+-------------------------------+

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
| (Many-to-Many via Enrollment)

+----------------------+ +----------------------+

| Enrollment | | Student |

+----------------------+ +----------------------+

| enrollment_id (PK) | | student_id (PK) |

| student_id (FK) |<---->| first_name |

| course_id (FK) | | last_name |

| enrollment_date | | email |

+----------------------+ | phone |

+----------------------+

| (One-to-Many)

+----------------+

| Grades |

+----------------+

| grade_id (PK) |

| student_id (FK) |

| course_id (FK) |

| grade |

| comments |

+----------------+

Step 5: SQL Implementation for the ER Model

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
-- Department Table

CREATE TABLE Department (

department_id INT PRIMARY KEY AUTO_INCREMENT,

department_name VARCHAR(100),

head_of_department VARCHAR(100)

);

-- Instructor Table

CREATE TABLE Instructor (

instructor_id INT PRIMARY KEY AUTO_INCREMENT,

first_name VARCHAR(50),

last_name VARCHAR(50),

email VARCHAR(100),

phone VARCHAR(15),

department_id INT,

FOREIGN KEY (department_id) REFERENCES Department(department_id)

);

-- Student Table

CREATE TABLE Student (

student_id INT PRIMARY KEY AUTO_INCREMENT,

first_name VARCHAR(50),

last_name VARCHAR(50),

email VARCHAR(100),

phone VARCHAR(15),

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
date_of_birth DATE,

department_id INT,

FOREIGN KEY (department_id) REFERENCES Department(department_id)

);

-- Course Table

CREATE TABLE Course (

course_id INT PRIMARY KEY AUTO_INCREMENT,

course_name VARCHAR(100),

credits INT,

department_id INT,

FOREIGN KEY (department_id) REFERENCES Department(department_id)

);

-- Enrollment Table (Many-to-Many Relationship)

CREATE TABLE Enrollment (

enrollment_id INT PRIMARY KEY AUTO_INCREMENT,

student_id INT,

course_id INT,

enrollment_date DATE,

FOREIGN KEY (student_id) REFERENCES Student(student_id),

FOREIGN KEY (course_id) REFERENCES Course(course_id)

);

-- Grades Table

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
CREATE TABLE Grades (

grade_id INT PRIMARY KEY AUTO_INCREMENT,

student_id INT,

course_id INT,

grade CHAR(2),

comments VARCHAR(255),

FOREIGN KEY (student_id) REFERENCES Student(student_id),

FOREIGN KEY (course_id) REFERENCES Course(course_id)

);

Step 6: Example Queries for the Application

1. Insert Sample Data

INSERT INTO Department (department_name, head_of_department)

VALUES ('Computer Science', 'Dr. Smith');

INSERT INTO Student (first_name, last_name, email, phone, date_of_birth, department_id)

VALUES ('Alice', 'Johnson', '[email protected]', '1234567890', '2000-05-15', 1);

INSERT INTO Course (course_name, credits, department_id)

VALUES ('Database Systems', 3, 1);

INSERT INTO Enrollment (student_id, course_id, enrollment_date)

VALUES (1, 1, '2025-01-10');

2. Retrieve All Students and Their Enrolled Courses

SELECT s.first_name, s.last_name, c.course_name

FROM Student s

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST


UNIT 1
JOIN Enrollment e ON s.student_id = e.student_id

JOIN Course c ON e.course_id = c.course_id;

3. Find Students in a Specific Department

SELECT first_name, last_name FROM Student

WHERE department_id = 1;

4. Get Average Grade for a Course

SELECT c.course_name, AVG(g.grade) AS avg_grade

FROM Grades g

JOIN Course c ON g.course_id = c.course_id

GROUP BY c.course_name;

Dr J Dhanalakshmi, Assistant Professor, DSBS, SRMIST

You might also like