0% found this document useful (0 votes)
16 views45 pages

DBMS

Data refers to distinct units of information that can be stored in various forms, while a database is an organized collection of inter-related data that allows for efficient retrieval and management. A Database Management System (DBMS) is software that manages databases, ensuring data integrity, security, and consistency, with various architectures like single-tier, two-tier, and three-tier. The document also discusses the evolution of databases, types of databases, and the differences between DBMS and RDBMS.

Uploaded by

Nagothi Sailaja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views45 pages

DBMS

Data refers to distinct units of information that can be stored in various forms, while a database is an organized collection of inter-related data that allows for efficient retrieval and management. A Database Management System (DBMS) is software that manages databases, ensuring data integrity, security, and consistency, with various architectures like single-tier, two-tier, and three-tier. The document also discusses the evolution of databases, types of databases, and the differences between DBMS and RDBMS.

Uploaded by

Nagothi Sailaja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 45

What is Data?

Data is a collection of a distinct small unit of information. It can be used in a variety of forms like text,
numbers, media, bytes, etc. it can be stored in pieces of paper or electronic memory, etc.

Word 'Data' is originated from the word 'datum' that means 'single piece of information.' It is plural
of the word datum.

What is Database

The database is a collection of inter-related data which is used to retrieve, insert and delete the data
efficiently. It is also used to organize the data in the form of a table, schema, views, and reports, etc.

For example: The college Database organizes the data about the admin, staff, students and faculty
etc.

Using the database, you can easily retrieve, insert, and delete the information.

There are many dynamic websites on the World Wide Web nowadays which are handled through
databases.

Database Management System

o Database management system is a software which is used to manage the database. For
example: MySQL, Oracle ,PostgreSQL, SQL Server, etc are a very popular commercial
database which is used in different applications.

o DBMS provides an interface to perform various operations like database creation, storing
data in it, updating data, creating a table in the database and a lot more.

o It provides protection and security to the database. In the case of multiple users, it also
maintains data consistency.

Data Consistency -> correctness in the data across the database,when an operation performed it
need to changed from one valid state to another valid state by preserving set of rules

The main characteristics of a Database Management System (DBMS) are:

1. Data Independence: Separates data structure from application logic, so changes in data
don't affect the software.

2. Data Integrity: Ensures data accuracy and consistency, using rules and constraints.

3. Data Security: Protects data through access control, ensuring only authorized users can
access or modify it.

4. Data Consistency: Prevents data conflicts and redundancy, making sure data remains the
same across all accesses.
5. Concurrent Access: Allows multiple users to access and modify data at the same time
without interference.

6. Backup and Recovery: Provides automatic backup and recovery to restore data after system
failures.

7. Data Abstraction: Hides the complexity of data storage, presenting only useful views of data
to users.

These features make DBMS reliable, secure, and efficient for managing large datasets.

Advantages: Features same

Disadvantages:

1. Cost: DBMS software and hardware can be expensive to set up and maintain.

2. Complexity: Requires trained personnel to manage and operate, as DBMS systems can be
complex.

3. Performance Issues: With heavy data processing and multiple users, DBMS performance can
slow down.

4. Maintenance: Regular maintenance, updates, and backups are required to keep the system
functioning properly.

Evolution of DataBases

1. Flat File Databases

 Definition: The simplest form of database, where data is stored in plain text files. Each line
represents a record, and fields are separated by delimiters.

 Example: A CSV (Comma-Separated Values) file containing names, addresses, and phone
numbers.

 Real-Time Example: A simple Excel spreadsheet that lists student names, grades, and contact
information.

2. Hierarchical Databases

 Definition: Data is organized in a tree-like structure, with parent-child relationships. Each


child record has only one parent, creating a hierarchy.

 Example: IBM's Information Management System (IMS).

 Application: Used in applications like banking systems where customer records are related to
accounts.

3. Network Databases
 Definition:Network databases allow multiple parent-child relationships, meaning records can
have many connections, creating a more flexible structure than hierarchical databases.
(graphs)

 Example: Integrated Data Store (IDS).

 Real-Time Example: A university database where students can enroll in multiple courses, and
each course can have many students.

4. Relational Databases

 Definition: Data is stored in tables (relations) that can be linked through common fields
(keys). It uses Structured Query Language (SQL) for data manipulation.

 Example: MySQL, PostgreSQL, and Oracle Database.

 Real-Time Example: A retail store's database that includes separate tables for customers,
orders, and products, allowing easy access to related data.

5. Object-Oriented Databases

 Definition: Stores data as objects, similar to object-oriented programming. Each object can
contain both data and behavior.

 Example: ObjectDB, db4o.

 Real-Time Example: A multimedia application that manages images and videos as objects,
each with properties (like size, format) and methods (like play, resize).

6. NoSQL Databases

 Definition: Non-relational databases designed to handle large volumes of unstructured or


semi-structured data. They prioritize scalability and flexibility.

 Example: MongoDB, Cassandra, and Redis.

 Real-Time Example: A social media platform like Facebook uses NoSQL databases to store
user profiles, posts, and comments in a flexible way.
1. Centralized Database

 Definition: A centralized database is a single database located in one place, which can be
accessed by multiple users or applications from various locations.

 Example: A bank’s main database that stores customer information and transaction
records, which all branches access.

 Easier to manage and backup.


 Better control over data access.
 Fast access for nearby users.

Disadvantages of Centralized Database


o The size of the centralized database is large, which increases the response time for
fetching the data.
o It is not easy to update such an extensive database system.
o If any server failure occurs, entire data will be lost, which could be a huge loss.

Distributed Database

 Definition: A distributed database is a collection of multiple databases located across


different geographical locations, which can communicate and work together as a single
database.

 Data is available closer to where it's needed.


 More reliable, since data can be copied in different places.( Redundancy: Data can be
replicated across multiple sites for reliability and fault tolerance.)
 Easier to grow by adding more databases.

 Example: A global company with databases in different countries for local offices.

Cloud Database
 Definition: A cloud database is a database service that is hosted on cloud computing
platforms, allowing users to store and manage data over the internet rather than on local
servers.

o Can be accessed from anywhere with internet.

o Scalable to handle more data easily.

o Managed by service providers, reducing maintenance work.

 Example: Google Cloud SQL or Amazon RDS for managing databases online.

Summary

 Centralized Database = One location, easier to manage.

 Distributed Database = Multiple locations, better access and reliability.

 Cloud Database = Online database, accessible from anywhere, less maintenance needed.

What is RDBMS (Relational Database Management System)

RDBMS stands for Relational Database Management System.

All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL, and
Microsoft Access are based on RDBMS.

It is called Relational Database Management System (RDBMS) because it is based on the relational
model introduced by E.F. Codd.

From 1970 to 1972, E.F. Codd published a paper to propose using a relational database model.

RDBMS is originally based on E.F. Codd's relational model invention

Following are the various terminologies of RDBMS:


What is table/Relation?

Everything in a relational database is stored in the form of relations. The RDBMS database uses
tables to store data. A table is a collection of related data entries and contains rows and columns to
store data. Each table represents some real-world objects such as person, place, or event about
which information is collected. The organized collection of data into a relational table is known as the
logical view of the database.

Properties of a Relation:

o Each relation has a unique name by which it is identified in the database.

o Relation does not contain duplicate tuples.

o The tuples of a relation have no specific order.

o All attributes in a relation are atomic, i.e., each cell of a relation contains exactly one value.

What is a row or record?

A row of a table is also called a record or tuple. It contains the specific information of each entry in
the table. It is a horizontal entity in the table.

Properties of a row:

o No two tuples are identical to each other in all their entries.

o All tuples of the relation have the same format and the same number of entries.

o The order of the tuple is irrelevant. They are identified by their content, not by their position.

What is a column/attribute?

A column is a vertical entity in the table which contains all information associated with a specific field
in a table.

Properties of an Attribute:

o Every attribute of a relation must have a name.

o Null values are permitted for the attributes.

o Default values can be specified for an attribute automatically inserted if no other value is
specified for an attribute.

o Attributes that uniquely identify each tuple of a relation are the primary key.

What is data item/Cells?

The smallest unit of data in the table is the individual data item. It is stored at the intersection of
tuples and attributes.

Properties of data items:


o Data items are atomic.

o The data items for an attribute should be drawn from the same domain.

Degree:

The total number of attributes that comprise a relation is known as the degree of the table.

For example, the student table has 4 attributes, and its degree is 4.

Cardinality:

The total number of tuples at any one time in a relation is known as the table's cardinality. The
relation whose cardinality is 0 is called an empty table.

For example, the student table has 5 rows, and its cardinality is 5.

Domain:

The domain refers to the possible values each attribute can contain. It can be specified using
standard data types such as integers, floating numbers, etc. For example, An attribute entitled
Marital_Status may be limited to married or unmarried values.

NULL Values

The NULL value of the table specifies that the field has been left blank during record creation. It is
different from the value filled with zero or a field that contains space.

Data Integrity

There are the following categories of data integrity exist with each RDBMS:

Entity integrity: It specifies that there should be no duplicate rows in a table.

Domain integrity:

Definition: Domain integrity ensures that all data in a database column meets certain constraints or
rules based on the data type and valid values for that column.

Examples of Constraints:

 Data Type: Ensuring a column that is meant to store dates only contains date values (e.g.,
DATE, TIME).

 Value Constraints: Setting limits, like ensuring an age column only contains values between 0
and 120.

 Null Constraints: Specifying whether a column can accept NULL values.


Referential integrity specifies that rows cannot be deleted, which are used by other records.

User-defined integrity: It enforces some specific business rules defined by users. These rules are
different from the entity, domain, or referential integrity.

Example: A customer cannot place an order if their account is inactive.

DBMS (Database Management RDBMS (Relational Database Management


Feature
System) System)

Data Data is stored as files; it may not be Data is stored in tabular form (tables) with rows
Structure structured. and columns.

Does not support relationships Supports relationships between tables through


Relationships
between data. foreign keys.

Does not typically use normalization; Uses normalization to reduce data


Normalization
data redundancy is common. redundancy and improve data integrity.

Data Integrity constraints are not Enforces data integrity through primary keys and
Integrity enforced. foreign keys.

Query Typically uses a file-based query language Uses Structured Query Language (SQL) for
Language or no standard query language. data manipulation and queries.

Examples include hierarchical databases (like Examples include MySQL, PostgreSQL,


Examples
XML), network databases. Oracle, and SQL Server.

Data Data access can be slower, as it may require Data access is generally faster due to
Access more complex navigation. indexing and efficient querying.

Suitable for smaller applications and less Suitable for larger applications with complex
Usage
complex data management. data management needs.

FILE SYSTEM V/S DBMS

Aspect File System Database Management System (DBMS)

A method for storing and A software system that enables the creation,
Definition
organizing files on a disk. management, and manipulation of databases.

Data Data is stored in files and Data is organized in structured formats (tables,
Structure directories. records).
Data Higher redundancy; same data may be Minimizes redundancy; data normalization
Redundancy stored in multiple files. reduces duplication.

Data Limited integrity constraints; user must Enforces data integrity through
Integrity manage consistency. constraints and rules.

Data Accessed through file operations Accessed through queries using structured query
Access (read, write). language (SQL).

Basic security; relies on file system Advanced security features (user roles, permissions)
Security
permissions. for data protection.

Limited support for concurrent access; can Supports multiple users simultaneously
Concurrency
lead to file locking. without conflicts.

Less scalable; managing large volumes of data More scalable; can handle large
Scalability
can be challenging. datasets efficiently.

Data Relationships between data are not Defines relationships between data through
Relationships explicitly defined. foreign keys and indexes.

Backup & Manual backup and recovery procedures Provides automated backup and
Recovery are needed. recovery options.

DBMS Architecture
Database Management System (DBMS) architecture refers to the structure that defines how data is
stored, accessed, and managed within a DBMS. Here’s a simple explanation of the three main types
of DBMS architectures along with real-time examples for better understanding.

1. Single-tier Architecture (Monolithic)

 Description: In this architecture, the DBMS is installed on a single machine where both the
database and the application reside. Users interact directly with the database.

 Example: A desktop application that stores data in a local database file (like Microsoft
Access). When a user opens the application, they directly access and manage the data stored
on their machine.

2. Two-tier Architecture

 Description: This architecture divides the DBMS into two parts: the client (application) and
the server (database). The client sends requests to the server, which processes the request
and sends back the results.

 Example: A client-server application like a banking system where the client application (user
interface) runs on a user’s computer, and the database server runs on a centralized server.
When a user checks their account balance, the client sends a request to the server, which
retrieves the information from the database and returns it to the client.

3. Three-tier Architecture
 Description: This architecture adds an additional layer between the client and the database
server, typically referred to as the application server. This layer processes business logic and
acts as an intermediary between the user interface and the database.

 Example: An online shopping platform (like Amazon).

o Presentation Layer: The web interface (the user sees) where customers browse
products.

o Application Layer: The server-side logic that processes user requests (adding items
to a cart, checking out).

o Data Layer: The database that stores product information, user accounts, and
transactions. When a user adds an item to their cart, the request goes to the
application server, which then communicates with the database server to update the
information.

Visual Representation

Here’s a simple diagram to visualize the three architectures:

markdown

Copy code

Single-tier Architecture:

-------------------------------------------------

| Application |

| (DBMS and Database) |

-------------------------------------------------

Two-tier Architecture:

-------------------------------------------------

| Client Application |

-------------------------------------------------

| |

| |

-------------------------------------------------

| Database Server |

-------------------------------------------------

Three-tier Architecture:
-------------------------------------------------

| Client Application |

-------------------------------------------------

| |

| |

-------------------------------------------------

| Application Server |

-------------------------------------------------

| |

| |

-------------------------------------------------

| Database Server |

-------------------------------------------------

Summary

 Single-tier: Everything is in one place; easy to set up but not scalable.

 Two-tier: Client and server communicate directly; good for small to medium applications.

 Three-tier: More scalable and flexible; best for larger applications with complex logic.

The Three Schema Architecture is a framework for designing databases that separates the user's
view of data from the physical storage of data. This architecture helps in data abstraction and
ensures that changes to the database structure do not affect how users access the data. The
architecture is divided into three levels:

1. Internal Schema (Physical Level)


 Description: This is the lowest level of data abstraction. It describes how the data is
physically stored in the database, including the data structures and file formats.

 Example: It defines the storage format (like B-trees, hashing) and the data organization on
the storage device. If a database uses compression or specific indexing techniques to
optimize storage, those details are part of the internal schema.

2. Conceptual Schema (Logical Level)

 Description: This level provides a community view of the entire database. It defines what
data is stored, the relationships between the data, and the constraints on the data without
worrying about how the data is physically stored.

 Example: In a university database, the conceptual schema might define entities such as
Student, Course, and Enrollment, along with their relationships. It specifies that a Student
can enroll in multiple Courses and each Course can have multiple Students.

3. External Schema (View Level)

 Description: This is the highest level of abstraction. It describes how individual users view
the data. Different users may have different views based on their needs and roles.

 Example: In the university database, an external schema for a professor might only show
Courses they teach and the Students enrolled in those courses. Meanwhile, an external
schema for a student might show their enrolled courses and grades. Each user accesses data
relevant to their role without seeing the entire database.

Summary of Benefits
 Data Abstraction: Users interact with the database without needing to understand how it is
stored.

 Independence: Changes in one schema do not affect others. For example, if the internal
schema changes (like changing file storage methods), it doesn't impact the conceptual or
external schemas.

 Security: By providing different views for different users, sensitive data can be hidden from
unauthorized users.

 Conceptual/Internal Mapping refers to the translation of the conceptual schema


(logical data structure) to the internal schema (physical storage), ensuring data
integrity and efficient storage, while External/Conceptual Mapping defines how
different user views (external schemas) relate to the conceptual schema, allowing
users to access relevant data without interacting with the entire database structure.

Data Models
Data models are fundamental frameworks used to define the structure, organization, and
relationships of data within a database. They serve as blueprints for designing databases and help in
understanding how data is stored, accessed, and manipulated. Here are the main types of data
models:

1. Hierarchical Data Model

 Description: Organizes data in a tree-like structure where each record has a single parent
and potentially many children, resembling a hierarchy.

 Example: An organization chart where an employee can have one direct supervisor but
multiple subordinates.

2. Network Data Model

 Description: Similar to the hierarchical model, but allows more complex relationships by
permitting each record to have multiple parent and child records, forming a graph structure.

 Example: A transportation network where cities can be connected to multiple other cities,
representing various routes.

3. Relational Data Model

 Description: Represents data in tables (relations) where each table consists of rows (records)
and columns (attributes). Relationships between tables are established through foreign keys.

 Example: A customer database where a Customers table is related to an Orders table


through a CustomerID foreign key.

4. Object-oriented Data Model

 Description: Integrates object-oriented programming principles with database technology,


storing data in the form of objects that include both data and behavior (methods).

 Example: A multimedia database where images, videos, and audio files are treated as
objects with properties (size, format) and methods (play, edit).
5. Entity-Relationship Model (ER Model)

 Description: Uses entities (objects) and relationships to visually represent the structure of a
database. Entities have attributes, and relationships depict how entities are connected.

 Example: A university database where Students, Courses, and Enrollments are represented
as entities, showing their attributes and how they relate to each other.

Entity is a physical existence , like student,facuty,chair

attributes are the properties of an entity like name,id

relation is the relationship between two or more entities for ex: student enrolled course

Semi-Structured Data Model

Definition: A semi-structured data model is a type of data model where data does not conform to a
fixed schema but still contains tags or markers to separate and identify data elements. It is more
flexible than structured data (like relational databases) but more organized than unstructured data
(like plain text). Common formats for semi-structured data include XML, JSON, and NoSQL databases.
Examples:

1. JSON (JavaScript Object Notation):

o Description: A lightweight format for data interchange that uses a simple key-value
pair structure.

o Example:

2. XML (eXtensible Markup Language):

o Description: A markup language that defines rules for encoding documents in a


format that is both human-readable and machine-readable.

o Example:
o

3. Email:

o Description: An email message contains semi-structured data, as it has a defined


format but can vary in content.

Data model Schema and Instance

o The data which is stored in the database at a particular moment of time is called an instance
of the database.

o The overall design of a database is called schema.

o A database schema is the skeleton structure of the database. It represents the logical view of
the entire database.

o A schema contains schema objects like table, foreign key, primary key, views, columns, data
types, stored procedure, etc.

Data Independence
o Data independence can be explained using the three-schema architecture.

o Data independence refers characteristic of being able to modify the schema at one level of
the database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence

o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.

o Logical data independence is used to separate the external level from the conceptual view.

o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.

o Logical data independence occurs at the user interface level.

2. Physical Data Independence

o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.

o If we do any changes in the storage size of the database system server, then the Conceptual
structure of the database will not be affected.

o Physical data independence is used to separate conceptual levels from the internal levels.

o Physical data independence occurs at the logical interface level.

Database Languages in DBMS

Database languages can be used to read, store and update the data in the database.

Types of Database Languages


1. Data Definition Language (DDL)

o DDL stands for Data Definition Language. It is used to define database structure or pattern.

o It is used to create schema, tables, indexes, constraints, etc. in the database.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.

o Alter: It is used to alter the structure of the database.

o Drop: It is used to delete objects from the database.

o Truncate: It is used to remove all records from a table.

o Rename: It is used to rename an object.

2. Data Manipulation Language (DML)

DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.

o Insert: It is used to insert data into a table.

o Update: It is used to update existing data within a table.

o Delete: It is used to delete all records from a table.

3. Data Control Language (DCL)

o DCL stands for Data Control Language.

o Controls access to data in the databas

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.

o Revoke: It is used to take back permissions from the user.

4. Transaction Control Language (TCL)


TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.

Manages database transactions to ensure data integrity.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.

o Rollback: It is used to restore the database to original since the last Commit.

The ACID Properties in a Database Management System (DBMS) ensure that transactions are
processed reliably. Each letter in ACID stands for one of the key properties: Atomicity, Consistency,
Isolation, and Durability. Here’s a simple breakdown with examples:

1. Atomicity

 Definition: Atomicity ensures that a transaction is all-or-nothing—either every part of it is


completed, or none of it is.

 Atomicity:-it is considered as single ‘unit of work’

 Example: Suppose you’re transferring $100 from Account A to Account B. The transaction
involves two steps: deducting $100 from Account A and adding $100 to Account B. If one
step succeeds and the other fails, the whole transaction is canceled, and no money is
transferred. This avoids partial updates that could cause inconsistencies.

2. Consistency

 Definition: Consistency ensures that a transaction brings the database from one valid state
to another. Any data written to the database must follow all defined rules, constraints, and
triggers.

 Example: In the same transfer, if Account A’s balance goes below $0 and the system rules
prevent overdrafts, the transaction is aborted. Consistency ensures that all data changes
follow database rules and constraints.

3. Isolation

 Definition: Isolation ensures that transactions occur independently. Changes made by one
transaction are not visible to other ongoing transactions until they are completed.

 Example: If two users try to transfer money from the same account at the same time,
isolation ensures that each transaction is processed one at a time, preventing potential
conflicts or incorrect balances.

4. Durability

 Definition: Durability guarantees that once a transaction is committed, it will remain saved,
even in case of a system crash or power failure.

 Example: After the $100 transfer is completed and confirmed, the system will save it so that
even if the system crashes afterward, the transaction is still recorded and will not be lost.

Summary of ACID Properties


 Atomicity: All or nothing.

 Consistency: Maintains valid data rules.

 Isolation: Each transaction is independent.

 Durability: Once saved, it stays saved.

ER (Entity Relationship) Diagram in DBMS


o ER model stands for an Entity-Relationship model. It is a high-level data model. This model is
used to define the data elements and relationship for a specified system.

For example, Suppose we design a school database. In this database, the student will be an entity
with attributes like address, name, id, age, etc. The address can be another entity with attributes like
city, street name, pin code, etc and there will be a relationship between them.

Component of ER Diagram
1. Entity:

An entity may be any object, class, person or place. In the ER diagram, an entity can be represented
as rectangles.

Consider an organization as an example- manager, product, employee, department etc. can be taken
as an entity.

a. Weak Entity

An entity that depends on another entity called a weak entity. The weak entity doesn't contain any
key attribute of its own. The weak entity is represented by a double recta

ngle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute.

For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.

b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.
c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.

For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another attribute like Date
of birth.

3. Relationship

A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.

Types of relationship are as follows:


a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.

c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.

For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have many employees.

Notations of ER model
Cardinality in a Database Management System (DBMS) refers to the number of instances of one
entity that can or must be associated with each instance of another entity.

Mapping constraints define the relationships between entities in a database, specifying how many
instances of one entity can or must be associated with instances of another entity.

Types of Cardinality

1. One-to-One (1:1):

o Definition: Each instance of Entity A is related to one instance of Entity B, and vice
versa.

o Example: A person can have only one passport, and a passport is assigned to only
one person.

2. One-to-Many (1:M):

o Definition: Each instance of Entity A can relate to multiple instances of Entity B, but
each instance of Entity B relates to only one instance of Entity A.

o Example: A teacher can teach multiple students, but each student has only one
teacher for a specific subject.

3. Many-to-One (M:1):

o Definition: Each instance of Entity B can relate to multiple instances of Entity A, but
each instance of Entity A relates to only one instance of Entity B.

o Example: Multiple students can belong to one class, but each student is enrolled in
just that one class.

4. Many-to-Many (M:M):
o Definition: Each instance of Entity A can relate to multiple instances of Entity B, and
each instance of Entity B can relate to multiple instances of Entity A.

o Example: Students can enroll in multiple courses, and each course can have multiple
students enrolled.

In the context of a Database Management System (DBMS), keys are attributes or a combination of
attributes that help uniquely identify records within a table. They play a crucial role in ensuring data
integrity and establishing relationships between tables. Here’s a breakdown of the main types of
keys:

Types of Keys

1. Primary Key:

o Definition: A unique identifier for each record in a table. No two rows can have the
same primary key value, and it cannot contain null values.

o Example: In a table of students, the student ID can serve as the primary key.

2. Foreign Key:

o Definition: An attribute that creates a link between two tables. It refers to the
primary key in another table, establishing a relationship.

o Example: In an enrollment table, a student ID could be a foreign key that references


the primary key in the students table.

3. Composite Key:

o Definition: A combination of two or more attributes that together uniquely identify a


record in a table.

o Example: In a table tracking course enrollments, a combination of student ID and


course ID may serve as a composite key.

4. Candidate Key:

o Definition: An attribute, or set of attributes, that can uniquely identify a record.


There can be multiple candidate keys in a table, but only one can be chosen as the
primary key.

o Example: Both student ID and email address could be candidate keys for a students
table.

5. Alternate Key:

o Definition: Any candidate key that is not chosen as the primary key.

o Example: If student ID is the primary key, then email address would be the alternate
key.

6. A super key is any set of attributes that can uniquely identify a record in a table, potentially
including extra attributes,
7. Artificial Key: An artificial key, also known as a surrogate key, is a unique identifier created
for a record that is not derived from the actual data but generated by the system.(auto
increment student no)

Generalization
o Generalization is like a bottom-up approach in which two or more entities of lower
level combine to form a higher level entity if they have some attributes in common.
o In generalization, entities are combined to form a more generalized entity, i.e.,
subclasses are combined to make a superclass.
For example, Faculty and Student entities can be generalized and create a higher level entity
Person.

Specialization
o Specialization is a top-down approach, and it is opposite to Generalization. In
specialization, one higher level entity can be broken down into two lower level
entities.
o Normally, the superclass is defined first, the subclass and its related attributes are
defined next, and relationship set are then added.
For example: In an Employee management system, EMPLOYEE entity can be specialized as
TESTER or DEVELOPER based on what role they play in the company.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship
which is in a relationship with another entity visitor. In the real world, if a visitor visits a
coaching center then he will never enquiry about the Course only or just about the Center
instead he will ask the enquiry about both.

Reducing an Entity-Relationship (ER) diagram to tables (or relational schema) involves


transforming the entities, attributes, and relationships defined in the ER diagram into a
format suitable for a relational database. Here are the key steps in this process, along with
an example:
Steps to Reduce an ER Diagram to Tables
1. Identify Entities:
o Each entity in the ER diagram will become a table.
2. Define Attributes:
o List out all attributes for each entity. These will become the columns of the
table.
3. Determine Primary Keys:
o Identify a primary key for each table, which uniquely identifies each record.
4. Translate Relationships:
o For each relationship:
 One-to-One: Include the primary key of one entity as a foreign key in
the other.
 One-to-Many: Add the primary key of the "one" side as a foreign key
in the "many" side.
 Many-to-Many: Create a new table that includes the primary keys of
both entities as foreign keys.
5. Handle Attributes in Relationships:
o If a relationship has attributes, include them in the new table created for the
relationship.
Example
ER Diagram
Assume an ER diagram with the following:
 Entities: Student and Course
 Attributes:
o Student: StudentID (Primary Key), Name, Email
o Course: CourseID (Primary Key), CourseName, Credits
 Relationship: Enrolls (Many-to-Many) between Student and Course with an attribute
of EnrollmentDate.
Steps of Reduction
1. Identify Entities:
o Student
o Course
2. Define Attributes:
o Student Table: StudentID, Name, Email
o Course Table: CourseID, CourseName, Credits
3. Determine Primary Keys:
o Student Table: Primary Key is StudentID
o Course Table: Primary Key is CourseID
4. Translate Relationships:
o Since "Enrolls" is a many-to-many relationship, create a new table called
Enrollment:
 Enrollment Table: StudentID (Foreign Key), CourseID (Foreign Key),
EnrollmentDate
5. Handle Attributes in Relationships:
o The Enrollment Table will have the attributes: StudentID, CourseID,
EnrollmentDate.
Final Tables
 Student Table:

StudentID Name Email

1 Alice [email protected]

2 Bob [email protected]

 Course Table:

CourseID CourseName Credits

101 Database Systems 3

102 Data Structures 4

 Enrollment Table:

StudentID CourseID EnrollmentDate

1 101 2024-01-15

1 102 2024-01-20

2 101 2024-01-18
StudentID CourseID EnrollmentDate
Relationship of higher degree
The degree of relationship can be defined as the number of occurrences in one entity that is
associated with the number of occurrences in another entity.
There is the three degree of relationship:
1. One-to-one (1:1)
2. One-to-many (1:M)
3. Many-to-many (M:N)
1. One-to-one
o In a one-to-one relationship, one occurrence of an entity relates to only one
occurrence in another entity.
o A one-to-one relationship rarely exists in practice.
o For example: if an employee is allocated a company car then that car can only be
driven by that employee.
o Therefore, employee and company car have a one-to-one relationship.
2. One-to-many
o In a one-to-many relationship, one occurrence in an entity relates to many
occurrences in another entity.
o For example: An employee works in one department, but a department has many
employees.
o Therefore, department and employee have a one-to-many relationship.

3. Many-to-many
o In a many-to-many relationship, many occurrences in an entity relate to many
occurrences in another entity.
o Same as a one-to-one relationship, the many-to-many relationship rarely exists in
practice.
o For example: At the same time, an employee can work on several projects, and a
project has a team of many employees.
o Therefore, employee and project have a many-to-many relationship.

Relational Model in DBMS


Relational model makes the query much easier than in hierarchical or network database
systems.
In 1970, E.F Codd has been developed it.
A relational database is defined as a group of independent tables which are linked to each
other using some common fields of each related table.
This model can be represented as a table with columns and rows.
Each row is known as a tuple.
Each table of the column has a name or attribute.
It is well knows in database technology because it is usually used to represent real-world
objects and the relationships between them.
Some popular relational databases are used nowadays like Oracle, Sybase, DB2, MySQL
Server etc.
Relational Model Terminologies:
Following are the terminologies of Relational Model:

Relation Table

Tuple Row, Record

Attribute Column, Field

Domain It consists of set of legal values

Cardinality It consists of number of rows

Degree It contains number of columns

Merits of the Relational Model


1. Simplicity: The relational model uses a straightforward structure based on tables,
making it easy to understand and use.
2. Data Integrity: It enforces data integrity through constraints like primary keys and
foreign keys, ensuring accuracy and consistency.
3. Flexibility: Tables can be easily modified, allowing for the addition of new data
without affecting existing data.
4. Data Independence: Changes in data structure do not require changes to application
programs, promoting easier maintenance.
5. Powerful Query Language: The use of SQL (Structured Query Language) allows for
complex queries to be executed with ease.
6. Support for Transactions: The relational model supports ACID properties (Atomicity,
Consistency, Isolation, Durability), ensuring reliable transactions.
7. Normalization: It allows for normalization, which reduces data redundancy and
improves data organization.
Demerits of the Relational Model
1. Complexity in Design: Designing a normalized database can be complex and time-
consuming, requiring careful planning.
2. Performance Issues: For very large datasets, performance can degrade, especially
with complex queries involving multiple joins.
3. Scalability Challenges: Scaling a relational database horizontally (across multiple
servers) can be more challenging compared to non-relational databases.
4. Overhead: The need for additional storage for constraints and indexes can lead to
increased overhead.
5. Fixed Schema: The fixed schema can be a limitation when dealing with dynamic data
requirements, as altering the schema may require significant effort.
6. Difficulty with Unstructured Data: It is less suited for handling unstructured or semi-
structured data, which is better managed by NoSQL databases.

Relational Algebra is a formal system used to manipulate and query relational databases. It
consists of a set of operations that take one or two relations (tables) as input and produce a
new relation as output. These operations allow users to perform various tasks, such as
retrieving data, filtering records, and combining tables.
Key Operations in Relational Algebra
1. Selection (σ):
o Purpose: To filter rows based on a specified condition.
o Notation: σ(condition)(Relation)
o Example: σ(Age > 25)(Employees) returns all employees older than 25.
2. Projection (π):
o Purpose: To retrieve specific columns from a relation.
o Notation: π(column1, column2, ...)(Relation)
o Example: π(Name, Age)(Employees) returns a list of employee names and
their ages.

3. Union (∪):
o Purpose: To combine the tuples from two relations, eliminating duplicates.

o Notation: Relation1 ∪ Relation2

o Example: Employees1 ∪ Employees2 returns all unique employees from both


tables.
4. Difference (−):
o Purpose: To find tuples that are in one relation but not in another.
o Notation: Relation1 − Relation2
o Example: Employees1 − Employees2 returns employees in Employees1 that
are not in Employees2.
5. Cartesian Product (×):
o Purpose: To combine every tuple of one relation with every tuple of another
relation.
o Notation: Relation1 × Relation2
o Example: Employees × Departments returns all combinations of employees
and departments.
 Rename (ρ): Allows renaming of a relation or its attributes for clarity.
o Notation: ρ(new_relation_name, old_relation_name)
 Intersection (∩): Finds tuples common to both relations (can be derived using union
and difference).
o Notation: Relation1 ∩ Relation2 (not a primitive operation but can be
expressed as Relation1 − (Relation1 − Relation2)).

Join Operations:

condition is satisfied. It is denoted by ⋈.


A Join operation combines related tuples from different relations, if and only if a given join

Example:
EMPLOYEE

EMP_CODE EMP_NAME
101 Stephan

102 Jack

103 Harry

SALARY

EMP_CODE SALARY

101 50000

102 30000

103 25000

1. Operation: (EMPLOYEE ⋈ SALARY)


Result:

EMP_CODE EMP_NAME SALARY

101 Stephan 50000

102 Jack 30000

103 Harry 25000

Types of Join operations:


1. Natural Join:
o A natural join is the set of tuples of all combinations in R and S that are equal on their
common attribute names.

o It is denoted by ⋈.
Example: Let's use the above EMPLOYEE table and SALARY table:
Input:

1. ∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY)


Output:

EMP_NAME SALARY

Stephan 50000

Jack 30000

Harry 25000
2. Outer Join:
The outer join operation is an extension of the join operation. It is used to deal with missing
information.
Example:
EMPLOYEE

EMP_NAME STREET CITY

Ram Civil line Mumbai

Shyam Park street Kolkata

Ravi M.G. Street Delhi

Hari Nehru nagar Hyderabad

FACT_WORKERS

EMP_NAME BRANCH SALARY

Ram Infosys 10000

Shyam Wipro 20000

Kuber HCL 30000

Hari TCS 50000

Input:

1. (EMPLOYEE ⋈ FACT_WORKERS)
Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000


Shyam Park street Kolkata Wipro 20000

Hari Nehru nagar Hyderabad TCS 50000

An outer join is basically of three types:


a. Left outer join
b. Right outer join
c. Full outer join
a. Left outer join:
o Left outer join contains the set of tuples of all combinations in R and S that are equal
on their common attribute names.
o In the left outer join, tuples in R have no matching tuples in S.

o It is denoted by ⟕.
Example: Using the above EMPLOYEE table and FACT_WORKERS table
Input:

1. EMPLOYEE ⟕ FACT_WORKERS

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

b. Right outer join:


o Right outer join contains the set of tuples of all combinations in R and S that are
equal on their common attribute names.
o In right outer join, tuples in S have no matching tuples in R.

o It is denoted by ⟖.
Example: Using the above EMPLOYEE table and FACT_WORKERS Relation
Input:
1. EMPLOYEE ⟖ FACT_WORKERS
Output:

EMP_NAME BRANCH SALARY STREET CITY

Ram Infosys 10000 Civil line Mumbai

Shyam Wipro 20000 Park street Kolkata

Hari TCS 50000 Nehru street Hyderabad

Kuber HCL 30000 NULL NULL

c. Full outer join:


o Full outer join is like a left or right join except that it contains all rows from both
tables.
o In full outer join, tuples in R that have no matching tuples in S and tuples in S that
have no matching tuples in R in their common attribute name.

o It is denoted by ⟗.
Example: Using the above EMPLOYEE table and FACT_WORKERS table
Input:

1. EMPLOYEE ⟗ FACT_WORKERS
Output:

EMP_NAME STREET CITY BRANCH SALARY

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

Kuber NULL NULL HCL 30000

3. Equi join:
It is also known as an inner join. It is the most common join. It is based on matched data as
per the equality condition. The equi join uses the comparison operator(=).
Example:
CUSTOMER RELATION

CLASS_ID NAME

1 John

2 Harry

3 Jackson

PRODUCT

PRODUCT_ID CITY

1 Delhi

2 Mumbai

3 Noida

Input:

1. CUSTOMER ⋈ PRODUCT
Output:

CLASS_ID NAME PRODUCT_ID CITY

1 John 1 Delhi

2 Harry 2 Mumbai

3 Harry 3 Noida

Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of
information.
o Integrity constraints ensure that the data insertion, updating, and other processes
have to be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
Types of Integrity Constraint

1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an
attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.
Example:
2. Entity integrity constraints
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation
and if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
Example:

3. Referential Integrity Constraints


o A referential integrity constraint is specified between two tables.
o In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary
Key of Table 2, then every value of the Foreign Key in Table 1 must be null or be
available in Table 2.
Example:
4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary
key. A primary key can contain a unique and null value in the relational table.
Example:

Aspect Entity Integrity Constraint Key Constraint

Focus Unique row identifier (primary key) Unique values in any specified columns
Aspect Entity Integrity Constraint Key Constraint

NULL
Cannot have NULL values Can have NULL values, but no duplicates
Values

Student ID must be unique and not Email addresses must be unique, NULL
Example
NULL allowed

You might also like