Unit 1
Unit 1
ld
Unit 1: Syllabus
or
• Introduction: Overview, Database System vs File System, Database System Con-
cept and Architecture.
• Data Model Schema and Instances, Data Independence, and Database Language
and Interfaces.
W
• Data Definitions Language (DDL), Data Manipulation Language (DML), Overall
Database Structure.
• Data Modeling Using the Entity Relationship Model: ER Model Concepts, Notation
for ER Diagram.
• Mapping Constraints, Keys, Concepts of Super Key, Candidate Key, Primary Key.
Introduction
ec
text, images, or other formats that are collected for reference or analysis. Data itself does
not carry any meaning until it is processed or interpreted.
Example: The number ’2024’ is data, but it does not convey any meaning until we
associate it with a year or a quantity.
1
What is a Database?
A Database is an organized collection of data that can be easily accessed, managed, and
updated. Databases store data in a structured format, using tables, records, and fields,
which allows for efficient querying and manipulation of the data.
Example: A customer database in a retail store contains information like customer
names, addresses, purchase history, and contact details.
ld
A Database Management System (DBMS) is software that interacts with end-users,
applications, and the database itself to capture and analyze data. A DBMS provides an
interface for the users to create, update, and manage databases efficiently.
Example: Examples of DBMS include Oracle, MySQL, Microsoft SQL Server, Mon-
goDB, and PostgreSQL.
or
What is the Need for a DBMS?
A DBMS is needed for the following reasons:
• Data Redundancy Control: Minimizes data duplication and ensures data con-
sistency across multiple locations.
W
• Data Integrity: Maintains the accuracy and consistency of data over its lifecycle.
• Backup and Recovery: Provides tools to recover data in case of system failures
h
or data corruption.
• Data Independence: Allows changes in data structure without affecting the ap-
plication programs.
ec
Example of DBMS:
2
Advantages of DBMS
A Database Management System (DBMS) offers several advantages:
• Data Integrity and Consistency: Ensures that data remains accurate, consis-
tent, and reliable across the database.
• Data Security: Provides robust security measures to protect data from unautho-
rized access and breaches.
ld
lowing for flexibility in modifying the database without affecting the application
layer.
• Efficient Data Access: Uses indexing, query optimization, and caching tech-
niques to enhance data retrieval and manipulation speed.
or
• Concurrent Access and Crash Recovery: Allows multiple users to access the
data simultaneously and ensures data recovery in case of system failures.
Disadvantages of DBMS
While a DBMS provides numerous benefits, it also has some disadvantages:
3
Types of Database Users
• Database Administrators (DBAs): Responsible for managing and maintaining
the overall database environment, including user management, backup, recovery,
and security.
Example: A DBA in a large corporation might configure database servers, monitor
performance, and handle disaster recovery planning.
• Application Programmers: Developers who write application programs that
interact with the database. They use programming languages like Java, Python, or
ld
SQL to access and manipulate data.
Example: An application programmer might create an e-commerce application
that retrieves product data from a database and displays it on a website.
• End Users: The individuals who interact with the database through applications
to perform tasks like data entry, retrieval, or reporting.
or
Example: A bank customer using an online portal to check their account balance
is an end user of the database.
• System Analysts: Professionals who design and develop the overall system archi-
tecture, including database design, to meet business requirements.
W
Example: A system analyst might work with both end-users and DBAs to create
a database schema that supports new business processes.
• Database Designers: Individuals responsible for designing the structure of the
database, including defining schemas, relationships, and constraints.
Example: A database designer might define the relationships between tables in a
hospital management system.
• Naive Users: Users who interact with the database through pre-defined applica-
tions without writing any queries or using advanced features.
h
business goals.
4
• Data Policy Development: Establishing policies, standards, and procedures
for data management, ensuring data quality, consistency, and security across the
organization.
Example: Developing guidelines for data entry to reduce errors and maintain
consistency.
• Data Standardization: Ensuring uniformity in data formats, definitions, and
representations to facilitate data integration and interoperability among different
systems.
ld
Example: Standardizing date formats across multiple databases (e.g., using YYYY-
MM-DD).
• Data Security and Privacy: Establishing rules and protocols to protect sensitive
data from unauthorized access, breaches, and misuse, in compliance with legal and
regulatory requirements.
or
Example: Implementing data masking techniques for personal identifiable infor-
mation (PII).
• Data Quality Management: Monitoring and managing data accuracy, complete-
ness, consistency, and reliability to ensure high-quality data across the organization.
Example: Setting up data validation rules to detect and correct errors during data
entry.
W
• Data Lifecycle Management: Overseeing the complete lifecycle of data, from
creation and storage to archiving and deletion, ensuring that data is properly man-
aged throughout its lifespan.
Example: Developing retention policies that specify how long certain types of data
should be stored.
• Collaboration with IT and Business Teams: Working closely with database
h
administrators (DBAs), system analysts, and business users to align data manage-
ment strategies with organizational objectives.
Example: Coordinating with the IT team to ensure data backup and disaster
recovery processes are in place.
ec
relationships.
Data Abstraction
Data Abstraction refers to the process of hiding the complexities of the database from
the user and providing a simplified view of the data. It helps in managing the large
amounts of data stored in the database by abstracting its details, enabling users to
interact with the data without needing to understand its internal structure or storage
details.
Data abstraction is achieved through three different levels:
5
• Physical Level: This is the lowest level of data abstraction, which describes how
the data is physically stored in the database. It deals with the storage of data
on storage media, such as hard drives, and the implementation details like file
organization, indexing, and data compression techniques.
Example: At this level, the data administrator might work with storage blocks,
and sectors, or manage how data is indexed in the database for quick retrieval.
• Logical Level: This level provides a higher level of abstraction and focuses on
what data is stored in the database and what the relationships are among those
ld
data. It describes the structure of the entire database for a group of users. This
level is independent of how the data is stored physically and provides a logical view
of the data.
Example: At this level, the data might be represented using tables, columns, rows,
and relationships like one-to-one, one-to-many, or many-to-many, without concern
for physical storage details.
or
• View Level: This is the highest level of data abstraction and describes only a part
of the entire database. The view level simplifies the interaction for the end-users
by providing only the relevant data needed for their specific tasks or applications.
It is also used to enhance security by restricting access to certain data.
W
Example: A bank employee might only see the customer details relevant to their
role, like name and account balance, without access to sensitive data like Social
Security numbers.
6
Aspect Physical Level Logical Level View Level
Definition Describes how data Describes what data is Shows only a subset
is physically stored in stored and the rela- of the database that is
storage devices. tionships among those relevant to the user or
data. application.
Focus Storage structure and Overall data struc- User-specific views,
access methods. ture, schema, and re- simplifying data inter-
lationships. action.
ld
Visibility Low-level details visi- Mid-level abstraction High-level abstraction
ble to DBAs only. visible to developers visible to end-users.
and designers.
Data Indepen- Provides low data in- Provides logical Offers external
dence dependence; changes data independence; data independence;
or
affect physical stor- changes do not affect changes do not affect
age. storage. internal schema.
Security Minimal impact on se- Moderate impact; fo- High impact; restricts
curity; deals with stor- cuses on logical data user access to sensitive
age details. security. data.
W
Example File storage formats, ER diagrams, tables, Customer view, em-
indexing, data com- relationships. ployee view, product
pression. catalog view.
Users Database Administra- Database Designers End-users and Appli-
tors (DBAs). and Developers. cation Programmers.
Example: Suppose a bank uses a file system to manage its customer information. The
system may face challenges like redundancy, inconsistency, and difficulty in managing
concurrent access by multiple users. In contrast, a DBMS will handle these issues effi-
ec
that defines how different components of a DBMS interact with each other. There are
primarily three types of DBMS architectures: 1-tier, 2-tier, and 3-tier architecture.
• 1-Tier Architecture:
In 1-tier architecture, the database is directly accessible to the user without any
intermediary application. The user directly interacts with the DBMS, which is
usually installed on their local machine. This architecture is mainly used for de-
velopment purposes, where the developer directly communicates with the database
for testing and design.
7
Aspect DBMS File System
Definition A software system that fa- A method for storing, orga-
cilitates the creation, man- nizing, and retrieving files
agement, and manipulation on a storage device.
of databases.
Data Redun- Minimizes redundancy by High redundancy due to in-
dancy using normalization tech- dependent file storage, lead-
niques. ing to data duplication.
ld
Data Consis- Ensures data consistency Lacks mechanisms for main-
tency through integrity con- taining data consistency
straints and transactions. across multiple files.
Data Security Provides robust security Limited security features;
features, including access relies on operating system
or
control, encryption, and security measures.
user authentication.
Backup and Offers automated and sys- Backup and recovery pro-
Recovery tematic backup and recov- cesses are manual and less
ery processes. reliable.
W
Data Access Supports complex querying Limited to basic file opera-
and data manipulation us- tions (create, read, update,
ing SQL or similar lan- delete).
guages.
Concurrency Manages multiple users ac- Lacks concurrency control;
Control cessing the data simultane- file locking is often required
ously through concurrency to prevent conflicts.
control mechanisms.
h
Example: SQL*Plus, Oracle Forms, etc., where the developer interacts directly
with the database system.
• 2-Tier Architecture:
In 2-tier architecture, the DBMS system is split into two parts: the client side and
the server side. The client directly communicates with the database server. This
type of architecture is used in small to medium-sized applications where the client
(user interface) directly connects to the server (database) through an application
8
Figure 1: 1-Tier Architecture of DBMS
ld
interface like ODBC or JDBC.
Example: Applications using client-server models like Microsoft Access and Fox-
Pro.
or
W
• 3-Tier Architecture:
The 3-tier architecture is the most commonly used architecture for DBMS systems.
It divides the application into three layers: the presentation layer (client), the ap-
plication layer (business logic), and the database layer (server). The client interacts
with the application server, which further communicates with the database server.
This architecture offers better security, scalability, and flexibility.
h
Example: Web applications where the client (browser) sends requests to the web
server (application server), which then interacts with the database server.
article graphicx amsmath
ec
9
∗
ld
∗ Explanation: Extends the hierarchical model by allowing multiple rela-
tionships between records (many-to-many). Uses a graph structure where
nodes represent records and edges represent relationships.
or
W
– Relational Data Model
∗ Explanation: Organizes data in tables (relations) consisting of rows and
columns. Each table has a unique key, and relationships between tables
are defined using foreign keys.
h
ec
∗
– Entity-Relationship (ER) Model
∗ Explanation: Represents data using entities (objects) and relationships
between them. Widely used for conceptual modeling of databases.
IT
∗ newline
10
ld
–
• Object-Oriented Data Model
or
– Explanation: Combines object-oriented programming principles with database
management. Data is represented as objects, similar to classes in object-
oriented languages. W
–
Data Schema
h
– Definition: The logical structure of the database that defines the organiza-
tion of data, such as tables, fields, and relationships.
Instances
ec
– Definition: The actual content stored in the database at a given time. For
example, in a relational model, instances are the rows stored in a table.
11
Database Languages and Interfaces
Database languages are used to define, manipulate, control, and query the data
within a database. They include several types:
ld
∗ Commands include CREATE, ALTER, DROP, and TRUNCATE.
∗ Examples: Defining the schema of a table or modifying an existing table
structure.
or
∗ Used for accessing and manipulating data stored in the database.
∗ Commands include INSERT, UPDATE, DELETE, and MERGE.
∗ Examples: Inserting a new record, updating an existing record, or delet-
ing records from a table.
∗ Examples: Fetching data from one or more tables, filtering data using
conditions, or joining tables.
ec
Example: SQL (Structured Query Language) is a widely used language that in-
cludes commands for DDL, DML, DCL, DQL, and VDL.
– Menu-Based Interface
12
∗ Provides a list of options or commands in a menu format.
∗ Users can navigate through different menus to execute specific database
operations.
∗ Commonly used in applications where users prefer easy and guided nav-
igation.
– Forms-Based Interface
∗ Allows users to enter data and interact with the database using forms.
ld
∗ Suitable for data entry tasks where structured input is required.
∗ Often used in applications where non-technical users interact with the
database.
or
∗ Provides a visual interface with icons, buttons, and other graphical ele-
ments.
∗ Allows users to interact with the database through point-and-click ac-
tions.
∗ Commonly used in modern applications to enhance usability and user
experience.
W
– Natural Language Interface
∗ Enables users to interact with the database using natural language queries.
∗ Suitable for users who are not familiar with query languages like SQL.
∗ Relies on natural language processing (NLP) to interpret user input.
∗ Provides advanced tools and options for managing the database system.
∗ Includes functionalities like performance monitoring, backup, and secu-
IT
rity management.
∗ Tailored for experienced users who need to maintain and optimize the
database.
13
Overall Database Structure
– Storage Manager:
ld
∗ Handles tasks like data allocation, organization, retrieval, and buffering.
∗ Components include the buffer manager, file manager, and disk space
manager.
– Query Processor:
or
∗ Translates high-level SQL queries into low-level instructions that the
database engine can execute.
∗ Performs query parsing, optimization, and execution.
∗ Includes components like the query parser, query optimizer, and query
executor.
– Transaction Manager:
W
∗ Ensures that all database transactions are processed reliably and adhere
to ACID properties (Atomicity, Consistency, Isolation, Durability).
∗ Manages transaction logs, concurrency control, and recovery processes.
∗ Includes components like the lock manager and log manager.
– Buffer Manager:
h
∗ Manages the buffer pool in main memory to reduce disk I/O operations.
∗ Decides which data pages to cache in memory and which to flush back
to disk.
ec
– Index Manager:
∗ Manages index structures like B-trees, hash tables, and bitmap indexes.
∗ Improves query performance by reducing the search space for data.
14
– Metadata Manager:
– Recovery Manager:
ld
∗ Implements techniques for backup, restore, and log-based recovery.
∗ Works closely with the transaction manager to provide rollback and com-
mit functionalities.
or
W
h
ec
IT
15
– Data Modeling Using the Entity-Relationship (ER) Model:
ld
∗ Notation for ER Diagram: An ER diagram uses various symbols to
depict entities (usually as rectangles), attributes (as ovals), and relation-
ships (as diamonds). Lines connect these elements to show how they
interact or are associated with each other.
or
Purpose of an ER Diagram:
16
– Mapping Constraints and Keys:
∗ Mapping Constraints: Define how entities are associated with one
another in a database. They specify the cardinality of relationships,
which determines the number of occurrences of one entity that can be
associated with occurrences of another entity. The types include:
· One-to-One (1:1): Each entity in the first entity set is associated
with at most one entity in the second entity set, and vice versa.
ld
· One-to-Many (1:N): An entity in the first entity set can be associ-
ated with multiple entities in the second entity set, but each entity
in the second entity set is associated with at most one entity in the
first entity set.
or
associated with a single entity in the second entity set, but each
entity in the second entity set can be associated with at most one
entity in the first entity set.
17
Email can be a super key if each combination uniquely identifies a
student.
· Candidate Key: A minimal super key, meaning it contains no
redundant attributes. Each candidate key can uniquely identify an
entity without any unnecessary attributes.
Example: In the same student database, StudentID alone can be a
candidate key if it is sufficient to uniquely identify each student.
· Primary Key: A candidate key chosen by the database designer to
uniquely identify each entity within an entity set. It must be unique
ld
and not null.
Example: StudentID might be chosen as the primary key in the
student database because it uniquely identifies each student and is
not null.
· Composite Key: A key that consists of two or more attributes
that together uniquely identify an entity.
or
Example: In an enrollment database, a combination of StudentID
and CourseID might be used as a composite key to uniquely identify
each enrollment record.
· Alternate Key: A candidate key that was not chosen as the pri-
mary key but can still uniquely identify an entity.
W
Example: In the student database, Email might be an alternate
key if StudentID is chosen as the primary key.
· Foreign Key: An attribute or set of attributes in one table that
refers to the primary key in another table. It establishes a link
between the two tables.
Example: In an enrollment database, CourseID in the Enrollment
table might be a foreign key that references the CourseID primary
key in the Course table.
article geometry a4paper, margin=1in graphicx array
h
Table 2: Differences Between Super Key, Candidate Key, and Primary Key
Key Type Super Key
ec
Definition A set of one or more attributes that can uniquely identify an entity.
Uniqueness Must uniquely identify each entity, but can include extra attributes.
Redundancy May contain redundant attributes.
Key Type Candidate Key
Definition A minimal super key, meaning it has no redundant attributes.
IT
18
ld
or
∗ Generalization, Aggregation, and Reduction of an ER Diagram
to Tables:
W
· Generalization: A process of extracting common characteristics
from multiple entities and creating a generalized entity that repre-
sents these shared characteristics. This is often used in hierarchical
data modeling.
h
ec
IT
19
ld
Specialization
Definition
or
Specialization is a process where a general entity is divided into more
specific sub-entities or subclasses. Each subclass inherits attributes
and relationships from the general entity but may also have addi-
tional attributes or relationships. article geometry a4paper, mar-
W
h
ec
gin=1in array
gregation
20
Aspect Generalization Specialization Aggregation
Definition The process of extract-The process of defining aA process of creating a
ing common characteristicsnew subclass from an exist-higher-level abstraction by
from multiple entities anding class to capture morecombining several entities
combining them into a gen-specific characteristics. into a single entity.
eralized entity.
Purpose To simplify and unify sim-To refine and categorize aTo simplify complex ER di-
ilar entities by identifyinggeneral entity into moreagrams by grouping related
ld
common attributes. specific sub-entities. entities into a single ab-
straction.
Direction From specific entities toFrom a generalized entityCombining multiple entities
a generalized entity (top-to more specific sub-entitiesinto a higher-level entity
down). (bottom-up). (horizontal aggregation).
or
Use Case Useful when multiple en-Useful when an entity needsUseful when representing
tities share common at-to be divided into morecomplex relationships be-
tributes or relationships. detailed types to representtween entities in a simplified
specific characteristics. manner.
ER DiagramGeneralization is often rep-Specialization is repre-Aggregation is represented
W
Representa- resented with a trianglesented with a hierarchy,by a diamond or an oval
tion pointing to a single general-where a general entity isencompassing multiple enti-
ized entity. at the top and specificties to show a higher-level
sub-entities below. relationship.
21
8. If the relationship has attributes, these attributes become addi-
tional columns in the relationship table.
9. Handle Primary and Foreign Keys:
10. Primary keys in entity tables are used to establish relationships
with other tables.
11. Foreign keys are added to tables to represent the relationships
between entities. They are columns that reference the primary
keys of other tables.
12. Ensure that referential integrity is maintained, meaning that for-
ld
eign keys must correspond to valid primary keys in the referenced
tables.
13. Convert Multi-Valued Attributes:
14. Multi-valued attributes (attributes that can have multiple values
for a single entity) are handled by creating a new table.
or
15. The new table includes a foreign key that references the primary
key of the entity and a column for the multi-valued attribute.
16. This table essentially captures the one-to-many relationship be-
tween the original entity and the multi-valued attribute.
17. Convert Weak Entities:
W
18. Weak entities (entities that do not have a sufficient primary key
on their own) are handled by creating a table that includes the
primary key of the strong entity it depends on.
19. The table includes the partial key of the weak entity along with
the foreign key from the strong entity.
20. The combination of the strong entity’s primary key and the weak
entity’s partial key forms the primary key of the weak entity’s
table.
h
tables are free from anomalies and that the relationships between
tables are accurately represented.
IT
22
ld
or
W
· Extended ER Model (EER): Extends the original ER model
by adding more modeling constructs, such as specialization, gener-
alization, categorization, and inheritance, to better represent more
complex database designs.
article geometry a4paper, margin=1in
grams
ec
Activity Diagram
23
Use Case Diagram
ld
· Relationships: Lines connecting actors to use cases, including as-
sociations, generalizations, and include/extend relationships.
· Use Cases: Understanding system requirements and user interac-
tions.
or
Interaction Overview Diagram
Timing Diagram
· Components:
· Lifelines: Represent objects or participants in the interaction.
· Time Intervals: Show the passage of time along the diagram.
ec
Sequence Diagram
IT
24
Class Diagram
ld
dencies between classes.
· Use Cases: Designing and understanding the structure of the sys-
tem.
or
W
h
ec
IT
25