0% found this document useful (0 votes)
20 views

Dbms Unit 01

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Dbms Unit 01

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Introduction to DBMS

File System vs DBMS

File System:

A file system is a method of storing and organizing files and their data. While it’s a fundamental part
of an operating system, it has several limitations when it comes to managing large amounts of data.

1. Data Redundancy and Inconsistency:

o Redundancy: Data is often duplicated in different files, leading to wastage of storage


space.

o Inconsistency: If data is duplicated, there is a risk that changes made to the data in
one file are not reflected in other files.

2. Data Isolation:

o Because data is stored in various files, it can be challenging to retrieve data from
multiple sources for analysis.

3. Data Integrity:

o Maintaining data integrity (ensuring data is accurate and consistent) is difficult


without a centralized control.

4. Access Control:

o Controlling access to files is primitive compared to the complex access control


mechanisms available in DBMS.

5. Atomicity Issues:

o Ensuring that all operations in a transaction are completed successfully is difficult,


making it hard to recover from failures.

6. Concurrency Problems:

o Simultaneous access to data by multiple users can lead to conflicts and


inconsistencies.

DBMS (Database Management System):

A DBMS is software that uses a standard method to store and organize data. It overcomes many of
the limitations of file systems by providing a systematic way to create, retrieve, update, and manage
data.

1. Data Redundancy and Inconsistency:

o Redundancy is controlled through normalization, which organizes data into tables to


reduce duplication.

o Ensures data consistency by maintaining a single version of the data.

2. Data Isolation:
o Data is stored in tables that can be linked using relationships, making it easier to
retrieve data from multiple tables.

3. Data Integrity:

o Integrity constraints (like primary keys, foreign keys, and check constraints) ensure
the accuracy and consistency of the data.

4. Access Control:

o Provides robust security mechanisms, including user authentication, roles, and


permissions to control access.

5. Atomicity, Consistency, Isolation, Durability (ACID) Properties:

o Ensures that all transactions are processed reliably and that the database remains
consistent in case of failures.

6. Concurrency Control:

o Manages simultaneous data access by multiple users through locking mechanisms


and concurrency control protocols to prevent conflicts and ensure data integrity.

Example:

• File System: Imagine a library where each book's information is stored separately by title,
author, and genre in different files. If the author’s name changes, updating every file is
cumbersome and prone to errors.

• DBMS: The same library’s data is stored in a DBMS with tables for books, authors, and
genres linked by relationships. Updating the author’s name in one place updates it
everywhere, maintaining consistency.

Advantages of DBMS

1. Data Independence:

o Logical Data Independence: Changes in the logical structure (like adding new fields)
do not affect application programs.

o Physical Data Independence: Changes in the physical storage of data do not affect
the DBMS's logical structure.

2. Efficient Data Access:

o Uses indexing, query optimization, and other techniques to retrieve data quickly,
even from large datasets.

3. Data Integrity and Security:

o Integrity constraints ensure that only valid data is entered.

o Security measures, such as user authentication and access control, protect data
from unauthorized access.

4. Data Administration:
o Centralized data management reduces redundancy, maintains consistency, and
simplifies data maintenance.

5. Concurrent Access and Crash Recovery:

o Manages concurrent data access by multiple users using transaction management


and concurrency control.

o Ensures data integrity in case of system failures through recovery mechanisms.

6. Reduced Application Development Time:

o Provides high-level interfaces and common functions, reducing the time and effort
required to develop applications.

Example:

• A university DBMS manages student records, course registrations, and grades. It ensures
that a student cannot register for a course without the prerequisite, maintains consistency
even if multiple administrators access the data simultaneously, and recovers data in case of
system failures.

Storage Data

Data Storage in DBMS:

1. Disk Space Manager:

o Allocates and manages disk space for database files.

2. File Manager:

o Manages the creation, deletion, reading, and writing of files within the DBMS.

3. Buffer Manager:

o Handles the transfer of data between disk storage and main memory, optimizing
performance.

4. Storage Hierarchy:

o Data is stored in a hierarchy from fast and expensive storage (like main memory and
cache) to slower and cheaper storage (like magnetic disks and tapes). This hierarchy
balances cost and performance.

Example:

• When a user queries a database, the buffer manager retrieves the required data from disk
storage to main memory, ensuring quick access. The disk space manager ensures that the
data is stored efficiently on the disk.
Queries

Query Processing:

1. Query Parser:

o Parses SQL commands, checking for syntax errors and converting the query into an
internal representation.

2. Query Optimizer:

o Determines the most efficient way to execute the query by considering different
execution plans and selecting the one with the lowest cost.

3. Query Executor:

o Executes the query using relational operators like selection, projection, and join.

4. Buffer and Disk Managers:

o Manage data retrieval from disk to memory during query execution, ensuring
efficient data access.

Example:

• When a user runs an SQL query to find all students enrolled in a specific course, the query
parser validates the query syntax, the query optimizer finds the best way to access the data,
and the query executor retrieves and displays the results.

DBMS Structure

A typical DBMS structure includes several components:

1. SQL Interface:

o Accepts and processes SQL commands from users.

2. Query Optimizer:

o Optimizes query execution plans for efficient data retrieval.

3. Execution Engine:

o Executes the optimized query plans.

4. File and Access Methods:

o Manage file storage and access methods like indexing.

5. Buffer Manager:

o Handles data buffering between disk and main memory.

6. Disk Space Manager:

o Manages disk space allocation for database files.

7. Concurrency Control:
o Ensures safe concurrent access to data by multiple users.

8. Transaction Manager:

o Manages transaction processing, ensuring ACID properties.

9. Recovery Manager:

o Handles crash recovery to maintain data integrity.

Example:

• When a banking system processes a transaction, the transaction manager ensures that the
transaction follows ACID properties, the concurrency control mechanism handles multiple
users accessing their accounts simultaneously, and the recovery manager ensures that all
transactions are properly logged for recovery in case of a system failure.

Types of Databases

1. Hierarchical Databases:

o Data is organized in a tree-like structure with a single root and multiple levels of
nested child records.

o Example: An organizational chart where each department has sub-departments.

2. Network Databases:

o Similar to hierarchical databases but allows more complex relationships with


multiple parent records.

o Example: A network model for telecommunication systems where each switch is


connected to multiple other switches.

3. Relational Databases:

o Data is stored in tables with rows and columns. Relationships are established
through foreign keys.

o Example: A customer and order database where customers and orders are linked by
customer IDs.

4. Key-Value Databases:

o Simple data storage where each item is a key-value pair.

o Example: NoSQL databases like Redis, where user sessions are stored as key-value
pairs.

5. Object-Oriented Databases:

o Data is stored as objects, similar to object-oriented programming.

o Example: Multimedia databases storing images, videos, and audio files as objects.

6. XML Databases:

o Designed to store and query XML data.


o Example: Databases managing web data in XML format, such as RSS feeds.

Overview of File Structures in Database

File Structures:

1. Heap Files:

o Unordered records are stored in pages. Suitable for small databases where searching
isn't intensive.

o Example: Storing log files where retrieval order is not important.

2. Sorted Files:

o Records are stored in a sorted order based on a key. Efficient for range queries.

o Example: Storing employee records sorted by employee ID.

3. Hashed Files:

o Records are distributed across buckets based on a hash function. Provides efficient
access for equality searches.

o Example: Storing user accounts where each username is hashed to a bucket.

4. Indexes:

o Auxiliary structures to speed up data retrieval. Types include:

▪ B-trees: Balanced tree structure for efficient range and equality searches.

▪ Hash Indexes: Uses a hash function for efficient equality searches.

▪ Bitmap Indexes: Uses bitmaps for efficient search on low-cardinality


columns.

o Example: Using a B-tree index to quickly find records in a large table.


Database Design: Data Models and Their Importance

Data Models

Data models are essential frameworks for database design. They define the structure of a database,
the relationships among data elements, and the constraints that apply to the data. Let's explore
different types of data models in detail:

1. Hierarchical Data Model:

o Structure: Represents data in a tree-like structure, with a single root and multiple
levels of nested child records.

o Example: An organizational chart where each department has sub-departments and


employees.

o Usage: Suitable for applications where data naturally forms a hierarchy, such as file
systems and XML documents.

o Advantages: Simple to implement and efficient for one-to-many relationships.

o Disadvantages: Limited flexibility as it requires the data to be structured in a


predefined hierarchy. Difficult to manage many-to-many relationships.

2. Network Data Model:

o Structure: An extension of the hierarchical model, it allows multiple parent records


for a child record, forming a graph structure.

o Example: A telecommunications network where switches are connected to multiple


other switches.

o Usage: Suitable for complex relationships and scenarios requiring more flexibility
than the hierarchical model.

o Advantages: Supports many-to-many relationships and provides a more flexible


approach than the hierarchical model.

o Disadvantages: Complexity in design and maintenance due to the graph structure.

3. Relational Data Model:

o Structure: Data is stored in tables (relations) consisting of rows (tuples) and columns
(attributes). Relationships among tables are established through foreign keys.

o Example: A customer and order database where customers and orders are linked by
customer IDs.

o Usage: Widely used in various applications, from small systems to large enterprise
applications.

o Advantages: Provides data independence, supports complex queries with SQL, and
ensures data integrity through constraints.

o Disadvantages: Performance issues with very large datasets and complex joins.

4. Object-Oriented Data Model:


o Structure: Integrates object-oriented programming principles with database
technology. Data is stored as objects, similar to the structure used in object-oriented
programming languages.

o Example: Multimedia databases storing images, videos, and audio files as objects
with properties and methods.

o Usage: Suitable for applications requiring complex data representations like


CAD/CAM, multimedia applications, and scientific databases.

o Advantages: Supports complex data types and encapsulates both data and behavior.

o Disadvantages: Complexity in design and less mature technology compared to


relational databases.

5. Object-Relational Data Model:

o Structure: A hybrid of object-oriented and relational models. It extends the


relational model to support complex data types and object-oriented features.

o Example: An e-commerce platform where products are represented as objects with


attributes and methods, but stored in relational tables.

o Usage: Combines the benefits of both relational and object-oriented models.

o Advantages: Supports complex data types and inheritance while maintaining the
familiarity of relational models.

o Disadvantages: Increased complexity and potential performance overhead.

6. Entity-Relationship (ER) Model:

o Structure: Focuses on entities, their attributes, and relationships. Uses ER diagrams


to visually represent the data model.

o Example: A university database where entities include students, courses, and


instructors, and relationships represent enrollments and teaching assignments.

o Usage: Useful in the initial design phase to conceptualize and visualize the data
model.

o Advantages: Easy to understand and communicate with stakeholders.

o Disadvantages: More abstract and requires transformation into a logical model for
implementation.

Importance of Data Models

Data models are critical for several reasons:

1. Data Organization:

o Provides a systematic way to organize data, ensuring consistency and integrity. By


defining how data is stored and accessed, data models help maintain the logical
structure of data.

2. Simplification of Design:
o Simplifies the database design process by providing a clear blueprint. This helps
database designers and developers understand the structure and relationships of
the data.

3. Improved Data Integrity:

o Enforces data integrity rules, ensuring that the data remains accurate and
consistent. This is crucial for maintaining the quality and reliability of the data.

4. Enhanced Communication:

o Serves as a communication tool between stakeholders, including developers,


business analysts, and end-users. Provides a common language to discuss database
requirements and design.

5. Support for Data Management:

o Defines data operations and constraints, supporting efficient data management. This
includes data retrieval, updates, and deletions, ensuring that the database performs
optimally.

6. Facilitation of Maintenance:

o Makes it easier to maintain and update the database. Provides a clear structure that
helps identify the impact of changes and ensures that updates are made
consistently.

7. Scalability and Flexibility:

o Allows for scalability and flexibility in database design. Can be adapted to


accommodate changes in business requirements and data volume, ensuring that the
database remains relevant over time.

Example:

Consider a company that needs to manage employee data. A well-designed data model ensures that:

• Employee records are stored consistently, with attributes like ID, name, position, and
department.

• Relationships between employees and departments are clearly defined, facilitating queries
like "Find all employees in a specific department."

• Data integrity is maintained, preventing duplicate records and ensuring accurate updates.

E-R Model: Entities, Attributes, and Entity Sets; Relationship and Relationship Set; Mapping

Entities, Attributes, and Entity Sets

• Entity:

o An entity is a real-world object or concept that can be distinctly identified. Examples


include a student, a course, or an instructor in a university database.

o Example: In a library system, an entity could be a "Book" with attributes such as


ISBN, title, author, and publisher.
• Attributes:

o Attributes are properties or characteristics of an entity. Each attribute represents a


specific aspect of the entity.

o Example: For a "Book" entity, attributes might include ISBN, title, author, and
publisher.

• Entity Set:

o An entity set is a collection of similar entities. For instance, all students in a


university database form a student entity set. Each entity in an entity set shares the
same attributes.

o Example: The entity set for "Books" includes all books in the library, each
represented with attributes like ISBN, title, and author.

Relationships and Relationship Sets

• Relationship:

o A relationship is an association between two or more entities. It represents how


entities are connected to each other.

o Example: The relationship between a "Student" and a "Course" they are enrolled in
is a relationship. This relationship might be named "Enrolled."

• Relationship Set:

o A relationship set is a collection of similar relationships. For instance, the enrollment


relationship set includes all instances of students enrolling in courses.

o Example: The "Enrolled" relationship set includes all pairs of students and courses,
indicating which students are enrolled in which courses.

• Descriptive Attributes:

o Relationships can have descriptive attributes that provide additional information


about the relationship.

o Example: The "Enrolled" relationship might have an attribute "Enrollment Date" to


indicate when the student enrolled in the course.

Mapping Cardinalities

• Mapping Cardinalities specify the number of entities to which another entity can be
associated via a relationship set. Common mapping cardinalities include:

o One-to-One (1:1): Each entity in the first set is associated with at most one entity in
the second set, and vice versa.

▪ Example: Each employee has one unique employee ID, and each ID
corresponds to one employee.

o One-to-Many (1): Each entity in the first set can be associated with multiple entities
in the second set, but each entity in the second set is associated with at most one
entity in the first set.
▪ Example: A department has many employees, but each employee belongs to
one department.

o Many-to-One (N:1): Each entity in the first set is associated with at most one entity
in the second set, but each entity in the second set can be associated with multiple
entities in the first set.

▪ Example: Many employees report to one manager, but each employee


reports to only one manager.

o Many-to-Many (M): Entities in both sets can be associated with multiple entities in
the other set.

▪ Example: Students can enroll in multiple courses, and each course can have
multiple students enrolled.

Example:

Consider a university database with entities "Student" and "Course." The "Enrolled" relationship
represents the association between students and courses:

• Entity Set:

o Students: John, Jane, and Alex.

o Courses: Database Systems, Algorithms, and Data Structures.

• Attributes:

o Student: Student ID, Name, Major.

o Course: Course ID, Title, Credits.

• Relationship Set:

o John is enrolled in Database Systems and Algorithms.

o Jane is enrolled in Data Structures.

o Alex is enrolled in Algorithms and Data Structures.

• Descriptive Attribute:

o Enrollment Date: Indicates when a student enrolled in a specific course.

• Mapping Cardinalities:

o One-to-Many: One course can have many students enrolled, but each student can
enroll in many courses.

You might also like