Module 1 Full
Module 1 Full
Introduction
Thapar Institute of Engineering and Technology, Patiala
1
Outlines
• Introduction of database
• File processing system
• Disadvantages of file processing system
• Transition to Database management system
• The desired characteristics of the database management system
• Summary
6
What is Database Management
System
• Data - Fact that(DBMS)?
can be recorded or stored
• e.g. Person Name, Age, Gender and Weight etc. (what is an information?)
• Database - Collection of logically related data
• e.g. Books Database in Library, Student Database in University etc.
• Management - Manipulation, Searching and Security of data
• e.g. Viewing result in Thapar website, Searching exam papers in
Thapar website etc.
• System - Programs or tools used to manage database
• e.g. SQL Server Studio Express, Oracle etc.
• DBMS - A Database Management System is a software for creating and
managing databases.
• Database Management System (DBMS) is a software designed to define,
manipulate, retrieve and manage data in a database.
• e.g. MS SQL Server, Oracle, My SQL, SQLite, MongoDB etc.
Introduction
9
How it works?
Accounts HR Production
Data Data Data
Each department maintain their own set of data. There is no link between those data pools.
Advantages of file based system
12
Disadvantages file-
•processing system
Data redundancy and inconsistency
• Redundancy occurs when same piece of the data is held in two or more separate places.
• Inconsistency occurs when similar data is kept in different format or values at two or more
separate places.
• Data isolation
• Data is scattered in various files , and files may be in different formats e.g., text files, CSV,
binary files.
Impact:
Difficulties in querying and combining data for decision-making.
Increased processing time to gather and align data.
13
• Integrity problems
Integrity problems arise when data fails to meet specified constraints (e.g., primary keys,
foreign keys) or is inconsistent across multiple locations.
Impact:
• Data corruption and errors during analysis or reporting.
• Time-consuming manual checks to ensure consistency.
Example: change in Roll No
• Atomicity problems
• It is essential in database that either complete query to be executed, or none.
If a system failure occurs after deducting $500 from Person A but before adding it to Person B, the
money disappears, violating atomicity.
• It is difficult to ensure atomicity in a conventional file-processing system.
• Concurrent-access anomalies
• Concurrent-access anomalies occur when multiple users or processes try to access
and manipulate shared data simultaneously in a manner that leads to unexpected
or incorrect results.
• Security problems
A file-based system lacks centralized control and sophisticated mechanisms for managing data
access and security.
This creates significant vulnerabilities, leading to security problems that compromise the
confidentiality, integrity, and availability of data.
11
HowDBMSworks?
Data
Characteristics of the
Database Approach
• In the database approach, a single repository maintains data.
• It should be accessed by various users repeatedly through queries,
transactions, and application programs.
• The main characteristics of the database approach are:
A. Self-describing nature of a database system
B. Insulation between programs and data
C. Data abstraction
D. Support of multiple views of the data
E. Sharing of data and multiuser transaction processing
18
A. Self-Describing Nature of a
• ADatabase System
database system contains metadata that describes its own structure, the
data it holds, and how the data is organized.
• This metadata allows the system to understand its own data without needing
external explanations or documentation.
19
• For example, if you have a table named Employees, the metadata will describe:
21
Ramez Elmasri and Shamkant B. Navathe. 1989. Fundamentals of database systems. Benjamin-Cummings Publishing Co., Inc., USA.
Self-Describing Nature
22
Ramez Elmasri and Shamkant B. Navathe. 1989. Fundamentals of database systems. Benjamin-Cummings Publishing Co., Inc., USA.
Why is this important?
• Flexibility: Since the database knows its structure, it can adapt to different
queries or changes in the system (like adding new tables or columns) without
requiring extra configuration or documentation.
• Data Integrity: The database can enforce rules and constraints automatically
using its self-describing metadata (e.g., ensuring that a column always holds
valid data).
• Ease of Access: Developers and users don’t need to manually track the
structure of the data. They can simply query the database, and the system
knows how to respond based on its internal structure.
Example: Imagine a simple Customer table with the following columns:
• CustomerID (integer)
• Name (string)
• Email (string)
The database doesn't need you to manually tell it the types or rules about the data—it
already knows this from its metadata.
When you query the database to get information about customers, it can use this self-
described structure to correctly process the request and return the correct results.
B. Insulation between Programs
and Data
• In traditional file processing, the structure of data files is embedded
in the application programs.
25
Example: Imagine you have an application program that
reads and writes data to a file containing customer
information:
File structure: The file might store customer data in columns like Name,
Address, and PhoneNumber (perhaps in a simple text file or a CSV file).
Since the application was written specifically to handle the old structure (without an
email), you would need to go into all programs that access the file and modify
them to handle the new column (Email).
This can be a lot of work, especially if multiple programs use the same file.
In contrast, in a Database Management System (DBMS), the
structure of data files (called the schema) is separate from the
application programs that access the data.
This means that you can change the structure of the database without
directly affecting the application programs, as long as you maintain
the same interface or access methods.
How it works:
•Adding a new field (Email) to the schema does not affect the query
unless the application explicitly needs to use the new field.
•The database handles the schema change internally without impacting existing
programs.
28
Data abstraction in a DBMS is essential because it:
In short, data abstraction makes the database system more manageable, secure,
adaptable, and user-friendly, ensuring smooth operations for both developers and
end-users.
D. Support of Multiple Views of
the Data
• A database typically has many types of users, each of whom may
require a different perspective or view of the database.
Example: Consider a university database with the following information:
Student Table: Stores information about students (name, age, department).
Course Table: Stores information about courses (course name, department).
Enrollment Table: Stores which students are enrolled in which courses.
30
• A student might only need to see their own details and enrolled courses.
• A teacher might want to see the list of students enrolled in their courses.
• An administrator might need to view all students, all courses, and enrollment details
across the university.
1.Security: Only authorized users can access certain views, ensuring sensitive
information is kept secure.
2.Data Abstraction: The database structure is abstracted from users, allowing them to
focus on the data they need without worrying about the underlying implementation.
3.User-Centric Customization: Users can interact with data in a way that is most
relevant to their tasks, improving usability.
4.Efficiency: Views can simplify complex data structures, allowing users to retrieve
only the relevant data, improving query performance.
E. Sharing of Data and
Multiuser Transaction
Processing
• It must allow multiple users to access the database at the same time.
• Both concepts (Sharing of Data and Multiuser Transaction) are essential for modern
DBMSs, as they allow for robust, secure, and scalable database systems that can
handle high volumes of users and transactions efficiently.
33
MC
• Data Isolation caused
Q
in traditional file
due
1. Duplicate
to Data system.
2. Scattering of
Data
3. Complex Data
4. Atomic Data
MC
• Data Isolation caused
Q
in traditional file
1. Duplicate
due to Data system.
2. Scattering of Data
3. Complex Data
4. Atomic Data
MC
Q
• If person A want to transfer fund of Rs.500 to person B.If failure occurs
after removing Rs.500 from AccountA and before transferring to
Account Bthen problem caused is
.
1. Data Isolation
2. Data Atomicity
3. None of these
4. Data Redundancy
MC
Q
• If person A want to transfer fund of Rs.500 to person B.If failure occurs
after removing Rs.500 from AccountA and before transferring to
Account Bthen problem caused is
.
1. Data Isolation
2. Data Atomicity
3. None of these
4. Data Redundancy
MC
Q
• Duplication of data at several places is .
called
1. Dataas
Inconsistency
2. Atomicity
Problem
3. Data Isolation
4. Data
Redundancy
MC
Q
• Duplication of data at several places is .
called
1. Dataas
Inconsistency
2. Atomicity
Problem
3. Data Isolation
4. Data Redundancy
MC
Q
• If in redundant file common fields are not matching then .
it1.
results
DatainIntegrity
Problem
2. Data Isolation
3. Data Redundancy
4. Data Inconsistency
MC
Q
• If in redundant file common fields are not matching then .
it1.
results
DatainIntegrity
Problem
2. Data Isolation
3. Data Redundancy
4. Data Inconsistency
MC
Q
• It is difficult to access conventional file system than Database
System.
1. True
2. False
MC
• Q
It is difficult to accessconventional file system than
Database System.
1. True
2. False
MC
• Q and Checking Account in the Bank.
Suppose user have Saving Account
Saving Account Stores following information -
account-no
name
address
mobile
and Checking Account stores -
account-no
name
address
mobile
• Which of the information is not
redundant.
1. Address
2. Name
4. Mobile
3. account-no
MC
• Q and Checking Account in the Bank.
Suppose user have Saving Account
Saving Account Stores following information -
account-
no
name
address
mobile
and Checking Account stores –
account-
no name
address
mobile
• Which of the information is
not redundant.
1. address
4. account-no
2. name
Definition of Database
A database is an organized collection of data that is stored and accessed electronically from a computer
system.
It allows for efficient storage, retrieval, and management of data.
Databases are designed to store large volumes of structured or unstructured data in a way that allows for
easy access, manipulation, and reporting.
1.Data: The actual information stored in the database (e.g., customer names, addresses, product
details).
2.Schema: The design or structure of the database, including tables, fields (columns), and relationships
between tables.
3.DBMS (Database Management System): A software system used to manage and interact with the
database. It allows users to create, read, update, and delete data (CRUD operations).
Example: In a Library Management System, the database could include:
Purpose of a Database:
• Efficient Data Storage: Data is stored in a structured manner that allows for easy retrieval.
• Data Integrity: Ensures that the data is accurate and consistent.
• Security: Provides access control to protect sensitive information.
• Scalability: Allows databases to grow with increasing data needs.
In summary, a database is a powerful tool for managing large amounts of data in a way that is
both efficient and secure.
Aspects to Studying DBMS
4.DBMS Implementation
Modeling and Design of Databases
The process of creating a conceptual representation of the database structure and its elements.
Before building a database, it's crucial to explore and address several key issues that could impact
the final design.
• Example:
• Deciding between a relational database (SQL) and a NoSQL database based on
scalability and performance requirements.
Programming: Queries and DB Operations
Writing and executing queries, as well as performing operations like data updates, retrieval, and
deletion.
•Types of DB Operations:
1.Data Retrieval: Using SELECT queries to get data from tables.
2.Data Insertion: Using INSERT queries to add new data to tables.
3.Data Update: Using UPDATE queries to modify existing data.
4.Data Deletion: Using DELETE queries to remove data.
•SQL Syntax:
•SELECT: SELECT * FROM books WHERE author = 'J.K. Rowling';
•INSERT: INSERT INTO books (title, author) VALUES ('New Book', 'Author');
•UPDATE: UPDATE books SET title = 'Updated Title' WHERE book_id = 1;
•DELETE: DELETE FROM books WHERE book_id = 1;
•Example:
•A Customer Relationship Management (CRM) system might need to update customer details
using UPDATE queries.
DBMS Implementation
The actual creation and deployment of the database system, including configuring the DBMS
and integrating it with applications.
• Example:
• Implementing a Customer Orders Database that integrates with an e-commerce
website.
Summary
• Key Points:
• Modeling and Design: Crucial for organizing and structuring data in an efficient way.
• Exploration of Issues: Identifying potential challenges before committing to implementation.
• Programming: Writing SQL queries to interact with the database.
• DBMS Implementation: Deploying the system into production for use.
• Conclusion: Studying these aspects of DBMS is vital to build efficient, scalable, and secure
database systems.
DBMS
• Architecture
The DBMS design depends upon its architecture.
• This architecture consists of many PCs and a workstation which are connected via the network.
• DBMS architecture depends upon how users are connected to the database to get their
request done.
A Tier-1 Architecture (also known as Single-Tier Architecture) refers to a system where all
components—data storage, business logic, and presentation—are bundled into a single unit.
There is no separation of concerns; the user interface, application logic, and database reside on
the same system.
• Example: A desktop application where the database and application logic are part of the same
system.
Components:
• Presentation Layer: The user interface (UI) interacts directly with the business logic and data
storage.
• Business Logic: The application logic that processes data is embedded directly in the same
system.
• Database Layer: The database and data storage are often stored locally within the system.
Advantages of Tier-1 (Single-Tier) Architecture:
• Simplicity: Easy to implement and deploy for smaller applications, requiring minimal infrastructure.
• Low Latency: As all components are on the same machine, there are minimal delays in
communication between the UI, application logic, and database.
• Cost-Effective: For small-scale applications, the architecture requires less overhead in terms of
network infrastructure and hardware.
• Ease of Deployment: Everything is in one place, so deployment and maintenance are simpler.
• Ideal for Small Applications: Best suited for small, stand-alone applications where scalability and
separation of concerns are not required.
Disadvantages:
•Limited Scalability: It’s difficult to scale because the entire system is dependent on a
single unit.
•Lack of Flexibility: Modifying or upgrading the system can be challenging due to the tight
coupling of all components.
•Risk of Downtime: If any part of the system fails (e.g., the database), the entire
application could stop working.
•Limited Data Security: With all components on the same system, securing sensitive data
can be challenging.
Tier-2 Architecture: Two-Tier Architecture
A Tier-2 Architecture (also known as Two-Tier Architecture) divides the application into two
separate layers: the client layer (presentation) and the server layer (data management). The
client communicates directly with the server to access or manipulate data.
Components:
• Client Layer: The user interface (UI) that interacts with the
application. The client may run on a user’s machine or a web
browser.
The user interfaces and application programs are run
on the client-side.
• Centralized Data Management: The database and business logic reside on the
server, making data management easier and more secure.
• Better Performance: Since data processing is offloaded to the server, the client has
less burden and can perform better.
• Data Security: The database resides on the server, so it can be better secured and
managed.
Disadvantages:
•Network Dependency: The client depends on network communication to interact with the
server, so performance may degrade with network latency or interruptions.
•Limited Flexibility: While the client and server are separate, they still tightly depend on each
other. Changes in the server may impact the client.
•Scaling Challenges: While there’s a separation between the client and the server, the
application still has limitations in terms of scaling large or complex systems.
Three-Tier Architecture
• Presentation Layer (Client Layer): The user interface that interacts with the application. This
is the front-end, such as a web browser, mobile app, or desktop app.
• Application Layer (Business Logic Layer): The backend that processes business logic,
handles requests, and performs necessary calculations. This layer acts as the intermediary
between the presentation and database layers.
• Database Layer: This layer is responsible for storing and managing data. It could be a
database server or cloud-based storage.
Advantages of Tier-3 (Three-Tier) Architecture:
• Scalability: Each layer can be scaled independently based on demand. For instance, if more
processing power is needed, you can scale the application layer without affecting the database.
• Separation of Concerns: Clear separation between the presentation, business logic, and data
management layers makes the application easier to manage, maintain, and update.
• Flexibility: Modifications in one layer (e.g., changing the database or business logic) can be
made without affecting the other layers.
• Security: Sensitive data and business logic are centralized in the backend (server) and database,
reducing the risk of unauthorized access from the client side.
• Improved Performance: The business logic layer can optimize data handling and communication
between the presentation and database layers.
• Fault Tolerance: If one layer fails, the impact is minimized because other layers are still
functional. For example, the presentation layer can still be functional even if the database layer
Disadvantages:
• Higher Cost: More infrastructure is needed, leading to higher initial setup costs and
maintenance.
• Tier-1 (Single-Tier) is ideal for small, self-contained applications where scalability and
separation of concerns are not critical.
• Tier-2 (Two-Tier) is suitable for client-server applications where the client interacts directly
with a centralized server, providing better scalability and separation of concerns than Tier-1.
• Tier-3 (Three-Tier) offers the highest scalability, flexibility, and security, making it the best
choice for larger, more complex applications that require independent scaling, robust fault
tolerance, and enhanced security.
Each architecture has its own advantages and trade-offs, and the choice
depends on the specific needs of the application and its scale.
Concept of Data Dependency
Why is it Important?
• Essential for understanding and optimizing database normalization.
• Influences data integrity and consistency.
Types of Data Dependencies
• 1. Functional Dependency:
• A relationship where one attribute uniquely determines another.
• Example: In a student database, Student_ID -> Student_Name.
• 2. Multivalued Dependency:
• Occurs when one attribute determines a set of values for another attribute.
• Example: Course_ID ->> Instructor_Name in a course database.
• 3. Transitive Dependency:
• A relationship where one attribute depends on another through an intermediate attribute.
• Example: A -> B and B -> C implies A -> C.
Functional Dependency
• Definition:
• Denoted as X -> Y, where X determines Y.
• Key Characteristics:
• Helps identify primary keys.
• Basis for normalization.
• Example:
• Employee_ID -> Employee_Name.
• Use Case:
• Optimizing query performance by reducing redundancy.
Multivalued Dependency
• Definition:
• Denoted as X ->> Y, where X determines multiple independent values of Y.
• Importance:
• Highlights the need for decomposing databases into multiple tables.
• Example:
• A movie database where
Movie_Title ->> Actor_Name
• Impact:
• Supports database normalization (4NF).
Transitive Dependency
• Definition:
• When an attribute depends indirectly on another via a third attribute.
• Example:
• In a sales database: Product_ID -> Category_ID and Category_ID ->
Category_Name, then Product_ID -> Category_Name.
• Significance:
• Elimination during normalization (3NF).
• Drawback:
• Can lead to redundancy if not resolved.
Symbols and Their Meanings:
Symbol Meaning
→ Functional Dependency
↠ Multivalued Dependency
* Join Dependency
⊆ Inclusion Dependency
Transitive Dependency
→→ (indirect path)
Importance of Understanding Data Dependencies
• Data Integrity:
• Ensures consistency in data relationships.
• Normalization:
• Guides the process of dividing tables into smaller, related ones.
• Query Efficiency:
• Improves performance by structuring data logically.
Applications of Data Dependency
• Database Normalization:
• Achieves 1NF, 2NF, 3NF, BCNF, and higher forms.
• Data Modeling:
• Used in designing entity-relationship diagrams.
• Query Optimization:
• Enables indexing and efficient data retrieval.
• Integrity Constraints:
• Helps enforce rules like primary and foreign keys.
Challenges in Managing Data Dependencies
• Complexity:
• Requires deep understanding of relationships.
• Performance Overhead:
• Normalization can sometimes lead to slower joins.
• Dynamic Changes:
• Adapting to schema updates while maintaining dependencies.
• Consistency Maintenance:
• Avoiding anomalies during updates, insertions, or deletions.
• Key Takeaways:
• Data dependency forms the backbone of relational databases.
• Understanding dependencies is critical for normalization and query optimization.
• Proper management ensures data integrity and reduces redundancy.
• Final Thought:
• Mastering data dependencies leads to robust and efficient database systems.
3-Schema Architecture of DBMS
• University Database:
- External Schema: Student, faculty, and admin views.
- Conceptual Schema: Courses, students, faculty, schedules.
- Internal Schema: File storage, indexing.
Challenges of 3-Schema Architecture
• Complexity:
- Designing and managing schemas require expertise.
• Performance Overhead:
- Additional layers may introduce latency.
• Synchronization:
- Ensuring consistency across schemas can be challenging.
Conclusion
• Key Takeaways:
- Provides a structured approach to database design.
- Enhances data abstraction and independence.
- Supports diverse user needs and simplifies database management.
• Final Thought:
- A cornerstone of modern DBMS, bridging user requirements and data management.
Questions?
Advantages of the DBMS Approach
• Controlling Redundancy
• In file processing, every user group maintains its own files for handling its
data-processing applications.
• Redundancy introduces following problems:
• Duplication of effort
• Storage space is wasted
• Data inconsistency
• The different user groups are integrated to store and update each logical data
item only one place in the database.
Advantages of the DBMS Approach
• Providing Efficient Query Processing
• Database is typically stored on disk.
• DBMS must provide specialized data structures and search techniques to speed up
disk search for the desired records.
• The query processing and optimization module of the DBMS is responsible for
choosing an efficient query execution plan for each query based on the existing
storage structures.
• Providing Backup and Recovery
• DBMS must provide facilities for recovering from hardware or software failures.
• The backup and recovery subsystem of the DBMS is responsible for recovery.
Advantages of the DBMS Approach
• Providing Multiple User Interfaces
• DBMS should provide a variety of user interfaces according to the users with varying
levels of technical knowledge.
• Apps for mobile users
• Query languages for casual users.
• Programming language interfaces for application programmers.
• Menu-driven interfaces and natural language interfaces for standalone users.
• Representing Complex Relationships among Data
• A database may include numerous varieties of data that are interrelated in many
ways.
• DBMS must have the capability to represent a variety of complex relationships
among the data, to define new relationships as they arise, and to retrieve and
update related data easily and efficiently.
Advantages of the DBMS Approach
• Enforcing Integrity Constraints
• The database applications have certain integrity constraints that must hold for
the data.
• The simplest type of integrity constraint involves specifying a data
type for each data item.
• Permitting Inferencing
• Database systems provide capabilities for inferencing new information from
the stored database facts.
• Such systems are called deductive database systems.
Advantages of the DBMS Approach
• Restricting Unauthorized Access
• We can not allow all users to access all information in the database.
• A DBMS should provide a security and authorization subsystem
• Where DBA creates accounts and specifies account restrictions
• Then DBMS should enforce these restrictions automatically
• Providing Persistent Storage for Program Objects
• Databases can be used to provide persistent storage for program objects and
data structures
• Such an object is said to be persistent, it survives the termination of program
execution and can later be directly retrieved by another program.
Summary
• Studied the traditional file system
• Drawbacks of the traditional file system
• Characteristics required for the data storage system
• We have completed with the overview of data, database
management system.
• Also, discussed the advantages of DBMS over file
processing system.
References
• https://www.softwaretestinghelp.com/database-management-
software/
• https://www.geeksforgeeks.org/introduction-to-nosql/
• https://www.tutorialspoint.com/File-based-Data-Management-
System
• https://opentextbc.ca/dbdesign01/chapter/chapter-1-before-the-
advent-of-database-systems/