Dmbs Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 151

It allows users to create, modify, and query a database, as

well as manage the security and access controls for that

popular commercial
database which is used in
A Database stores a lot of critical information to access data quickly and
securely.

Hence it is important to select the correct architecture for efficient data


management.

We choose database architecture depending on several factors like the


size of the database, number of users, and relationships between the
users.
It is the simplest architecture of Database in which the client, server,
and Database all reside on the same machine.

A simple one tier architecture example would be anytime you install a


Database in your system and access it to practice SQL queries.

But such architecture is rarely used in production.

In scenarios where the data management needs are relatively


simple, and the system is not intended to be accessed by multiple
users or clients simultaneously, a 1-Tier architecture can be
suitable.

For example, a small shop owner who wants to keep track of


inventory and sales data on their computer might choose a 1-Tier
setup
In 2-Tier architecture, there is a clear separation between the client (user
interface and application logic) and the server (database management).

The client handles the user interface and basic processing, while the
server manages the database and more complex application logic.

The client communicates with the server to request data or perform


updates, and the server responds accordingly.

Example: A desktop application or web interface (client) that communicates


with a central database server to retrieve or update information.
This is the user interface that users interact with.

It could be a web browser on your PC, tablet, or mobile device.

Example: When you access an e-commerce website on your laptop and


browse through product listings, add items to your cart, and proceed to
checkout, you're interacting with the presentation layer.

The application layer, often referred to as the server-side, handles


the business logic, data processing, and communication between the
presentation layer and the database
server.

Example: When you click the "Checkout" button on the e-commerce


website, the application layer processes this request.

It verifies your cart, calculates the total, and communicates with the
database to update inventory and store the order details.

This layer is responsible for handling various business rules,


authentication, and transaction
management.
This layer is responsible for storing and managing the data. It
handles data retrieval, storage, and manipulation based on the
requests from the application layer.

Example: The database server stores product information, user


details, order history, and other relevant data. When you place an
order on the e-commerce site, the application layer communicates
with the database server to update the order information and
deduct the purchased items from the inventory.
Represents how data appears to different users or user groups.

Each external schema defines a specific view of the data tailored to the
needs of a particular user or application.

Consider an e-commerce database where there are


different external schemas for customers, administrators, and
suppliers.

Shows information about products, prices, and


orders.

Includes additional details about inventory,


sales reports, and user management.

Focuses on inventory and order fulfillment.


Represents the logical structure of the entire database, independent of how
data is viewed or stored.

It defines relationships, constraints, and the overall organization of data.

In our e-commerce database, the conceptual schema outlines


entities like Products, Orders, Users, and their relationships.

Product (ProductID, Name, Price, Quantity)


Order (OrderID, CustomerID, OrderDate, Status)
User (UserID, Username, Password, Role)

One-to-Many: Each Order belongs to one


Customer, but a Customer can have multiple
Orders.

Many-to-Many: Users can have multiple Roles, and


a Role can be associated with multiple Users.
Represents how data is physically stored on the storage devices.

It deals with issues like indexing, file structures, and optimization


for efficient storage and retrieval.

Going back to our e-commerce database, the internal


schema considers how data is stored on the disk, such as using
specific file structures, indexing mechanisms, and storage
optimization techniques.
DBMS enforces data integrity constraints, preventing inconsistencies and
ensuring accurate and reliable data.

Example: A student database ensures that each student record includes a


unique identification number, avoiding duplicate entries.

DBMS centralizes data storage, reducing duplication and ensuring a


single, consistent source of information.

Example : In an online store, product details like name, price, and


description are stored centrally in a database, avoiding
redundant entries.

DBMS provides mechanisms for access control, protecting


sensitive data from unauthorized access.

Example : A personnel database in a company restricts access


to salary information based on user roles to maintain
confidentiality.
DBMS handles simultaneous access by multiple users, managing
transactions to ensure data consistency.

An airline reservation system allows multiple users to book flights


concurrently, maintaining accurate seat availability.

Example: In a software application for managing employee information,


changes to how data is stored (like adding a new field for
employee photos) can be made without affecting how the information
appears in the user interface.

A DBMS allows developers to modify the internal structure without


disrupting the software's functionality.

It's like upgrading the database engine without requiring users to learn a
new way to interact with the employee records.
The provision of the functionality that we expect of a good DBMS
makes the DBMS an extremely complex piece of software.

Database designers and developers, data and database administrators, and


end-users must understand thisfunctionality to take full advantage of it.

Failure to understand the system can lead to bad design decisions, which can
have serious consequences for an organization.

The complexity and breadth of functionality makes the DBMS an


extremely large piece of software, occupying many megabytes of disk
space and requiring substantial amounts of memory to run efficiently.

The cost of DBMSs varies significantly, depending on the environment and


functionality provided.

For example, a single-user DBMS for a personal computer may only cost $100.

However, a large mainframe multi-user DBMS servicing hundreds of users can be


extremely expensive, perhaps $100,000 or even $1,000,000.

There is also the recurrent annual maintenance cost, which is typically a


percentage of the list price.
DBMS introduces some performance overhead due to query optimization and
management of transactions.

Example: High-volume transactional systems may experience latency due to


the overhead of managing multiple concurrent transactions.
Think of logical data independence like upgrading your smartphone without
changing how your apps work.

If you add a new feature to a game app, you don't want it to break just
because you got a new phone.

Similarly, in a DBMS with logical data independence, you can modify the
way data is organized or add new data structures without impacting the
applications that use the database.

Example: Suppose you have a student database, and you decide to


add a new field for students' hobbies. With logical data independence,
you can make this change without rewriting the software that displays
student information.
Physical data independence means that you can change how the data is stored (its
physical structure) without affecting how users interact with it.

Example: In a DBMS with physical data independence, you might decide to switch
from one type of storage system to another for better performance.

Users and applications accessing the data won't notice this change, as the
interaction remains the same.
A database administrator, or DBA, is responsible for maintaining, securing, and
operating databases and also ensures that data is correctly stored and
retrieved.

In addition, DBAs often work with developers to design and implement new
features and troubleshoot any issues.

A DBA must have a strong understanding of both technical and business needs.

Designing and planning the structure of the database, including


defining tables, relationships, and constraints.

DBA enhances query processing by improving speed,


performance, and accuracy.
DBAs are no longer responsible for managing the underlying infrastructure.

With cloud computing, this is all managed by the provider.

DBAs now perform more strategic tasks, such as data analytics, user experience design, and
cybersecurity.

DBAs often work directly with users and business leaders on developing new ways to use
data and software to automate processes, reduce costs, and stay competitive.

This requires a new set of skills from DBAs.

In the past, having strong technical skills was the most important requirement.

There is less need for these skills with cloud computing.

Instead, DBAs need to communicate and collaborate with users to understand their needs
and business environment.

They also need to work with other teams, such as DevOps, to help deliver software that
will solve business problems.
In a Relational Database Management System (RDBMS), data is organized
using a relational data structure.

This structure is based on tables, where data is stored in rows and columns.

Consider a database to store information about students, including their


details and the courses they are enrolled in.

We can represent this data using relational tables:


In the context of relational databases, a "relation" is another term for a table.

It represents a collection of similar data entries organized in rows and


columns.

A relationship in DBMS is the way in which two or more data sets are linked,

Attributes are the columns or fields in a table.

They represent the properties or characteristics of the entities in the table.

For example, in a "Student" table, attributes could include "StudentID," "Name,"


and "Age."
A schema defines the structure and organization of a database, including
tables, columns, and relationships.

It's like a blueprint for the database.

Instances, in the context of a database, refer to the actual data stored in the
database at a particular moment.

Each row in a table is an instance of that table.


Referential integrity is a property that ensures that relationships between
tables remain consistent.

In the context of foreign keys, it means that a foreign key in one table must
match the primary key value in another table, ensuring that relationships
between tables are valid.

Entity integrity ensures that each row (entity) in a table has a unique and non-
null primary key.

It means that the primary key must have a value (not null), and no two rows
can have the same primary key value.

The term "entity integrity" is often used to emphasize the holistic nature of
maintaining the integrity of the entire record, not just the primary key column.
data model is a conceptual representation of how data is
structured, stored, and accessed in a database.

It provides an abstraction that helps users and developers


understand the relationships between different elements in
the database.

The data model serves as a blueprint for designing the


database and organizing information in a way that meets the
requirements of the application or system.
The relational data model is a conceptual framework for organizing
and representing data in a database.

It defines the structure of the data and the relationships between


entities (tables) within the database.
Example: Consider a "Person" entity and a "Passport" entity. Each
person can have only one passport, and each passport belongs to
only one person. So, the cardinality constraint here is one-to-one.

Example: Think of a "Department" entity and an "Employee"


entity. Each department can have many employees, but each
employee works in only one department. So, the cardinality
constraint here is one-to-many.

Using the same "Department" and "Employee" example, if you view it


from the perspective of employees, it's a many-to-one relationship:
many employees to one department. Many employees work in one
department, but each employee works in only one department.

Example: Consider a "Student" entity and a "Course" entity.

Each student can enroll in multiple courses, and each course


can have multiple students enrolled. So, the cardinality
constraint here is many-to-many.
Relational algebra is a mathematical query language for
relations.
The set difference operator helps identify the tuples that exist in
one relation but not in another, making it useful for various data
comparison and filtering tasks in relational databases.
Returns all rows from the
left table and matching

Notice that all rows from the EMPLOYEE table are included in the result,
and for rows where there is no match in the SALARY table, NULL values are
included for the Salary column.

This illustrates the difference between a natural join, which only includes
matching rows, and an outer join, which includes all rows from one table
Returns all rows from the
right table and matching
It describes what data to retrieve from the
database by using variables, quantifiers, and
predicates to express the desired properties of
the tuples in the result.
we want to retrieve the names of all students who are older than 20 years old.
Each student is enrolled in a specific course, so the
course is functionally dependent on the student's name.

Each course is taught by a specific professor, so the


professor's name is functionally dependent on the course.
It is a property that indicates a specific relationship between
the columns in a table, where the table can be recreated by
joining multiple related tables with fewer columns.

Join Dependency is based on the principle of lossless


decomposition, which ensures that no data is lost when splitting
a table into multiple related tables and then joining them back
together.

Optimizing database schema during the normalization process.

Improving query performance by reducing the amount of


redundant data.

Ensuring data integrity and consistency across systems.


In 1 NF, each table cell should contain only a single
value, and each column should have a unique name.
In a traditional normalized database, we
store data in separate logical tables and
attempt to minimize redundant data.

We may strive to have only one copy of


each piece of data in a database.
Under denormalization, we decide that we’re
okay with some redundancy and some extra
effort to update the database in order to
get the efficiency advantages of fewer joins.

Retrieving data is faster since we do fewer joins.

Queries to retrieve can be simpler(and therefore less


likely to have bugs), since we need to look at fewer
tables.

Updates and inserts are more expensive.


Denormalization can make update and insert code harder
to write.
Data may be inconsistent.
Data redundancy necessitates more storage.

In a system that demands scalability, like that of any


major tech company, we almost always use elements of
It is used to control the flow of execution.
If a given non-serial schedule can be converted into a serial schedule
by swapping its non-conflicting operations, then it is called as a conflict
serializable schedule.
View serializability ensures that the final result of concurrent
transactions is the same as if they were executed serially,

by ensuring that the read and write operations of each


transaction appear in the same order in the final result as they
do in any serial execution.

Transaction T1 reads product prices and calculates discounts,


while Transaction T2 updates product prices.

View serializability ensures that T1 reads the latest product


prices updated by T2, preserving consistency in discount
calculations.
refers to the property that ensures that once a transaction
commits, its effects persist even in the event of system failures.

Transactions need to commit in the right order, especially if one


transaction reads data updated by another.

Transactions should only read data that's been fully updated and
committed by other transactions.

Imagine you're checking your bank balance after transferring money.


Your balance should only update after the transfer transaction is

Imagine an online banking application where users can transfer


funds between accounts.

Transaction T1 transfers $100 from account A to account B.

Transaction T2 checks the balance of account B after the transfer.

T2 should only commit after T1 commits to ensure it reads the


updated balance reflecting the $100 transfer.
WTS is the timestamp of the last write operation performed by a
transaction.

It helps in determining the order of conflicting write operations


and enforcing transaction serialization.
Fine-grained locking is suitable for scenarios where the data items
are relatively small and transactions frequently access different
parts of the data simultaneously without conflicting with each

locks are applied at a higher level, typically at the level of larger


data structures such as entire disk blocks or even entire tables.

Each data structure is associated with a single lock, meaning that


transactions must acquire the lock for the entire structure even if
they only need to access a small portion of it.

reduces concurrency because transactions may be forced to wait


for access to the entire data structure even if they only need to
access a small portion of it.

requires less overhead in terms of managing fewer locks, reduced


lock and unlock operations, and less storage space for the lock

suitable for scenarios where the data items are relatively large and
transactions tend to access most or all of the data structure at
once, reducing the likelihood of conflicts between transactions.

You might also like