0% found this document useful (0 votes)

8 views44 pages

DBMS

SOME DBMS NOTES

Uploaded by

patixiw394

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views44 pages

DBMS

SOME DBMS NOTES

Uploaded by

patixiw394

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Database systems comprise complex data structures.

In order to make the

system efficient in terms of retrieval of data, and reduce complexity in
terms of usability of users, developers use abstraction i.e. hide irrelevant
details from the users. This approach simplifies database design.

Level of Abstraction in a DBMS

There are mainly 3 levels of data abstraction:

● Physical or Internal Level

● Logical or Conceptual Level

● View or External Level

Data abstraction in DBMS plays a crucial role in simplifying and securing data
access while managing its complexity. It essentially hides the intricacies of
how data is stored and accessed from users, providing a clear and concise
interface for interaction. Here's a breakdown of its key aspects:

Levels of Abstraction:

● Physical Level: The lowest level, dealing with the actual physical
storage of data like disk blocks and pointers. Users have no direct
access to this level.
● Logical Level: Defines the overall database structure, including tables,
columns, data types, and relationships. Users interact with this level
through queries and data manipulation languages (DMLs).
● View Level: Presents customized subsets of data based on specific user
needs and access privileges. This allows different users to see different
versions of the same data, enhancing security and privacy.

Benefits of Data Abstraction in DBMS:

● Simplified interaction: Users don't need to understand the underlying

storage mechanisms, making data access easier and more intuitive.
● Improved security: Encapsulation restricts unauthorized access to
specific data elements or tables.
● Enhanced maintainability: Changes to the physical data storage can be
implemented without affecting the logical level, simplifying database
updates.
● Data independence: Applications are insulated from changes in the
physical or logical structure of the database, improving code stability
and reusability.

Data independence is a crucial concept in database management systems

(DBMS) that refers to the ability to modify one level of the database schema
without affecting the next higher level. This separation between different levels
of data organization offers several key benefits:

Types of Data Independence:

● Physical Data Independence: This occurs when changes to the physical

storage structure of data (e.g., file sizes, indexes) do not impact the
logical view of the data accessed by users and applications. This allows
efficient storage optimization without rewriting queries or application
logic.
● Logical Data Independence: This refers to the ability to modify the
logical schema (e.g., adding/removing columns, tables) without affecting
existing applications. This makes database evolution more flexible and
minimizes application code changes.

Benefits of Data Independence:

● Increased flexibility and agility: Databases can adapt to changing data

needs without disrupting applications, enabling faster response to
evolving business requirements.
● Improved developer productivity: Applications written against a stable
logical schema require fewer modifications when the physical storage or
internal database structure changes.
● Enhanced data integrity and security: Data access controls and
constraints defined at the logical level remain independent of physical
storage specifics, promoting data consistency and protection.
● Simplified database administration: Changes to the physical level can
be implemented transparently, minimizing the need for extensive
schema modifications at higher levels.

Data Definition Language (DDL) Explained

Data Definition Language (DDL) is a subset of SQL used to create, modify, and
delete database objects. Think of it as the architect blueprints for your
database, defining the structure and organization of your data. Unlike Data
Manipulation Language (DML) which focuses on retrieving and manipulating
data, DDL deals with the "what" rather than the "how" of your data stored.

Here's a breakdown of DDL's key functionalities:

Object Creation:

● CREATE TABLE: Defines the structure of a table, specifying column

names, data types, constraints, and relationships.
● CREATE INDEX: Speeds up data retrieval by organizing data based on
frequently used query criteria.
● CREATE USER: Establishes user accounts with defined roles and
access privileges for secure data management.
● CREATE VIEW: Presents customized subsets of data based on specific
user needs and access limitations.

Object Modification:

● ALTER TABLE: Modifies an existing table structure, like

adding/removing columns, changing data types, or updating
constraints.
● ALTER INDEX: Changes the structure or definition of an existing index
to optimize database performance.
● ALTER USER: Updates user roles, passwords, or access privileges to
maintain data security.

Object Deletion:

● DROP TABLE: Permanently removes a table and its associated data

from the database.
● DROP INDEX: Deletes an existing index to free up storage space or
optimize database performance.
● DROP USER: Removes a user account and its access privileges from
the database.

Benefits of using DDL:

● Organized and consistent data structures: Ensures clear and defined

data organization, facilitating efficient data access and manipulation.
● Improved data security: Enables granular control over user access and
data visibility, safeguarding sensitive information.
● Flexible database evolution: Allows modifying the database structure
without impacting existing applications, promoting easy data schema
updates.
● Simplified data management: Provides a standardized way to define and
manage database objects, reducing complexity and maintenance
overhead.

Data Manipulation Language (DML) is your magic wand for interacting with the
actual data stored within your database. It's the counterpart to DDL, which
focuses on defining the "what" (database structure), while DML deals with the
"how" (manipulating data). Think of it as the instructions you give your
database to retrieve, insert, update, or delete data.

Here's a breakdown of DML's core functionalities:

● SELECT: This retrieves data from one or more tables based on specified
criteria. You can filter, sort, and aggregate data to extract valuable
insights.
● INSERT: This adds new rows of data into a table, following the defined
schema and constraints.
● UPDATE: This modifies existing data in a table, changing specific values
or columns based on conditions.
● DELETE: This removes unwanted rows of data from a table permanently.

Benefits of using DML:

● Efficient data access and manipulation: DML provides powerful

commands to extract, add, modify, and remove data with precision and
ease.
● Flexible data analysis: You can filter, sort, and combine data from
different tables to answer complex questions and gain valuable insights.
● Simplified data management: DML offers a standardized way to interact
with your data, reducing operational complexity and maintenance
overhead.
● Increased productivity: Developers and analysts can focus on
data-driven tasks without needing to write low-level data access code.

Real-world examples of DML in action:

● Running a query to find all customers who made purchases in the last
month.
● Adding a new employee record to a company database.
● Updating the price of a product in an online store.
● Deleting outdated order records from a database.

Data Models: A Comparative Look

Data models are blueprints for organizing and accessing data in databases.
Here's a comparison of four key models:

1. Entity-Relationship Model (ER Model):

● Focus: Representing real-world entities and their relationships.

● Components:
○ Entities: Real-world objects (e.g., customers, products).
○ Attributes: Properties of entities (e.g., customer name, product
price).
○ Relationships: Connections between entities (e.g., "purchases").
● Advantages: Easy to understand, good for conceptual design.
● Disadvantages: Doesn't directly translate to physical database
structures, limited for complex relationships.

2. Network Model:

● Focus: Representing data as a network of records with linked pointers.

● Components:
○ Record types: Groups of similar records (e.g., customers, orders).
○ Sets: Relationships between record types (e.g., a customer places
an order).
○ Links: Pointers connecting records within sets.
● Advantages: Flexible for complex relationships, efficient for navigating
related data.
● Disadvantages: Difficult to understand and maintain, not widely used in
modern databases.

3. Relational Model:

● Focus: Organizing data in tables with rows and columns.

● Components:
○ Tables: Collections of related data points (e.g., customers table,
orders table).
○ Columns: Attributes of the data (e.g., customer name, order date).
○ Rows: Records of individual data points (e.g., specific customer
information, specific order details).
○ Relationships: Defined through foreign keys connecting related
tables.
● Advantages: Dominant model in modern databases, simple to
understand and manage, supports efficient data retrieval and
manipulation.
● Disadvantages: Less flexible for complex relationships than network
model, data normalization can be complex.

4. Object-Oriented Data Model (OODM):

● Focus: Combining data and operations into self-contained objects.

● Components:
○ Objects: Encapsulated data and functions acting on that data.
○ Classes: Templates for creating objects.
○ Inheritance: Ability of objects to inherit properties and behavior
from other classes.
● Advantages: Natural fit for object-oriented programming, flexible and
powerful for complex data structures.
● Disadvantages: Not all DBMS support true OODM, learning curve can be
steeper than relational models.

Choosing the right model:

The best model depends on your specific needs and data complexity.

● ER model: Good for understanding data relationships and designing

conceptual database models.
● Network model: Useful for very complex relationships, but not widely
used anymore.
● Relational model: Dominant choice for most business-oriented
databases, due to its simplicity and efficiency.
● OODM model: Ideal for applications with complex data structures and
object-oriented programming.

Integrity constraints are the rules that ensure the validity, consistency, and
accuracy of data within a database. They act as safeguards, preventing invalid
data from entering the system, and maintaining the logical relationships
between various data elements.

Here's a breakdown of different types of integrity constraints:

1. Domain Constraints:
● These define the valid values that can be stored in a specific column.
For example, a "customer age" column might only allow values between
1 and 120.
● Types:
○ Data type constraints: Specify the data type like integer, string,
date, etc.
○ Range constraints: Limit the range of acceptable values (e.g., age
between 18 and 65).
○ Check constraints: Define custom validation rules for specific
data formats or patterns.

2. Entity Integrity Constraints:

● These ensure that each row in a table has a unique identifier to

distinguish it from other rows.
● Types:
○ Primary key: A unique column or set of columns that uniquely
identifies each row in a table.
○ Candidate key: Any column or set of columns that could
potentially serve as a primary key.

3. Referential Integrity Constraints:

● These enforce relationships between tables by ensuring that foreign key

values in one table reference existing primary key values in another
related table.
● Types:
○ Foreign key: A column in one table that references the primary
key of another table.
○ Referential actions: Define what happens to rows in a child table
when the referenced row in the parent table is deleted or updated
(e.g., cascade, set null, restrict).

Benefits of Integrity Constraints:

● Improved data quality: By preventing invalid data entry, constraints

ensure accuracy and consistency, leading to reliable data analysis and
decision-making.
● Enhanced data relationships: Referential constraints maintain the
integrity of relationships between tables, preventing orphaned or
inconsistent data.
● Simplified data management: Constraints automate data validation and
enforcement, reducing manual effort and potential errors.

Examples of Integrity Constraints:

● A "product_id" column in an "orders" table must reference a valid

"product_id" in the "products" table (referential integrity).
● A "customer_email" column must be unique and in a valid email format
(domain and entity integrity).

ata manipulation operations are the actions you take to interact with the data
stored in a database. These operations cover a wide range of tasks, from
simply retrieving specific data points to transforming and analyzing entire
datasets.

Here's a breakdown of some key data manipulation operations:

Basic Operations:

● Read (SELECT): This retrieves data from one or more tables based on
specified criteria. You can filter, sort, and aggregate data to extract
valuable insights.
● Create (INSERT): This adds new rows of data into a table, following the
defined schema and constraints.
● Update (UPDATE): This modifies existing data in a table, changing
specific values or columns based on conditions.
● Delete (DELETE): This removes unwanted rows of data from a table
permanently.

Advanced Operations:

● Join: Combines data from multiple tables based on related columns,

allowing you to analyze data across different entities.
● Union/Intersection/Difference: Perform set operations on tables to
identify overlapping or unique data points.
● Aggregation: Summarizes data using functions like SUM, AVG, COUNT,
etc., providing insights into overall trends and patterns.
● Subqueries: Nested queries that leverage the results of one query within
another, enabling complex data analysis tasks.
● Data Transforms: Modify existing data by applying various calculations,
formatting, or manipulations to create new derived data points.
Unit 3

Storage strategies in dbms

Storage strategies in DBMS (Database Management Systems) are crucial for

optimizing data accessibility, performance, and cost.A database system
provides an ultimate view of the stored data. However, data in the form of bits,
bytes get stored in different storage devices.

Types of Data Storage

For storing the data, there are different types of storage options available. These
storage types differ from one another as per the speed and accessibility. There are
the following types of storage devices used for storing the data:

○ Primary Storage

○ Secondary Storage

○ Tertiary Storage
Primary Storage( RAM)

Fastest access, but volatile and expensive. Used for active data sets in use.

It is the primary area that offers quick access to the stored data. We also know the
primary storage as volatile storage. It is because this type of memory does not
permanently store the data. As soon as the system leads to a power cut or a crash,
the data also get lost. Main memory and cache are the types of primary storage.

○ Main Memory: It is the one that is responsible for operating the data that is
available by the storage medium. The main memory handles each instruction
of a computer machine. This type of memory can store gigabytes of data on
a system but is small enough to carry the entire database. At last, the main
memory loses the whole content if the system shuts down because of power
failure or other reasons.

1. Cache: It is one of the costly storage media. On the other hand, it is the
fastest one. A cache is a tiny storage media which is maintained by the
computer hardware usually. While designing the algorithms and query
processors for the data structures, the designers keep concern on the cache
effects.
● Secondary Storage (Hard Disk Drives, SSDs): Slower access than RAM,
but persistent and more affordable. Used for storing large datasets.

Secondary storage is also called as Online storage. It is the storage area that allows
the user to save and store data permanently. This type of memory does not lose the
data due to any power failure or system crash. That's why we also call it
non-volatile storage.

There are some commonly described secondary storage media which are available
in almost every type of computer system:

○ Flash Memory: A flash memory stores data in USB (Universal Serial Bus)
keys which are further plugged into the USB slots of a computer system.
These USB keys help transfer data to a computer system, but it varies in size
limits. Unlike the main memory, it is possible to get back the stored data
which may be lost due to a power cut or other reasons. This type of memory
storage is most commonly used in the server systems for caching the
frequently used data. This leads the systems towards high performance and
is capable of storing large amounts of databases than the main memory.

○ Magnetic Disk Storage: This type of storage media is also known as online
storage media. A magnetic disk is used for storing the data for a long time. It
is capable of storing an entire database. It is the responsibility of the
computer system to make availability of the data from a disk to the main
memory for further accessing. Also, if the system performs any operation
over the data, the modified data should be written back to the disk. The
tremendous capability of a magnetic disk is that it does not affect the data
due to a system crash or failure, but a disk failure can easily ruin as well as
destroy the stored data.

Tertiary Storage (Tape Drives, Cloud Storage): Very slow access, but extremely
inexpensive. Used for long-term archival purposes

It is the storage type that is external from the computer system. It has the slowest
speed. But it is capable of storing a large amount of data. It is also known as Offline
storage. Tertiary storage is generally used for data backup. There are following
tertiary storage devices available:

○ Optical Storage: An optical storage can store megabytes or gigabytes of

data. A Compact Disk (CD) can store 700 megabytes of data with a playtime
of around 80 minutes. On the other hand, a Digital Video Disk or a DVD can
store 4.7 or 8.5 gigabytes of data on each side of the disk.

○ Tape Storage: It is the cheapest storage medium than disks. Generally, tapes
are used for archiving or backing up the data. It provides slow access to data
as it accesses data sequentially from the start. Thus, tape storage is also
known as sequential-access storage. Disk storage is known as direct-access
storage as we can directly access the data from any location on disk.

Indexes play a crucial role in DBMS performance by optimizing data retrieval.

They're like detailed roadmaps in your database, helping queries reach the
desired data much faster. Here's a breakdown of how they work:

What are Indices?

Imagine sorting all the books in a library by author's name instead of browsing
each shelf randomly. An index in a DBMS does something similar. It acts as a
sorted data structure based on specific columns, allowing for rapid
identification and retrieval of data rows that match a query's criteria.

Benefits of using Indices:

● Faster query execution: Especially for queries involving the indexed

column(s), the sorted structure enables rapid identification and retrieval
of relevant data rows.
● Improved performance for specific operations: Filtering, sorting, and
joining based on indexed columns become significantly faster.
● Reduced overall workload: By minimizing disk access for frequent
queries, the entire database system operates more efficiently.

B-trees are a self-balancing tree data structure designed for efficient data
storage and retrieval, particularly in databases. They offer several advantages
over simpler tree structures like binary search trees, especially when dealing
with large datasets
are a self-balancing tree data structure designed for efficient data storage and
retrieval, particularly in databases. They offer several advantages over simpler
tree structures like binary search trees, especially when dealing with large
datasets

The limitations of traditional binary search trees can be frustrating. Meet

the B-Tree, the multi-talented data structure that can handle massive
amounts of data with ease. When it comes to storing and searching large
amounts of data, traditional binary search trees can become impractical
due to their poor performance and high memory usage. B-Trees, also
known as B-Tree or Balanced Tree, are a type of self-balancing tree that
was specifically designed to overcome these limitations.

Time Complexity of B-Tree:

Algorith
Sr. No. Time Complexity
m
1. Search O(log n)

2. Insert O(log n)

3. Delete O(log n)

Here's a breakdown of what makes B-trees special:

Key Features of B-trees:

● Multiple children per node: Unlike binary search trees which have at
most two child nodes, B-tree nodes can have a minimum and maximum
number of children (often denoted by "t"). This allows for storing more
data points in each node and reducing overall tree height.
● Balanced structure: B-trees automatically adjust their structure to
maintain a roughly consistent height across the tree. This ensures
efficient searches, regardless of the data distribution, because the
number of levels to traverse remains predictable.
● Ordered data: Data within each node is kept sorted in ascending order.
This facilitates faster searching by quickly narrowing down the potential
location of the target data point.
● Dynamic insertion and deletion: B-trees can efficiently handle data
insertion and deletion without compromising the balanced structure.
They automatically redistribute data or split/merge nodes to maintain
order and search performance
●
●
●
●
●
●
●
●
●
●
●
●
● .

Hashing in DBMS plays a crucial role in optimizing data access and retrieval.
It's a powerful technique that leverages hash functions to transform large,
variable-length data into short, fixed-length strings called hash values. These
values essentially act as fingerprints for your data, enabling quick
identification and comparison, especially within large datasets.

Static Hashing
Dynamic Hashing
○ The dynamic hashing method is used to overcome the problems of static
hashing like bucket overflow.

○ In this method, data buckets grow or shrink as the records increases or

decreases. This method is also known as Extendable hashing method.

○ This method makes hashing dynamic, i.e., it allows insertion or deletion

without resulting in poor performance.

Static Hashing:

● Concept: The number of hash buckets and the hash function are fixed at
the time the hash table is created. Data is evenly distributed across the
pre-defined number of buckets based on their hash values.
● Advantages:
○ Simple and efficient: Easy to implement and understand, offering
predictable performance for operations like insertion and search.
○ Less overhead: Requires minimal memory and processing
resources for maintenance.
○ Suitable for static datasets: Works well for situations where the
data size and access patterns are relatively stable.
● Disadvantages:
○ Performance bottleneck: Can suffer from collisions and
performance degradation as the data grows and fills up buckets
unevenly.
○ Limited scalability: Difficult to adapt to changes in data size or
access patterns, requiring rebuilding the entire hash table if
significant changes are needed.
○ Wasteful space: May lead to empty buckets if the data distribution
is uneven, potentially wasting storage space.

Dynamic Hashing:

● Concept: The number of hash buckets can dynamically adjust as the

data volume changes. The hash function may also be modified based on
the current distribution of data to minimize collisions.
● Advantages:
○ Highly scalable: Adapts readily to changes in data size and
access patterns, reducing performance bottlenecks and
maintaining efficient operations.
○ Better collision handling: Employs various techniques to handle
collisions and distribute data evenly across buckets, even with
uneven growth.
○ More efficient space utilization: Minimizes wasted space by
dynamically allocating and deallocating buckets based on actual
data needs.
● Disadvantages:
○ More complex: Requires more sophisticated algorithms and
implementation than static hashing, potentially increasing
maintenance overhead.
○ Potentially slower insertion/deletion: Dynamic adjustments to the
hash table structure can introduce additional processing
overhead for inserting or deleting data compared to static
hashing.
○ Less predictable performance: Performance may vary depending
on the data distribution and chosen dynamic hashing algorithm.
Unit 4
Concurrency control is a crucial aspect of database management systems
(DBMS) that ensures orderly and consistent access to data by multiple users
or processes at the same time. Without it, chaos would ensue, leading to data
corruption, inconsistencies, and unreliable results.

Concurrency Control Protocols

Concurrency control protocols are the set of rules which are maintained in
order to solve the concurrency control problems in the database. It
ensures that the concurrent transactions can execute properly while
maintaining the database consistency. The concurrent execution of a
transaction is provided with atomicity, consistency, isolation, durability,
and serializability via the concurrency control protocols.

● Locked based concurrency control protocol

● Timestamp based concurrency control protocol

Locked based Protocol

In locked based protocol, each transaction needs to acquire locks before

they start accessing or modifying the data items. There are two types of
locks used in databases.

Shared Lock : Shared lock is also known as read lock which allows

multiple transactions to read the data simultaneously. The transaction

which is holding a shared lock can only read the data item but it can not

modify the data item.

Exclusive Lock : Exclusive lock is also known as the write lock. Exclusive
lock allows a transaction to update a data item. Only one transaction can
hold the exclusive lock on a data item at a time.
Two-Phase Locking Protocol (2PL) – a cornerstone of transaction processing
in database management systems! It's one of the most widely used
concurrency control mechanisms for ensuring data consistency and
preventing interference between concurrent transactions accessing the same
data.

Here's a breakdown of how 2PL works:

Phases of the Protocol:

1. Growing Phase : In this phase, the transaction starts acquiring locks

before performing any modification on the data items. Once a

transaction acquires a lock, that lock can not be released until the

transaction reaches the end of the execution.

2. Shrinking Phase : In this phase, the transaction releases all the

acquired locks once it performs all the modifications on the data

item. Once the transaction starts releasing the locks, it can not

acquire any locks further

Lock Types:

● Shared Lock: Allows other transactions to read the data, but prevents
them from modifying it.
● Exclusive Lock: Allows only the holding transaction to both read and
write the data, blocking other transactions from accessing it in any way.

Benefits of 2PL:

● Guarantees serializability: Ensures that concurrent transactions appear

to execute one after the other, even though they are actually running
concurrently. This prevents conflicting modifications and maintains data
consistency.
● Deadlock freedom: Guarantees that transactions will not deadlock (a
situation where two or more transactions wait indefinitely for locks held
by each other). This keeps the system running smoothly and avoids
deadlocks that require manual intervention.
● Simple and predictable: Easy to understand and implement compared to
other concurrency control mechanisms like timestamps.

Timestamp based Protocol

● In this protocol each transaction has a timestamp attached to it.

Timestamp is nothing but the time in which a transaction enters

into the system.

● The conflicting pairs of operations can be resolved by the

timestamp ordering protocol through the utilization of the

timestamp values of the transactions. Therefore, guaranteeing

that the transactions take place in the correct order.

Advantages of Concurrency
In general, concurrency means, that more than one transaction can work
on a system. The advantages of a concurrent system are:

● Waiting Time: concurrency leads to less waiting time.

● Response Time:, concurrency leads to less Response Time.

● Resource Utilization: concurrency leads to more Resource

Utilization.

● Efficiency: Concurrency leads to more Efficiency.

Disadvantages of Concurrency
● Overhead: Implementing concurrency control requires additional

overhead, such as acquiring and releasing locks on database

objects. This overhead can lead to slower performance and

increased resource consumption, particularly in systems with

high levels of concurrency.

● Deadlocks: Deadlocks can occur when two or more transactions

are waiting for each other to release resources, causing a circular

dependency that can prevent any of the transactions from

completing. Deadlocks can be difficult to detect and resolve, and

can result in reduced throughput and increased latency.

● Reduced concurrency: Concurrency control can limit the number

of users or applications that can access the database

simultaneously. This can lead to reduced concurrency and slower

performance in systems with high levels of concurrency.

● Complexity: Implementing concurrency control can be complex,

particularly in distributed systems or in systems with complex

transactional logic. This complexity can lead to increased

development and maintenance costs.

● Inconsistency: In some cases, concurrency control can lead to

inconsistencies in the database. For example, a transaction that is

rolled back may leave the database in an inconsistent state, or a

long-running transaction may cause other transactions to wait for

extended periods, leading to data staleness and reduced

accuracy.
ACID property
A transaction is a single logical unit of work that accesses and possibly
modifies the contents of a database. Transactions access data using read
and write operations.
In order to maintain consistency in a database, before and after the
transaction, certain properties are followed. These are called ACID
properties.

Atomicity: Imagine a bank transfer. Atomicity guarantees that either the entire
transfer happens successfully (money deducted from sender, credited to
receiver) or not at all. No partial transfers! This prevents inconsistent states
and incomplete changes.
Consistency: Think of updating a shopping cart. Consistency ensures that the
database remains in a valid state after a transaction. For example, updating
product availability only after successfully deducting the quantity from
inventory maintains consistency.

example above,

The total amount before and after the transaction must be maintained.

Total before T occurs = 500 + 200 = 700.

Total after T occurs = 400 + 300 = 700.

Therefore, the database is consistent. Inconsistency occurs in case T1

completes but T2 fails. As a result, T is incomplete.

Isolation: Picture multiple users booking movie tickets at the same time.
Isolation ensures that one user's booking doesn't interfere with another's.
Even if multiple bookings happen concurrently, each appears to complete in
its own isolated environment, preventing overbooking or inconsistent seat
allocations.
Durability: Imagine a power outage during a purchase. Durability guarantees
that once a transaction is committed (successfully completed), its changes are
permanently stored in the database, even if the system crashes. No more
worrying about lost data!

serializability in scheduling – a crucial concept in database management

systems (DBMS) that ensures the correctness of concurrent transactions.

Imagine multiple transactions accessing and modifying data in a database at

the same time. Serializability guarantees that the outcome of these concurrent
executions is the same as if they had happened one after the other, in a
specific order (serial schedule). This ensures data consistency and prevents
anomalies from arising.

Formalizing the concept:

● Two schedules are considered equivalent if they produce the same final
database state.
● A schedule is serializable if it is equivalent to some serial schedule.
● A schedule represents the operations from different transactions in a
concurrent run.

Types of Serializability:

● Strict serializability: The most restrictive form, ensuring the final state is
identical to a serial schedule where transactions are executed in order
of their start times.
● Conflict serializability: Allows more flexibility, as long as conflicting
operations from different transactions appear in the same relative order
in all equivalent serial schedules.

Benefits of Serializability:

● Data integrity: Guarantees accurate and consistent database state, even

with concurrent transactions.
● Predictable behavior: Developers can write code with confidence,
knowing the outcomes of transactions regardless of their interleaving.
● Simplified debugging: Makes it easier to diagnose and fix issues in
concurrent programs, as the behavior is equivalent to a sequential
execution.

Both multi-version and optimistic concurrency control (OCC) schemes are

alternative approaches to traditional locking mechanisms in database
management systems (DBMS). They offer different ways to manage concurrent
access to data while still maintaining data integrity and consistency. Let's dive
into their key characteristics:

Multi-version Concurrency Control (MVCC):

● Concept: Instead of locking data elements, MVCC maintains multiple

versions of data as transactions modify it. Each version is associated
with a specific timestamp, representing the time of the transaction that
created it.
● Strengths:
○ Increased concurrency: By eliminating explicit locking, MVCC
allows more concurrent reads and writes compared to locking
schemes.
○ Read-write isolation: Reads never block writes and vice versa,
ensuring efficient concurrent access.
○ No deadlocks: MVCC inherently avoids deadlocks, simplifying
concurrency management.
● Weaknesses:
○ Increased storage overhead: Maintaining multiple versions of data
can consume more storage space.
○ Complex implementation: MVCC algorithms require careful design
and implementation to ensure consistency and avoid anomalies.
○ Potentially slower writes: Version management can introduce
additional overhead for write operations compared to locking.
Optimistic Concurrency Control (OCC):

● Concept: Unlike locking or MVCC, OCC assumes transactions will not

conflict with each other and allows them to proceed without acquiring
locks or maintaining versions. Conflicts are detected and resolved only
when transactions attempt to commit.
● Strengths:
○ Improved performance: No locking overhead leads to potentially
faster transaction execution, especially for read-heavy workloads.
○ Simple implementation: OCC requires minimal changes to
existing database systems compared to locking or MVCC.
○ Scalable: Can handle a large number of concurrent transactions
efficiently.
● Weaknesses:
○ Increased conflict detection overhead: Validating all changes
during commit can be expensive, especially with frequent
conflicts.
○ Potentially higher abort rates: Conflicts can lead to transaction
aborts and re-executions, impacting performance.
○ Not suitable for all scenarios: May not be ideal for applications
with high write contention or long-running transactions.
Multi-version Concurrency Control Optimistic Concurrency Control
(MVCC): (OCC):

● MVCC: Maintains multiple ● OCC: Assumes transactions

versions of data, each are conflict-free until commit
associated with the modifying time. Conflicts are detected
transaction's timestamp. This and resolved (usually by
allows read-write isolation aborting one transaction)
without blocking. during commit validation.

● MVCC: Maintains multiple ● OCC: Operates on the current

versions of data, each single version of data. No
associated with the modifying versioning overhead, but
transaction's timestamp. This potential for lost updates or
allows read-write isolation phantom reads if conflicts
without blocking. occur.

● MVCC: Inherently avoids

deadlocks due to timestamp ● OCC: Possible to have
ordering and versioning. No deadlocks if transactions wait
waiting for locks. for each other during conflict
resolution.

● MVCC: May have higher

storage overhead due to
maintaining multiple data ● OCC: Has minimal overhead as
versions. Implementing and it doesn't require locks or
maintaining the versioning versioning. However, conflict
logic can also be more detection and resolution at
complex. commit can be expensive,
especially with frequent
conflicts.
● MVCC: Can offer higher
concurrency for read-heavy
workloads due to concurrent ● OCC: Can offer faster initial
reads and writes. Writes might execution due to no locking
be slower due to version overhead. However, frequent
management. conflicts leading to aborts and
● MVCC: Suitable for re-executions can impact
applications with frequent performance.
● OCC: Suitable for applications
reads and writes, high with read-heavy workloads,
concurrency needs, and no simple transactions, and
tolerance for deadlocks. potentially high conflict
tolerance.

Feature Multi-version Optimistic Concurrency

Concurrency Control Control (OCC)
(MVCC)

Conflict Resolution At read time through At commit time

versioning

Data Management Multiple data Single data version

versions with
timestamps

Deadlocks No deadlocks Possible deadlocks

possible during conflict
resolution

Overhead Higher storage and Lower overhead, no

implementation locks or versions
overhead

Performance High concurrency for Faster initial

reads and writes, execution, slower
slower writes with frequent
conflicts

Suitability Frequent Read-heavy

reads/writes, high workloads, simple
concurrency, transactions, conflict
deadlock avoidance tolerance
Database recovery. Database recovery encompasses a set of techniques and
procedures used to restore a database to a consistent and valid state after a
failure or error. It's like a safety net, ensuring your critical data remains
accessible and reliable even in the face of unexpected event.
Database Systems like any other computer system, are subject to failures
but the data stored in them must be available as and when required.
When a database fails it must possess the facilities for fast recovery. It
must also have atomicity i.e. either transactions are completed
successfully and committed (the effect is recorded permanently in the
database) or the transaction should have no effect on the database

Here's a breakdown of its core concepts:

Types of Failures:

● Hardware failures: Disk crashes, power outages, etc.

● Software errors: Bugs, configuration issues, application crashes, etc.
● Human errors: Accidental data deletion, incorrect updates, etc.
● Natural disasters: Floods, earthquakes, etc.

Types of Recovery Techniques in DBMS

● Rollback/Undo Recovery Technique

● Commit/Redo Recovery Technique

Rollback/Undo Recovery: This technique focuses on reversing unwanted

changes made to the database. Think of it like hitting the "undo" button.

Transaction logs track changes made to the database. By replaying the redo
log from the point of failure, the database can be brought back to a consistent
state
● Benefits:
○ Efficient recovery: Undoing changes directly is often faster than
replaying redo logs, especially for short-lived transactions.
○ Minimizes data loss: Only unwanted changes are reversed,
potentially preserving some recent data compared to restoring
from a backup.
○ Easy to understand: The concept of undoing actions is intuitive
and easy to comprehend.
○
● Drawbacks:
○ Increased overhead: Maintaining undo logs adds overhead to the
system, consuming storage space and requiring processing
power to keep them updated.
○ Limited effectiveness: Undo logs typically only store information
for recent transactions. Recovering from older failures might
require other techniques like backups.

Commit/Redo Recovery: This technique focuses on replaying successful

transactions to bring the database back to a consistent state. Imagine
revisiting and applying all the completed steps (commits) to ensure everything
is up-to-date.

● Benefits:
○ Guaranteed consistency: Redo logs ensure that only successful
transactions are applied, guaranteeing data integrity and
consistency even after failures.
○ Scalability: Redo logs can be large enough to store information
for all recent transactions, allowing recovery from older failures
compared to undo logs.
○ Efficient for long-lived transactions: Replaying committed
changes can be faster than reversing a large number of undo
operations for complex or long-running transactions.
● Drawbacks:
○ Increased overhead: Redo logs can be large and require
significant storage space and processing power to manage.
○ Potential data loss: If a failure occurs before the transaction is
logged, its changes might be lost and need to be recovered from
backups.
○ More complex: The concept of replaying logs might be less
intuitive compared to simply undoing actions.
Feature Rollback/Undo Commit/Redo

Focus Reverses unwanted changes Replays successful

within transactions transactions for state
consistency

Data Discards uncommitted Only applies committed

Handli changes, restores previous changes, maintains
ng state integrity

Efficie Fast for short transactions, Faster for long

ncy error correction transactions, consistent
state

Data Less loss (undoes unwanted Potential loss if failure

Loss changes) before log

Overhe Lower (undo logs) Higher (redo logs, resource

ad usage)

Applic Error correction, isolated Data consistency, major

ation rollbacks, storage failures, complex
limitations transactions
Analog Discarding mistaken batter Recreating a perfectly
y before baking baked cake after oven
failure

Unit 5
Database security encompasses everything you do to safeguard your
database from unauthorized access, malicious attacks, and accidental or
intentional damage. It's like a sturdy vault securing your most valuable
information, ensuring its confidentiality, integrity, and availability.

Here are the key pillars of database security:

1 Authentication: Verifying the identity of users trying to access the database.

2 Authorization: Determining what actions each user can perform based on

their roles and permissions.

Authentication:

● Concept: Verifying the identity of someone trying to access your

system. Think of it like checking IDs at a nightclub.
● Methods: Passwords, PINs, biometrics (fingerprints, facial recognition),
multi-factor authentication (combining different methods).
● Importance: Ensures only authorized individuals can access your
system, preventing unauthorized access and potential attacks.

Authorization:
● Concept: Determining what actions a user can perform after they've
been authenticated. Think of it like assigning roles and permissions in a
team project.
● Techniques: Access control lists (ACLs), user roles and groups,
resource-based access control (RBAC) where permissions are assigned
based on specific resources.
● Importance: Limits user actions based on their roles and needs,
preventing unauthorized modifications or misuse of data.

Access Control:

● Concept: Enforcing the rules established by authorization. Think of it as

a bouncer at the nightclub ensuring only those with proper permissions
can enter specific areas.
● Techniques: Firewalls, network segmentation, data encryption, and
software controls within applications.
● Techniques: Implemented through the DBMS itself, restricting user
access to specific data based on permissions
● Importance: Implements the limits set by authorization, preventing
unauthorized access to specific data or resources, even if a user might
be authenticated.

Feature Authentication Authorization Access Control

Concept Verifying user Defining user Enforcing

identity permissions permissions
within the
database

Analogy Checking library Assigning Librarian

card document editing restricting
rights access to certain
books

Methods Passwords, PINs, Database logins, ACLs, roles &

biometrics, MFA client groups, row-level
certificates, security
Kerberos

Importance Prevents Limits user Implements

unauthorized actions authorization
access restrictions

Focus Who can access What users can How permissions

the database do within the are enforced
database

Database Data sensitivity, Granular control, Database engine

Considerations application user roles, integration,
integration, audit dynamic row-level
trails permissions security

Examples SQL Server Granting Restricting

logins, Oracle SELECT, INSERT, access to
Database users UPDATE specific rows
permissions on based on user
tables attributes
DAC, MAC, and RBAC are three prominent access control models used to regulate user access
to resources, including databases. Each model operates with a distinct approach and
philosophy, offering varying levels of granularity and control. Here's a breakdown of their key
characteristics:

Discretionary Access Control (DAC):

Discretionary Access Control (DAC) is a popular access control model where

the owner of a resource (data, file, object) directly manages who can access it
and what actions they can perform. Think of it like the owner of a house
sharing their keys with specific guests and granting them permission to
specific rooms or objects within the house.

Strengths:

● Simple to implement: Users have direct control over their data and can
easily share it with others.
● User autonomy: Users can manage access based on their own needs
and preferences.
● Flexible: Can be adapted to various situations and user groups.

Weaknesses:

● Lack of centralized control: Difficult to enforce consistent security

policies across the system.
● Not scalable: Managing access becomes complex as the number of
users and resources grows.
● Limited accountability: Difficult to track who has access to what and
who granted it.

Mandatory Access Control (MAC):

● Concept: System enforces pre-defined security labels on resources and

user clearances. Think of it as government documents with classified
levels and authorized personnel clearance requirements.
● Strengths: High security, prevents unauthorized access to sensitive
data.
● Weaknesses: Complex to implement, inflexible, users have limited
control over access.
● Suitable for: Highly sensitive data, environments with strict security
regulations.
Mandatory Access Control (MAC) takes a radically different approach to
access control compared to DAC. where access to resources is governed by
pre-defined security labels assigned to both data and users based on
sensitivity levels. Unlike DAC's user-driven approach, MAC prioritizes strict
security and centralized control,

Strengths:

● High Security: Strict separation of data based on sensitivity levels

minimizes unauthorized access risks.
● Centralized Control: System administrators define and enforce security
policies, ensuring consistency and compliance.
● Minimal Misconfiguration Risk: User control over access is limited,
reducing the chance of accidental breaches.
● Auditable and Traceable: Access logs track user activity, facilitating
accountability and anomaly detection.

Weaknesses:

● Complex to Implement: Requires careful policy definition and system

configuration.
● Inflexible: Users have limited control over access, Anpassung an
diverse Datenempfindlichkeiten kann komplex sein.
● User Autonomy Limited: Users might feel restricted by pre-defined
access levels.
● Scalability Concerns: Implementing and managing MAC for large
systems can be challenging.

Role-Based Access Control (RBAC) takes a different direction than DAC and
MAC, offering a well-structured approach to access control. Imagine assigning
responsibilities and permissions based on roles in a play, with actors having
access to props and areas relevant to their assigned roles. RBAC works
similarly, assigning pre-defined roles with associated permissions to users,
granting access based on their assigned roles.

Here's a breakdown of RBAC's key characteristics:

Strengths:

● Granular Control: Permissions are defined at the role level, enabling

flexible and detailed access management.
● Efficient Management: Centralized control of roles simplifies managing
access for large user groups.
● Scalable: Adapts well to growing systems and complex user needs.
● Improved Accountability: Roles clearly define user responsibilities and
access privileges.

Weaknesses:

● Requires Careful Role Definition: Defining roles and permissions

effectively can be complex.
● Overlapping Permissions: Potential for confusion or misuse if roles
share similar permissions.
● Limited User Autonomy: Users have less control over access compared
to DAC.

SQL injection is a code injection technique that might destroy your

database.

SQL injection is one of the most common web hacking techniques.

SQL injection is the placement of malicious code in SQL statements,

via web page inpuT

SQL injection usually occurs when you ask a user for input, like their
username/userid, and instead of a name/id, the user gives you an
SQL statement that you will unknowingly run on your database
Role-Based Access Control (RBAC):

● Concept: Users are assigned roles with pre-defined permissions, access

is granted based on assigned roles. Think of it as assigning access
based on job titles (e.g., editor, manager, administrator) with specific
duties and permissions.
● Strengths: Granular control, efficient management, scalable for large
systems.
● Weaknesses: Requires careful role definition and maintenance, potential
for misuse if roles are poorly defined.
● Suitable for: Large organizations, centralized management,
environments with diverse user needs and data sensitivity levels.

Database Management System PPT
100% (3)
Database Management System PPT
88 pages
Data Base Management System
88% (8)
Data Base Management System
224 pages
Icai Itt Questions
100% (1)
Icai Itt Questions
24 pages
Advanced Database Note
No ratings yet
Advanced Database Note
157 pages
KCSE 2022 - Nekta Management System
100% (1)
KCSE 2022 - Nekta Management System
55 pages
Introduction To Database Management System
No ratings yet
Introduction To Database Management System
20 pages
CS8481 Database Management Systems Laboratory Manual IICSEA
38% (8)
CS8481 Database Management Systems Laboratory Manual IICSEA
148 pages
Chapter 5 - Database Management System
No ratings yet
Chapter 5 - Database Management System
30 pages
Database Management Systems
No ratings yet
Database Management Systems
53 pages
DBMS Full Notes
No ratings yet
DBMS Full Notes
41 pages
Data Abstraction
No ratings yet
Data Abstraction
12 pages
Before Mids Data Base Notes
No ratings yet
Before Mids Data Base Notes
109 pages
DBMS Merged
No ratings yet
DBMS Merged
566 pages
DBMS
100% (1)
DBMS
16 pages
Before Mids Data Base Notes Word File
No ratings yet
Before Mids Data Base Notes Word File
105 pages
Database Notes
No ratings yet
Database Notes
109 pages
Dbms
No ratings yet
Dbms
69 pages
SQL Course
No ratings yet
SQL Course
183 pages
DBMS Notes
No ratings yet
DBMS Notes
6 pages
Introduction To Data Base Mangement
No ratings yet
Introduction To Data Base Mangement
88 pages
Antim Prahar 2025 Data Base Management System
No ratings yet
Antim Prahar 2025 Data Base Management System
58 pages
Dbms LMT Notes
No ratings yet
Dbms LMT Notes
71 pages
DBMS
No ratings yet
DBMS
74 pages
College Management Full Document
75% (129)
College Management Full Document
56 pages
Dbms Chap 01
No ratings yet
Dbms Chap 01
52 pages
Database Systems Week 3
No ratings yet
Database Systems Week 3
34 pages
DBMS Basic Concepts
No ratings yet
DBMS Basic Concepts
29 pages
DBMS Basic Concepts
No ratings yet
DBMS Basic Concepts
29 pages
It Officer Notes Ebook
No ratings yet
It Officer Notes Ebook
352 pages
Unit 1
No ratings yet
Unit 1
65 pages
Database Environment Presentation
No ratings yet
Database Environment Presentation
34 pages
Introduction To Security
No ratings yet
Introduction To Security
37 pages
Intro Dbms
No ratings yet
Intro Dbms
44 pages
DBMS Slides
No ratings yet
DBMS Slides
127 pages
Data Warehouse Slide1
No ratings yet
Data Warehouse Slide1
52 pages
DBMS Imp
No ratings yet
DBMS Imp
25 pages
Dbms Notes
No ratings yet
Dbms Notes
18 pages
Database Administration and Management Lecture 2
No ratings yet
Database Administration and Management Lecture 2
21 pages
Dbms Architecture, Dbms Languages, Data Independence
No ratings yet
Dbms Architecture, Dbms Languages, Data Independence
7 pages
Rdbms Unit2
No ratings yet
Rdbms Unit2
20 pages
Unit 1-rdbms
No ratings yet
Unit 1-rdbms
19 pages
Dbms Part 1
No ratings yet
Dbms Part 1
29 pages
DBMS Notes
No ratings yet
DBMS Notes
18 pages
Wepik Optimizing Data Management A Comprehensive Overview of Database Management Systems 20231105220856anip
No ratings yet
Wepik Optimizing Data Management A Comprehensive Overview of Database Management Systems 20231105220856anip
14 pages
DBMS Chap 01
No ratings yet
DBMS Chap 01
45 pages
MM 1
No ratings yet
MM 1
13 pages
UNIT 1: Database System Concepts & Architecture: o o o o o o
No ratings yet
UNIT 1: Database System Concepts & Architecture: o o o o o o
18 pages
Oracle SQL 9i
No ratings yet
Oracle SQL 9i
76 pages
Database Unit1 Notes For Reference
No ratings yet
Database Unit1 Notes For Reference
19 pages
Dbms Unit Test Notes Till Unit 4
No ratings yet
Dbms Unit Test Notes Till Unit 4
31 pages
DBMS
No ratings yet
DBMS
18 pages
Dbms 1
No ratings yet
Dbms 1
10 pages
DBMS Unit 1
No ratings yet
DBMS Unit 1
19 pages
Introduction To Database Management Systems
No ratings yet
Introduction To Database Management Systems
27 pages
Unit 1 Half
No ratings yet
Unit 1 Half
11 pages
Dbms Notes
No ratings yet
Dbms Notes
9 pages
Dbms Unit 1
No ratings yet
Dbms Unit 1
7 pages
Questio S Answers of DBM S
No ratings yet
Questio S Answers of DBM S
13 pages
1 New Unit I Date
No ratings yet
1 New Unit I Date
15 pages
DBMS Notes
No ratings yet
DBMS Notes
8 pages
Mod 1
No ratings yet
Mod 1
9 pages
Getting Started With DBMS: Prepared By: Prerana Bhattarai
No ratings yet
Getting Started With DBMS: Prepared By: Prerana Bhattarai
39 pages
DBMS Question Bank
No ratings yet
DBMS Question Bank
5 pages
RA
No ratings yet
RA
4 pages
Database Management System
No ratings yet
Database Management System
21 pages
Dbms
No ratings yet
Dbms
8 pages
Chapter 14. Database
No ratings yet
Chapter 14. Database
11 pages
DBMS Answer Key
100% (1)
DBMS Answer Key
26 pages
ODI Knowledge Module Introduction
100% (2)
ODI Knowledge Module Introduction
14 pages
Srs Blood Donation
71% (14)
Srs Blood Donation
15 pages
AIS 14e Romney Chapter 10
100% (1)
AIS 14e Romney Chapter 10
28 pages
Tutorial: Checksum and CRC Data Integrity Techniques For Aviation
No ratings yet
Tutorial: Checksum and CRC Data Integrity Techniques For Aviation
44 pages
CSE - Database Management Systems
No ratings yet
CSE - Database Management Systems
17 pages
UNIT 1 Material
No ratings yet
UNIT 1 Material
22 pages
Dbms Units Notes
No ratings yet
Dbms Units Notes
20 pages
1.6 Security, Privacy And: Data Integrity
No ratings yet
1.6 Security, Privacy And: Data Integrity
38 pages
DBMS Individual Project
No ratings yet
DBMS Individual Project
16 pages
Database Slides
No ratings yet
Database Slides
23 pages
Importance of Database Design in DBMS
No ratings yet
Importance of Database Design in DBMS
5 pages
Unit 5 - Distributed Algorithms
No ratings yet
Unit 5 - Distributed Algorithms
15 pages
Group1 Asm Part1 Nguyenhoangdung BH00663
No ratings yet
Group1 Asm Part1 Nguyenhoangdung BH00663
29 pages
M03 Operate Database Application
No ratings yet
M03 Operate Database Application
68 pages
Class-XI Database+Concepts
No ratings yet
Class-XI Database+Concepts
32 pages
SNIA Data Protection Best Practice 2025-01-27
No ratings yet
SNIA Data Protection Best Practice 2025-01-27
47 pages
CBS 416 Assigment Solution
No ratings yet
CBS 416 Assigment Solution
8 pages
Confidentiality, Integrity, Availability (CIA)
No ratings yet
Confidentiality, Integrity, Availability (CIA)
9 pages
A Database Is A Structured Collection of Data
No ratings yet
A Database Is A Structured Collection of Data
21 pages
Alis 51 (2) 72-81
No ratings yet
Alis 51 (2) 72-81
10 pages
IE 412/512: Information Systems Engineering Fall 2016 Homework #1: Database System Concepts and Architecture Due: Tuesday, Oct 12, 2016
No ratings yet
IE 412/512: Information Systems Engineering Fall 2016 Homework #1: Database System Concepts and Architecture Due: Tuesday, Oct 12, 2016
5 pages
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet