0% found this document useful (0 votes)
57 views13 pages

Serializability in DBMS

dbms

Uploaded by

gurpreet kohli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views13 pages

Serializability in DBMS

dbms

Uploaded by

gurpreet kohli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Serializability in DBMS

Last Updated : 06 Oct, 2023



In this article, we are going to explain the serializability concept and how this
concept affects the DBMS deeply, we also understand the concept of
serializability with some examples, and we will finally conclude this topic with an
example of the importance of serializability. The DBMS form is the foundation of
the most modern applications, and when we design the form properly, it
provides high-performance and relative storage solutions to our application.
What is a serializable schedule, and what is it used
for?
If a non-serial schedule can be transformed into its corresponding serial
schedule, it is said to be serializable. Simply said, a non-serial schedule is
referred to as a serializable schedule if it yields the same results as a serial
timetable.
Non-serial Schedule
A schedule where the transactions are overlapping or switching places. As they
are used to carry out actual database operations, multiple transactions are
running at once. It’s possible that these transactions are focusing on the same
data set. Therefore, it is crucial that non-serial schedules can be serialized in
order for our database to be consistent both before and after the transactions
are executed.
Example:
Transaction-
1 Transaction-2

R(a)

W(a)

R(b)

W(b)

R(b)

R(a)

W(b)
Transaction-
1 Transaction-2

W(a)
We can observe that Transaction-2 begins its execution before Transaction-1 is
finished, and they are both working on the same data, i.e., “a” and “b”,
interchangeably. Where “R”-Read, “W”-Write
Serializability testing
We can utilize the Serialization Graph or Precedence Graph to examine a
schedule’s serializability. A schedule’s full transactions are organized into a
Directed Graph, what a serialization graph is.

Precedence Graph

It can be described as a Graph G(V, E) with vertices V = “V1, V2, V3,…, Vn” and
directed edges E = “E1, E2, E3,…, En”. One of the two operations—READ or
WRITE—performed by a certain transaction is contained in the collection of
edges. Where Ti -> Tj, means Transaction-Ti is either performing read or write
before the transaction-Tj.
Types of Serializability
There are two ways to check whether any non-serial schedule is serializable.
Types of Serializability – Conflict & View

1. Conflict serializability
Conflict serializability refers to a subset of serializability that focuses on
maintaining the consistency of a database while ensuring that identical data
items are executed in an order. In a DBMS each transaction has a value and all
the transactions, in the database rely on this uniqueness. This uniqueness
ensures that no two operations with the conflict value can occur simultaneously.
For example lets consider an order table and a customer table as two instances.
Each order is associated with one customer even though a single client may
place orders. However there are restrictions for achieving conflict serializability
in the database. Here are a few of them.
1. Different transactions should be used for the two procedures.
2. The identical data item should be present in both transactions.
3. Between the two operations, there should be at least one write operation.
Example
Three transactions—t1, t2, and t3—are active on a schedule “S” at once. Let’s
create a graph of precedence.
Transaction – 1
(t1) Transaction – 2 (t2) Transaction – 3 (t3)

R(a)

R(b)

R(b)

W(b)
Transaction – 1
(t1) Transaction – 2 (t2) Transaction – 3 (t3)

W(a)

W(a)

R(a)

W(a)
It is a conflict serializable schedule as well as a serial schedule because the
graph (a DAG) has no loops. We can also determine the order of transactions
because it is a serial schedule.

DAG of transactions

As there is no incoming edge on Transaction 1, Transaction 1 will be executed


first. T3 will run second because it only depends on T1. Due to its dependence
on both T1 and T3, t2 will finally be executed.
Therefore, the serial schedule’s equivalent order is: t1 –> t3 –> t2
Note: A schedule is unquestionably consistent if it is conflicting serializable. A
non-conflicting serializable schedule, on the other hand, might or might not be
serial. We employ the idea of View Serializability to further examine its serial
behavior.
2. View Serializability
View serializability is a kind of operation in a serializable in which each
transaction should provide some results, and these outcomes are the output of
properly sequentially executing the data item. The view serializability, in
contrast to conflict serialized, is concerned with avoiding database
inconsistency. The view serializability feature of DBMS enables users to see
databases in contradictory ways.
To further understand view serializability in DBMS, we need to understand the
schedules S1 and S2. The two transactions T1 and T2 should be used to
establish these two schedules. Each schedule must follow the three transactions
in order to retain the equivalent of the transaction. These three circumstances
are listed below.
1. The first prerequisite is that the same kind of transaction appears on every
schedule. This requirement means that the same kind of group of
transactions cannot appear on both schedules S1 and S2. The schedules are
not equal to one another if one schedule commits a transaction but it does
not match the transaction of the other schedule.
2. The second requirement is that different read or write operations should not
be used in either schedule. On the other hand, we say that two schedules are
not similar if schedule S1 has two write operations whereas schedule S2 only
has one. The number of the write operation must be the same in both
schedules, however there is no issue if the number of the read operation is
different.
3. The second to last requirement is that there should not be a conflict between
either timetable. execution order for a single data item. Assume, for instance,
that schedule S1’s transaction is T1, and schedule S2’s transaction is T2. The
data item A is written by both the transaction T1 and the transaction T2. The
schedules are not equal in this instance. However, we referred to the
schedule as equivalent to one another if it had the same number of all write
operations in the data item.
What is view equivalency?
Schedules (S1 and S2) must satisfy these two requirements in order to be
viewed as equivalent:
1. The same piece of data must be read for the first time. For instance, if
transaction t1 is reading “A” from the database in schedule S1, then t1 must
also read A in schedule S2.
2. The same piece of data must be used for the final write. As an illustration, if
transaction t1 updated A last in S1, it should also conduct final write in S2.
3. The middle sequence need to follow suit. As an illustration, if in S1 t1 is
reading A, and t2 updates A, then in S2 t1 should read A, and t2 should
update A.
View Serializability refers to the process of determining whether a schedule’s
views are equivalent.
Example
We have a schedule “S” with two concurrently running transactions, “t1” and
“t2.”
Schedule – S:
Transaction-1 (t1) Transaction-2 (t2)

R(a)

W(a)

R(a)

W(a)

R(b)

W(b)

R(b)

W(b)
By switching between both transactions’ mid-read-write operations, let’s create
its view equivalent schedule (S’).
Schedule – S’:
Transaction-1 (t1) Transaction-2 (t2)

R(a)

W(a)

R(b)

W(b)

R(a)

W(a)

R(b)

W(b)
It is a view serializable schedule since a view similar schedule is conceivable.
Note: A conflict serializable schedule is always viewed as serializable, but
vice versa is not always true.

Advantages of Serializability
1. Execution is predictable: In serializable, the DBMS’s threads are all
performed simultaneously. The DBMS doesn’t include any such surprises.
In DBMS, no data loss or corruption occurs and all variables are updated as
intended.
2. DBMS executes each thread independently, making it much simpler to
understand and troubleshoot each database thread. This can greatly simplify
the debugging process. The concurrent process is therefore not a concern for
us.
3. Lower Costs: The cost of the hardware required for the efficient operation of
the database can be decreased with the aid of the serializable property. It
may also lower the price of developing the software.
4. Increased Performance: Since serializable executions provide developers
the opportunity to optimize their code for performance, they occasionally
outperform non-serializable equivalents.
For a DBMS transaction to be regarded as serializable, it must adhere to
the ACID properties. In DBMS, serializability comes in a variety of forms, each
having advantages and disadvantages of its own. Most of the time, choosing the
best sort of serializability involves making a choice between performance and
correctness.
Making the incorrect choice for serializability might result in database issues
that are challenging to track down and resolve. You should now have a better
knowledge of how serializability in DBMS functions and the different types that
are available thanks to this guide.
FAQs on Serializability in DBMS
Q.1: How does a DBMS achieve serializability?
Answer:
Through concurrency control techniques like locking, timestamp ordering, and
optimistic concurrency control, DBMS accomplish serializability. The
simultaneous access to the database is still permitted while these methods
make sure that transactions are carried out in a serializable order.

Q.2: How is View Serializability different from Conflict Serializability?


Answer:
A lower type of serializability than conflict serializability is view serializability.
View serializability just needs that transactions yield the same final result as a
serial schedule, whereas conflict serializability demands that transactions do not
have conflicting accesses to the same data item. As a result, some schedules
that are considered conflict serializable may not be.

Q.3: What distinguishes strong serializability from weak serializability?


Answer:
A stronger type of serializability than weak serializability is strong serializability.
A schedule must be comparable to a serial schedule in strong serializability,
where transactions are carried out in the same sequence as they were in the
original schedule. A schedule just needs to be conflict equal to a serial schedule
in weak serializability.

Q.4: What does serializability testing’s precedence graph entail?


Answer:
An instrument for determining if conflicts can be serialized in a schedule is a
precedence graph. Each transaction is represented as a node in a precedence
graph, and if the operation in the second transaction depends on the operation
in the first transaction, a directed edge is drawn from one node to the next. The
schedule can be serialized to resolve conflicts if the graph is acyclic.

What is Data Reconciliation?


Data Reconciliation is a critical process in the world of Data Management. It is essential for
ensuring accuracy and consistency across different data sources. In basic terms, Data
Reconciliation can be defined as a process of data verification during the migration process.

It compares data from two or more sources to identify discrepancies, resolve differences,
and update records to reflect accurate and unified information. This process is crucial in
industries where data integrity directly impacts decision-making, compliance, and
operational efficiency.

Evolution of Data Reconciliation


The evolution of Data Reconciliation demonstrates the process of technology in Data
Management. Initially, it was a labour-intensive and manual process. Earlier, it required a
huge amount of human effort and was prone to errors.

However, as databases became more sophisticated and computing power increased,


automated tools and algorithms began to take over. It significantly enhanced the speed,
accuracy, and efficiency of Data Reconciliation processes. Today, advanced analytics,
Machine Learning algorithms, and Artificial Intelligence play a significant role in automating
and optimising reconciliation tasks.

Learn and gain expertise in handling massive databases with our Big Data and Analytics
Training – join today!

Importance of Data Reconciliation


The main role of Data Reconciliation is to support Data Management and Data Utilisation
practices of modern days in terms of effectiveness. It is the crucial process for assuring the
accuracy, consistency, and effectiveness of statistics among different databases and
platforms. It enhances the capacity for policy formulation and execution planning. Listed
here are key points highlighting importance:

1) Enhances data accuracy and quality: By identifying and resolving discrepancies in


data from different sources, Data Reconciliation improves the overall quality and accuracy of
the information that organisations rely on for decision-making.

2) Supports compliance and reporting: Accurate data is essential for regulatory


compliance and accurate reporting. Data Reconciliation ensures that organisations meet
legal standards and report accurately on their operations, financials, and customer
interactions.

3) Facilitates better decision-making: High-quality, reconciled data provides a solid


foundation for Business Intelligence and analytics. Organisations can make more informed
decisions when they are confident in the accuracy and completeness of their data.

4) Improves operational efficiency: Resolving data inconsistencies and errors through


reconciliation reduces operational bottlenecks and inefficiencies. It ensures that systems
and processes run smoothly, with less downtime and fewer errors.

5) Minimises risk factor: Inaccurate data can lead to strategic missteps, financial losses,
and damaged reputations. Data Reconciliation mitigates these risks by ensuring that critical
data is accurate and consistent across all systems.

6) Enhances customer satisfaction: Accurate, reconciled data enables organisations to


understand their customers and offer more personalised and efficient services. This leads to
enhanced customer satisfaction and loyalty.

7) Facilitates system integrations and migrations: During system upgrades,


migrations, or integrations, Data Reconciliation ensures that data remains consistent and
accurate across old and new systems, preventing data loss or corruption.

8) Reduces costs: By identifying and correcting errors early, Data Reconciliation can help
avoid the higher costs associated with rectifying issues later in the data lifecycle. This
includes costs related to erroneous decision-making, compliance penalties, and operational
inefficiencies.

Gain proficiency in data science and analytical skills with our Data Science Courses – sign
up now!

Instances Requiring Data Reconciliation


Now that you know What is Data Reconciliation, you must also remember that it is
necessary in numerous situations. This includes but is not limited to the following listed
below:

Ensuring accuracy during data migrations


It is a major part of the data migration journey which includes transferring sensitive
information from the legacy system to a new one. Through this process, Data Reconciliation
makes certain that the data inputs from the new system fit in with the data in the source
system.

Through merging and harmonization of databases, this management enables the finding and
ironing out of the differences. This is the most important to secure data stability, data
integrity, and accuracy during the migration process.

Assessing data quality in regular business operations


In regular business operations, Data Reconciliation helps in maintaining data quality and
consistency. Whether it’s customer contact details, production data, or order information,
regular inspection ensures that the data has been reliable. By reconciling data across
multiple platforms or systems, businesses can achieve a single customer view, improve
operational efficiency, and make informed decisions.

Addressing complexities in financial services


Financial institutions, including e-commerce, logistics, healthcare, insurance companies,
banks, and investment firms, rely heavily on accurate data. Data Reconciliation is critical for
the following:

1) Risk Management: Detecting fraud or potential errors.

2) Compliance: Ensuring accuracy in financial reporting.

3) Investment Data Reconciliation: Maintaining data integrity for investment portfolios.

Methods of Data Reconciliation


Data Reconciliation is an important process in maintaining data accuracy and consistency.
Let’s explore some essential methods for effective reconciliation of data:

1) Master Data Reconciliation


Master data, such as product details, customer information or employee records, serves as a
reference point for other datasets. Reconciling master data involves verifying its accuracy
across various systems. Key steps include identifying discrepancies, resolving conflicts, and
updating the master dataset.

2) Activity Accuracy
Activity-based reconciliation focuses on tracking events or transactions. For instance, in
financial systems, reconciling bank statements with transaction records ensures that all
deposits, withdrawals, and fees are accurately recorded. Regular audits and automated
checks help maintain activity accuracy.

3) Transactional Data Alignment


Transactional Data Reconciliation involves comparing generated data during specific
processes. For example, reconciling inventory records with sales orders ensures that stock
levels match customer demand. Timely adjustments prevent stockouts or excess inventory.

4) Automated Reconciliation Techniques


Leveraging technology and automated reconciliation streamlines the process. Techniques
include the following:

1) Matching Algorithms: These algorithms identify similarities between datasets, flagging


discrepancies for further investigation.

2) Exception Reports: Automated systems generate reports highlighting inconsistencies,


enabling quick resolution.

3) Data Validation Rules: Implementing predefined rules ensures data integrity and
reduces manual effort.

Challenges in Data Reconciliation


The process of Data Reconciliation consists in comparing data from different sources,
indicating differences, and settling them for the purpose of data security. However, the
following challenges can complicate reconciliation efforts:

1) Lack of data consistency


Data compatibility problems come about when different systems or departments within an
organisation adopt a variety of formats, standards, or representations for data that is of
similar type. This absence of uniformity creates a problem to match and reconcile data
properly and followed with the possibilities of erroneous reporting or analysis.

2) Human errors
Human intervention in data entry, processing, or management often leads to mistakes.
These can range from simple typos to more significant errors like duplications or omissions.
Human errors not only make Data Reconciliation more challenging but can also have
cascading effects on data quality overall.

3) Obsolete systems
Many organisations rely on outdated systems for data storage and processing. These legacy
systems might not be compatible with newer technologies, making it difficult to extract,
transform, and load data for reconciliation purposes.

4) Managing large volumes of data


With the exponential growth of data, organisations face challenges in handling vast volumes
of information efficiently. The sheer scale of data can overwhelm traditional Data
Reconciliation tools and processes, leading to bottlenecks and delays.

5) System integration issues


Organisations often use a variety of software applications and databases, each designed for
specific purposes. Integrating these disparate systems to achieve seamless data flow for
reconciliation purposes is a complex task that requires significant effort and expertise.

6) Temporal disparities
Differences in the timing of data capture, processing, and reporting can lead to
discrepancies. For example, if two systems update data at different times, reconciling this
data can be challenging without accounting for these temporal disparities.

7) Development costs and complexities


Developing or purchasing software solutions for Data Reconciliation can be costly.
Furthermore, the complexity of implementing these solutions, especially in a way that is
tailored to an organisation's specific needs, adds to the challenge.

Tools for Data Reconciliation


Despite all kinds of challenges, several tools and solutions have been developed to aid in the
process of Data Reconciliation. Let's explore the tools listed below:
1) OpenRefine
OpenRefine is an open-source tool designed to work with clustered data. It allows users to
clean, transform, and extend data with web services and external data. OpenRefine's
capabilities make it a valuable tool for reconciling differences in data from various sources.

2) TIBCO Clarity
TIBCO Clarity is a powerful tool that helps organisations cleanse, standardise, and validate
data. It supports Data Reconciliation by ensuring that data from different sources is accurate
and consistent before it is merged or analysed.

3) Winpure
Winpure is a matching and data cleansing software that offers numerous features to
improve data quality. It helps in identifying and removing duplicates, correcting errors, and
standardising data formats, which are crucial steps in the Data Reconciliation process.

You might also like