0% found this document useful (0 votes)
12 views

Empirical Analysis of Performance Parameters For Consistency in Distributed Database

The document summarizes research on consistency models in distributed databases. It discusses several consistency models like linearizability, serializability, strict serializability, sequential consistency, causal consistency, and eventual consistency. It also discusses transactional consistency and client-centric consistency models. The document outlines the methodology for empirically analyzing performance parameters that impact consistency in distributed databases.

Uploaded by

Mona Abdelaleem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Empirical Analysis of Performance Parameters For Consistency in Distributed Database

The document summarizes research on consistency models in distributed databases. It discusses several consistency models like linearizability, serializability, strict serializability, sequential consistency, causal consistency, and eventual consistency. It also discusses transactional consistency and client-centric consistency models. The document outlines the methodology for empirically analyzing performance parameters that impact consistency in distributed databases.

Uploaded by

Mona Abdelaleem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Summarization of a

research
Empirical Analysis of Performance
Parameters for Consistency (C form ACID)
in Distributed Databases

BY: Mona Abdelaleem

1
Major titles
1.introduction
2.Related work

Empirical
Analysis
Research
4. Results 3. Methodology

2
1.Introduction
Consistency is one of the characteristics of the distributed system according to that every node
or replica (may be data store) has the same view of data for a particular moment irrespective of
who so ever has updated the data. This consistency depends on the underlying distributed
systems and databases. both communities treat this distributed word in a different manner.

The distributed system treats this term as same value of an item on different data store, while
databases this is C form ACID implies implementing integrity. Based on this, we are using terms
Data consistency mainly for distributed system and transactional consistency for databases.

A consistency model is a defined set of rules that a transaction or parts of transactions follow for
consistency (accuracy or expected output). Therefore, this consistency modelling is divided into
two parts one is the data consistency model and other is the transactional consistency model.

Data consistency considers the consistency models that are depending on distributed systems
while transactional consistency models treat this word for transactions from the database
community that majorly depends on concurrency and isolation levels.
There are Several reasons which necessitate empirical analysis of performance
parameters for consistency in distributed databases:

 It is essential to optimise the efficacy of distributed databases that operate in complex and
dynamic environments.

 Researchers and practitioners are able to comprehend the effect of various performance
parameters on system behaviour through the use of empirical analysis.

 Comparison and Benchmarking: Empirical analysis enables comparison and benchmarking of


various distributed database systems or consistency protocols.

 In conclusion, empirical analysis of performance parameters for consistency in distributed


databases is necessary for optimising performance, validating theoretical models, making
informed design decisions, comparing systems, and comprehending how distributed databases
act in real-world situations.
 It offers invaluable insights and quantitative data for enhancing the performance and
dependability of distributed database systems.
2. Related work
 Various types of models are used to address this
issue of consistency in distributed databases.
Based on distributed systems and transactional
various consistency models are proposed.
 Below sections will discuss their working style and
the parameters on which it depends

5
2.1 Linearizability:
 In this type of consistency, one operation is done on one object at a
time and is feasible only in a local system with the preferred user at a
time.
 Examples of these systems are RAFT, PAXOS. It provides strong
consistency, but there is an issue with performance and availability of
the databases system.
 This type of consistency falls under data consistency that depends
on the underlying distributed system.
 The major contribution is network latency and replication. It also
depends on data allocation, fragments allocation, and data modelling

6
2.2 Serializability:
 This type of consistency is known for multi operations on multi objects by many
users.
 It guarantees a set of transactions or parts of transactions over multiple items
is equivalent to the serial order of the transactions.
 There is some order for the correctness of the value and order of the
operations. This depends on the concurrency factors. In concurrency
management, it depends on the isolation levels between different transactions
to maintain correctness. Further, in this type of consistency, three major issues
are dirty read, on repeatable and phantom read.
 To avoid these issues different levels of isolation are provided, which are of
four types: i) Read Uncommitted, ii) Read Committed, iii) Repeatable Read,
and iv) Serializable.
7
2.3 Strict Serializability:
 In this scenario along with the order real timestamp of the start of the
execution is also considered.

 It is similar to linearizability but, the only difference is linearizable works for


one operation on one object preferably on a local system, while serial
serializability works for multiple operations on different objects on distributed
systems. It assumes the presence of a global clock.
 It’s very difficult to achieve this spectrally in distributed system in the presence of
network failure. This depends on distributed systems and the concurrency of the
databases

8
2.4 Sequential Consistency:
 In this type of consistency model, there is an order of operations
on individual processes, not all the processes like in strict
consistency models.
 Here interlaying is allowed.
 Each process maintains its order.
 It’s a concept of single-copy data.

9
2.5 Casual consistency:
 In this type of consistency, there is an order of only those
operations that are related.
 For this to handle concurrency ordering of related operations is
required.
 If the operations are on not dependent on each other then they
can be put in any order for better performance.
 In this type, casually related writes are to be stored in one place
to keep them in an ordered fashion.
10
2.6 Eventual consistency:
 Another approach is eventual consistency. Unlike in sequential or
serializable the updated values must be visible to all the replicas at
the same time
 In this approach, the updated value is not immediately visible to all
the replicas but later on.
 Mostly the data is modified at one copy and other replicas get those
modified values after some time. Majorly all these types of
consistency come under data-centric i.e. how the stored data is
updated and this updating is visible to other replicas.
11
 If Data is not committed by the user from the same instance the latest
update is visible to that or not? This kind of issue is well taken in
distributed databases under client-centric models . Broadly it can be
divided in following types:
 Monotonic Read: If the data item is read by the user, and any read done after this
will be the same or updated values if applicable.
 Monotonic Writes: A write must be propagated to other replicas before a new
write by the same process.
 Read your writes: Any read done by the process shows the latest write done by
the same process.
 Write Follows Read: Any write on the item will show the latest read on the same
item

12
2.7 Transaction:
 Transactions are a set of operations that must be done in one unit. This is
required to make ensure data integrity.
 In the basic model of a transaction, one user is using one database on a
single machine in starting of database transactions.
 Once the transaction is complete the updated values are visible after this. As
concurrency is allowed where many users are accessing the same data at
the same place is allowed this modelling requires review.So, the ACID model
is used for this type of consistency .
 This model is good for client-server architecture where data is stored in one
place with limited concurrency levels and temporary and flat transactions.

13
3. Methodology
 The empirical analysis of performance parameters for consistency in distributed databases
entails a comprehensive methodology that encompasses various stages.

 These stages include defining research objectives, selecting appropriate consistency models,
designing an experimental setup, defining performance parameters, developing test
scenarios, implementing a distributed database system, collecting empirical data, conducting
multiple experimental runs, systematically varying performance parameters, analysing
performance data, interpreting results, validating and refining models, discussing implications
and recommendations, and documenting and sharing findings.

 This methodology facilitates informed decision-making, system performance optimisation,


and the advancement of knowledge pertaining to distributed database systems among
researchers.

14
 Through the implementation of this methodology, scholars can acquire valuable
knowledge regarding the performance parameters that impact the consistency of
distributed databases.

 This knowledge can be utilised to make well-informed decisions, enhance system


performance, and further the comprehension of distributed database systems. Apart
from the literature study conducted on consistency models, a bibliometric analysis
has also been performed.

15
4. Results and Discussions
 This study has identified the parameters on which consistency either
data or transaction depends which are discussed as follows. Also,
Table 1 presents the dependency of consistency type on these
factors.
 a. Network latency: It is the delay due to the communication
network. If the delay is small and tolerable without affecting
consistency, then this type of latency is known as low-level latency.
Otherwise it’s known as high level latency. High level affects the
bandwidth of network communication that delays the communication
between data servers or nodes
16
 b. Concurrency: Concurrency in database can be accessed by many
transactions may be from one server or from different servers
simultaneously. This is done with the help of locks and isolation
levels. For high level of concurrency accordingly isolation has to be
chosen.
 c. Replication: The best advantage of distribution is data redundancy
for availability and scalability. For this replication [21, 24, 27] plays an
important role. But this becomes one of the challenges for data
inconsistency. It uses fully replicated databases or partially replicated
databases or other types. If a strict or similar type of consistency is
required then immediate replication is applied. In the consistency
level for stricter kind synchronous kind of replication required that
hampers the availability of the databases.
17
 d. Transaction Nature: Different types of transactions are a flat or
nested type of transaction. Flat means that it may be categorized
based on the execution location i.e. local transaction or global
transaction [13, 15, 25, 37]. The stricter kind of transaction prefers the
flat and local type of transaction for better results, and the casual kind
of consistency can handle the nested and global, which is distributed
kind of transaction.
 e. Level of redundancy: Data redundancy means the same data in
different locations. This is achieved with the help of replication. In
distributed systems [38-40] data redundancy is required for availability
and fault tolerance. So, level means the required redundancy is
desired instead of fully relation of the data. Here, how redundancy
plays an important role is consistency [41, 42] selection?
18
f. System Design: Designing distributed system means which
the framework we are going to use. What kind of scalability is
required horizontal or vertical? Similarly, in decision-making of
the architecture of a system, shading is required or not. Shading
[41] means a part of the database is at one particular location. It
also decides where the query is going to execute preferably
where minimum traffic is required to move from one location to
another location.

19
20
21
22
23
24
25
26
VENUS MARS
Conclusion

SATURN MERCURY
Conclusion
 This study analyses multiple consistency models and then performs a
theoretical comparison of said models on the basis of significant parameters
pertinent to ensuring consistency in distributed databases. Consequently, the
system can choose from a variety of consistency models in order to improve
its performance and availability. The ACID consistency model places a
greater emphasis on network consistency as its principal parameter, while
linearizability, which can only be achieved in smaller systems, is of lesser
importance. In a diminutive system, such as linearizability, the
aforementioned parameter is of lesser importance, whereas it has a greater
impact in the context of the ACID or BASE model.
28
thanks

29

You might also like