19-DistributedDatabases
19-DistributedDatabases
DB
MG
1
Distributed architectures
2
DB
MG
Distributed architectures
Client/server
Simplest and more widespread
Server manages the database
Client manages the user interface
Distributed database system
Different DBMS servers on different network nodes
autonomous
able to cooperate
Guaranteeing the ACID properties requires more
complex techniques
3
DB
MG
Distributed architectures
Data replication
A replica is a copy of the data stored on a different
network node
The replication server autonomously manages
copy update
Simpler architecture than distributed database
4
DB
MG
Distributed architectures
Parallel architectures
Performance increase is the only objective
Different architectures
Multiprocessor machines
CPU clusters
Dedicated network connections
Data warehouses
Servers specialized in decision support
Perform OLAP (On Line Analytical Processing)
different from OLTP (On Line Transaction
Processing)
5
DB
MG
Relevant properties
Portability
Capability of moving a program from a system to a
different system
Guaranteed by the SQL standard
Interoperability
Capability of different DBMS servers to cooperate
on a given task
Interaction protocols are needed
ODBC
X-Open-DTP
6
DB
MG
Database Management Systems
Client/server Architectures
DB
MG
7
Client/server architectures
2-Tier
Thick clients
with some application logic
DBMS server
provides access to data CLIENT1
CLIENTn
DBMS
SERVER
DB
8
DB
MG
Client/server architectures
3-Tier
Thin client CLIENT1
CLIENTn
browser
Application server
APPLICATION
implements business logic SERVER
typically also a web server
DBMS Server
DBMS
provides access to data SERVER
DB
9
DB
MG
SQL execution
Compile & Go
The query is sent to the server
The query is prepared
generation of the query plan
The query is executed
The result is shipped
The query plan is not stored on the server
Effective for one-shot query executions
provides flexible execution of dynamic SQL
10
DB
MG
SQL execution
11
DB
MG
Database Management Systems
DB
MG
12
Distributed database systems
13
DB
MG
Distributed database systems
Functional advantages
Appropriate localization of data and applications
e.g., geographical distribution
Technological advantages
Increased data availability
Total block probability is reduced
Local blocks may be more frequent
Enhanced scalability
Provided by the modularity and flexibility of the
architecture
14
DB
MG
Database Management Systems
DB
MG
15
Data fragmentation
16
DB
MG
Horizontal fragmentation
17
DB
MG
Example
18
DB
MG
Vertical fragmentation
19
DB
MG
Example
20
DB
MG
20
Fragmentation properties
Completeness
each information in relation R is contained in at
least one fragment Ri
Correctness
the information in R can be rebuilt from its
fragments
21
DB
MG
Distributed database design
22
DB
MG
Allocation of fragments
SITE 1
SITE 2
SITE 1
23
DB
MG
Allocation of fragments
SITE 1
SITE 2
SITE 1 + SITE 2
24
DB
MG
Transparency levels
25
DB
MG
Transparency levels
26
DB
MG
Fragmentation transparency
SELECT SName
FROM S
WHERE S#=:CODE
27
DB
MG
Allocation transparency
Transaction classification
DB
MG
30
Transaction classification
31
DB
MG
Transaction classification
Remote request
Read only request
only select statement
Single remote server
Remote transaction
Any SQL command
Single remote server
32
DB
MG
Transaction classification
Distributed transaction
Any SQL command
Each SQL statement is addressed to one single
server
Global atomicity is needed
2 phase commit protocol
Distributed request
Each SQL command may refer to data on different
servers
Distributed optimization is needed
Fragmentation transparency is in this class only
33
DB
MG
Example
34
DB
MG
Example
UPDATE Account
SET Balance = Balance + 100
WHERE Acc# = 13000
EoT (End of transaction)
35
DB
MG
Example
36
DB
MG
Database Management Systems
DB
MG
37
ACID properties
Atomicity
It requires distributed techniques
2 phase commit
Consistency
Constraints are currently enforced only locally
Isolation
It requires strict 2PL and 2 Phase Commit
Durability
It requires the extension of local procedures to
manage atomicity in presence of failure
38
DB
MG
Other issues
39
DB
MG
Atomicity
40
DB
MG
2 Phase Commit protocol
Objective
Coordination of the conclusion of a distributed
transaction
Parallel with a wedding
Priest celebrating the wedding
Coordinates the agreement
Couple to be married
Participate to the agreement
41
DB
MG
2 Phase Commit protocol
Distributed transaction
One coordinator
Transaction Manager (TM)
Several DBMS servers which take part to the
transaction
Resource Managers (RM)
Any participant may take the role of TM
Also the client requesting the transaction execution
42
DB
MG
New log records
43
DB
MG
New log records
44
DB
MG
2 Phase Commit protocol
RM TM
Prepare
LOG
45
DB
MG
Phase I
1. The TM
Writes the prepare record in the log
Sends the prepare message to all RM
(participants)
Sets a timeout, defining maximum waiting time for
RM answer
46
DB
MG
2 Phase Commit protocol
RM TM
Prepare
Ready
LOG
LOG
47
DB
MG
Phase I
2. The RMs
Wait for the prepare message
When they receive it
If they are in a reliable state
Write the ready record in the log
Send the ready message to the TM
If they are not in a reliable state
Send a not ready message to the TM
Terminate the protocol
Perform local rollback
If the RM crashed
No answer is sent
48
DB
MG
2 Phase Commit protocol
RM TM
Prepare
Ready
LOG
LOG
Global
Commit/Abort
LOG
49
DB
MG
Phase I
3. The TM
Collects all incoming messages from the RMs
If it receives ready from all RMs
The commit global decision record is written in the
log
If it receives one or more not ready or the
timeout expires
The abort global decision record is written in the
log
50
DB
MG
Phase II
1. The TM
Sends the global decision to the RMs
Sets a timeout for the RM answers
51
DB
MG
2 Phase Commit protocol
RM TM
Prepare
Ready
LOG
LOG
Global
Commit/Abort
LOG
Commit/Abort
LOG
DB
52
DB
MG
Phase II
2. The RM
Waits for the global decision
When it receives it
The commit/abort record is written in the log
The database is updated
An ACK message is sent to the TM
53
DB
MG
2 Phase Commit protocol
RM TM
Prepare
Ready
LOG
LOG
Global
Commit/Abort
LOG
Commit/Abort
LOG
Complete
DB
LOG
54
DB
MG
Phase II
3. The TM
Collects the ACK messages from the RMs
If all ACK messages are received
The complete record is written in the log
If the timeout expires and some ACK messages
are missing
A new timeout is set
The global decision is resent to the RMs which did
not answer
until all answers are received
55
DB
MG
2 Phase Commit protocol
RM TM
Prepare
Ready
Phase I
LOG
LOG
Global
Commit/Abort
uncertainty
window LOG
Commit/Abort
LOG
Phase II
Complete
DB
LOG
56
DB
MG
Uncertainty window
57
DB
MG
Failure of a participant (RM)
60
DB
MG
Database Management Systems
X-Open-DTP
DB
MG
61
X-Open-DTP
62
DB
MG
Interfaces
63
DB
MG
Standard features
64
DB
MG
Presumed abort
65
DB
MG
Read only
66
DB
MG
Heuristic decision
68
DB
MG
Protocol interaction
Client TM RM
(TM Interface) (XA Interface)
TM.Init()
TM.Open XA.Open()
Client - TM communication
TM.Begin() XA.Start()
Transaction
:
:
Transaction
Session
TM.Commit() XA.Precom()
XA.Abort()
XA.Commit()
XA.End()
TM.Term() XA.Close()
TM.Exit()
69
DB
MG
Database Management Systems
Parallel DBMS
DB
MG
70
Parallelism
72
DB
MG
Intra-query parallelism
DBMS benchmarks
DB
MG
74
DBMS benchmarks
TPC C
Order entry transactions
It simulates the behavior of an OLTP system
New evolution is TPC E
TPC H
Decision support (OLAP)
It is a mix of complex queries
Also TPC-DI and TPC-DS
TPCx-HS
Big data management
Assessment of implementation of Hadoop clusters
76
DB
MG