Introduction To Distributed Databases: Intro To Database Systems Andy Pavlo
Introduction To Distributed Databases: Intro To Database Systems Andy Pavlo
Distributed Databases
ADMINISTRIVIA
ADMINISTRIVIA
U P C O M I N G D ATA B A S E E V E N T S
PA R A L L E L V S . D I S T R I B U T E D
Parallel DBMSs:
→ Nodes are physically close to each other.
→ Nodes connected with high-speed LAN.
→ Communication cost is assumed to be small.
Distributed DBMSs:
→ Nodes can be far from each other.
→ Nodes connected using public network.
→ Communication cost and problems cannot be ignored.
DISTRIBUTED DBMSs
T O D AY ' S A G E N D A
System Architectures
Design Issues
Partitioning Schemes
Distributed Concurrency Control
SYSTEM ARCHITECTURE
SYSTEM ARCHITECTURE
Network
Network
Network
SHARED MEMORY
SHARED DISK
SHARED NOTHING
Node
Get Id=200
Get Id=200 P3→ID:101-200
Application
Server Node
P2→ID:201-300
P2→ID:151-300
E A R LY D I S T R I B U T E D D ATA B A S E S Y S T E M S
Gray
CMU 15-445/645 (Fall 2019)
16
DESIGN ISSUES
D ATA T R A N S PA R E N C Y
D ATA B A S E PA R T I T I O N I N G
N A Ï V E TA B L E PA R T I T I O N I N G
N A Ï V E TA B L E PA R T I T I O N I N G
Table1
H O R I Z O N TA L PA R T I T I O N I N G
H O R I Z O N TA L PA R T I T I O N I N G
Partitioning Key
Table1 Partitions
101 a XXX 2019-11-29 hash(a)%4 = P2
102 b XXY 2019-11-28 hash(b)%4 = P4 P1 P2
103 c XYZ 2019-11-29 hash(c)%4 = P3
104 d XYX 2019-11-27 hash(d)%4 = P2
105 e XYY 2019-11-29 hash(e)%4 = P1
Ideal Query: P3 P4
SELECT * FROM table
WHERE partitionKey = ?
CONSISTENT HASHING
1 0 hash(key1)
E Replication Factor = 3
A
C If hash(key)=D
hash(key2)
B
D
L O G I C A L PA R T I T I O N I N G
Node Id=1
Id=2
Storage
Get Id=1
Id=1
Id=2
Get Id=3 Id=3
Application Id=4
Server Node
Id=3
Id=4
P H Y S I C A L PA R T I T I O N I N G
Node
Id=1
Get Id=1 Id=2
Get Id=3
Application
Server Node
Id=3
Id=4
S I N G L E- N O D E V S . D I S T R I B U T E D
T R A N S A C T I O N C O O R D I N AT I O N
TP MONITORS
C E N T R A L I Z E D C O O R D I N AT O R
P1
Coordinator P2
Commit
Lock Request
Request P3 Partitions
P4
Acknowledgement P1 P2
C E N T R A L I Z E D C O O R D I N AT O R
Partitions
Middleware
Commit
Query Requests
Request Safe to commit?
P1 P2
Application P1→ID:1-100
Server P2→ID:101-200 P3 P4
P3→ID:201-300
P4→ID:301-400
D E C E N T R A L I Z E D C O O R D I N AT O R
Partitions
Commit
Begin Request
Request P1 P2
Query Request
Safe to commit?
Application
Server P3 P4
DISTRIBUTED 2PL
Waits-For Graph
A=2
A=1 B=7
B=8
NETWORK
Node 1 Node 2
CMU 15-445/645 (Fall 2019)
37
CONCLUSION
NEXT CLASS