0% found this document useful (0 votes)
3 views

04-DistributedDataManagementAndProcessing

The document outlines the complexities and benefits of distributed data management and processing, emphasizing key concepts such as distributed systems, databases, and cloud databases. It covers various challenges, transparency layers, and architectural considerations, as well as the CAP theorem and data management issues related to fragmentation, allocation, and replication. Additionally, it discusses the implications of multi-tenancy and the need for efficient query processing in distributed environments.

Uploaded by

lgavidiap31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

04-DistributedDataManagementAndProcessing

The document outlines the complexities and benefits of distributed data management and processing, emphasizing key concepts such as distributed systems, databases, and cloud databases. It covers various challenges, transparency layers, and architectural considerations, as well as the CAP theorem and data management issues related to fragmentation, allocation, and replication. Additionally, it discusses the implications of multi-tenancy and the need for efficient query processing in distributed environments.

Uploaded by

lgavidiap31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Distributed Data Management

and Processing
23D020: Big Data Management for Data Science
Barcelona School of Economics

23D020
Knowledge objectives 13. Recognize the complexity and benefits of data allocation
14. Explain the benefits of replication
1. Give a definition of Distributed System 15. Discuss the alternatives of a distributed catalog
2. Enumerate the 6 challenges of a Distributed System 16. Explain the CAP theorem
3. Give a definition of a Distributed Database 17. Identify the 3 configuration alternatives given by the CAP
4. Explain the different transparency layers in DDBMS theorem
5. Identify the requirements that distribution imposes 18. Explain the 4 synchronization protocols we can have
on the ANSI/SPARC architecture 19. Explain what eventually consistency means
6. Draw a classical reference functional architecture for 20. Enumerate the phases of distributed query processing
DDBMS
21. Explain the difference between data shipping and query
7. Enumerate the 8 main features of Cloud Databases shipping
8. Explain the difficulties of Cloud Database providers to 22. Explain the meaning of “reconstruction” and “reduction” in
have multiple tenants syntactic optimization
9. Enumerate the 4 main problems tenants/users need 23. Enumerate the 4 different cost factors in distributed query
to tackle in Cloud Databases processing
10. Distinguish the cost of sequential and random access 24. Explain the different kinds of parallelism
11. Explain the difference between the cost of sequential 25. Identify the impact of fragmentation in intra-operator
and random access parallelism
12. Distinguish vertical and horizontal fragmentation 26. Explain the impact of tree topologies (i.e., linear and bushy) in
inter-operator parallelism

23D020 3
Distributed System

Distributed DBMS

Cloud DBMS

Distributed Systems

23D020 4
Distributed system
“One in which components located at networked computers communicate
and coordinate their actions only by passing messages.”
G. Coulouris et al.
• Characteristics:
• Concurrency of components
• Independent failures of components
• Lack of a global clock
Network

23D020 5
Challenges of distributed systems
• Openness
• Scalability
• Quality of service
• Performance/Efficiency
• Reliability/Availability
• Confidentiality
• Concurrency Network

• Transparency
• Heterogeneity of components

23D020 6
Scalability
Cope with large workloads
• Scale up
• Scale out

• Use: Network
• Automatic load-balancing

• Avoid:
• Bottlenecks
• Unnecessary communication
• Peer-to-peer

23D020 7
Performance/Efficiency
Efficient processing
• Minimize latencies
• Maximize throughput

• Use
• Parallelism Network

• Network optimization
• Specific techniques

23D020 8
Reliability/Availability
a) Keep consistency
b) Keep the system running
• Even in the case of failures

• Use
Network
• Replication
• Flexible routing
• Heartbeats
• Automatic recovery

23D020 9
Concurrency
Share resources as much as possible

• Use
• Consensus Protocols

Network
• Avoid
• Interferences
• Deadlocks

23D020 10
Transparency
a) Hide implementation (i.e., physical) details to the users
b) Make transparent to the user all the mechanisms to solve the other
challenges

Network

23D020 11
Further objectives
• Use
• Platform-independent software

• Avoid
• Complex configurations
• Specific hardware/software Network

23D020 12
Distributed System

Distributed DBMS

Cloud DBMS

Distributed Database Systems

23D020 13
Distributed database
“A Distributed DataBase (DDB) is an integrated collection of databases that is physically
distributed across sites in a computer network. A Distributed DataBase Management
System (DDBMS) is the software system that manages a distributed database such that
the distribution aspects are transparent to the users.”
Encyclopedia of Database Systems

Network Network

23D020 14
Transparency layers (I)

Fragmentation Transparency

Replication Transparency

Update Transparency

Network Transparency

Name Transparency

Location Transparency

Data Independence

23D020 15
Transparency layers (II)
• Fragmentation transparency
• The user must not be aware of the existence of different fragments
• Replication transparency
• The user must not be aware of the existing replicas
• Network transparency
• Data access must be independent regardless where data is located
• Each data object must have a unique name
• Data independence at the logical and physical level must be guaranteed
• Inherited from centralized DBMSs (ANSI SPARC)

23D020 16
Classification According to Degree of Autonomy

Autonomy Central Query Update


schema transparency transparency
DDBMS No Yes Yes Yes
T.C. Federated Low Yes Yes Limited
L.C. Federated Medium No Yes Limited
Multi-database High No No No

23D020 17
Extended ANSI-SPARC Architecture of Schemas

• Global catalog (Mappings between ESs – GCS and GCS – LCSs)


• Each node has a local catalog (Mappings between LCSi – ISi)
23D020 18
Centralized DBMS Functional Architecture

Query Manager

View Security Constraint Query


Manager Manager Checker Optimizer

Execution Manager

Scheduler

Recovery Data Manager


Manager Log
Operating
system Buffer pool
Buffer
Manager (Memory)
File
system

23D020 19
Distributed DBMS Functional Architecture
Global Query Manager External

One coordinator
Schema
View Security Constraint Query

GLOBAL CATALOG
Manager Manager Checker Optimizer Global
Conceptual
Schema

Fragment
Global Execution Manager Schema
Allocation
Schema
Global Scheduler


Local Query Manager Local
Conceptual
Schema
Many workers

Local Execution Manager


Local

LOCAL CATALOG
LOCAL CATALOG
Internal
Schema
Operating Recovery Data Manager
Manager Log Data Manager
system

File Buffer Buffer pool


system Manager (Memory)


23D020 20
Distributed System

Distributed DBMS

Cloud DBMS

Cloud Databases

23D020 21
Parallel database architectures

D. DeWitt & J. Gray. Figure by D. Abadi

23D020 22
Key Features of Cloud Databases
• Scalability
a) Ability to horizontally scale (scale out)
• Quality of service
• Performance/Efficiency
b) Fragmentation: Replication & Distribution
c) Indexing: Distributed indexes and RAM
• Reliability/Availability
• Concurrency Network
d) Weaker concurrency model than ACID
• Transparency
e) Simple call level interface or protocol
• No declarative query language
• Further objectives
f) Flexible schema
• Ability to dynamically add new attributes
g) Quick/Cheap set up
h) Multi-tenancy

23D020 23
Multi-tenancy platform problems (provider side)
• Difficulty: Unpredictable load characteristics
• Variable popularity
• Flash crowds
• Variable resource requirements
• Requirement: Support thousands of tenants
a) Maintain metadata about tenants (e.g., activated features)
b) Self-managing
c) Tolerating failures
d) Scale-out is necessary (sooner or later)
• Rolling upgrades one server at a time
e) Elastic load balancing
• Dynamic partitioning of databases

23D020 24
Data management problems (tenant side)
I. (Distributed) data design
• Data fragmentation
• Data allocation
• Data replication
II. (Distributed) catalog management
• Metadata fragmentation
• Metadata allocation
• Metadata replication
III. (Distributed) transaction management
• Enforcement of ACID properties
• Distributed recovery system
• Distributed concurrency control system
• Replica consistency
• Latency & Availability vs. Update performance

IV. (Distributed) query processing


• Optimization considering
1) Distribution/Parallelism
• Communication overhead
2) Replication

23D020 25
(Distributed) Data Design
Challenge I

23D020 26
DDB Design
• Given a DB and its workload, how should the DB be split and allocated to
sites as to optimize certain objective functions
• Minimize resource consumption for query processing

• Two main issues:


• Data fragmentation
• Data allocation
• Data replication

23D020 27
Data Fragmentation
• Usefulness
• An application typically accesses only a subset of data
• Different subsets are (naturally) needed at different sites
• The degree of concurrency is enhanced
• Facilitates parallelism
• Fragments can be even defined dynamicaly (i.e., at query time, not at design time)

• Difficulties
• Complicates the catalog management
• May lead to poorer performance when multiple fragments need to be joined
• Fragments likely to be used jointly can be colocated to minimize communication overhead
• Costly to enforce the dependency between attributes in different fragments

23D020 28
Fragmentation Correctness
• Completeness
• Every datum in the relation must be assigned to a fragment
• Disjointness
• There is no redundancy and every datum is assigned to only one fragment
• The decision to replicate data is in the allocation phase
• Reconstruction
• The original relation can be reconstructed from the fragments
• Union for horizontal fragmentation
• Join for vertical fragmentation

23D020 29
Finding the best fragmentation strategy
• Consider it per table
• Computational cost is NP-hard
• Needed information
• Workload
• Frequency of each query
• Access plan and cost of each query
• Take intermediate results and repetitive access into account
• Value distribution and selectivity of predicates
• Work in three phases
1. Determine primary partitions (i.e., attribute subsets often accessed together)
2. Generate a disjoint and covering combination of primary partitions
3. Evaluate the cost of all combinations generated in the previous phase

23D020 30
Data Allocation
• Given a set of fragments, a set of sites on which a number of applications are
running, allocate each fragment such that some optimization criterion is met (subject
to certain constraints)
• It is known to be an NP-hard problem
• The optimal solution depends on many factors
• Location in which the query originates
• The query processing strategies (e.g., join methods)
• Furthermore, in a dynamic environment the workload and access patterns may change
• The problem is typically simplified with certain assumptions
• E.g., only communication cost considered
• Typical approaches build cost models and any optimization algorithm can be
adapted to solve it
• Sub-optimal solutions
• Heuristics are also available
• E.g., best-fit for non-replicated fragments

23D020 31
Data Replication
• Generalization of Allocation (for more than one location)
• Provides execution alternatives
• Improves availability
• Generates consistency problems
• Specially useful for read-only workloads
• No synchronization required

23D020 32
(Distributed) Catalog
Management
Challenge II

23D020 33
DDBMS Catalog Characteristics
External
• Fragmentation Schema

• Global metadata

GLOBAL CATALOG
Global
• External schemas Conceptual
• Global conceptual schema Schema
• Fragment schema Fragment
• Allocation schema Schema
• Local metadata Allocation
• Local conceptual schema Schema

• Physical schema
• Allocation Local
• Global metadata in the coordinator node Conceptual
• Local metadata in the workers
Schema

• Replication Local

LOCAL CATALOG
Internal
a) Single-copy (Coordinator node) Schema
• Single point of failure
• Poor performance (potential bottleneck)
b) Multi-copy (Mirroring, Secondary node)
• Requires synchronization

23D020 34
(Distributed) Transaction
Management
Challenge III

23D020 35
CAP theorem
“We can only achieve two of Consistency, system Availability, and
tolerance to network Partition.”
Eric Brewer

• Consistency (C) equivalent to a single up-to-date copy of the data


• High availability (A) of the data (for updates)
• Tolerance to network partitions (P). a) Error (unavailable)

Write(X)
X X

b) Ok (inconsistent)

23D020 36
Configuration alternatives
a) Strong consistency (give away availability)
• Replicas are synchonously modified and guarantee consistent query answering
• The whole system will be declared not to be available in case of network partition
b) Eventually consistent (give away consistency)
• Changes are asynchronously propagated to replicas so answer to the same query
depends on the replica being used
• In case of network partition, changes will be simply delayed
c) Non-distributed data (give away network partitioning)
• Connectivity cannot be lost
• We can have strong consistency without affecting availability

23D020 37
Managing replicas
• Replicating fragments improves query latency and availability
• Requires dealing with consistency and update (a.k.a., synchronization) performance
• Replication protocols characteristics
• Primary – Distributed versioning
• Eager – Lazy replication

User A User B User A User B


write(x) write(x)
write(x) read(x) write(x)

Primary Synchronous Replica Primary Asynchronous Replica


Server Replication Server Server Replication Server
Strong a) Eager primary copy replication b) Lazy primary copy replication Eventually
Consistency User A User A User B
Consistent
User B

write(x) write(x) write(x) write(x)

Replica 1 Synchronous Replica 2 Replica 1 Asynchronous Replica 2


Server Replication Server Server Replication Server
c) Eager distributed replication d) Lazy distributed replication

23D020 38
Eventual consistency

Justin Travis Waith-Mair

23D020 39
Replication management configuration
• Definitions
• N: #replicas
• W: #replicas that have to be written before commit
• R: #replicas read that need to coincide before giving response
• Named situations
• Inconsistency window  W<N
• Strong consistency  R+W>N
• Eventually consistent  R+W<=N
• Sets of machines (R and W) may not overlap
• Potential conflict  W<(N+1)/2
• Sets of writing machines (W) may not overlap
• Typical configurations
• Fault tolerant system  N=3; W=2; R=2
• Massive replication for read scaling  R=1
• Read One-Write All (ROWA)  R=1; W=N (1+N>N  Strong consistency)
• Fast read
• Slow write (low probability of succeeding)

23D020 40
Visual Guide to NOSQL Systems
Data Models:

A
Availability:
Relational
Each client can always
Key-Value
read and write.
Column-Oriented/Tabular
Document-Oriented

AP
CA Dynamo, Voldemort, Tokyo Cabinet, KAI
RDBMSs (MySQL, Postgres,…) Cassandra
Aster Data, Greenplum
Vertica
Pick Two SimpleDB, CouchDB, Riak

Consistency: Partition Tolerance:


All clients always have
the same view of the data C CP
P The system works well despite
physical network partitions.
BerkeleyDB, Scalaris, MemcacheDB, Redis
BigTable, Hypertable, HBase
MongoDB, Terrastore
source:
23D020 https://blog.nahurst.com/visual-guide-to-nosql-systems 41
(Distributed) Query
Processing
Challenge IV

23D020 42
Challenges in distributed query processing
• Communication cost (data shipping)
• Not that critical for LAN networks
• Assuming high enough I/O cost
• Fragmentation / Replication
• Metadata and statistics about fragments (and replicas) in the global catalog
• Join Optimization A centralized optimizer
• Joins order minimizes the number of
• Semi-join strategy accesses to disk

• How to decide the execution plan A distributed optimizer


• Exploit parallelism (!) minimizes the use of network
• Who executes what bandwidth

23D020 43
The main scenarios in data processing
• Data shipping
• The data is retrieved from the stored site to the site executing the query
• Avoid bottlenecks on frequently used data
• Too much data may be moved – bandwidth intensive!
• Query shipping
• The evaluation of the query is delegated to the site where it is stored
• Avoid transferring large amounts of data
• Overloads machines containing the data! Data Shipping Query Shipping
S1

D1 D2
• Hybrid strategy
• Dynamically decide data or query shipping S2
D1 D2 D1 D2

23D020 44
Phases of distributed query processing

23D020 45
Syntactic optimizer
• Ordering operators
• Left or right deep trees
• Bushy trees

Hard to parallelize Easy to parallelize

• Added difficulties
• Consider multi-way joins
• Consider the size of the datasets
• Specially the size of the intermediate joins

23D020 46
Physical optimizer
• Transforms an internal query representation into an efficient plan
• Replaces the logical query operators by
1) Specific algorithms (plan operators)
2) Access methods
• Decides in which order to execute them
• Parallelism (!)
• Selects where to execute them (exploit Data Location)
• More difficult for joins (multi-way joins)
• This is done by…
• Enumerating alternative but equivalent plans
• Estimating their costs
• Searching for the best solution
• Using available statistics regarding the physical state of the system

23D020 47
Criteria to choose the access plan
• Usually with the goal to optimize response time
• Time needed to execute a query (i.e., latency or response time)
• Benefits from parallelism
• Cost Model
• Sum of local cost and communication cost
• Local cost
• Cost of central unit processing (#cycles),
• Unit cost of I/O operation (#I/O ops)
• Communication cost (commonly assumed it is linear in the number of bytes transmitted)
• Cost of initiating a message and sending a message (#messages)
• Cost of transmitting one byte (#bytes)
• Knowledge required
• Size of elementary data units processed
• Selectivity of operations to estimate intermediate results

23D020 48
Cost model example
• Parameters:
• Local processing:
• Average CPU time to process an instance (Tcpu)
• Number of instances processed (#inst)
• I/O time per operation (TI/O)
• Number of I/O operations (#I/Os)
The statistics are not difficult to
• Global processing: collect, the problem is that for
• Message time (TMsg) estimating the response time, we
• Number of messages issued (#msgs) need to know a-priori what is going
• Transfer time (send a byte from one site to another) (T TR) to be executed in parallel and what
• Number of bytes transferred (#bytes) is going to be executed
sequentially!
• Calculations:
Resources Used = Wcpu *Tcpu * #inst + WI/O *TI/O *#I/Os + WMsg *TMsg *#msgs + WTR *TTR *#bytes
Response Time = Tcpu * seq#inst + TI/O * seq#I/Os + TMsg * seq#msgs + TTR * seq#bytes

23D020 49
The problem of parallelism

Theory Practice

Samuel Yee

23D020 50
Bulk Synchronous Parallel Model

Ideal
Waisted computing time

Real
SAILING lab slides

23D020 51
Kinds of parallelism
• Inter-query: different queries executed in parallel
• Intra-query
• Intra-operator (if its one operator)
• Unary (e.g., selection)
• Static partitioning
• Binary (e.g., join)
• Static or dynamic partitioning
• Inter-operator
• Independent
• Pipelined

Pipelined Independent operators

23D020 52
Closing

23D020 55
Summary
• Distributed Systems • Distributed Transactions
• Distributed Database Systems • CAP Theorem
• Distributed Database Systems Architectures • Strong and Eventual Consistency

• Cloud Databases • Distributed Querying


• Data shipping
• Distributed Database Design • Query shipping
• Fragmentation
• Query Optimizer
• Kinds
• Characteristics
• Parallelism
• Allocation
• Replication
• Distributed Catalog

23D020 56
References
• D. DeWitt & J. Gray. Parallel Database Systems: The future of High Performance
Database Processing. Communications of the ACM, June 1992
• N. J. Gunther. A Simple Capacity Model of Massively Parallel Transaction
Systems. CMG National Conference, 1993
• L. Liu, M.T. Özsu (Eds.). Encyclopedia of Database Systems. Springer, 2009
• M. T. Özsu & P. Valduriez. Principles of Distributed Database Systems, 3rd Ed.
Springer, 2011
• G. Coulouris et al. Distributed Systems: Concepts and Design, 5th Ed. Addisson-
Wesley, 2012
• G. Graefe. Query Evaluation Techniques. In ACM Computing Surveys, 25(2),
1993
• L. G. Valiant. A bridging model for parallel computation. Commun. ACM.
August 1990
23D020 57

You might also like