0% found this document useful (0 votes)
2 views

C3-Distributed_Databases

Uploaded by

tahya.lehbib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

C3-Distributed_Databases

Uploaded by

tahya.lehbib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

NATIONAL ENGINEERING SCHOOL OF TUNIS

Distributed
Systems
Level : 3rd year Software Engineering
Instructor : Dr. Wafa MEFTEH
2
Plan

1 - Key Elements and Architectures

2 - Distributed Databases

3 – Agent based Modeling, Simulation and Programing

4 - Distributed Artificial Intelligence

Wafa MEFTEH - ENIT 25/11/2024


Distributed
Systems Distributed
Part 2 Databases

Wafa MEFTEH - ENIT 11/25/2024 3


4

3 - Processing & Optimisation


of Distributed Queries
NOVEMBER 25, 2024
Dr. Wafa MEFTEH - ENIT

Wafa MEFTEH
5
Challenges Wafa MEFTEH

 The execution rules and query optimization methods defined for a


centralized context are still valid, but we must consider:
• Fragmentation and distribution of data at different sites.
• The problem of the cost of inter-site communications to transfer data.

 The problem of fragmentation with or without replication mainly


concerns updates, while the problem of communication costs mainly
concerns ordinary queries.
November
25, 2024
6
Updates Wafa MEFTEH

An update in a global schema relation results in several updates in


different fragments.

1. The first step is to identify the fragments affected by the update


operation.

1. Then decompose the operation accordingly into a set of update


operations on these fragments.
November
25, 2024
7
Updates Wafa MEFTEH

Identify the appropriate horizontal fragment based on the


Insert conditions defining each fragment and subsequently insert the
tuple into all corresponding vertical fragments to maintain data
integrity and alignment.

Search for the tuple within the fragments most likely to contain it
Delete and remove the corresponding attribute values across all
associated vertical fragments to ensure consistency.

Identify the relevant tuples, apply the necessary modifications,


Update and relocate them to the appropriate fragments as required to
maintain data accuracy and alignment.
November
25, 2024
8
Updates Examples Wafa MEFTEH

 The horizontal fragment concerned


can be found with the CCs (in this
case, it is CC3).

 Next, insert the tuple into all the


vertical fragments.
November
25, 2024
9
Updates Examples Wafa MEFTEH

 We use the CC
conditions: here CC3 and
CC4 are concerned.

 So, we will search in the


corresponding fragments.
November
25, 2024
10
Updates Examples Wafa MEFTEH

 No CC is involved.

 All fragments must be searched.

November
25, 2024
11
Updates Examples Wafa MEFTEH

 CC3, CC4 and CC5 are affected.

 We modify and verify that the CCs are always checked.


• Since numeq=1, it must be removed from F3 and F5.
• The tuple is then moved into fragment 4
November
25, 2024
12
Query on DDB Wafa MEFTEH

 In a distributed environment, queries formulated at a global level are


broken down into sub-queries.

 These subqueries are addressed to the systems available at the local


sites where they are executed.

 Local responses are then grouped together to develop the response to


the global query.
November
25, 2024
13
Query on DDB Wafa MEFTEH

 It is this process that we will describe when considering global queries


initially formulated in SQL.

 They are rewritten in algebraic form to be reduced and optimized.

 The fragmentation scheme allows to determine the local addressed


queries.
November
25, 2024
14
Fragmentation of Dist-Queries Wafa MEFTEH

1. Construction of the overall execution plan - Put the query as an algebraic


tree
❖ transition = relationship
❖ node = relational operation
2. Expression of the plan according to the fragments - Replace each sheet
with a global relationship reconstruction program.

3. Plan Transformation - Apply reduction techniques to eliminate unnecessary


operations.
November
25, 2024
15
Fragmentation of Dist-Queries Wafa MEFTEH
Examples

Considering:
Client(nclient, nom, ville)
Cde(ncde, #nclient, produit, qte)

Fragmentation Schema

November
25, 2024
16
Fragmentation of Dist-Queries Wafa MEFTEH
Examples

SELECT nom FROM Client;

The algebraic tree of


the query →

November
25, 2024
17
Fragmentation of Dist-Queries Wafa MEFTEH
Examples
Reduction of horizontal fragmentation
Rule: eliminate access to unnecessary fragments
SELECT nom FROM Client WHERE ville = ‘Paris’;

November
25, 2024
18
Fragmentation of Dist-Queries Wafa MEFTEH
Examples
Reduction of vertical fragmentation
Rule: eliminate access to basic relations that do not have attributes useful for the result.
SELECT nclient FROM Cde;

November
25, 2024
19
Fragmentation of Dist-Queries Wafa MEFTEH
Examples
Reduction of Derived-H fragmentation
Rule: distribute joints relative to unions and apply reductions for horizontal fragmentation.
SELECT * FROM Client, Cde WHERE Client.nclient = Cde.nclient AND Ville = ‘Paris’;

November
25, 2024
20
Fragmentation of Dist-Queries Wafa MEFTEH
Example

November
25, 2024
21
Execution Plan Wafa MEFTEH

 In centralized DB, it is the sequence of algebraic operators for the


calculation of a query.

 In distributed DB, it is the sequence of algebraic operators and inter-


site data exchanges for the calculation of a query.

November
25, 2024
22
Execution Plan Wafa MEFTEH

SELECT nom FROM Client, Cde


WHERE Client.nclient = Cde.nclient AND qte > 10 ;
Two execution plans are possible

algebraically
optimal

November
25, 2024
23
Execution Plan Wafa MEFTEH

Rule-based optimization

• Principle: Make the least expensive operators (projection,


selection) first, to reduce the size of input data for the most
expensive operators (join).

• Methodology: Lower the selections, then the projections to the


maximum.
November
25, 2024
24
Execution Plan Wafa MEFTEH
Example

November
25, 2024
25
Execution Plan Wafa MEFTEH
Example

November
25, 2024
26
Execution Plan Wafa MEFTEH
Example

Suppose that:

• size(Cde1) = size(Cde2) = 10 000 n-uplets


• size(Client1) = size(Client2) = 2 000 n-uplets
• Transfer cost of 1 n-uplets = 1
• Selectivity (qty > 10) = 1%

November
25, 2024
27
Execution Plan Wafa MEFTEH
Example
Cost comparison of the two solutions

 Solution 1:
1. Transfer Cde1 + Cde2 = 20 000 n-uplets
2. Transfer Client1 + Client2 = 4 000 n-uplets
 Solution 2:
1. Transfer C1 + C2 = 200 n-uplets
2. Transfer C3 + C4 = 200 n-uplets
November
25, 2024
28
Complexity of Distributed Queries Wafa MEFTEH

 In a centralized database, only the I/O and CPU factors determine the complexity
of a query.

 The complexity of a query in a DDB is defined by:


• Input/output on disks: the cost of data access.
• CPU cost: this is the cost of data processing to perform algebraic operations
(joins, selections, etc.).
• Communication on the network: this is the time needed to exchange a volume
of data between sites involved in the execution of a query.
November
25, 2024
29
Complexity of Distributed Queries Wafa MEFTEH

Note that we distinguish between the total cost and the overall response time of a
query:

• Total cost: this is the sum of all the time required to complete a query. In this
cost, the execution times on the different sites, the data accesses and the
communication times between the different sites that come into play.

• Global response time: this is the execution time of a query. Because some
operations can be performed in parallel at multiple sites, the overall
response time is generally less than the total cost.
November
25, 2024
30
Data Transfer Wafa MEFTEH

 The transmission time of a message considers the access time and the
transmission time (data volume/ transmission rate).

 The access time is negligible on a local network but can reach a few
seconds for transmissions over long distances or via satellite.

 Under these conditions, a complete data processing is necessary.

 The inter-site transfer unit is a relationship or fragment.

November
25, 2024
31

Thanks,

See You Next Session


NchaALLAH

You might also like