0% found this document useful (0 votes)

19 views62 pages

02 Distdbms Storage

Uploaded by

mukulbasavaraj1999v2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views62 pages

02 Distdbms Storage

Uploaded by

mukulbasavaraj1999v2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

SCALABLE DATA MANAGEMENT SYSTEMS

STORAGE: DISTRIBUTED DBMSs

Systems@TUDa http://tuda.systems/
Announcements (29/10/2024)
Slides on Lab 0 tasks have been updated.
• Clarification on next_free and free_list

SDMS Examination Registration Period: 01.11.24 - 12.12.24

COURSE SCHEDULE
Week 1: Introduction
Week 2: DBMS Storage - Single Node
Week 3: DBMS Storage - Distributed
Week 4: DBMS Query Processing - Single Node Single-Node +
Week 5: DBMS Query Processing - Distributed Query Distributed DBMS
Week 6: DBMS Query Optimization - Single Node & Distributed
Week 7: DBMS Transaction Processing - Single Node
Week 8: DBMS Transaction Processing - Distributed
Week 9: Cloud DBMS - Data Centers & DBMS Architectures
Week 10: Cloud DBMS - Scalable Query Processing
Cloud DBMSs
Week 11: Cloud DBMS - Scalable Transaction Processing
Week 12: Cloud DBMS - Secure DBMSs
Week 13: Other Workloads – MapReduce / Streaming
Week 14: Other Workloads – Distributed AI Other Workloads

Systems@TUDa http://tuda.systems/ | 3
READING LIST
Silberschatz et. al.: Database Systems Concepts, 7th Edition (Chapter 20
to Chapter 21)

T. Özsu et. al.: Principles of Distributed Database Systems, 3rd edition

(Chapter 1-3)

Andy Pavlo: Online CMU DBClass:

• https://www.youtube.com/watch?v=FijDo-sEx5c&list=PLSE8ODhjZXja3hgmuwhf89qboV1kOxMx7&index=22
• https://www.youtube.com/watch?v=swgT4huBvN0&list=PLSE8ODhjZXja3hgmuwhf89qboV1kOxMx7&index=23
• https://www.youtube.com/watch?v=RLFXFrA8NCA&list=PLSE8ODhjZXja3hgmuwhf89qboV1kOxMx7&index=24

Systems@TUDa http://tuda.systems/ | 4
DISTRIBUTED DBMSs

Server
2
Server
1
Communication
Network

Server Server
4 3

Main goal: Distribute data / workload across multiple DBMS servers

Systems@TUDa http://tuda.systems/ | 5
DISTRIBUTED DBMSs: SCOPE
Distribution across Data Centers Distribution within Data Centers
(aka Geo-distributed DBMSs) (aka Parallel DBMSs)

• Replicate across data centers: • Improve query performance on large

Resilience towards disasters data sets (i.e., parallelize execution)
• Reduce access latencies, legal • Provide fault-tolerance & load
issues, … balancing, ….
…
Systems@TUDa http://tuda.systems/ | 6
DISTRIBUTED DBMSs: VARIANTS
Homogeneous vs. Centralized vs.
Heterogeneous Peer-to-peer

Oracle Oracle Worker

Coordi
Worker
nator
Oracle Oracle
Worker

Oracle DB2 Peer Peer

Peer Peer
SAP MSQL

Systems@TUDa http://tuda.systems/ | 7
FOCUS OF THIS & NEXT LECTURES
Distributed DBMS within a Data Center (aka Parallel DBMSs)

Homogeneous DBMSs → each node in a cluster of nodes runs the same

DBMS software

Centralized DBMSs → one dedicated coordinator node of the cluster

compiles/optimizes query but all nodes are used to run query in parallel

Systems@TUDa http://tuda.systems/ | 12/6/22 | 8

AGENDA
Architectures for Distributed DBMSs

Distributed Data Storage

• Data Distribution
• Partitioning for OLAP
• Partitioning for OLTP

Systems@TUDa http://tuda.systems/ | 12/6/22 | 9

ARCHITECTURES FOR DISTRIBUTED DBMSs

Main scheme for distributed / parallel DBMSs:

Worker The coordinator

compiles/optimizes
Query: SELECT … submit Coordi
Worker
nator query & workers run
Worker the query in parallel

Main question: how to distribute data and queries across workers?

Systems@TUDa http://tuda.systems/ | 06.12.22 | 10

DBMS ARCHITECTURES: SHARED-NOTHING

Network

Compute CPU

RAM
Multiple workers each having a
private portion of the data, each worker
Storage runs share of query on private data

Systems@TUDa http://tuda.systems/ | 22.11.21 | 11

DBMS ARCHITECTURES: SHARED-DATA

CPU
Compute
RAM
Multiple servers each having
access to the shared data over network,
Network compute servers can run queries
on any part of data

RAM
Shared
Storage data

Systems@TUDa http://tuda.systems/ | 22.11.21 | 12

SHARED-DISK VS. SHARED-NOTHING
Shared-data Shared-nothing
+ Load balancing: compute + Low latency: no network
nodes can access shared data involved for accessing data
+ Fault-tolerant: Any compute + Simple to implement: Each
node can access data node runs a full DBMS
- High latency: data access - Prone to load-imbalance and
across the network node failures

Systems@TUDa http://tuda.systems/ | 12/6/22 | 13

EARLY AND RECENT PARALLEL DBMSs
Early Parallel DBMSs: 1980’s
• Gamma (Wisconsin, shared-nothing)
• Volcano (Colorado, shared-nothing) On-premise DBMS
• Non-stop SQL (Tandem, shared-nothing)
• Terradata (shared-nothing)
For this lecture: Focus on shared-nothing DBMSs
Recent Parallel DBMSs: After 2010
On-premise & Cloud DBMS
• SAP HANA (shared-nothing)
• Snowflake / OLAP (shared-data) Cloud DBMS
• Amazon Aurora / OLTP (shared-data)
Systems@TUDa http://tuda.systems/ | 12/6/22 | 14
DIFFERENT TYPES OF PARALLELISM
Intra-query parallelism
▪ One query is run in parallel across machines of a cluster

Inter-query parallelism
▪ Different queries are run in parallel across machines of a cluster

We’ll first focus on intra-query parallelism (for distributed OLAP) but

also think about inter-query parallelism (for distributed OLTP)

Systems@TUDa http://tuda.systems/ | 12/6/22 | 15

METRICS FOR DISTRIBUTED DBMSs
Speedup: performance for fixed data + growing resources
(i.e., a higher # servers)
▪ speedup = small system elapsed time
large system elapsed time

Scaleup: performance for growing data & resources

▪ scaleup = small system small problem elapsed time
big system big problem elapsed time

Systems@TUDa http://tuda.systems/ | 06.12.22 | 16

FACTORS LIMITING SPEEDUP AND SCALEUP
Speedup and scaleup are often sublinear due to:
• Network cost: May dominate computation time; network is a slow /
shared resource!
• Synchronization (for OLTP): Transactions accessing shared data
compete with each other (see also Amdahl’s law)
• Skew: Overall execution time determined by slowest of parallelly
executing tasks

Systems@TUDa http://tuda.systems/ | 06.12.22 | 17

AGENDA
Architectures for Distributed DBMSs

Distributed Data Storage

• Data Distribution
• Partitioning for OLAP
• Partitioning for OLTP

Systems@TUDa http://tuda.systems/ | 12/6/22 | 18

SHARED-NOTHING DBMS: DATA DISTRIBUTION

Network

Compute

Storage

In a shared-nothing architecture, the

data needs to be distributed over multiple workers (nodes)
Systems@TUDa http://tuda.systems/ | 12/6/22 | 19
DATA DISTRIBUTION: OVERVIEW
Data distribution has two steps: (1) fragmentation and (2) allocation

Step 1: Fragmentation of Data

• Divide a table into smaller fragments
• Also called “shards” or “partitions”

Step 2: Allocation of Data

• Decide on which server to put each fragment
• Replication is part of allocation (i.e., assign fragment to multiple nodes)

Systems@TUDa http://tuda.systems/ | 12/6/22 | 20

DATA DISTRIBUTION: EXAMPLE
Table R0
R R0
Node N1
R1 Partitions
R1 R0 and R2
are replicated
R0
Node N2
R2 R2

Step 1: Fragmentation Node N3

R2
Step 2: Allocation
Systems@TUDa http://tuda.systems/ | 12/6/22 | 21
HOW TO FRAGMENT A TABLE?
Horizontally fragmentation: split table by rows

Vertically fragmentation: split table by columns

Systems@TUDa http://tuda.systems/ | 12/6/22 | 22

Horizontal fragmentation of a table
Time Oid Product Amount
6/1/17 424252 Couch $570
6/1/17 256623 Car $1123
6/2/17 636353 Bike $86
6/5/17 662113 Chair $10
6/7/17 121113 Lamp $19
6/9/17 887734 Bike $56
6/11/17 252111 Scooter $18
6/11/17 116458 Hammer $8000

Node 1 Node 2 Node 3 Node 4

FB Informatik | FG Data Management | 07.01.18
Vertical Fragmentation of a table
Time Oid Product Amount
6/1/17 424252 Couch $570
6/1/17 256623 Car $1123
6/2/17 636353 Bike $86
6/5/17 662113 Chair $10
6/7/17 121113 Lamp $19
6/9/17 887734 Bike $56
6/11/17 252111 Scooter $18
6/11/17 116458 Hammer $8000

Node 1 Node 2 Node 3 Node 4

FB Informatik | FG Data Management | 07.01.18
FRAGMENTATION IN PARALLEL DBMS
Parallel DBMS typically use horizontal fragmentation
• Typically one fragment per server is created
• Often over-partitioning is used (# of fragments > # of servers). Details
later

Horizontal fragmentation schemes:

• Round-robin
• Range-partitioning
• Hash-partitioning

Systems@TUDa http://tuda.systems/ | 12/6/22 | 25

EXAMPLE: ROUND-ROBIN
Round-robin assigns tuples one-by-one to different partitions:

Orders0 Partition 0

Orders1 Partition 1

Orders2 Partition 2

Orders3 Partition 3

Systems@TUDa http://tuda.systems/ | 12/6/22 | 26

EXAMPLE: HASH-PARTITIONING
Hash partitioning uses hash function h to assign tuples to partitions:
h(key) → partition# maps key to partition #

Ordersh(oid)=0 Partition 0
h(oid)= 0
h(oid)= 3 Ordersh(oid)=1 Partition 1
h(oid)= 1
h(oid)= 1
h(oid)= 1 Ordersh(oid)=2 Partition 2
h(oid)= 2
h(oid)= 3
h(oid)= 2
Orders h(oid)=3 Partition 3

h(oid)=oid%4

Good hash functions in DBMSs: http://www.vldb.org/pvldb/vol9/p96-richter.pdf

Systems@TUDa http://tuda.systems/ | 12/6/22 | 27

EXAMPLE: RANGE-PARTITIONING
Range-partition defines range predicates for assignments:

OrdersTime in Q1 Partition 0

OrdersTime in Q2 Partition 1

OrdersTime in Q3 Partition 2

OrdersTime in Q4 Partition 3

Date between Q1 (Jan-Mar) Partition 0

Date between Q2 (Apr-June) Partition 1

Date between Q3 (Jul-Sep) Partition 2

Date between Q4 (Oct-Dec) Partition 3

Systems@TUDa http://tuda.systems/ | 12/6/22 | 28

PROPERTIES OF PARTITIONING
The following properties are important for a partitioning scheme

1. Supports co-partitioning of data

2. Enables pruning of partitions
3. Can avoid data-distribution skew (i.e., one node gets more data
then other nodes)

→ All three properties are important for efficient distributed query

processing

Systems@TUDa http://tuda.systems/ | 12/6/22 | 29

CO-PARTITIONED DATA
Two tables are co-partitioned if the very same partitioning function is
used for both tables (e.g., we use h(cid)=cid%2 for customers and orders)

Co-partitioning reduces network communication during query exec.

• Most expensive operation for distributed query processing is
distributed join
• Co-partitioning on join key beneficial for joins: If both tables are
partitioned on join keys no data shuffling is needed

Round-robin can not guarantee co-partitioning but hash- and range

Systems@TUDa http://tuda.systems/ | 12/6/22 | 30

EXAMPLE: JOIN WITHOUT CO-PARTITIONING
Orders o
Customer c
oid total cid
cid cname 1 100 1
h(cid)=cid%2 1 Smith 2 209 2 h(oid)=oid%2
2 Miller 3 99 2
4 199 1

oid total cid

cid cname 2 209 2 o0 Node N1
Node N1 c0 2 Miller 4 199 1

c⨝o=?
oid total cid
cid cname 1 100 1 o1 Node N2
Node N2 c1 1 Smith 3 99 2

Systems@TUDa http://tuda.systems/ | 12/6/22 | 31

EXAMPLE: JOIN WITH CO-PARTITIONING
Orders o
Customer c
oid total cid
cid cname 1 100 1
h(cid)=cid%2 1 Smith 2 209 2 h(cid)=cid%2
2 Miller 3 99 2
4 199 1

oid total cid

cid cname o0 Node N1
Node N1 c0 2 Miller 2 209 2
3 99 2

c⨝o=?
oid total cid
cid cname 1 100 1 o1 Node N2
Node N2 c1 1 Smith 4 199 1

Systems@TUDa http://tuda.systems/ | 12/6/22 | 32

PRUNING OF PARTITIONS
Partitioning information is used to skip ”unnecessary” partitions during
query execution

For example, assume we have two partitions of “orders” table

• Partition0: Time < 2010
• Partition1: Time >= 2010

Query Select * From orders where Time = 2017 only needs to

read partition1 (since partition0 does not have any data for query)

Systems@TUDa http://tuda.systems/ | 12/6/22 | 33

DATA-DISTRIBUTION SKEW
Data-distribution skew: some nodes have
Orders
more data than others. This could occur due to 1 Orders
2

▪ Attribute-value skew Node 1 Node 2

▪ Some partitioning-attribute values appear in many tuples
▪ Can occur with range-partitioning and hash-partitioning.
▪ Partition skew
▪ Imbalance can happen even without attribute–value skew
▪ When? E.g., Badly chosen range-partition vector may assign too many tuples to
some partitions and too few to others

Over-partitioning (# partitions >> # servers) can help with partition-skew! Why?

Systems@TUDa http://tuda.systems/ | 12/6/22 | 34
PARTITIONING SCHEMES: COMPARISON
Round-Robin:
• Pro: Avoids partition skew (Why?)
• Con: Pruning of partitions not possible

Range-Partitioning:
• Pro: Pruning for key-lookups (year=2017) and range queries (year>=2010)
• Con: Sensitive to partition skew

Hash-Partitioning:
• Pro: Pruning for key-lookups, avoids partitioning skew (if hash function is good)
• Con: Pruning does not work for range queries
Systems@TUDa http://tuda.systems/ | 12/6/22 | 35
RECAP: CO-PARTITION & DISTRIBUTED JOINS
Customer0 (Node 0) Orders0 (Node 0)
cid cname cage oid total cid
Co-Partition 5 199 3
3 C 22 ⨝ c.cid=o.cid
on join key … … …
19 55 6
6 F 35

Customer1 (Node 1) Orders1 (Node 1)

cid cname cage oid total cid
1 77 4
1 A 19 ⨝Co-Partition
c.cid=o.cid
on join key … … …
4 D 59 77 499 1

Customer2 (Node 2) Orders2 (Node 2)

cid cname cage oid total cid
3 100 2
2 B 37 ⨝Co-Partition
c.cid=o.cid
on join key
… … …
5 E 28 45 10 5

Hash-Partitioned: cid%3 Hash-Partitioned: cid%3

HOWEVER … CO-PARTITIONING IS LIMITED
Customer0 (Node 0) Orders0 (Node 0) Part0 (Node 0)
cid cname cage oid total
oid total cid cid
part

3
6
C
F
22
35
5
…
19
199
……
199

5555
3
…
6
3 1
…1
6 2
?
Customer1 (Node 1) Orders1 (Node 1) Part1 (Node 1)
cid cname cage oid total
oid total cid cidpart part name
1
4
A
D
19
59
11
……
77
77
7777
……
499
499
4
…
1
4 1
…3
1 3
? 1 Phone

2 TV
3 Comp.
Customer2 (Node 2) Orders2 (Node 2) Part2 (Node 2)
cid cname cage Oid.
oid total
total cid part
cid
2
5
B
E
37
28
33
……
45
45
……
100
100

1010
2
…
2
2 2
…3
5 3
?
Hash-Partitioned: cid%3 Hash-Partitioned: cid%3
PREDICATE-BASED REFERENCE PARTITIONING
Based on paper: https://dl.acm.org/doi/10.1145/2723372.2723718

Intuition: Predicate-based Reference Partitioning (PREF) introduces

minimal data redundancy on the tuple-level to maximize data-locality.

Systems@TUDa http://tuda.systems/ | 12/6/22 | 38

PREDICATE-BASED REFERENCE PARTITIONING
Based on paper: https://dl.acm.org/doi/10.1145/2723372.2723718

Intuition: Predicate-based Reference Partitioning (PREF) introduces

minimal data redundancy on the tuple-level to maximize data-locality.

Example: PREF partition table R by S using predicate R.B = S.B

S0 (Node 0) R0 (Node 0)
A B ⨝R.B=S.B B
s2 r2 r2 Table R
B
r1
S1 (Node 1) R1 (Node 1) r2
A B B
s1 r1 ⨝R.B=S.B r1
s3 r2 r2

Systems@TUDa http://tuda.systems/ | 12/6/22 | 39

AUTOMATED PARTITIONING DESIGN WITH PREF
Problem definition: Given a set of tables & optionally a workload of SQL queries,
assign a partitioning scheme (HASH or PREF) to each table such that

fk fk
CUSTOMER C ORDERS O LINEITEM L

fk fk
NATION SUPPLIER S

Goals: Data-Locality is maximized Data-Redundancy

for distributed joins is minimized
Systems@TUDa http://tuda.systems/ | 12/6/22 | 40
AUTOMATED PARTITIONING DESIGN WITH PREF

1 Schema-driven (SD): Foreign-keys in schema represent join paths

fk fk
CUSTOMER C ORDERS O LINEITEM L

fk fk
NATION SUPPLIER S

2
Workload-driven (WD): Use workload to derive join paths

Systems@TUDa http://tuda.systems/ | 12/6/22 | 41

SCHEMA-DRIVEN PARTITIONING DESIGN
1.Schema Graph GS 2. Maximum 3.Enumerate Partitionings
(with weights): Spanning Tree: & Estimate Table Sizes:
7.5m 7.5m PREF
O L O L on L O L HASH

1.65m 6.01m 1.65m 6.01m

PREF PREF
C S C S on O C S on L
~150k ~10k ~150k
PREF
N N on C
N

Weights = Maximize Minimize

Comm. cost Data-Locality Data-Redundancy

Systems@TUDa http://tuda.systems/ | 12/6/22 | 42

QUERY RUNTIME (TPC-H): ALL 22 QUERIES
Data: SF=10 (10GB) Cluster: 10 nodes over 1G Ethernet

Classical = Co-part. (2 largest) + SD = Schema-driven WD = Workload-driven

Replicate all other tables PREF PREF
Systems@TUDa http://tuda.systems/ | 12/6/22 | 43
AGENDA
Parallel DBMSs (within a Data Center)

Distributed Data Storage

• Data Distribution
• Partitioning for OLAP
• Partitioning for OLTP

Systems@TUDa http://tuda.systems/ | 12/6/22 | 44

A TYPICAL OLAP QUERY

SELECT D3.city,
SUM(F.price)
FROM F, D2, D3, D4
WHERE F.d2 = D2.id
AND F.d3 = D3.id
AND F.d4 = D4.id
AND D2.quarter = 4
AND D2.year = 2019
AND D3.country='Germany'
AND D4.product='Lenovo T61’
GROUP BY D3.city

Systems@TUDa http://tuda.systems/ | 06.12.22 | 45

OLAP: DATA DISTRIBUTION
Main challenge: distributed joins between fact and dimension tables

Partitioning strategy for OLAP (star schema)

1. Fact table and largest dimension table are co-partitioned on the join key (i.e.,
the primary key of the dimension table)
2. Allocate resulting partitions to servers to enable local joins
3. All other dimension tables are replicated to all nodes

→ No network communication is required for joins of fact & dimension tables!

Systems@TUDa http://tuda.systems/ | 12/6/22 | 46

EXAMPLE: STAR SCHEMA
Step 1: Partition fact table and largest dimension on (e.g., F and D4 on Product ID)

D2: Time D3: Location

Sales0
FactSales
Table1 F:
Sales2
Sales
Sales3
D4: Product0
D4: Product
D1: Payment D4: Product1
D4: Product2
D4: Product3
Systems@TUDa http://tuda.systems/ | 12/6/22 | 47
EXAMPLE: STAR SCHEMA
Step 2: Partitions of dimension and fact table are co-located on nodes

Node 3
Node 1

Sales0
Sales2
D4:
Product0
D4:
Product2

Node 4
Node 2

Sales1
Sales3
D4:
Product1
D4:
Product3

Systems@TUDa http://tuda.systems/ | 12/6/22 | 48

EXAMPLE: STAR SCHEMA
Step 3: Other dimension tables (D1-D3) are replicated to all nodes
D3: D3:
D2: Time D2: Time Location
Location

Node 3
Node 1

Sales0
Sales2

D4:
Product0 D1:
D1: D4:
Payment Payment Product2

D3: D3:
D2: Time D2: Time Location
Location

Node 4
Node 2

Sales1

D4: Sales3
D1: Product1 D1:
Payment Payment D4:
Product3
Systems@TUDa http://tuda.systems/ | 12/6/22 | 49
ISSUE: MANY LARGE DIMENSION TABLES
Problem: There are multiple large dimension tables and they do not fit
all in one node = cannot fully replicate them

Solution 1:
• Partition all dimension tables
• ... but then co-partitioning of all dimension tables with fact table is not
possible → distributed join with fact table needed (see next lecture)

Solution 2: Use PREF for other dimension tables (see before)

Systems@TUDa http://tuda.systems/ | 07.01.18

AGENDA
Parallel DBMSs (within a Data Center)

Distributed Data Storage

• Data Distribution
• Partitioning for OLAP
• Partitioning for OLTP

Systems@TUDa http://tuda.systems/ | 12/6/22 | 51

OLTP: DATA DISTRIBUTION
Main challenge: A transaction needs to update data on multiple servers
(leads to higher latencies and thus higher conflict rates)

Partitioning strategies for OLTP

• Same as for OLAP: avoid network communication (e.g., co-partition data)
OLTP Workload:
Customer1 Orders of Customer2 Orders of Tx’s reads customer /
Customer1 Customer2
update its order
Node 1 Node 2

• Complex workloads require more sophisticated schemes (e.g., Schism)

OLTP PARTITIONING: SCHISM

https://dspace.mit.edu/handle/1721.1/73347
OLTP PARTITIONING: SCHISM
1. Build a graph from a workload trace
▪ Nodes: Tuples accessed by the transactions (txn) in trace
▪ Edges: Connect tuples accessed by same txn via an edge
OLTP PARTITIONING: SCHISM
2. Partitioning should minimize distributed txns
• Idea: min-cut of graph minimizes distributed txns
EXAMPLE: BUILDING A GRAPH
EXAMPLE: BUILDING A GRAPH
EXAMPLE: BUILDING A GRAPH
EXAMPLE: BUILDING A GRAPH
EXAMPLE: BUILDING A GRAPH

Weights on edges = # of transactions which access tuples together

EXAMPLE: PARTITION THE GRAPH

Min-cut partitioning: Cut graph into n disjoint subsets and

minimize sum of weights that are used for cut
QUESTIONS

B-63943en-2 Macro Executor Programming Manual
No ratings yet
B-63943en-2 Macro Executor Programming Manual
394 pages
21 Distributed
No ratings yet
21 Distributed
6 pages
Parallel Database Systems and Their Architecture
No ratings yet
Parallel Database Systems and Their Architecture
17 pages
Basis For Distributed Database Technology
No ratings yet
Basis For Distributed Database Technology
35 pages
9.CSI2004-ADBMS Module2 Part1
No ratings yet
9.CSI2004-ADBMS Module2 Part1
54 pages
22 Distributed
No ratings yet
22 Distributed
6 pages
Distributed Database System
No ratings yet
Distributed Database System
5 pages
CSE 453 Slide 1
No ratings yet
CSE 453 Slide 1
46 pages
Basis For Distributed Database Technology
No ratings yet
Basis For Distributed Database Technology
35 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
Lecture 1 Ho
No ratings yet
Lecture 1 Ho
62 pages
Lecture 1 Ho PDF
No ratings yet
Lecture 1 Ho PDF
62 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
123 pages
Distributed Databases: Daniel Marcous
No ratings yet
Distributed Databases: Daniel Marcous
41 pages
Unit 5 Parallel and Distributed Databases
No ratings yet
Unit 5 Parallel and Distributed Databases
22 pages
TDD: Topics in Distributed Databases: Parallel Database Management Systems
No ratings yet
TDD: Topics in Distributed Databases: Parallel Database Management Systems
38 pages
Chapter - 7 Distributed Database System
0% (1)
Chapter - 7 Distributed Database System
54 pages
Topic 7 - Distributed Database Systems
No ratings yet
Topic 7 - Distributed Database Systems
44 pages
Lecture 8 - Distributed Database Management Systems
No ratings yet
Lecture 8 - Distributed Database Management Systems
60 pages
Parallel and Distributed Databases
No ratings yet
Parallel and Distributed Databases
7 pages
Ads Unit 3
No ratings yet
Ads Unit 3
8 pages
Unit V NoSQL Databases
No ratings yet
Unit V NoSQL Databases
124 pages
Sayan Ghosh 26900123054 Distributed Database System Cse 6TH Sem
No ratings yet
Sayan Ghosh 26900123054 Distributed Database System Cse 6TH Sem
11 pages
CSC302 ch24
No ratings yet
CSC302 ch24
23 pages
Unit - I DBMS
No ratings yet
Unit - I DBMS
74 pages
Elective-I Advanced Database Management Systems: Unit Ii
100% (1)
Elective-I Advanced Database Management Systems: Unit Ii
141 pages
Unit 1 DISTRIBUTED DATABASE
No ratings yet
Unit 1 DISTRIBUTED DATABASE
6 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
Sayan Ghosh 26900123054 Distributed Database System Cse 6th Sem
No ratings yet
Sayan Ghosh 26900123054 Distributed Database System Cse 6th Sem
11 pages
Parallel and Distributed Databases in DBMS
No ratings yet
Parallel and Distributed Databases in DBMS
31 pages
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
No ratings yet
M.C.a. (Sem - IV) Paper - IV - Adavanced Database Techniques
114 pages
Distributed Multimedia & Database System
No ratings yet
Distributed Multimedia & Database System
58 pages
Distributed DBM S
No ratings yet
Distributed DBM S
67 pages
07 DistributedDataManagement
No ratings yet
07 DistributedDataManagement
44 pages
W7 DBMS Chapter23
No ratings yet
W7 DBMS Chapter23
33 pages
Parallal Databases
No ratings yet
Parallal Databases
4 pages
1 Distributed DB
No ratings yet
1 Distributed DB
67 pages
Distributed Database MID Notes
No ratings yet
Distributed Database MID Notes
19 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Distributed Databases
No ratings yet
Distributed Databases
55 pages
Distributed and Parallel Database Systems: To-Peer, Require Sophisticated Protocols
No ratings yet
Distributed and Parallel Database Systems: To-Peer, Require Sophisticated Protocols
4 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
RDBMS - Module5 - Distributed and Parallel DB
No ratings yet
RDBMS - Module5 - Distributed and Parallel DB
7 pages
ADT Unit 1 To 5
No ratings yet
ADT Unit 1 To 5
160 pages
Distributed
No ratings yet
Distributed
30 pages
Week 2 Parallel and Distributed Database
No ratings yet
Week 2 Parallel and Distributed Database
7 pages
ADBMS Parallel and Distributed Databases
No ratings yet
ADBMS Parallel and Distributed Databases
98 pages
Final
No ratings yet
Final
46 pages
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
No ratings yet
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
27 pages
Parallel Databases
No ratings yet
Parallel Databases
23 pages
10 Distributeddbms
No ratings yet
10 Distributeddbms
56 pages
Distributed Databases
No ratings yet
Distributed Databases
32 pages
Distributed DB
No ratings yet
Distributed DB
146 pages
Module 1
No ratings yet
Module 1
24 pages
ParallelDBs PDF
No ratings yet
ParallelDBs PDF
23 pages
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
No ratings yet
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
23 pages
DDMS Part-1
No ratings yet
DDMS Part-1
35 pages
Unit 4 - Concept of Distributed DBMS
No ratings yet
Unit 4 - Concept of Distributed DBMS
29 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Concise Oracle Database For People Who Has No Time
From Everand
Concise Oracle Database For People Who Has No Time
Billy Aung Myint
No ratings yet
Oracle 11g Streams Implementer's Guide
From Everand
Oracle 11g Streams Implementer's Guide
Ann L. R. McKinnell
No ratings yet
Elevate Customer Service Standards With Promptora Generative AI
No ratings yet
Elevate Customer Service Standards With Promptora Generative AI
5 pages
Data Structures Using C (Csit124) Lecture Notes: by Dr. Nancy Girdhar
No ratings yet
Data Structures Using C (Csit124) Lecture Notes: by Dr. Nancy Girdhar
33 pages
AI Documentation
100% (1)
AI Documentation
54 pages
GE - Introduction To International Business
No ratings yet
GE - Introduction To International Business
136 pages
Td2 Quickstart Guide: Device Connection Diagram
No ratings yet
Td2 Quickstart Guide: Device Connection Diagram
1 page
SMK User Manual: October 25, 2017
No ratings yet
SMK User Manual: October 25, 2017
64 pages
AUTOPULSE Z-10 Agent Release Control Panel: Features
No ratings yet
AUTOPULSE Z-10 Agent Release Control Panel: Features
10 pages
Examiner Payment System User Manual
No ratings yet
Examiner Payment System User Manual
20 pages
Human Controlled Search and Rescue Robot With BLE-based Victim Localization
No ratings yet
Human Controlled Search and Rescue Robot With BLE-based Victim Localization
8 pages
Comprssor Kaeser SX 3T PDF
No ratings yet
Comprssor Kaeser SX 3T PDF
9 pages
For CL Verilog
No ratings yet
For CL Verilog
14 pages
Sas #3 - Neri, Armee Gay C.
No ratings yet
Sas #3 - Neri, Armee Gay C.
3 pages
Catalogue DRX New DRX Adjustable 01
No ratings yet
Catalogue DRX New DRX Adjustable 01
22 pages
SEAT Arona 2019 Brochure
No ratings yet
SEAT Arona 2019 Brochure
4 pages
SK Accessory Brochure - ENYAQ - Mar 2022
No ratings yet
SK Accessory Brochure - ENYAQ - Mar 2022
36 pages
Project File Format
No ratings yet
Project File Format
16 pages
SAP Presentation
No ratings yet
SAP Presentation
10 pages
Powerflarm 1090 Receiver Module: Mode-S and Ads-B Interoperability For Maximum Safety
No ratings yet
Powerflarm 1090 Receiver Module: Mode-S and Ads-B Interoperability For Maximum Safety
2 pages
MINDMAP
No ratings yet
MINDMAP
1 page
GM 1927 17 Processes and Measurements Procedure Rev8.0
100% (3)
GM 1927 17 Processes and Measurements Procedure Rev8.0
41 pages
Digital File Cabinet and Smart Office Communication Proposal
100% (2)
Digital File Cabinet and Smart Office Communication Proposal
13 pages
BBA 8th Semester-MoT Syllabus
No ratings yet
BBA 8th Semester-MoT Syllabus
3 pages
IC-F121 - F221 Icom CB
No ratings yet
IC-F121 - F221 Icom CB
2 pages
EViews - Wikipedia PDF
No ratings yet
EViews - Wikipedia PDF
8 pages
FortiGate High-End NGFW-DC-NSE-Oct15-19
No ratings yet
FortiGate High-End NGFW-DC-NSE-Oct15-19
48 pages
Engel CL Manual
No ratings yet
Engel CL Manual
128 pages
ECU Remapping Solution
No ratings yet
ECU Remapping Solution
19 pages
School Ai
No ratings yet
School Ai
42 pages
Information Technology Risks in Tourism
No ratings yet
Information Technology Risks in Tourism
3 pages