0% found this document useful (0 votes)

50 views5 pages

hw3 Sols

The document provides solutions to questions from Homework #3 on database systems. It discusses sorting algorithms, join algorithms, and calculating I/O costs for different join strategies such as block nested loop, hash, and sort-merge joins on relations R, S, and T.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views5 pages

hw3 Sols

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

C ARNEGIE M ELLON U NIVERSITY

C OMPUTER S CIENCE D EPARTMENT

15-445/645 – DATABASE S YSTEMS (FALL 2021)
P ROF. L IN M A

Homework #3 (by Sophie Qiu) – Solutions

Due: Sunday Oct 24, 2021 @ 11:59pm

IMPORTANT:
• Upload this PDF with your answers to Gradescope by 11:59pm on Sunday Oct 24, 2021.
• Plagiarism: Homework may be discussed with other students, but all homework is to be
completed individually.
• You have to use this PDF for all of your answers.
For your information:
• Graded out of 100 points; 2 questions total
• Rough time estimate: ≈ 1 - 2 hours (0.5 - 1 hours for each question)
Revision : 2021/11/16 01:04

Question Points Score

Sorting Algorithms 40
Join Algorithms 60
Total: 100

1
15-445/645 (Fall 2021) Homework #3 Page 2 of 5

Question 1: Sorting Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [40 points]

Graded by:
We have a database file with six million pages (N = 6,000,000 pages), and we want to sort it
using external merge sort. Assume that the DBMS is not using double buffering or blocked
I/O, and that it uses quicksort for in-memory sorting. Let B denote the number of buffers.
(a) [10 points] Assume that the DBMS has six buffers. How many passes does the DBMS
need to perform in order to sort the file?
2 5 2 7 2 8 10 2 12
Solution:

N
1 + logB−1 = 1 + dlog5 (d6, 000, 000/6e)e
B
= 1 + dlog5 (d1, 000, 000e)e
= 1 + 9 = 10

(b) [5 points] Again, assuming that the DBMS has six buffers. What is the total I/O cost to
sort the file?
2 60,000,000 120,000,000 2 144,000,000 2 240,000,000 2 480,000,000

Solution: Cost = 2N × #passes = 2 × 6, 000, 000 × 10

(c) [10 points] What is the smallest number of buffers B that the DBMS can sort the target
file using only two passes?
2 172 2 173 2 174 2,450 2 2,451 2 2,452 2 2,827 2 2,828
2 2,829 2 3,999,999 2 4,000,000 2 4,000,001
Solution: We want B where N ≤ B × (B − 1). If B = 2450, then 6, 000, 000 ≤
2050 × 2449 = 6, 000, 050; any smaller value for B would fail.

(d) [10 points] What is the smallest number of buffers B that the DBMS can sort the target
file using only six passes?
2 14 15 2 16 2 1,240 2 1,241 2 1,242 2 1,256 2 1,257
2 1,258 2 2,934 2 2,935 2 2,936 2 3,999,999 2 4,000,000
2 4,000,001
Solution: B × (B − 1)5 = 15 × 14 × 14 × 14 × 14 × 14 = 8, 067, 360. Any smaller
value of B would fail.

(e) [5 points] Suppose the DBMS has twenty-four buffers. What is the largest database file
(expressed in terms of N , the number of pages) that can be sorted with external merge

Question 1 continues. . .
15-445/645 (Fall 2021) Homework #3 Page 3 of 5

sort using six passes?

2 65,610 2 65,601 2 131,071 2 131,072 2 3,590,490 2 3,590,940
2 49,251,980 2 49,521,980 2 154,472,230 154,472,232
Solution: We want N such that N ≤ B×(B−1)5 . The largest such value is B×(B−1)5
itself, which is 24 × 235 = 154, 472, 232

Homework #3 continues. . .
15-445/645 (Fall 2021) Homework #3 Page 4 of 5

Question 2: Join Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [60 points]

Graded by:
Consider relations R(a, b), S(a, c, d), and T(a, e) to be joined on the common attribute
a. Assume that there are no indexes available on the tables to speed up the join algorithms.

• There are B = 60 pages in the buffer

• Table R spans M = 1,400 pages with 60 tuples per page
• Table S spans N = 2,200 pages with 200 tuples per page
• The joining result of R and S spans K = 2,000 pages
• Table T spans L = 1,000 pages with 200 tuples per page

Answer the following questions on computing the I/O costs for the joins. You can assume the
simplest cost model where pages are read and written one at a time. You can also assume that
you will need one buffer block to hold the evolving output block and one input block to hold
the current input block of the inner relation. You may ignore the cost of the writing of the final
results.

(a) [5 points] Block nested loop join with R as the outer relation and S as the inner relation:
2 11,200 2 23,000 56,400 2 85,000 2 92,600
M
Solution: M + d B−2 e × N = 1, 400 + d 1,400
58
e × 2, 200 = 1, 400 + 55, 000 = 56, 400

(b) [5 points] Block nested loop join with S as the outer relation and R as the inner relation:
2 31,200 2 43,000 2 43,600 2 52,900 55,400
N
Solution: N + d B−2 e × M = 2, 200 + d 2,200
58
e × 1, 400 = 2, 200 + 53, 200 = 55, 400

(c) Hash join with S as the outer relation and R as the inner relation. You may ignore recursive
partitioning and partially filled blocks.
i. [5 points] What is the cost of the partition phase?
2 2,800 2 4,400 2 5,000 2 5,800 7,200
Solution: 2 × (M + N ) = 2 × (1, 400 + 2, 200) = 2 × 3, 600 = 7, 200
ii. [5 points] What is the cost of the probe phase?
2 2,800 2 4,400 3,600 2 4,800 2 7,200
Solution: (M + N ) = (1, 400 + 2, 200) = 3, 600

(d) [10 points] Assume that the tables do not fit in main memory and that a high cardinality
of distinct values hash to the same bucket using your hash function h1 . Which of the
following approaches works the best?

Question 2 continues. . .
15-445/645 (Fall 2021) Homework #3 Page 5 of 5

2 Create hashtables for the inner and outer relation using h1 and rehash into an embed-
ded hash table using h1 for large buckets

Create hashtables for the inner and outer relation using h1 and rehash into an
embedded hash table using h2 != h1 for large buckets

2 Use linear probing for collisions and page in and out parts of the hashtable needed
at a given time

2 Create 2 hashtables half the size of the original one, run the same hash join algo-
rithm on the tables, and then merge the hashtables together

Solution: Use Grace hash join with recursive partitioning, which is what the correct
option describes.

(e) Sort-merge join with S as the outer relation and R as the inner relation:
i. [4 points] What is the cost of sorting the tuples in R on attribute a?
2 3,000 5,600 2 7,400 2 9,600 2 10,800
Solution: passes = 1 + dlogB−1 (d M
B
e)e = 1 + dlog59 (d 1,400
60
e)e = 1 + 1 = 2
2M × passes = 2 ∗ 1, 400 ∗ 2 = 5, 600
ii. [4 points] What is the cost of sorting the tuples in S on attribute a?
2 3,400 2 4,000 2 6,400 2 7,600 8,800
Solution: passes = 1 + dlogB−1 (d N
B
e)e = 2
2N × passes = 2 ∗ 2, 200 ∗ 2 = 8, 800
iii. [10 points] What is the cost of the merge phase assuming there are no duplicates in
the join attribute?
2 1,400 2 1,800 3,600 2 4,400 2 4,800
Solution: M + N = 1, 400 + 2, 200 = 3, 600
iv. [10 points] What is the cost of the merge phase in the worst-case scenario?
2 1,080,000 2 2,880,000 3, 080,000 2 4, 750,000 2 10,080,000
Solution: M × N = 1, 400 × 2, 200 = 3, 080, 000
v. [2 points] Now consider joining R, S and then joining the result with T. What is the
cost of the merge phase assuming there are no duplicates in the join attribute?
2 1,000 2 2,000 3,000 2 5,000 2 2,000,000
Solution: K + L = 2, 000 + 1, 000 = 3, 000

End of Homework #3

JOHN DEERE 06 - 644K - English PDF
100% (4)
JOHN DEERE 06 - 644K - English PDF
300 pages
HW3 Sol
No ratings yet
HW3 Sol
12 pages
1st Year Statistics Chapter 3 Notes
67% (3)
1st Year Statistics Chapter 3 Notes
58 pages
ASE1 - Module (Dec 08)
No ratings yet
ASE1 - Module (Dec 08)
144 pages
Reduced-Order State Observer Design
No ratings yet
Reduced-Order State Observer Design
145 pages
Desai Tech Dubai 2016-2
No ratings yet
Desai Tech Dubai 2016-2
72 pages
JohnBiggs UPF
No ratings yet
JohnBiggs UPF
16 pages
05 Vaishnavi Bhosale B1
No ratings yet
05 Vaishnavi Bhosale B1
68 pages
Dehnsupport Toolbox Ds709 e
No ratings yet
Dehnsupport Toolbox Ds709 e
20 pages
Here: Design of Steel Structure by Subramanian PDF
100% (1)
Here: Design of Steel Structure by Subramanian PDF
2 pages
Define Administrator Groups in Sap
No ratings yet
Define Administrator Groups in Sap
4 pages
Air Conditioning Laboratory Unit: Solteq
100% (1)
Air Conditioning Laboratory Unit: Solteq
4 pages
Arduino Workshop
No ratings yet
Arduino Workshop
30 pages
Operatig System
100% (1)
Operatig System
29 pages
Araling Panlipunan 3: 3rd Quarter Week 1
No ratings yet
Araling Panlipunan 3: 3rd Quarter Week 1
39 pages
Using "Audacity®" For Language Teaching
No ratings yet
Using "Audacity®" For Language Teaching
28 pages
It Report PDF
No ratings yet
It Report PDF
24 pages
Query Processing + Optimization: Outline: Operator Evaluation Strategies
No ratings yet
Query Processing + Optimization: Outline: Operator Evaluation Strategies
53 pages
BrixNGN Solution Overview
No ratings yet
BrixNGN Solution Overview
31 pages
Pacelab PostFlightAnalyzer Web
No ratings yet
Pacelab PostFlightAnalyzer Web
2 pages
Teachers Weekly Accomplishment Report Format
No ratings yet
Teachers Weekly Accomplishment Report Format
2 pages
AWS Certified Solutions Architect Associate SAA-C03 Slides Tutorials Dojo
No ratings yet
AWS Certified Solutions Architect Associate SAA-C03 Slides Tutorials Dojo
1,031 pages
Cs411fa09 Hw4 Sol
No ratings yet
Cs411fa09 Hw4 Sol
8 pages
Esp32-C6 Technical Reference Manual en - Pdf#riscvcpu
No ratings yet
Esp32-C6 Technical Reference Manual en - Pdf#riscvcpu
1,361 pages
BCS Topic
No ratings yet
BCS Topic
66 pages
AUTOSCAN 4 Installation Guide
No ratings yet
AUTOSCAN 4 Installation Guide
2 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
77 pages
Database Homework Help
No ratings yet
Database Homework Help
10 pages
Assignment 2: Write Clearly Your Name, Student Number and Lab Number On The Front Page of Your Assignment
No ratings yet
Assignment 2: Write Clearly Your Name, Student Number and Lab Number On The Front Page of Your Assignment
5 pages
IF (C2 B2,"Yes","No") : 1) IF Function To Return Text
No ratings yet
IF (C2 B2,"Yes","No") : 1) IF Function To Return Text
3 pages
Session - 10 Querying
No ratings yet
Session - 10 Querying
36 pages
Unit-2 I-O IN JAVA - 231004 - 094749
No ratings yet
Unit-2 I-O IN JAVA - 231004 - 094749
20 pages
Social Bookmarking Sites
100% (1)
Social Bookmarking Sites
20 pages
Advance Database Management System: Unit - 2 .Query Processing and Optimization
No ratings yet
Advance Database Management System: Unit - 2 .Query Processing and Optimization
38 pages
Response DB 2
No ratings yet
Response DB 2
8 pages
Sem 6 End Sem Paper
No ratings yet
Sem 6 End Sem Paper
11 pages
CSC 172 Midterm
No ratings yet
CSC 172 Midterm
11 pages
7-Query Processing
No ratings yet
7-Query Processing
47 pages
Lesson 06
No ratings yet
Lesson 06
44 pages
Response DB 3
No ratings yet
Response DB 3
6 pages
DBMSEnd Sem Winter 2017 Solution
No ratings yet
DBMSEnd Sem Winter 2017 Solution
7 pages
Course08 - RelEval
No ratings yet
Course08 - RelEval
22 pages
Midterm 02 Solutions
No ratings yet
Midterm 02 Solutions
10 pages
Homework #4 Concurrency Control After - 16
No ratings yet
Homework #4 Concurrency Control After - 16
12 pages
hw2 Sols
No ratings yet
hw2 Sols
13 pages
Guc 437 59 31055 2023-05-25T16 41 09
No ratings yet
Guc 437 59 31055 2023-05-25T16 41 09
15 pages
HW 3 Sol
No ratings yet
HW 3 Sol
8 pages
DBMS 10 Joins v2
No ratings yet
DBMS 10 Joins v2
38 pages
hw5 Sols
No ratings yet
hw5 Sols
8 pages
hw2 Sols
No ratings yet
hw2 Sols
8 pages
hw3 Sols
No ratings yet
hw3 Sols
5 pages
Special Characters PDF
No ratings yet
Special Characters PDF
2 pages
Homework #3 Join Algorithms After - 12
No ratings yet
Homework #3 Join Algorithms After - 12
4 pages
hw6 Sols
No ratings yet
hw6 Sols
7 pages
Lec05 Intermediate Code Generation
No ratings yet
Lec05 Intermediate Code Generation
40 pages
hw4 Sols
No ratings yet
hw4 Sols
4 pages
Midterm 13w2
No ratings yet
Midterm 13w2
8 pages
Tabella Costi Algoritmi Data
No ratings yet
Tabella Costi Algoritmi Data
4 pages
Midterm Exam: Introduction To Database Systems: Solutions: Below Is The Preferred Solution
No ratings yet
Midterm Exam: Introduction To Database Systems: Solutions: Below Is The Preferred Solution
9 pages
Ch12-Query Processing
No ratings yet
Ch12-Query Processing
34 pages
2019 Spring Final Sol
No ratings yet
2019 Spring Final Sol
19 pages
05 Optimization
No ratings yet
05 Optimization
58 pages
hw3 Sols
No ratings yet
hw3 Sols
4 pages
Becoming A Full-Stack Developer With Python Involves Mastering Both Front - 20241230 - 120127 - 0000
No ratings yet
Becoming A Full-Stack Developer With Python Involves Mastering Both Front - 20241230 - 120127 - 0000
23 pages
Cs F212 - Database Systems II SEMESTER 2017-2018
No ratings yet
Cs F212 - Database Systems II SEMESTER 2017-2018
4 pages
Q Evaluation
No ratings yet
Q Evaluation
17 pages
Inheritance PDF
No ratings yet
Inheritance PDF
7 pages
Hash Solution
100% (2)
Hash Solution
3 pages
Final 15
No ratings yet
Final 15
7 pages
10 Microservices Patterns All Architects Should Know - TechTarget
No ratings yet
10 Microservices Patterns All Architects Should Know - TechTarget
4 pages
BACS2063 2020-Oct Final Exam Question Paper
No ratings yet
BACS2063 2020-Oct Final Exam Question Paper
6 pages
CS8492-Database Management Systems
No ratings yet
CS8492-Database Management Systems
15 pages
Final MC Highlighted v2
No ratings yet
Final MC Highlighted v2
10 pages
CSC3170 2024fall A3
No ratings yet
CSC3170 2024fall A3
4 pages
22csc22 Cat-3.1 - Answer Key
No ratings yet
22csc22 Cat-3.1 - Answer Key
22 pages
DCIT204 Past Questions
No ratings yet
DCIT204 Past Questions
3 pages
Query Processing
No ratings yet
Query Processing
77 pages
Quiz 10 November 2020 Questions
No ratings yet
Quiz 10 November 2020 Questions
7 pages
DBMS R19 Unit Iv
No ratings yet
DBMS R19 Unit Iv
25 pages
3cs1112 Ir RPR December 2019
No ratings yet
3cs1112 Ir RPR December 2019
5 pages
Final Highlighted
No ratings yet
Final Highlighted
10 pages
hw7 Questions
No ratings yet
hw7 Questions
2 pages
DBMS Previous Ques
No ratings yet
DBMS Previous Ques
77 pages
DAA FJ1 - Set B Key
No ratings yet
DAA FJ1 - Set B Key
9 pages
Homework #5 Database Systems (50 Points)
No ratings yet
Homework #5 Database Systems (50 Points)
1 page
DBMS UNIT 4 Part 1
No ratings yet
DBMS UNIT 4 Part 1
15 pages
CDCSC20 2024-1
No ratings yet
CDCSC20 2024-1
2 pages
CSE 444: Database Internals: Section 4: Query Optimizer
No ratings yet
CSE 444: Database Internals: Section 4: Query Optimizer
16 pages
Sheet 03
No ratings yet
Sheet 03
2 pages
Unit IV Part II
No ratings yet
Unit IV Part II
37 pages
Solution 03
No ratings yet
Solution 03
6 pages
HPE - c04895558 - HPE 3PAR StoreServ 8000 Storage Drive Support Reference
No ratings yet
HPE - c04895558 - HPE 3PAR StoreServ 8000 Storage Drive Support Reference
26 pages
1 - 2 - CSC 3170 Final 2019-20 Term 2
No ratings yet
1 - 2 - CSC 3170 Final 2019-20 Term 2
7 pages
Dsal Lab Manual
No ratings yet
Dsal Lab Manual
65 pages
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet

hw3 Sols

Uploaded by

hw3 Sols

Uploaded by

C ARNEGIE M ELLON U NIVERSITY

C OMPUTER S CIENCE D EPARTMENT

Homework #3 (by Sophie Qiu) – Solutions

Question Points Score

Question 1: Sorting Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [40 points]

Solution: Cost = 2N × #passes = 2 × 6, 000, 000 × 10

sort using six passes?

Question 2: Join Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [60 points]

• There are B = 60 pages in the buffer

You might also like