0% found this document useful (0 votes)
19 views11 pages

Database System Assignment Help

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views11 pages

Database System Assignment Help

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Database Systems Assignment Help

For Any Assignment related queries, Call us at : - +1 678 648 4277
You can mail us at : - [email protected] or
reach us at : - https://www.databasehomeworkhelp.com/
Problem 1:

Describe an alternative to this hashing-based algorithm. Your answer shouldn’t require


more than a sentence or two.

Solution:

Remember that the tuples do not fit in memory.


One solution would be to use an external memory sort on the table, where the sort key for
each tuple is the gbyf field.
After sorting, sequentially scan the sorted file’s tuples, keeping a running average for each
gbyf grouping (they will appear in contiguous chunks in the file). The average should be
calculated on each aggf field.
Other solutions included using external memory indices, such as B+Trees, on gbyf to group
the tuples on which to calculate an average.

I Cost Estimation

Suppose you are given a database with the following tables and sizes, and that each data
page holds 100 tuples, that both leaf and non-leaf B+Tree pages are dense-packed and hold
100 keys, and that you have 102 pages of memory. Assume that the buffer pool is

databasehomeworkhelp.com
managed as described in the DBMIN Paper (“An Evaluation of Buffer Management
Strategies for Relational Database Systems.”, VLDB 1985.)

Problem 2:

Estimate the minimum number of I/Os required for the following join operations. Ignore the
difference between random and sequential I/O. Assume that B+Trees store pointers to
records in heap files in their leaves (i.e., B+Tree pages only store keys, not tuples.)

Solution:

• Nested loops join between T1 and T2, no indices.

T1 fits in memory. Put it as the inner and read all of T1 once. Then scan through T2 as the
outer. Total: |T1| + |T2| = 1100 I/Os.

databasehomeworkhelp.com
• Grace hash join between T2 and T3, no indices.

Grace join hashes all of T2, T3 in one pass, outputting records as it goes. It then scans
through the hashed output to do the join. Therefore it reads all tuples twice and writes them
once for a total cost of: 3(|T2| + |T3|) = 18000 I/Os.

• Index nested loops join between a foreign key of T2 and the primary key of T3, with a
B+Tree index on T3 and no index on T2.

Put T2 as the outer and T3 as the inner. Assume the matching tuples are always on a single
B+Tree page and that selectivity is 1 due to foreign-key primary-key join. Cache the upper
levels of the B+Tree. T3 is 500,000 tuples, so it needs 3 levels (top level = 100 pointers,
second level = 1002 = 10, 000, third level 1003 = 1, 000, 000 pointers). Cache the root plus
all but one of the level 2 pages (100 pages). Read all of T2 once (one page at a time). For
each tuple in T2, we do a lookup in the B+Tree and read one B+Tree leaf page (at level 3),
then follow the pointer to fetch the actual tuple from the heap file (using the other page of
memory).

Total cost is: 1000(|T2|) + 100(top o f B + Tree) + {T2} × No.BTree lookups


For 99/100 of the B+Tree lookups, two levels will be cached, so {T2} × No.BTree lookups
is:
99/100 ∗ (2 × |T2|) = 99/100 ∗ 2 × 100000 = 198000 pages.
For 1/100 of the B+Tree lookups, only the root level will be cached:
databasehomeworkhelp.com
1/100 ∗ (3 × |T2|) = 1/100 ∗ 3 × 100000 = 3000 pages.
So the total is :
1000 + 100 + 198000 + 3000 = 202100 I/Os

We were flexible with the exact calculation here due to the complexity introduced by not
being able to completely cache both levels of the B+Tree.

II ARIES with CLRs

Suppose you are given the following log file.

databasehomeworkhelp.com
Problem 3:

After recovery, which transactions will be committed and which will be aborted?

Solution:

T1 commits, while T2 and T3 abort, since T1 is the only transaction that has a COMMIT
record in the log.

Problem 4:

Suppose the dirty page table in the CP record has only page A in it. At what LSN will the
REDO pass of recovery begin?

Solution

The LSN that the REDO pass will start at is 2.

The LSN selected is the earliest recoveryLSN in the dirty page table for any dirty page.
Since only page A is in the dirty page table, and the first log record in which it was modified
(its recoveryLSN) is 2, the REDO pass will begin at LSN 2. Pages B and C might have been
stolen (STEAL) by some background flush process before the checkpoint was written out,
and so they do not appear in the dirty page table.
databasehomeworkhelp.com
Problem 5:

Again, suppose the dirty page table in the CP record has only page A in it. What pages may
be written during the REDO pass of recovery?

Solution:

Pages A, B, D, E. REDO will start at LSN 2, and re-apply any UP record in the log. Since B
and C are not in the dirty page table, any change to them before the checkpoint must already
be on disk, so they are not written by REDO before the checkpoint. Since B is updated after
the checkpoint, it will be written at that point.

Problem 6:

Once again, suppose the dirty page table in the CP record has only page A in it. What pages
may be written during the UNDO pass of recovery?

Solution

Pages C and D. The UNDO stage starts at the last LSN for the last transaction to be aborted,
and proceeds backward through the log. LSN 13 is read first, but since it is a CLR (previous
recovery handled it), E was correctly updated on disk, and is not rewritten. LSN 12 is also a

databasehomeworkhelp.com
CLR, so we skip the update to B for the same reason. Following LSN 13’s prevLSN
pointer, we see that LSN 7 was the start of transaction 3, so we are done undoing
transaction 3. Following LSN 12’s prevLSN pointer leads us to LSN 8, and so we undo the
update to D (overwriting D). Following LSN 8’s prevLSN pointer to LSN 5, we again undo
the update to C (overwriting C). Following LSN 5’s prevLSN, we go to LSN 4, which is
the start of transaction 2, and thus the end of the UNDO pass.

III Snapshot Isolation

Oracle and Postgres both use a form of transaction isolation called snapshot isolation. One
possible implementation of snapshot isolation is as follows:

• Every object (e.g., tuple or page) in the database has a timestamp; multiple copies
(“versions”) of objects with old timestamps are kept until no transaction needs to read them
again. (For this question, you don’t need to worry about how such old versions are
maintained or discarded.)
• When a transaction begins, the system records the transaction’s start time stamp, tss.
Timestamps are monotonically increasing, such that no two transactions have the same
timestamp value.
• When a transaction T writes an object O, it adds the new version of the O to T’s local
write set. Versions in the local write set are not read by other transactions until after T has
committed.

databasehomeworkhelp.com
• When a transaction T reads an object O, it reads the most recent committed version with
timestamp ≤ tss, reading O from T’s own local write set if O has been previously written by
T.
• When a transaction T commits, a new timestamp, tsc is taken. For every object O in T’s
local write set, if the most recent version of O in the database has timestamp ≤ tss, then O is
written into the database with timestamp tsc. Otherwise, T aborts. Only one transaction
commits at a time.

For example, consider the following schedule:


Initial database: objects A and B, both version 0 (denoted A0 and B0)

databasehomeworkhelp.com
Here, T2 aborts because it tried to write A concurrently with T1.

Problem 6:

Is snapshot isolation conflict serializable? If yes, state briefly why. If not, give an example
of a non-serializable schedule.

Solution:

No. Snapshot isolation ignores read/write conflicts. Consider T1: (RA, WB) and T2: (RB,
WA).
These execute in the following order
RA0 (T1), RB0 (T2), WB1 (T1), WA2 (T2)

Under snapshot isolation, this order is acceptable and both transactions can commit.
However, this is neither serial orders T1, T2 (where T2 would RB1) or T2, T1 (where T1
would RA2).

Problem 7:

Oracle claims that snapshot isolation is much faster than traditional concurrency control.
Why?

databasehomeworkhelp.com
Solution:

Writes never block readers. For example, if a long running transaction has written value A,
with traditional concurrency control any transaction that wants to read A will need to wait
to acquire a lock on A. With snapshot isolation, the reading transactions will read the
previous version and continue.

This is similar to optimistic concurrency control in that there are no locks and transactions
do not block each other. However, with snapshot isolation, a transaction sees a consistent
snapshot of the database, determined when it begins. With optimistic concurrency control, a
transaction can see values from transactions that commit after it begins.

databasehomeworkhelp.com

You might also like