DBMS Test 2 Advanced Key
DBMS Test 2 Advanced Key
In what normal form is the LOTS relation schema in Figure with the respect to the
restrictive interpretations of normal form that take only the primary key into account?
Will it be in the same normal form if the general definitions of normal form were used?
4.5M
Answer:
If we only take the primary key into account, the LOTS relation schema in Figure
will be in 2NF since there are no partial dependencies on the primary key .However, it is not
in 3NF, since there are the following two transitive dependencies onthe primary
key:PROPERTY_ID# ->COUNTY_NAME ->TAX_RATE, andPROPERTY_ID# ->AREA
->PRICE.Now, if we take all keys into account and use the general definition of 2NF and
3NF, theLOTS relation schema will only be in 1NF because there is a partial
dependencyCOUNTY_NAME ->TAX_RATE on the secondary key {COUNTY_NAME,
LOT#}, which violates 2NF.
Scheme
the primary key into account? 2M
Will it be in the same normal form if the general definitions of normal form were used?
2.5M
Q2. Suppose that we have the following three tuples in legal instance of a relation
schema with three attributes ABC as (1,2,3) (4,2,3) and (5,3,3) . Then which of the
following dependencies can you infer does not hold over schema. 4.5M
Answer:
BC→A
2 3 →1
2 3 →4
3 3 →5
It does not hold.
Scheme
Identifying dependencies that does not hold 2M
Suppose you have a table Employee, which contains emp_id as primary key than a
clustered index which is created on a primary key will sort the Employee table as per
emp_id. That was a brief introduction of What is clustered index in SQL.
On another hand, the Non-Clustered index involves one extra step which points to the
physical location of the record. In this SQL Interview question, we will see some
more differences between clustered and nonclustered index in point format.
Scheme
1. 3M
2. 3M
3. 2M
Scheme
Join 2.5M
Reduce Side Join 2M
Q9. Describe the MapReduce join proConsider schedule below. Determine whether the
below mentioned schedule is strict, cascadeless, recoverable, or nonrecoverable.
Determine the strictest recoverability condition that the schedule satisfies. S3: r1 (X); r2
(Z); r1 (Z); r3 (X);r3 (Y); w1 (X); c1; w3 (Y); c3; r2 (Y);w2 (Z); w2 (Y); c2;cedures for
Sort-Merge join, Partition Join, N-way Map-side join, and Simple N-way join 8M
Answer: In this schedule no read-write or write-write conflict arises before commit hence its
strict schedule:
Scheme
Schedule 8M
Q10. Describe the MapReduce join procedures for Sort-Merge join, Partition Join, N-
way Map-side join, and Simple N-way join. 8M
Answer: What is a Join?
The join operation is used to combine two or more database tables based on foreign keys. In
general, companies maintain separate tables for the customer and the transaction records in
their database. And, many times these companies need to generate analytic reports using
the data present in such separate tables. Therefore, they perform a join operation on these
separate tables using a common column (foreign key), like customer id, etc., to generate a
combined table. Then, they analyze this combined table to get the desired analytic reports.
Joins in MapReduce
Just like SQL join, we can also perform join operations in MapReduce on different data sets.
There are two types of join operations in MapReduce:
Map Side Join: As the name implies, the join operation is performed in the map
phase itself. Therefore, in the map side join, the mapper performs the join and it is
mandatory that the input to each map is partitioned and sorted according to the keys.
The map side join has been covered in a separate blog with an example.
Reduce Side Join: As the name suggests, in the reduce side join, the
reducer is responsible for performing the join operation. It is comparatively simple
and easier to implement than the map side join as the sorting and shuffling phase
sends the values having identical keys to the same reducer and therefore, by default,
the data is organized for us.
Scheme
Join 4.5M
Reduce Side Join 4M
Q11. Consider the three transactions T1, T2, and T3, and the schedules S1 and S2 given
below. Draw the serializability (precedence) graphs for S1 and S2, and state whether
each schedule is serializable or not. If a schedule is serializable, write down the
equivalent serial schedule(s). 12.5M
T1: r1 (X); r1 (Z); w1 (X);
T2: r2 (Z); r2 (Y); w2 (Z); w2 (Y);
T3: r3 (X); r3 (Y); w3 (Y);
S1: r1 (X); r2 (Z); r1 (Z); r3 (X); r3 (Y); w1 (X); w3 (Y); r2 (Y); w2 (Z); w2 (Y);
S2: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); w2 (Z); w3 (Y); w2 (Y);
Answer:
T1, T2, T3
_______________________
| T1 | T2 | T3
T| | |
I | r1(X) | r2(Z) | r3(X)
M | r1(Z) | r2(Y) | r3(Y)
E | w1(X) | w2(Z) | w3(Y)
| | w2(Y) |
Schedule: S1
_______________________
| T1 | T2 | T3
| | |
| r1(X) | |
T| | r2(Z) |
I | r1(Z) | |
M| | | r3(X)
E| | | r3(Y)
| w1(X) | |
| | | w3(Y)
| | r2(Y) |
| | w2(Z) |
| | w2(Y) |
Answer: Suppose we have two concurrent transactions T1 and T2, where both are updating
data d. Suppose T1 started first and read d for update. As soon as T1 read d, T2 started and
read d for its update. As soon as T2 reads d, T1 updates d to d’. Once T1 is complete, T2
updates d to d”. Here T2 is unaware of T1’s update as it has read the data before T1 has
updated it. Similarly, T1 is unaware of T2’s updates. What happens to final result and T1’s
update here? Which value of d will be final here – d’ or d” ?
Since T2 is unaware of T1’s update and is processed at the last, the updates done by T1 is
lost. The updates done by T2 will only be retained. T1’s update is totally lost and nowhere its
symptom of update is kept. This type of update is known as lost update.
But T1’s transaction is valid one and cannot be ignored. Its update is also as important as
T2’s. Probably if T1’s update might have changed the result of T2’s update (cases like update
is dependent on the value of the column that we are updating – d=d*10). Hence we cannot
lose the data that are being updated by any transactions. This type of lost update can be
prevented if these transactions are grouped and executed serially. Suppose T1 is allowed to
read and write d, once it completes write then T2 is allowed to read d, then we will have
updates done by T1 as well as T2. The first update will however changed by T2, the update of
T1 will be stored in undo log or rollback segment. Hence we will know at least there is some
value in between transaction begin (here transaction means group of T1 and T2 together) and
end of it (end of T2). Such a grouping of transactions and defining the order of execution is
known as scheduling or serialization. This type of execution guarantees isolation of
transaction. It will not have any dirty reads, non-repeatable reads, deadlocks or lost update
issues.
Scheme
Write the final values of X and Y as per schedule A. 3M
Is this a serializable schedule? ii. 6.5M
Write the final values of X and Y for all possible serial schedules as per schedule B. 3M