cs317 s2022 Midsem
cs317 s2022 Midsem
b. Given relations reading(time, val), write an SQL query to find for each time t the
average of the readings at t-1, t and t+1. Those times where there is no reading
at t-1 or t+1 are to be omitted. (3 marks)
Answer:
SELECT r2.time, (r1.val+r2.val+r3.val)/3
FROM reading as r1, reading as r2, reading as r3
WHERE r1.time+1=r2.time AND r2.time+1=r3.time
Rubric: 1 mark for from clause, 1 mark for where clause and 1 mark for select
clause
Alternative using SQL window syntax:
SELECT time, avg(val) over(ORDER BY time ROWS BETWEEN 1
PRECEDING AND 1 FOLLOWING) as ‘avg’
FROM reading;
Rubric: 1 mark each for preceding and following row, 1 mark for complete query
c. Given entity sets e1 and e2 with primary keys p1 and p2 respectively and a
many-to-one relationship from e1 to e2 called adv, write down the SQL create
table statement for relation r, with all relevant constraints. (errata: should have
said relations, not relation r) (3 marks)
Answer: RA query- (R.a γ (r ⟕ s)) [Note the left outer join used here;
count(S.c)
it is also OK to use inner joins and union to implement left outer join)
Rubrics: 1 mark for gamma operator use, 1 mark for the group by attribute, 1
mark for aggregate, 2 marks for r left outer join s
b. Would your relational algebra query and SQL query be equivalent if r.A is not a
primary key for r.A? Explain your answer. (3 marks)
Answer: Not always. There can be duplicate rows in the output. The duplicate
rows, if any, will be eliminated merged by the RA query. Also for [1.5 marks]
Reason: As attribute A can have duplicate values when A is not a primary key,
there can be rows with the same value for A but different value for B. [1.5 marks]
c. How can you simplify the query if r.B is a foreign key referencing s.B, and r.B is
declared as not null? (2 marks)
Answer: In that case the left outer join will be equivalent to an inner join because
every r.B value will have a matching s.B value, so we can replace the left outer
join by an inner join
Rubrics: 1 mark for saying inner join (or natural join) instead of outer join, 1 mark
for explanation of why it will be equivalent.
3. FDs (10 marks): Given the relational schema r(A,B,C,D,E,F) and FDs Paras garg
([email protected])
A-> BC, B->C, CD->E, AD->E
do the following:
a. Find a candidate key and explain why it is a candidate key (3 marks)
b. Find a canonical cover, showing all the steps in computing it. (4 marks)
c. Give a 3NF decomposition of the relation with an explanation of how you
computed it (3 marks)
Answer :
4. Multisets and null values (10 marks): Multisets behave differently from sets.
Consider relation r(A,B,C) - Shiva Tarun(180050042)
a. A is primary key of relation r implies that A->BC. Does A-> BC imply that A is
primary key of r, if r is a multiset relation? Explain your answer. (3 marks)
Answer: No, A->BC Doesn’t imply that A is the primary key. (1 mark)
b. Given B->C, the decomposition r1(A,B) and r2(B,C) is lossless join with sets.
Does this property hold with multisets, where r1 and r2 are created by multiset
projection on r, and the multiset version of join is used instead of the set version?
Give a small example to explain your answer. (4 marks)
Counter-example/explanation
Consider the case where we have two rows [x,y,z]. B->C holds here. On
decomposition we get two [x,y] and two [y,z]. On joining them back we get four
[x,y,z]. (3 marks)
c. Restricting ourselves to sets, suppose that B may have null values Then is the
decomposition lossless join? Explain your answer (3 marks)
Answer:Its a lossy decomposition. (1 mark)
5. Temporal data (10 marks). Given two temporal relations r(A, B, start, end) and s(B, C,
start, end), where the valid time of a tuple is [start,end) Sai
Phanindra([email protected])
a. Write an SQL query to check if the relation r satisfies the constraint that r.A is a
temporal primary key. The query should return a non-empty relation if the
temporal primary key constraint is violated, and the empty relation if the
constraint is satisfied. You can assume a function overlaps(s1, e1, s2, e2) which
returns true iff [s1, e1) overlaps with [s2, e2) (5 marks)
Answer:
SELECT r1.A
FROM r as r1, r as r2
WHERE r1.A = r2.A and overlaps(r1.start, r1.end, r2.start, r2.end) and
!(r1.B=r2.B and r1.start = r2.start and r1.end = r2.end)
6. Big Data and map-reduce (12 marks): Vinayak Gosula ([email protected]) Using
the signature map(record) and reduce (key, list-of-values), write pseudocode for the
following::
a. Given a relation packet(time, src, dest, size) recording each packet flowing on a
network link, write a map-reduce program that outputs the total number of bytes
flowing from each source in each second. NOTE: this is not a streaming system,
the relation is stored already. (6 marks)
Answer: Assuming 1 record per line, and inputs to map function are lines,
MAP(record):
time, src, dest, size = record.split()
emit( {src, floor(time)}, size )
REDUCE(key, list):
s=0;
For each value in list
s=s+value
emit( key, s )
Rubrics: 1 mark for 1st line of map, 2 marks for 2nd line of map
(partial mark of 1 mark may be given if floor is omitted.
3 marks for reduce (partial mark of 1 may be given in case of small error)
b. Write map-reduce code to execute the following query on relations r(A,B), and
s(B,C,D) (6 marks):
SELECT r.A, r.B, s.C, s.D FROM r LEFT OUTER JOIN s ON (r.B=s.B)
Answer:
MAP(record)
If record is from r emit (r.B, (“r”, r.A))
If record is from s emit (s.B, (“s”, s.C, s.D)
REDUCE(key, list)
r-list = s-list = empty
Iterate over list and add records to r-list or s-list depending
on value as “r” or “s”
If s-list is empty
for each record in r-list emit (r.A, key, null, null)
Else for each record r1 in r-list, for each record s1 in s-list
Emit (r1.A, key, s1.C, s1.D)
NOTE: value emitted by map can include r.B and s.B,
although not required since it is in key
Rubrics: Map: 1 mark for each emit.
Reduce: 1 mark for creation of the two lists
1 marks for proper if condition
1 mark each for correct emit in each of two cases