(Slide) Containment Conjunctive Queries
(Slide) Containment Conjunctive Queries
select-project-join queries.
Useful for optimization of active elements,
e.g., checking distributed constraints,
maintaining materialized views.)
Useful for information integration.
Applying a CQ to a Database
Example
p(X; Y ) : , q(X; Z ) & q(Z; Y )
true:
1. X ! 1; Y ! 3; Z ! 2.
2. X ! 2; Y ! 4; Z ! 3.
Yield heads p(1; 3) and p(2; 4).
Containment
Q1 Q2 i for every database D, Q1(D) Q2(D).
Containment problem is NP-complete, but
Example
1
Claim: B A.
In proof, suppose p(x; y) is in B (D). Then
there is some w such that r(x; w), b(w; w), and
r(w; y) are in D.
In A, make the substitution X ! x, Y ! y,
W ! w, Z ! w.
Thus, the head of A becomes p(x; y), and all
subgoals of A are in D.
Thus, p(x; y) is also in A(D), proving B A.
1. Containment mappings.
2. Canonical databases.
Similar for basic CQ case, but (2) is useful for
more general cases like negated subgoals.
Containment Mappings
Example
A, B as above:
A: p(X,Y) :- r(X,W) & b(W,Z) & r(Z,Y)
B : p(X,Y) :- r(X,W) & b(W,W) & r(W,Y)
Example
C1: p(X) :- a(X,Y) & a(Y,Z) & a(Z,W)
C2: p(X) :- a(X,Y) & a(Y,X)
2
Proof:
a) X ! X required for head.
b) Thus, rst subgoal of C2 must map to
rst subgoal of C1; Y must map to Y .
c) Similarly, 2nd subgoal of C2 must map to
2nd subgoal of C1, so X must map to Z .
d) But we already found X maps to X .
Proof (If)
Let : Q2 ! Q1 be a containment mapping. Let D
be any DB.
Every tuple t in Q1(D) is produced by some
substitution on the variables of Q1 that
makes Q1's subgoals all become facts in D.
Claim: is a substitution for variables of
Q2 that produces t.
1. (Fi) = (some Gj ). Therefore, it is
in D.
2. (H2 ) = (H1 ) = t.
Thus, every t in Q1(D) is also in Q2 (D); i.e.,
Q1 Q2.
Example
p(X) :- a(X,Y) & a(Y,Z) & a(Z,W)
Q2:
Q1:
E :- F1 & Fm (X; Y )
t
ab
Example
Again consider
A: p(X,Y) :- r(X,W) & b(W,Z) & r(Z,Y)
B : p(X,Y) :- r(X,W) & b(W,W) & r(W,Y)
Example
C1: p(X) :- a(X,Y) & a(Y,Z) & a(Z,W)
C2: p(X) :- a(X,Y) & a(Y,X)
Here is the test for C2 C1 :
Choose constants X ! 0, Y ! 1.
Canonical DB from C1 is
D = fa(0; 1); a(1; 0)g
general.
Sariaya's algorithm is a polynomial-time test
of Q1 Q2 for the common case that no
predicate appears more than twice among the
subgoals of Q1.
They can appear any number of times in
Q2.
The algorithm is a reduction to 2SAT and
yields a linear-time algorithm.
Our algorithm is more direct, but quadratic.
The Algorithm
Example
Example
CQ Q1 is:
Harder Cases
Datalog program CQ: doubly exponential