Asked By : Ken Li
Answered By : Patrick87
- $(r, t)$ is a tuple in $R$;
- (a) $(t, s)$ is a tuple of $S$ or (b) $s = w$;
- If $(r, t, s)$ is in the set for $s neq w$, then $(r, t, w)$ is not in the set.
Example: $R$’s schema is $(A_{1}, A_{2}, A_{3})$, $S$’s schema is $(A_{2}, A_{3}, A_{4})$, and we have that $R = {(1, 2, 3), (4, 5, 6)}$ and $S = {(2, 3, 4), (2, 3, 6)}$. By (1) and (2) we get the intermediate result ${(1, 2, 3, 4),(1, 2, 3, 6), (1, 2, 3, epsilon),(4, 5, 6, epsilon)}$. By (3) we must remove $(1, 2, 3, epsilon)$, since we have (for instance) $(1, 2, 3, 4)$ and $s = 4 neq epsilon = w$. We are thus left with ${(1, 2, 3, 4), (1, 2, 3, 6), (4, 5, 6, epsilon)}$, the expected result for a left join. Theorem: R LEFT JOIN S is equivalent to (R EQUIJOIN S) UNION ((((PROJECT_T R) DIFFERENCE (PROJECT_T S)) EQUIJOIN R) JOIN w). Proof: (R EQUIJOIN S) gives us everything required by (1) and (2a). We claim that ((((PROJECT_T R) DIFFERENCE (PROJECT_T S)) EQUIJOIN R) JOIN w) gives us everything of the form (r, t, w) required by (2b) and (3). To see this, first notice that (((PROJECT_T R) DIFFERENCE (PROJECT_T S)) EQUIJOIN R) is the set of all tuples in $R$ for which there is no corresponding tuple in $S$. To see that, it suffices to note that by projecting the common attributes out of $R$ and $S$ (the attribute set $T$) and taking the difference, one is left with all and only those tuples (with schema $T$) which are represented in $R$ but not $S$. By the EQUIJOIN with $R$, we recover all and only those tuples in $R$ which have values for attributes in $T$ which are present in $R$ but not in $S$; namely, precisely the set of tuples we have so far claimed. Next, notice that the schema of (((PROJECT_T R) DIFFERENCE (PROJECT_T S)) is the same as that of $R$ (namely, $(R’, T)$), while the schema of $w$ is $S'$. The JOIN operation is therefore a Cartesian product, we we get all tuples of the form $(r, t, w)$ where there is no $(t, s)$ in $S$ corresponding to $(r, t)$ in $R$. To see that this is precisely the set of tuples we needed to add to R EQUIJOIN S in order to construct R LEFT JOIN S, consider the following: by construction, (3) is satisfied, since R EQUIJOIN S cannot contain $(r, t, s)$ if ((((PROJECT_T R) DIFFERENCE (PROJECT_T S)) EQUIJOIN R) JOIN w) contains $(r, t, w)$ (if it did, then the second part’s containing $(r, t, w)$ would be a contradiction); if we were to add another $(r, t, w)$ not in ((((PROJECT_T R) DIFFERENCE (PROJECT_T S)) EQUIJOIN R) JOIN w), then there would be a $(t, s)$ in $S$ corresponding to $(r, t)$ in $R$, and by the definition of EQUIJOIN, $(r, t, s)$ would also be in R LEFT JOIN S, a contradiction of (3). This completes the proof. Now we show that left join can be used to construct difference: Theorem: R DIFFERENCE S is equivalent to PROJECT_T(SELECT_{t’=w}(R LEFT JOIN (SELECT_{s=s’}(((S JOIN RENAME_{T->T’}(S))))))) Proof: Notice that here, $R'$ and $S'$ are empty, since all attributes are shared for DIFFERENCE to make sense. First, we create a new relation from $S$ by duplicating the attribute set in $S$ (handled by RENAME and JOIN) so that it consists of tuples $(t, t')$ on attribute set $(T, T')$ where $t = t'$ (handled by the SELECT). The left join leaves us with tuples of the form $(t, t')$ where $t=t'$ or $t'=w$. Now, to get rid of entries which do also appear in $S$, we must keep only the tuples of the form $(t, w)$, which is handled by the outermost SELECT. The last PROJECT gets rid of the temporary attribute set $T'$ and leaves us with the difference in terms of the original schema. Example: Let $R = {(1, 2), (3, 4), (5, 6)}$ and $S = {(3, 4), (5, 6), (7, 8)}$. We first get $S$ with the RENAMEd attribute set $T'$: ${(3, 4), (5, 6), (7, 8)}$. The JOIN operation gives us the Cartesian product with all nine possible pairings; this set is not written here for reasons of formatting. The SELECT then pares this down to ${(3, 4, 3, 4), (5, 6, 5, 6), (7, 8, 7, 8)}$. The LEFT JOIN with $R$ gives ${(1, 2, epsilon, epsilon), (3, 4, 3, 4), (5, 6, 5, 6)}$. The SELECT gives ${(1, 2, epsilon, epsilon)}$. The PROJECT gives ${(1, 2)}$, the desired answer.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/2