__Question__:

[Joe Celko's Equal Sets Puzzle]

Set theory has two symbols for subsets. One is a
"horseshoe" on its side, which means that set A
is contained within set B and is sometimes called a
proper subset. The other is the same symbol with
a horizontal bar under it, which means "contained in
or equal to," which is sometimes called a subset
or a containment operator.

Standard SQL has never had an operator to compare
tables against each other. Several college text
books on Relational Databases mention a CONTAINS
predicate which does not exist in standard SQL-89
nor was it proposed for SQL-92. Two such offenders
are *An Introduction to Data Base Systems* by
Bipin C. Desai (West Publishing 1990, ISBN 0-314-66771-7)
and *Fundamentals of Database Systems* by Elmasri and Navthe
(Benjamin Cummings 1989, ISBN 0-8053-0145-3). This
predicate used to exist in the original System R,
IBM's first experimental SQL system, but it was
dropped from later SQL implementations because
of the expense of running it.

The IN predicate is a test for membership, not for
subsets. For those of you who remember your high school
set theory, membership is shown with a stylized epsilon
with the containing set on the right side. Membership
is for one element, while a subset is itself a set, not
just an element.

Chris Date's puzzle in the December issue of *Database
Programming & Design* magazine ("A Matter of Integrity, Part II"
*According to Date*, 1993 December) was to use a supplier and
parts table to find pairs of suppliers who provide EXACTLY
the same parts. This is the same thing as finding two equal sets.

CREATE TABLE SupParts
(sno CHAR(2) NOT NULL,
pno CHAR(2) NOT NULL,
PRIMARY KEY (sno, pno));

The usual way of proving that two sets are equal to each
other is to show that set A contains set B, and set B contains set A.
What you would usually do in standard SQL would be to show that
there exists no element in set A that is not in set B and
therefore A is a subset of B.

*Answer*:

Instead of using subsets, I thought I would look for
another way to do set equality. First I join one
supplier to another on their common parts, eliminating
the situation where supplier one is the same as
supplier two, so that I have the intersection of the
two sets. If the intersection has the same number of
pairs as each of the two sets has elements,
then the two sets are equal.

SELECT SP1.sno AS Supplier1, SP2.sno AS Supplier2
FROM SupParts AS SP1 JOIN SupParts AS SP2
ON (SP1.pno = SP2.pno AND SP1.sno < SP2.sno)
GROUP BY Supplier1, Supplier2
HAVING (SELECT COUNT(*)
FROM SupParts AS SP3
WHERE SP3.sno = SP1.sno)
= (SELECT COUNT(*)
FROM SupParts AS SP4
WHERE SP4.sno = SP2.sno);

This was tested in Watcom SQL and uses some SQL-92 features.
If there is an index on the supplier number in the SupParts
table, it can provide the counts directly as well as help
with the join operation.

It is left to the reader as an exercise to modify the query to
do a proper subset operation. Hint: first remember to change
the "less than" to "not equal", then look at the WHERE clause.

*Puzzle provided courtesy of:
*

Joe Celko

71062.1056@compuserve.com