Question:
[Joe Celko’s Promo Puzzle]
Let’s assume that you have just gotten a job as the sales manager for a department store. Your database has the following two tables. One is a calendar of the promotional events the store has had, and the other is a list of the sales which have been made during the promotions. You need to write a query which will tell me which clerk had the highest amount of sales for each promotion, so we can pay them a performance bonus.
CREATE TABLE Promotions (promo CHAR (25) NOT NULL PRIMARY KEY, startdate DATE NOT NULL, enddate DATE NOT NULL); Promotions promo startdate enddate ===================================================== ‘Feast of St. Fred’ ‘1995-02-01’ ‘1995-02-07’ ‘National Pickle Pageant’ ‘1995-11-01’ ‘1995-11-07’ ‘Christmas Week’ ‘1995-12-18’ ‘1995-12-25’and
CREATE TABLE Sales (clerk CHAR (15) NOT NULL, saledate DATE NOT NULL, amount DECIMAL (8,2) NOT NULL); Sales clerk saledate amount ================================= ‘Curly’ ‘1995-02-03’ 250.99 ‘Curly’ ‘1995-02-03’ 250.99 ‘Curly’ ‘1995-02-04’ 100.00 ‘Curly’ ‘1995-02-05’ 400.98 ‘Curly’ ‘1995-12-19’ 400.98 ‘Curly’ ‘1995-12-20’ 4.98 ‘Larry’ ‘1995-02-03’ 257.50 ‘Larry’ ‘1995-02-04’ 110.00 ‘Larry’ ‘1995-02-05’ 300.98 ‘Larry’ ‘1995-11-01’ 150.25 ‘Larry’ ‘1995-11-01’ 325.00 ‘Larry’ ‘1995-11-02’ 150.75 ‘Larry’ ‘1995-12-23’ 257.50 ‘Larry’ ‘1995-12-24’ 25.50 ‘Moe’ ‘1995-11-01’ 325.00 ‘Moe’ ‘1995-11-01’ 999.75 ‘Moe’ ‘1995-11-03’ 150.00 ‘Moe’ ‘1995-12-18’ 800.00 ‘Moe’ ‘1995-12-19’ 100.00 ‘Moe’ ‘1995-12-20’ 200.00 ‘Moe’ ‘1995-01-04’ 100.00
Answer:
The trick in this query is that we need to find out what each employee sold during each promotion and finally pick the highest sum from those groups. The first part is a fairly easy join and GROUP BY statement.
The final step of finding the largest total sales in each grouping requires a fairly tricky HAVING clause. Let’s look at the answer first, then explain it.
SELECT S1.clerk, P1.promo, SUM(S1.amount) AS totsales FROM Sales AS S1, Promotions AS P1 WHERE S1.saledate BETWEEN P1.startdate AND P1.enddate GROUP BY S1.clerk, P1.promo HAVING SUM(amount) >= ALL (SELECT SUM(amount) FROM Sales AS S2 WHERE S2.clerk S1.clerk AND S2.saledate BETWEEN (SELECT startdate FROM Promotions AS P2 WHERE P2.promo = P1.promo) AND (SELECT enddate FROM Promotions AS P3 WHERE P3.promo = P1.promo) GROUP BY S2.clerk);We want the total sales for the chosen clerk and promotion to be equal or greater than the other total sales of all the other clerks during that promotion. The predicate “S2.clerk S1.clerk” excludes the other clerks from the subquery total. The subquery expressions in the BETWEEN predicate make sure that we are using the right dates for the promotion.
The first thought when trying to improve this query is to replace the subquery expressions in the BETWEEN predicate with direct outer references, like this:
SELECT S1.clerk, P1.promo, SUM(S1.amount) AS totsales FROM Sales AS S1, Promotions AS P1 WHERE S1.saledate BETWEEN P1.startdate AND P1.enddate GROUP BY S1.clerk, P1.promo HAVING SUM(amount) >= ALL (SELECT SUM(amount) FROM Sales AS S2 WHERE S2.clerk S1.clerk AND S2.saledate — Error !! BETWEEN P1.startdate AND P1.enddate GROUP BY S2.clerk);But this will not work — and if you know why, then you really know your SQL. Cover the rest of this column and try to figure it out before you read further.
The “GROUP BY S1.clerk, P1.promo” clause has created a grouped table whose rows contain only aggregate functions and two grouping columns. The original working table built in the FROM clause ceased to exist and was replaced by this grouped working table, so the startdate and enddate also ceased to exist at that point.
However, the subquery expressions work because they reference the outer table P1 while it is still available since the query works from the innermost subqueries outward, and not the grouped table. If we were looking for sales performance between two known, constant dates, then the second query would work when we replaced P1.startdate and P1.enddate with those constants.
To give you a better feel for the data, here are the total sales per clerk for each promotion:
‘Feast of St. Fred’ ‘Curly’ $1,002.96Puzzle provided courtesy of:
Joe Celko
[email protected]