devxlogo

Calculating Averages When Joins Are Required to Filter Data

Calculating Averages When Joins Are Required to Filter Data

Question:
[Joe Celko’s Magazine Sales Puzzle]

This one was posted on the Sybase forum of CompuServe by Keith McGregor in November of 1994.

One of his end users came to him with the following query. After nearly three days of trial and error, he still didn’t have a clue how to tell her to do it. He could have done this in about 30 minutes using COBOL and flat files.

You are given the following tables for a magazine distribution database:

    CREATE TABLE Titles        (product_id INTEGER NOT NULL PRIMARY KEY,        title INTEGER NOT NULL,        issue INTEGER NOT NULL,        issue_year INTEGER NOT NULL);    CREATE TABLE Sales        (product_id INTEGER NOT NULL,        custno INTEGER NOT NULL,         net_sold_qty INTEGER NOT NULL,         PRIMARY KEY (product_id, customer));    CREATE TABLE Customers        (custno INTEGER NOT NULL PRIMARY KEY        customer_name CHAR (20) NOT NULL);
He needs to select the customer_name(s) where:
a) The average net_sold_qty is greater than two for BOTH title 02667 and 48632 (if the average is two or less for either one, do not select the customer at all).
or
b) The average net_sold_qty is greater than five for title 01107 (if this is true, select the customer regardless of the result of condition a).

Answer:
Let’s create a view of the three tables joined together which will give us the basic information we are after. Maybe this view can be used for other reports later.

    CREATE VIEW Magazine(customer_name, title, net_sold_qty)        AS SELECT Sales.customer_name, Titles.title, net_sold_qty            FROM Titles, Sales, Customers            WHERE Sales.custno = Customers.custno                 AND Titles.product_id = Sales.product_id;
Then we write the query from hell:
    SELECT customer_name        FROM Magazines AS M0        GROUP BY customer_name         HAVING  — the two accept conditions            (5 < (SELECT AVG(net_sold_qty)                FROM Magazines AS M1                WHERE M1.custno = M0.custno                     AND title = '01107'))            OR (2 < (SELECT AVG(net_sold_qty)                FROM Magazines AS M2                WHERE M2.custno = M0.custno                     AND title IN ('02667', '48632')))         AND NOT -- the two reject conditions            ( 2 > (SELECT AVG(net_sold_qty)                FROM Magazines AS M3                WHERE M3.custno = M0.custno                     AND title IN = ‘02667’)            OR             2 > (SELECT AVG(net_sold_qty)                FROM Magazines AS M4                WHERE M4.custno = M0.custno                     AND title = ‘48632’)            );
For bonus points, can any reader simplify or improve this expression? Hint: DeMorgan’s law might be useful and it would help to have a decision table.
See also  How College Students Can Shape the Future of Tech Responsibility

The solution provided to the puzzle can be greatly simplified by using two techniques: First create a VIEW of the average sales and include an ‘EXISTS’ for the condition of two titles which both must exceed a threshold.

The DML:

    CREATE VIEW Magazines (custno, title, avg_qty_sold) AS        SELECT Sales.custno, Titles.title, AVG(Sales.net_sold_qty)            FROM Titles, Customers, Sales            WHERE Titles.product_id = Sales.product_id                AND Customers.custno = Sales.custno                AND Titles.title IN (01107, 02667, 48632)            GROUP BY Sales.custno, Titles.title;    SELECT DISTINCT Customers.customer_name        FROM Magazines, Customers        WHERE Customers.custno = Magazines.custno            AND ((Magazines.title = 1107 AND Magazines.avg_qty_sold > 5)            OR (Magazines.title = 2667 AND Magazines.avg_qty_sold > 2            AND EXISTS                (SELECT 1                    FROM Magazines, Other                    WHERE Other.title = 48632                        AND Other.custno = Magazines.custno                        AND Other.avg_qty_sold > 2)));

Puzzle provided courtesy of:
Joe Celko
[email protected]

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist