Calculating Averages When Joins Are Required to Filter Data

Question:
[Joe Celko’s Magazine Sales Puzzle]

This one was posted on the Sybase forum of CompuServe by Keith McGregor in November of 1994.

One of his end users came to him with the following query. After nearly three days of trial and error, he still didn’t have a clue how to tell her to do it. He could have done this in about 30 minutes using COBOL and flat files.

You are given the following tables for a magazine distribution database:

    CREATE TABLE Titles        (product_id INTEGER NOT NULL PRIMARY KEY,        title INTEGER NOT NULL,        issue INTEGER NOT NULL,        issue_year INTEGER NOT NULL);    CREATE TABLE Sales        (product_id INTEGER NOT NULL,        custno INTEGER NOT NULL,         net_sold_qty INTEGER NOT NULL,         PRIMARY KEY (product_id, customer));    CREATE TABLE Customers        (custno INTEGER NOT NULL PRIMARY KEY        customer_name CHAR (20) NOT NULL);
He needs to select the customer_name(s) where:
a) The average net_sold_qty is greater than two for BOTH title 02667 and 48632 (if the average is two or less for either one, do not select the customer at all).
or
b) The average net_sold_qty is greater than five for title 01107 (if this is true, select the customer regardless of the result of condition a).

Answer:
Let’s create a view of the three tables joined together which will give us the basic information we are after. Maybe this view can be used for other reports later.

    CREATE VIEW Magazine(customer_name, title, net_sold_qty)        AS SELECT Sales.customer_name, Titles.title, net_sold_qty            FROM Titles, Sales, Customers            WHERE Sales.custno = Customers.custno                 AND Titles.product_id = Sales.product_id;
Then we write the query from hell:
    SELECT customer_name        FROM Magazines AS M0        GROUP BY customer_name         HAVING  — the two accept conditions            (5 < (SELECT AVG(net_sold_qty)                FROM Magazines AS M1                WHERE M1.custno = M0.custno                     AND title = '01107'))            OR (2 < (SELECT AVG(net_sold_qty)                FROM Magazines AS M2                WHERE M2.custno = M0.custno                     AND title IN ('02667', '48632')))         AND NOT -- the two reject conditions            ( 2 > (SELECT AVG(net_sold_qty)                FROM Magazines AS M3                WHERE M3.custno = M0.custno                     AND title IN = ‘02667’)            OR             2 > (SELECT AVG(net_sold_qty)                FROM Magazines AS M4                WHERE M4.custno = M0.custno                     AND title = ‘48632’)            );
For bonus points, can any reader simplify or improve this expression? Hint: DeMorgan’s law might be useful and it would help to have a decision table.

The solution provided to the puzzle can be greatly simplified by using two techniques: First create a VIEW of the average sales and include an ‘EXISTS’ for the condition of two titles which both must exceed a threshold.

The DML:

    CREATE VIEW Magazines (custno, title, avg_qty_sold) AS        SELECT Sales.custno, Titles.title, AVG(Sales.net_sold_qty)            FROM Titles, Customers, Sales            WHERE Titles.product_id = Sales.product_id                AND Customers.custno = Sales.custno                AND Titles.title IN (01107, 02667, 48632)            GROUP BY Sales.custno, Titles.title;    SELECT DISTINCT Customers.customer_name        FROM Magazines, Customers        WHERE Customers.custno = Magazines.custno            AND ((Magazines.title = 1107 AND Magazines.avg_qty_sold > 5)            OR (Magazines.title = 2667 AND Magazines.avg_qty_sold > 2            AND EXISTS                (SELECT 1                    FROM Magazines, Other                    WHERE Other.title = 48632                        AND Other.custno = Magazines.custno                        AND Other.avg_qty_sold > 2)));

Puzzle provided courtesy of:
Joe Celko
[email protected]

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

The Latest

your company's audio

4 Areas of Your Company Where Your Audio Really Matters

Your company probably relies on audio more than you realize. Whether you’re creating a spoken text message to a colleague or giving a speech, you want your audio to shine. Otherwise, you could cause avoidable friction points and potentially hurt your brand reputation. For example, let’s say you create a

chrome os developer mode

How to Turn on Chrome OS Developer Mode

Google’s Chrome OS is a popular operating system that is widely used on Chromebooks and other devices. While it is designed to be simple and user-friendly, there are times when users may want to access additional features and functionality. One way to do this is by turning on Chrome OS

homes in the real estate industry

Exploring the Latest Tech Trends Impacting the Real Estate Industry

The real estate industry is changing thanks to the newest technological advancements. These new developments — from blockchain and AI to virtual reality and 3D printing — are poised to change how we buy and sell homes. Real estate brokers, buyers, sellers, wholesale real estate professionals, fix and flippers, and beyond may