Finding the Greatest Lower Bound

Finding the Greatest Lower Bound

[Joe Celko’s Sales Gaps Puzzle]

Given a table “sales” and a column “saledate” and “customer,” is there a way to get the average number of days between sales dates for each customer in a single SQL statement? Let’s assume nobody makes a sale to the same person on the same day, so we have the very simple table:

    CREATE TABLE Sales        (customer CHAR(5) NOT NULL,        saledate DATE NOT NULL        PRIMARY KEY (customer, saledate));

This is a problem where the more you know the more you hurt yourself. To find the gap between sales, the SQL guru does a self join or makes VIEWS. The first task is to get the sales into a table with the current saledate and the date of the last purchase:

    CREATE VIEW LastSales (customer, thissaledate, lastsaledate)        AS SELECT S1.customer, S1.saledate,                (SELECT MAX(saledate)                    FROM Sales AS S2                    WHERE S2.saledate < S1.saledate                         AND S2.customer = S1.customer)            FROM Sales AS S1, Sales AS S2;
This is a greatest lower bound query — we want the highest date in the set of dates for this customer which comes before the current date.

Now we construct a VIEW with the gap in days between this sale and their last purchase. You could combine the two views in one statement, but it would be unreadable.

    CREATE VIEW SalesGap (customer, gap)        AS SELECT customer, DAYS(thissaledate, lastsaledate)            FROM LastSales;
The DAYS function, or something like it, is a library routine in each vendor’s SQL which will give the interval between two dates in days. The final answer is one query:
    SELECT customer, AVG(gap)        FROM SalesGap        GROUP BY customer;
You could combine the two nested views into the AVG() call, but it would be unreadable, might blow up and would run like molasses.

Or if you stop and think about the question being asked, you simply write:

    SELECT customer,             (MAX(saledate) – MIN(saledate) / (COUNT(*) – 1)) AS gap        FROM Sales        GROUP BY customer         HAVING COUNT(*) > 1;

Puzzle provided courtesy of:
Joe Celko
[email protected]

Share the Post:
XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes

ransomware cyber attack

Why Is Ransomware Such a Major Threat?

One of the most significant cyber threats faced by modern organizations is a ransomware attack. Ransomware attacks have grown in both sophistication and frequency over the past few years, forcing

data dictionary

Tools You Need to Make a Data Dictionary

Data dictionaries are crucial for organizations of all sizes that deal with large amounts of data. they are centralized repositories of all the data in organizations, including metadata such as