Nested Outer Joins

Nested Outer Joins

Question:
[Joe Celko’s Outer (JOIN) Limits Puzzle]

Gerard Manko at ARI had a problem in 1994 April. They had just switched over from Paradox to Watcom SQL. The conversion of the legacy database was done on a one-for-one basis, without any thought of normalization. Yes, I know that as the SQL Guru, I should have sent him to whatever ring of hell we are reserving for people who don’t normalize. But that does not get the job done, and it is something we find in the real world all the time.

The system tracks teams of employees to work on jobs. Each job has a slot for a primary and a slot for an optional assistant mechanic. The tables involved look like this:

    CREATE TABLE Jobs        (jobid INTEGER NOT NULL PRIMARY KEY,        startdate DATE NOT NULL,        … );    CREATE TABLE Employees        (empid INTEGER NOT NULL PRIMARY KEY        name CHAR (20) NOT NULL,        …);    CREATE TABLE Teams        (jobid INTEGER NOT NULL,        mechslot INTEGER NOT NULL,        empid INTEGER NOT NULL,        … );
Your first task is just to add some integrity checking to the Teams table. We will not worry about normalization, since a lot of the procedures depend on having these tables set up as they are.

What you want to do is build a query for a report that lists all the jobs by jobid, the primary mechanic (if any), and the assistant mechanic (if any). Hint: you can get the jobids from Jobs because that table has all of the current jobs, while the Teams table lists only those jobs for which a team has been assigned.

Answer:
The first problem is to add a little referential integrity. The Teams table should probably be tied to the others with foreign key references and it is always a good idea to check the codes in the database schema, like so:

    CREATE TABLE Teams        (jobid INTEGER NOT NULL REFERENCES Jobs(jobid),        mechslot INTEGER NOT NULL            CHECK (mechslot IN (‘Primary’, ‘Assistant’)),        empid INTEGER NOT NULL REFERENCES Employees(empid),        …);
The trick in this problem is that experienced SQL people will immediately think of using a LEFT OUTER JOIN because if they wanted to get just the primary mechanics, they would write:
    SELECT Jobs.jobid, Teams.empid AS Primary        FROM (Jobs LEFT OUTER JOIN Teams            ON Jobs.jobid = Teams.jobid)        WHERE Teams.mechslot = ‘Primary’;
Or realizing that Jobs is the preserved table in the left outer join:
    SELECT Jobs.jobid, Teams.empid AS Primary        FROM (Jobs LEFT OUTER JOIN Teams            ON (Jobs.jobid = Teams.jobid                AND Teams.mechslot = ‘Primary’));
The reader might want to verify for himself that if the Teams table were the preserved table, this transform would not work.

Or if you wanted to get the employees by name, you could nest the joins deeper:

    SELECT Jobs.jobid, Employees.name AS Primary        FROM ((Jobs LEFT OUTER JOIN Teams            ON Jobs.jobid = Teams.jobid)         AS P1 LEFT OUTER JOIN Employees             ON P1.empid = Employees.empid)        WHERE P1.mechslot = ‘Primary’;
The problem here is that you want to do two independent outer joins for each mechanic’s slot on a team and put the results in one table. It is probably possible to build a horrible, deeply nested self outer join all in one SELECT statement. But you could not read it or understand it. You could do the report with VIEWs for primary and assistant mechanics as shown, then put them together. But you can avoid all of this mess with:
    SELECT Jobs.jobid,        (SELECT empid FROM Teams            WHERE Jobs.jobid = Teams.jobid                AND Teams.mechslot = ‘Primary’) AS Primary,        (SELECT empid FROM Teams            WHERE Jobs.jobid = Teams.jobid                AND Teams.mechslot = ‘Assistant’) AS Assistant        FROM Jobs;
The neat SQL-92 trick is the ability to nest two independent scalar SELECT statements in the outermost SELECT. To add the employee’s name, just change the innermost SELECT statements.
    SELECT Jobs.jobid,        (SELECT name FROM Teams, Employees            WHERE Jobs.jobid = Teams.jobid                AND Employees.empid = Teams.empid                AND Teams.mechslot = ‘Primary’) AS Primary,        (SELECT name FROM Teams, Employees            WHERE Jobs.jobid = Teams.jobid                AND Employees.empid = Teams.empid                AND Teams.mechslot = ‘Assistant’) AS Assistant        FROM Jobs;
If you have a guy acting as both primary and assistant mechanic on a single job, then you will get him in both slots. If you have two or more primary mechanics, or you have two or more assistant mechanics on a job, then you will get an error. If you have no primary or assistant mechanics, then you will get an empty SELECT result and that becomes a NULL. That gives you the outer joins you wanted.

Puzzle provided courtesy of:
Joe Celko
[email protected]

Share the Post:
Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular

XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes