Using SQL’s EXISTS Predicate to Identify Missing Data

Using SQL’s EXISTS Predicate to Identify Missing Data

inding data in a relational database is usually a simple task, especially if you’re simply comparing entries to a specific value, but queries rapidly become more complicated when you start hunting for data that’s missing, in other words, data that could exist, but doesn’t. Missing data is data related to existing data that simply doesn’t occur. You could find missing values using a two-query solution, but SQL’s EXISTS predicate can help you quickly find missing values with just one query?a subquery to be exact.

For example, you may create a query to find active customers by counting recent orders. However, finding inactive customers may be just as important; perhaps you can learn why they’re inactive and perhaps turn them into active customers. In this article, you’ll see how to use the SQL EXISTS predicate to find missing data. This article is aimed at the Jet and Transact-SQL (T-SQL) audience. We’ll offer specific instructions when syntax or rules differ between the two dialects.

Fast Facts
This article refers to Jet and Transact-SQL. The examples work in Access 97, 2000, and 2002. The T-SQL syntax is correct for SQL Server 2000.

SQL’s EXISTS predicate specifies a subquery and then compares a value against the existence of one or more rows in that subquery. The subquery returns True when the subquery contains any rows and False when it doesn’t. However, all of that is probably clear as mud if you’re not familiar with subqueries.

A subquery is a SELECT query within another SELECT, INSERT, UPDATE, DELETE query, or another subquery. In other words, the subquery returns a result set, which is then subject to the main query’s conditions and criteria. Think of the subquery as a filter. The results of the embedded SELECT, also known as the inner query or inner select, become part of the search condition for the main query, otherwise known as the outer query or outer select.

The subquery can take many forms:

   SELECT field1?, (subquery) AS alias FROM datasource   SELECT fieldlist FROM datasource WHERE field comparison operator (subquery)   SELECT fieldlist FROM datasource WHERE field ANY|SOME|ALL (subquery)   WHERE expression [NOT] EXISTS (subquery)

In the preceding code, comparison operator equals one of the following: =, , , >=, , !lt;, or

Create a Subquery with EXISTS
When combined with EXISTS, an outer query checks for the existence of values within the inner query’s result set. Now, using the Northwind sample database that comes with Access, suppose you want to learn which companies have placed orders. The solution is simple?you don’t need a complex EXISTS solution just yet. The following query returns a unique list of companies with records in the Orders table (see Figure 1):

   SELECT DISTINCT Orders.CustomerID   FROM Orders

There are 89 companies that have placed at least one order (see Figure 1). The Access query displays the company name instead of the CustomerID value because the CustomerID field in the Orders table is a lookup field. A lookup field is an Access data type that displays a value other than the value that’s actually stored (SQL Server 2000 also supports lookup fields via an ADP front-end).

Figure 2).
Figure 2: Combining the EXISTS predicate with the NOT operator returns the two customers who haven’t placed an order.

The inner query compares the CustomerID value for each record in the Orders table to the CustomerID values in the Customers table. When a matching value is found, the inner query returns True. A True value means the customer has placed an order, but the NOT operator preceding the EXISTS predicate eliminates that particular value from the outer query’s result set.

When the inner query doesn’t find any matching records, it returns False. Remember, a False value means the customer hasn’t ordered. The NOT operator then negates that False value, so the outer query can include that record in its result set. The EXISTS predicate finds values that do exist; preceding that predicate with the NOT operator finds those that don’t.

Missing data can often be just as informative as data that’s available. Unfortunately, learning what’s missing isn’t always as easy as finding existing data. Combining SQL’s EXISTS predicate with the NOT operator solves the problem and quickly turns up missing information.


Share the Post: