Index Covering Boosts SQL Server Query Performance

reating a non-clustered index that contains all the columns used in a SQL query, a technique called index covering, is a quick and easy solution to many query performance problems. Sometimes just adding a column or two to an index can really boost the query’s performance. Check out a couple of all-too-common scenarios where applying index covering can speed up the queries in your Microsoft SQL Server database.

Scenario 1: Speed Up a Query for a Table with a Clustered Index

Consider a very typical scenario: you have a table ORDERS with a clustered index on CUSTOMER_ID column. You need to speed up a select query quickly:

SELECT ORDER_DATE, SUM(AMOUNT) FROM ORDERS GROUP BY ORDER_DATE

The query’s execution plan is quite simple: the database engine scans the whole clustered index, and then it sorts the intermediate result set to satisfy the GROUP BY clause. Can a non-clustered index speed up the query? Definitely. Just create a non-clustered index that contains all the columns used in the query:

CREATE  INDEX order_amt ON dbo.ORDERS(ORDER_DATE, AMOUNT)

Rerun the query and you’ll find it runs very fast. Why? First, look at the execution plan: the query accesses only the index order_amt; it doesn’t touch the table at all. In fact, there is no need to access the table because all the columns necessary to satisfy the query are already stored in the index. This is index covering. The index is much smaller than the table, which is why the query read many fewer pages. Also, the index entries are already stored in the necessary order, so there is no need to sort the intermediate result set.

Not only does the plan look good, but the real execution cost of the query after creating the index also is much cheaper:

logical reads 1919, (snip)   CPU time = 875 ms

This is significantly less than the real execution costs before the index was created:

logical reads 8365,  (snip)   CPU time = 2015 ms

As you can see, the query ran much faster because it read fewer pages and did not sort intermediate results.

Now that the query runs fast enough, it is time to consider the price you paid for speeding it up. The index uses up some storage. Also, modifying the table might become slower. If having this query return as fast as possible is a high priority for you, you can stop at this point. Depending on your circumstances, however, you might want to consider the impact of this new index on the overall performance of the system. You can use the Index Tuning Wizard in SQL Server to validate your quick fix against a realistic workload, but that is beyond the scope of this article. (Microsoft TechNet offers instructions for Index Tuning Wizard use in SQL Server 2005 and SQL Server 7.0.)

Scenario 2: Speed Up Select Queries Using Predicates on a Column

Consider another typical scenario when index covering is very useful: you need to speed up several select queries, like the following two, using predicates on ITEM_ID:

SELECT TURNAROUND, COUNT(*) FROM DBO.ORDERS WHERE ITEM_ID=10000GROUP BY TURNAROUNDSELECT SUM(AMOUNT) FROM DBO.ORDERS WHERE ITEM_ID=10093

The table ORDERS is the same as in the previous scenario. It has a non-clustered index on the ITEM_ID column. Both queries use the non-clustered index on ITEM_ID and access the qualifying rows in the table via bookmark lookups. It would be great to create a clustered index on ITEM_ID (having a clustered index on a foreign key is very common), but that is not an option: a clustered index already exists on CUSTOMER_ID.

Instead, you can go for a covering non-clustered index on (ITEM_ID, AMOUNT, TURNAROUND). Both queries will run much faster, because the need to do bookmark lookups is eliminated. All the columns necessary for both queries are stored in index entries. Also consider dropping the index on (ITEM_ID)?you probably will no longer need it.

The Fine Print

When you add an index on a table, you usually slow down modifications against that table. This is why keeping the number of indexes as small as possible usually (not always) is very important. So under most circumstances you should have just one index on (ITEM_ID, AMOUNT, TURNAROUND), not separate indexes on (ITEM_ID, AMOUNT) and (ITEM_ID, TURNAROUND). This one index will use less space, and it will not slow down the modifications as much as several indexes would.

If you have a clustered index, then every index entry in a non-clustered index also stores all the columns necessary to locate the row via the clustered index (they are called bookmarks). These bookmark columns also count towards index covering. For instance, if the table ORDERS is clustered on CUSTOMER_ID, then the non-clustered index on ORDER_DATE covers the query:

SELECT DISTINCT CUSTOMER_ID FROM ORDERS WHERE ORDER_DT='20050917'

Note that some well-known rules of thumb about non-clustered indexes are not relevant to covering non-clustered ones. For instance, even though ORDER_DT is less selective than ITEM_ID, you can successfully use an index on ORDERS(ORDER_DT, ITEM_ID) to cover a query:

SELECT DISTINCT ITEM_ID FROM ORDERS WHERE ORDER_DT='20050202'

More to the point, putting the more selective column ITEM_ID first in the index will result in worse performance. Another example is the following query:

SELECT DISTINCT LAST_NAME, FIRST_NAME FROM CUSTOMERS WHERE FIRST_NAME LIKE '%eather%'

This query is covered by an index on (LAST_NAME, FIRST_NAME) even though only the second column in the index is used in the WHERE clause and the predicate FIRST_NAME LIKE ‘%eather%’ is not index-friendly (a.k.a. not sargeable).

For SQL Server-to-Oracle Migrants

If you have recently switched to MS SQL Server from Oracle, be aware that MS SQL Server’s indexes store entries even for the rows for which all the columns in the index definition are null. For example, even if both LAST_NAME and FIRST_NAME columns are nullable, the index on LAST_NAME, FIRST_NAME will still have entries for all the rows in the table?even for those rows with both LAST_NAME and FIRST_NAME null. So, unlike Oracle, MS SQL Server will consider the index on (LAST_NAME, FIRST_NAME) as covering the query:

SELECT DISTINCT LAST_NAME, FIRST_NAME FROM CUSTOMERS

(Index) Covering the Basics

What you have just learned are only the basics. Learn more about index covering and the Index Tuning Wizard to get the best performance from your queries.

Good luck!

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: