DevX HomePage

DB2 Explains Itself: A Roadmap to Faster Query Runtime

DB2 creates a plan for each dynamic SQL query you run. Evaluating this plan is one of the keys to query runtime efficiency. In this 10-Minute Solution, DB2 Pro Greg Nash shows you how to use DB2's explain facility to identify and correct the planning weaknesses that are slowing your queries down.
ave you ever wondered why some of your queries take so long to run? Or how DB2 interprets your requests?

As with all high-level programming languages, SQL must be translated into a set of smaller operations, which can be performed in turn. Because the efficiency of each operation can vary greatly with the nature of the data, considerable attention should be paid to selecting the best order of operations.

You can compare this with shopping. There may be many paths from your home to the shop you need to visit. You might want to park as close to the shop as you can, or perhaps park further away to avoid parking station costs and queuing to leave. There may be a smaller shop closer to home that has what you want. You may want to avoid a busy road. The best shopping path isn't always obvious, and you could waste hours though poor planning.

To shorten a query's runtime, you need to locate any weaknesses in the operation planning. How? IBM's online documentation boasts: "DB2 provides the most comprehensive explain facility in the industry with detailed optimizer information on the access plan chosen for an explained SQL statement." The explain facility outlines how the database will go about your work. This allows you to identify and correct any inefficiencies in your query planning.



How can I make my queries faster?



Use DB2's EXPLAIN feature to access and evaluate how it plans to access your data.




Background: Plans and Costs
DB2 builds plans. A plan is basically the directions that will be followed to the data (or to the shops). To save time, some items on the plan can be performed in parallel (if you have a spouse, they might buy meat while you buy fruit).

Static plans are built once, and used many times—DB2 stores them when you BIND a collection of static SQL. It's like establishing a favorite path for your shopping. You don't have to think about it every time, as you have already decided on a good plan. Plans for dynamic SQL, however, must be evaluated each time. Each plan can be evaluated for its overall cost, which takes into account the nature of the data, and DB2's understanding of the system on which it's running. Critically important in this evaluation are the runtime statistics—the size of your tables, the cardinality of various columns, and the clustering in the indexes.

Figure 1. Color Coding: Here is a sample tree from "Visual Explain."
Figure 2.
Drilling Down: Here is a sample details of a node from "Visual Explain."

Show Me the Plan
There are several tools centered around explaining DB2's plans: All these tools use a set of data known as the "explain tables." Visual Explain is capable of creating these tables on demand. With the other tools, you'll need to run a script such as in Listing 2:




C:\PROGRA~1\SQLLIB\bin>cd ..\misc


C:\PROGRA~1\SQLLIB\misc>db2 connect to sample

   Database Connection Information

 Database server        = DB2/NT 7.2.2
 SQL authorization ID   = GRNASH
 Local database alias   = SAMPLE


C:\PROGRA~1\SQLLIB\misc>db2 -tf explain.ddl

******* IMPORTANT **********

USAGE: db2 -tf EXPLAIN.DDL

******* IMPORTANT **********


DB20000I  The UPDATE COMMAND OPTIONS command completed successfully.

Now you can assess the results.




The Plan Explained

Figure 3. Examining the Nodes: This larger graph shows some nodes feeding multiple higher nodes (using temporary tables).
Whichever explain tool you use, you will have an estimate of the cost of the query in "timerons". A "timeron" is an abstract unit of measure. While it doesn't directly equate to any actual elapsed time, it does give a rough relative estimate of the resources (cost) required by a database manager to execute an access plan. You can try different variations of a query, different optimization levels or other changes, and look for a lower cost to indicate a faster result. Working from the bottom up, examining the entire plan may help you find specific inefficiencies. The explain lists and graphs show how the required data is located, collated, and prepared for return. Note that the graph isn't always a tree - it takes sensible shortcuts if it saves performing the same operation twice. At each node, you can examine the arguments to the operator, the estimated cost of the node, and the cumulative cost of all operations up to this point.

Here are the main node types:

That covers the basic operators. There's a lot more information than just the node types and estimated cost. You can find it in IBM's "Description of db2expln and dynexpln Output."




Reducing the Cost
Now that you can read the price tags, it's time to reduce the cost. Here are some guidelines:

1.) Better-informed decisions. DB2's optimizer must be well informed to make the best decisions.

2.) Don't buy junk. It's a simple rule - don't buy what you don't need. This includes:

3.) Keep an index handy. Many columns can be used in an index, and the order of the columns is significant. You might want to use more than one index.

4.) Use a Summary Table. See my solution "Tame Beastly Data with Summary Tables."

While there's much more that can be learned on your way to writing efficient queries, you've got a good start in knowing how to measure the impact of your efforts.

Greg Nash is an integration engineer with a manufacturing company.


DevX is a division of Jupitermedia Corporation
© Copyright 2007 Jupitermedia Corporation. All Rights Reserved. Legal Notices