he performance of queries in applications is critical. You usually can gain manifold performance improvements by just improving query efficiency, and with IT budgets being squeezed, getting the maximum performance out of the existing infrastructure has become mandatory. Adding more hardware/RAM/faster disk or more CPU is likely not an option in the current economic climate.
Determining which queries will provide the best performance for a given application often comes down to a simple question: Does Query X perform better than Query Y? Normally, performance is defined with respect to response time, throughput, and scalability. Response time is the time difference between when a response is received and when a request is submitted. Throughput is the number of requests/transactions handled in a given span of time. Scalability is the ability of a system to increase throughput, response time, or both when the underlying resources (that is, CPU, memory, and disk) change.
This article explores utilities and methods can help compare query response times in Oracle, enabling you to answer the which-query-is-better question. It does not explore utilities that can be used to isolate the problem queries, as such queries usually are addressed at development/testing times or as a result of deployment/production DBAs pointing them out.
To compare requires, some test cases are required. Let’s set those up first.
Setting Up the Test Cases
You will use three sets of comparable queries that return equivalent results. These queries are somewhat contrived, but are nonetheless derived from actual real-world examples:
- Set 1:
- Query 1 (Version A): Returns totals and sub-totals for all objects, by owner and object types
- Query 1 (Version B): Returns totals and sub-totals for all objects, by owner and object types, based on the new GROUP BY CUBE syntax (find examples of such queries in Mastering Oracle SQL, 2nd Edition by Sanjay Mishra and Alan Beaulieu)
- Set 2:
- Query 2 (Version A): Retrieves the last updated object by type, per owner
- Query 2 (Version B): Retrieves the last updated object by type, per owner, based on the analytical function syntax MAX OVER PARTITION BY (see Tom Kyte’s article “On Cursors, SQL, and Analytics” for a longer discussion)
- Set 3:
- Query 3 (Version A): Returns all tables that have one column with frequency histograms, one column with nulls, and one column with a length larger than 25 (The actual query gets data for questions such as “get all orders where diapers, bananas, and beer were bought together in one order.”)
- Query 3 (Version B): Retrieves the same data as Version A, but uses the WITH clause and the INTERSECT set operator
The tests that I ran were run under Oracle 11g, but I didn’t use any 11g-specific features in the queries or the tools. Output from previous versions may differ slightly, but Oracle 10g and above should be fine for reproducing the tests.
The Utilities You’ll Use
You will use the following utilities to perform your query performance comparisons.
The Explain Plan statement is used to view how SQL processing will happen for a query. You can view the details of operations such as table access methods, join methods, and sorting/filtering and their sequences. Other useful data that Explain Plan displays are the order in which tables are accessed, how much data gets passed from one stage of processing to another, and the cost/cardinality of each operation.
The script utlxplan.sql will create the PLAN_TABLE, which needs to be available for Explain Plan to work.
The Autotrace utility is used to view logical I/O, physical I/O, and rows processed, which give a fair idea of the performance of the query. Other useful data Autotrace shows are in-memory sorts, disk sorts, redo generated.
The RunStats utility (written by Thomas Kyte) provides a comparison between two queries, based on latch information. Latches are internal locks used by Oracle to maintain concurrency and parallelism. More latches mean less scalability. This utility is different from the rest, as it can be used to predict which query is likely to limit performance when data grows.
SQL Trace/10046 Trace
The SQL Trace/10046 Trace utility provides a very detailed comparison of the actual processing of a query. All internal data about processing (for instance, plan, wait events, CPU cycles, logical and physical I/O) is exposed via Trace.
Tables 1 below provides a summary of the comparison data provided by these utilities, and Table 2 summarizes factors that affect the applicability and usage of each.
|Table 1. Comparison Data Provided by the Utilities
|Wait event data
|Table 2. Factors Affecting Applicability and Usage of the Utilities
|Affected by caching effects/priming
|Estimated or actual
|Supplied by Oracle
|Yes (TKProf, OraSRP)
|Easy to compare
Test Protocol and Example Results
When I ran the tests for this example, I used the following protocol for the utilities:
- Explain Plan: All the queries were run three times, and I selected the third output for comparison.
- Autotrace: All the queries were run three times, and I selected the third output for comparison.
- RunStats: If Query A and Query B were being compared, I ran Query A followed by Query B prior to the comparison. Then I ran the two in the same order and compared the results.I also carried out a reverse test. Prior to comparison, I ran Query B followed by Query A. Later, I ran Query B followed by Query A and compared the results.
- SQL Trace/10046: I ran the queries twice and then turned the tracing on for the third execution. I compared the third executions.
|Table 3. Comparison Results for Query Tests by Utility
|Query 1(Version A)
|Query 1(Version B)
|Query 2(Version A)
|Query 2(Version B)
|Query 3(Version A)
|Query 3(Version B)
|Table 4. Comparison Results of RunStats Reverse Test
|Query 1A vs. Query 1B
|Query 1B vs. Query 1A
|Query 2A vs. Query 2B
|Query 2B vs. Query 2A
|Query 3A vs. Query 3B
|Query 3B vs. Query 3A
Analysis of Test Results
For the Query 1A vs. Query 1B comparison, Query B is better:
- Explain Plan: Query B will cost about 25 percent of Query A.
- Autotrace: Query B will use one-fourth the number of Logical I/Os (consistent gets) as Query A will.
- SQL Trace/10046: Query B performed one-fourth the I/O (physical and logical) as did Query A. The disk and query columns in the output represent the physical and logical I/O. The execution time of Query B was also less than that of Query A.
- Based on RunStats: Query B used fewer latches than to Query A did, and therefore will scale better with more data. (Note: The numbers reported by the utility vary greatly, and it is not easy to equalize the effects of caching.)
- Explain Plan: Query A will cost about 80 percent of Query B.
- Autotrace: Query A will use about twice the number of logical I/Os (consistent gets) as will Query B.
- SQL Trace/10046: Query A performed twice the I/O (physical and logical) that Query A did. While the elapsed time of A is more than the elapsed time of B, A’s CPU time is less than B’s.
- Based on RunStats: Query A used more latches than Query B did, and therefore will not scale as well with more data. (Note: The numbers reported by the utility vary greatly, and it is not easy to equalize effects of caching.)
For the Query 3A vs. Query 3B, Query A is better than Query B:
- Based on Explain Plan: Query A will run at about 25 percent the cost of Query B.
- Based on Autotrace: Query A will perform 65 percent of the logical I/Os (consistent gets) that Query B will.
- Based on SQL Trace/10046: Query A performed about 65 percent of the I/O (physical and logical) that Query B did. The elapsed and CPU times of Query A are better than those of Query B.
- Based on RunStats: Query A used fewer latches than Query B did, and therefore will scale better with more data. (Note: The numbers reported by the utility vary greatly, and it is not easy to equalize effects of caching.)
Comparing the Utilities as Well
Given the right test data, SQL Trace/Event 10046 is the best comparison tool because it reports the actual execution statistics. Explain Plan and Autotrace are quick-and-dirty solutions that present estimates. RunStats results vary greatly, and the version used for the tests here (Tom Kyte’s) cannot equalize the effects of caching. However, Connor McDonald has a version that captures multiple snapshots, which may alleviate these issues.