Hence, we could skip the overhead of a refresh immediate summary, and rebuild the summary each day. A refresh deferred summary table is a neat way to do this. It allows more flexible queries, but has its own limitations.
Refresh deferred summary tables are similar to refresh immediate, except they:
- Operate asynchronously; they are refreshed only when you explicitly refresh them
- May use scalar functions
- May use other column functions, such as MIN and MAX
- Do not require a COUNT(*) or COUNT_BIG(*)
- May contain a HAVING clause
- Cannot be used to automatically optimize static SQL queries
- Can be used to automatically optimize dynamic SQL queries under certain circumstances (see sidebar "A Faster Disaster").
Listing 5 shows a refresh deferred definition. Remember, you can increase flexibility and reduce overhead by deferring the update.
Listing 5: Create a Deferred Summary Table
CREATE SUMMARY TABLE TEST.GOOP_DEF AS (
SELECT Machine, DateMade,
SUM(Weight) AS SumWeight,
COUNT(Weight) AS CountWeight,
COUNT(*) AS Bags,
MIN(SerialNo) as MinSerial,
MAX(SerialNo) as MaxSerial
FROM TEST.GOOP
GROUP BY Machine, DateMade
) DATA INITIALLY DEFERRED REFRESH DEFERRED @
create index TEST.GOOPDEFMDIX ON TEST.GOOP_DEF
(Machine, DateMade) @
Note that the contents of goop_def are only updated upon the command
refresh table TEST.GOOP_DEF
meaning that the contents of goop_def are not guaranteed to be current.
Refresh deferred summary tables are good when:
- You can refresh a summary table once and query it many times, due to the timing of the transactions and queriesfor example, if you run queries on historical daily summaries or the source data arrives in bursts.
- You need the extra functions that aren't available with refresh immediate.
So it's still a balancing act, and there's no single answer to all situations, but now you have two tools to cut your query times when collating large data sets. Decide whether your summary needs to be live, or simply refreshable, and let DB2 work smartlike Eddie.