Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Tame Beastly Data with Summary Tables-3 : Page 3


advertisement
Immediate Results
With DB2, the functionality we need to reduce the workload of our reporting is built right into the database, by utilizing Automatic Summary Tables (ASTs). An AST is a "materialized view" —a view that is stored on disc for direct access.

To use Automatic Summary Tables we simply:

  1. Decide how to group the data, and whether we want it updated immediately ("refresh immediate") or on-request ("refresh deferred")
  2. Create a summary table for the summary and optionally index it for even greater performance (see Listing 3).


Listing 3: Create a summary table. CREATE SUMMARY TABLE TEST.GOOPST AS ( SELECT Machine, Color, DateMade, SUM(Weight) AS SumWeight, COUNT(Weight) AS CountWeight, COUNT(*) AS Bags FROM TEST.GOOP GROUP BY CUBE(Machine, Color, DateMade) ) DATA INITIALLY DEFERRED REFRESH IMMEDIATE @ refresh table TEST.GOOPST @ create index TEST.GOOPSTMCIX ON TEST.GOOPST (Machine, Color) @ create index TEST.GOOPSTDMCIX ON TEST.GOOPST (DateMade, Machine, Color) @ -- Do a RUNSTATS, but not until you've populated the table! RUNSTATS ON TABLE TEST.GOOP WITH DISTRIBUTION AND DETAILED INDEXES ALL @ RUNSTATS ON TABLE TEST.GOOPST WITH DISTRIBUTION AND DETAILED INDEXES ALL @ -- DB2s query optimizer will make better-informed decisions -- if you perform a RUNSTATS on both the source table -- and the summary table while they are both populated



Select Your Refreshment
You could now change your reports to query the summary table or you can continue using the original queries. First you need to decide whether to use refresh immediate or refresh deferred ASTs.

Refresh immediate summary tables:

  • are defined by

    CREATE SUMMARY TABLE summaryname [(columname, columname,..)] AS (fullselect) DATA INITIALLY DEFERRED REFRESH IMMEDIATE

  • operate synchronously, that is, they are updated as part of any transaction on their source tables. This incurs some overhead, but the results are always available.
  • Can be used by DB2 to automatically optimize both static and dynamic SQL queries
  • Must contain a GROUP BY clause and a COUNT(*) or COUNT_BIG(*)
  • May use GROUPING SETS, CUBE, and ROLLUP (See Listing 4, below)
  • May contain a SUM(columname) column function, but requires a COUNT(columname) on the same column
  • May be based on an inner-join between tables or views, but cannot use the INNER JOIN keywords.

In Listing 4, we use a CUBE summary table with our revised queries to give us optimal performance. Query A can now be answered from a single row:

Listing 4: A CUBE summary table with revised queries. CREATE SUMMARY TABLE TEST.GOOPCUBE AS ( SELECT Machine, Color, DateMade, SUM(Weight) AS SumWeight, COUNT(Weight) AS CountWeight, COUNT(*) AS Bags FROM TEST.GOOP GROUP BY CUBE(Machine, Color, DateMade) ) DATA INITIALLY DEFERRED REFRESH IMMEDIATE @ refresh table TEST.GOOPCUBE@ create index TEST.GOOPCDMCIX ON TEST.GOOPCUBE ( DateMade, Machine, Color) @ -- We'll only use one index this time, as -- machine/color totals can be found where DateMade is null RUNSTATS ON TABLE TEST.GOOPCUBE WITH DISTRIBUTION AND DETAILED INDEXES ALL @ -- You can DROP TABLE TEST.GOOPST @ -- as GOOPCUBE does the same plus more. -- Now we can use queries on the original table, -- and DB2 should choose the AST, -- or we can explicitly query the cube AST as follows: -- Query A SELECT Machine, SumWeight/Bags as AveWeight FROM TEST.GOOPcube WHERE Machine IS NOT NULL AND DateMade IS NULL AND Color IS NULL@ -- Query B SELECT Color, SUM(SumWeight) as SumWeight, SUM(Bags) as Bags, SUM(SumWeight)/sum(Bags) as AveWeight FROM TEST.GOOPcube WHERE DateMade BETWEEN (current date - 7 days) AND (current date - 1 day) AND Machine IS NOT NULL AND Color IS NOT NULL GROUP BY Color@ -- Query C -- Also includes an all-Colors total SELECT Color, Bags, SumWeight FROM TEST.GOOPcube WHERE DateMade = CURRENT DATE AND Machine IS NULL GROUP BY Color, Bags, SumWeight@

Refresh immediate summary tables are good when:

  • Having a summary table will avoid the need to scan part or all of its source table
  • You can afford the transaction overhead
  • You need a completely up-to-date summary
  • Using a regular query or view is too slow, or fails due to other constraints.


Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date