- Grouped in the PARTITION BY clause, if present
- Limited to a subset of rows using a ROWS or RANGE clause, if specified, where the range operates in the ORDER given.
The easiest way to see this is by example. Listing 8 shows the SUM, AVG, and MAX functions used to fill a range of different roles. If you run the query on our sample data you'll see some patterns emerge among the different clients.
Listing 8. A Wider Range of Functions SELECT
CLIENT, MONTHEND, SALEVOL,
-- Total sales ever
SUM(SALEVOL) OVER ()
AS TOTALSALES,
--Year-to-date cumulative total
SUM(SALEVOL) OVER(
PARTITION BY CLIENT, YEAR(MONTHEND)
ORDER BY MONTHEND
ROWS UNBOUNDED PRECEDING)
AS YTD,
--All-time cumulative total across all
--clients prior to this month
SUM(SALEVOL) OVER(
ORDER BY MONTHEND
ROWS UNBOUNDED PRECEDING)
AS ALLYTD,
--Moving 3-month average
AVG(SALEVOL) OVER(
PARTITION BY CLIENT
ORDER BY MONTHEND
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
AS MOVAVGDIF,
--Difference to last year's moving avg
SALEVOL - AVG(SALEVOL) OVER(
PARTITION BY CLIENT
ORDER BY MONTHEND
ROWS BETWEEN 13 PRECEDING AND 11 PRECEDING)
AS MOVPREVDIF,
--Greatest volume by any client for the
--current month
MAX(SALEVOL) OVER(
PARTITION BY MONTHEND) AS MONTHMAX,
--Percentage of total for month
SMALLINT(SALEVOL * 100 /
SUM(SALEVOL) OVER (PARTITION BY MONTHEND))
AS THISPCT,
--Number of months where this client's sales
-- are within +/- 5 units of this month
COUNT(*) OVER(
PARTITION BY CLIENT
ORDER BY SALEVOL
RANGE BETWEEN 5 PRECEDING AND 5 FOLLOWING)
AS NUMSAME
FROM MYSALES
ORDER BY CLIENT, MONTHEND
You can use RANGE in place of ROWS, except that RANGE operates only on a single numeric column, and compares the values in the column rather than the number of rows. (See the NUMSAME calculation near the end of Listing 8 above).
That's only the beginning of the possibilities. It's important to remember that OLAP queries can place a heavy load on a database, and sometimes the whole data set must be processed even to return a small number of results. That said, you don't have to transfer large amounts of data to other programs, or worse, across a network, to get the results.