Browse DevX
Sign up for e-mail newsletters from DevX


The Baker's Dozen: 13 Tips for Querying OLAP Databases with MDX : Page 4

MDX is a SQL-like query language used to retrieve data from OLAP databases and to define Key Performance Indicators (KPI) in SQL Server 2005 Analysis Services.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Tip 7: Never Too Late to "Generate"
One of the more complicated MDX functions is Generate. You will usually use it when you need to combine sets of data. For instance, suppose you want to retrieve the top five states by sales, and within each of those ten states, you want to know the top five product categories. Here is the full query you would use:

SELECT [Measures].[Internet Sales Amount] ON COLUMNS, Generate ( TOPCOUNT ( [Customer].[State-Province].children,5 , [Internet Sales Amount]), ({[Customer].[State-Province].CurrentMember}, Topcount([Product].[SubCategory].children,10, [Internet Sales Amount] )),ALL ) on rows FROM [Adventure Works]

You may be saying, "Ouch! That looks nasty!" Well, if you break it down into smaller pieces, you'll see that it really isn't that bad.

First, you need to establish the top five states by sales, which you learned how to do back in Tip 6:

TOPCOUNT ( [Customer].[State-Province].children,5, [Internet Sales Amount]),

That command will essentially return one set, with the top five states. Your MDX query needs to take each state and perform a second query for the top ten products for each state:

({ [Customer].[State-Province].CurrentMember}, Topcount([Product].[SubCategory].children, 10, [Internet Sales Amount] ))

The second query uses the CurrentMember of the State dimension and "applies" it to the second TOPCOUNT call to retrieve the top five products for each specific state. Note that by wrapping parenthesis around the entire code block, you are creating a second set of results.

If your query requirements involve combining multiple sets, start thinking about using the Generate function.
So you have a set of five states, and a second set that will return ten products for each state CurrentMember. You finally combine them with the Generate function, which will return the final result of each relevant state/product combination.

One final thing: the order of the result set will be the states with the highest aggregate sales amount, and products by sales amount within each state. That may be fine, or you may want the result set in the natural or original hierarchical order (state name and product subcategory name). You may be tempted to insert some ORDER statements to force the order you wish. Fortunately, there is a much easier way! You can use the Hierarchize function to force the results into the original hierarchical order. Simply by wrapping the entire ON ROW contents inside Hierarchize, you can force the result set order:

select [Measures].[Internet Sales Amount] ON columns, Hierarchize ( Generate ( TOPCOUNT (...

Are you on overload yet? Well I hope not, because there's one more important use of Generate that further reinforces the value of this function. Back in Tip 5, I presented a basic RANK function. But you may want to know how to rank sales by state/product category, independent of the hierarchy. Once again, you'll use the Generate function in the same way you used it to perform multiple TOPCOUNT queries. The code below provides a full example. Once again, breaking down the query will help:

-- First, create a Set WITH SET [SalesRankSet] AS GENERATE( [Customer].[State-Province].Children, ORDER ( NONEMPTY( ([Customer].[State-Province].CurrentMember, [Product].[Subcategory].Children) , { [Measures].[Internet Sales Amount] } ), [Measures].[Internet Sales Amount], BDESC )) -- Next, create a calculated member MEMBER [Measures].[CategoryRank] AS RANK( ([Customer].[State-Province].CurrentMember, [Product].[Subcategory].CurrentMember), EXISTS( [SalesRankSet], {[Customer].[State-Province].CurrentMember}) ) SELECT {[Measures].[Internet Sales Amount], [Measures].[CategoryRank]} ON 0, [SalesRankSet] ON 1 FROM [Adventure Works]

First, you'll create a SET by using Generate to combine States with an ordered set of State/Product Subcategory combinations (sorted by dollar amount, descending):

WITH SET [SalesRankSet] AS GENERATE( [Customer].[State-Province].Children, ORDER ( NONEMPTY( ([Customer].[State-Province].CurrentMember, [Product].[Subcategory].Children) , { [Measures].[Internet Sales Amount]}), [Measures].[Internet Sales Amount], BDESC ))

Next, you'll create a calculated member to determine the ranking of each state/product subcategory combination. You'll call the RANK function and pass two parameters. The first parameter will be each CurrentMember combination of State and SubCategory (placed in parenthesis to form a single expression). The second parameter will be the SET returned from the MDX Exists function, which will return a set of data from the SalesRankSet from above, based on each state CurrentMember:

MEMBER [Measures].[CategoryRank] AS RANK( ( [Customer].[State-Province].CurrentMember, [Product].[Subcategory].CurrentMember), EXISTS([SalesRankSet], {[Customer].[State-Province].CurrentMember}))

Finally, you can simply reference the CategoryRank member and the SalesRankSet set:

SELECT {[Measures].[Internet Sales Amount], [Measures].[CategoryRank]} ON 0, [SalesRankSet] ON 1 FROM [Adventure Works]

Tip 8: "Heads or Tails"
While not as common, sometimes you may want to retrieve the first row(s), or last row(s) from an ordered set. For example, in a list of top ten states by Sales, you may want to retrieve the first three. You can use the HEAD function, as follows:

-- From the list of top ten states, -- grab the first three SELECT [Measures].[Internet Sales Amount] on 0, HEAD( TOPCOUNT( [Customer].[State-Province].CHILDREN, 10,[Internet Sales Amount]),3) ON 1 FROM [Adventure Works]

If you want to retrieve the ninth and tenth rows from the same list of top ten states by Sales, you can use the TAIL function, as follows:

-- From the list of top 10 states, grab the 10th SELECT [Measures].[Internet Sales Amount] on 0, TAIL( TOPCOUNT( [Customer].[State-Province].CHILDREN, 10,[Internet Sales Amount]),10) ON 1 FROM [Adventure Works]

Both functions take on two parameters: the set to read and the number of rows to ultimately retrieve.

Thanks for your registration, follow us on our social networks to keep up-to-date