2009-04-05

SSAS:Non Empty Vs NonEmpty

For reference

"The NonEmpty() function is powerful because it can be applied to MDX expressions whereas the NON EMPTY clause was only available at the top level in SELECT statements."

NonEmpty({set},[filter set])

From Mrachek
http://www.mrachek.com/PermaLink,guid,8820cd3a-35f9-4c55-91b2-cc1dbd47fc83.aspx

From Technet
nonempty
http://technet.microsoft.com/en-us/library/ms145988.aspx

From BI Best Practice Blog (Internet Archive:Repost)
Best Practices for Microsoft Business Intelligence

The Difference Between NONEMPTY(), NONEMPTYCROSSJOIN(), NON EMPTY BEHAVIOR and NON EMPTY THRESHOLD
Analysis Services uses two algorithms to remove empty cells: the 'old' one, which is slower but 100% reliable, and the 'new' one which is faster but can only reliably be used when there are no calculated members, custom rollups, etc. involved in the query. When you put a NON EMPTY clause on an axis, Analysis Services can use either of these algorithms; alternatively, to make sure you use the second algorithm in your query you can use the NONEMPTYCROSSJOIN function, but you must understand its limitations or you will get unexpected results.
When you are using a NON EMPTY clause (and some client tools, like Excel, do not give you the choice to use NONEMTPYCROSSJOIN), the decision on which algorithm for filtering empty cells is based on two factors. The first is the NON EMPTY THRESHOLD connection string property (default value=5000). This is the lower threshold for the number of tuples on an axis that must be evaluated for Analysis Services to decide that it is worth the effort of using the 'new' algorithm or not; that is to say, that if there are fewer tuples on the axis than the value specified in the property, then the old algorithm will always be used. The second factor is whether there are calculated members, custom rollups etc which would normally mean that the first algorithm will have to be used. However, it is possible to use the second algorithm in a query using calculated members if the calculated members have their NON_EMPTY_BEHAVIOR property set. This property specifies a 'real' measure which will return an empty value in the same circumstances as when the calculated member returns an empty value. It is up to the developer though, to make sure that the 'real' measure does in fact behave in the same way as the calculated member, because Analysis Services doesn't do any checks to determine if this is the case.
To illustrate this functionality in action, take a look at the following set of queries for Foodmart 2000 and run them in the MDX Sample application. Let's start with a simple query which includes a calculated measure and returns six cells, three of which have data and three of which don't:
WITH MEMBER [MEASURES].[DEMO] AS 'IIF([MEASURES].[UNIT SALES]>25000, 1, NULL)'SELECT{[MEASURES].[DEMO]}ON 0,CROSSJOIN([Product].[Product Family].MEMBERS, [GENDER].[GENDER].MEMBERS) ON 1FROM SALES
To get rid of the empty cells, we can simply add in a NON EMPTY clause, so:
WITH MEMBER [MEASURES].[DEMO] AS 'IIF([MEASURES].[UNIT SALES]>25000, 1, NULL)'SELECT{[MEASURES].[DEMO]}ON 0,NON EMPTYCROSSJOIN([Product].[Product Family].MEMBERS, [GENDER].[GENDER].MEMBERS) ON 1FROM SALES
At the same time, note that if we tried to use NONEMPTYCROSSJOIN here, as with
WITH MEMBER [MEASURES].[DEMO] AS 'IIF([MEASURES].[UNIT SALES]>25000, 1, NULL)'SELECT{[MEASURES].[DEMO]}ON 0,NONEMPTYCROSSJOIN([Product].[Product Family].MEMBERS, [GENDER].[GENDER].MEMBERS) ON 1FROM SALES
we would still return the empty cells. This is because the 'new' algorithm for removing empty cells, as used in NONEMPTYCROSSJOIN, only removes tuple combinations which don't exist in the fact table and doesn't check the result set itself for empty cells. Since for each tuple on the rows axis data exists in the fact table, then these tuples aren't removed. if we had put [MEASURES].[UNIT SALES] on columns instead, you'd see that all six cells had data.
However, you can force Analysis Services to use the 'new' algorithm here. The first thing to do is to include in your connection string the property NON EMPTY THRESHOLD=1;. The other thing we need to do is set the NON_EMPTY_BEHAVIOR property for the calculated measure. In the MDX Sample application, close the connection to the server and then reopen it, but when you see the dialog asking for the server to connect to, instead of leaving the name of the server as MYSERVERNAMEchange it toMYSERVERNAME; NON EMPTY THRESHOLD=1and then run the following query:
WITH MEMBER [MEASURES].[DEMO] AS 'IIF([MEASURES].[UNIT SALES]>25000, 1, NULL)', NON_EMPTY_BEHAVIOR='[MEASURES].[UNIT SALES]'SELECT{[MEASURES].[DEMO]}ON 0,NON EMPTYCROSSJOIN([Product].[Product Family].MEMBERS, [GENDER].[GENDER].MEMBERS) ON 1FROM SALES
Here the empty cells are returned again, even though we're not using NONEMPTYCROSSJOIN. This shows we're using the 'new' algorithm. Either removing the NON_EMPTY_BEHAVIOR property from the calculated member, or using a NON_EMPTY_THRESHOLD value of greater than 5 (as there are 6 tuples on rows), would mean that for this query the 'old' algorithm would be used again and the empty cells would disappear.

No comments:

Post a Comment