SCOPING at different granularities in DAX (Part I)

One of the most powerful MDX functions is SCOPE.  With it, one can override cell values at any level of granularity in an SSAS cube and their new custom values will now be considered when returning aggregate results.

It is generally understood you can’t use SCOPE-type functionality in an SSAS Tabular model.  But if we examine the topic a bit closer, we realize that a lot of what is possible with SCOPE statements can also be replicated with DAX expressions.

One of these examples is the ability to manipulate the granularity at which a computation executes.   My colleague Jason Thomas ( blog | twitter ) recently posted a blog entry of a technique in which the expression scope is changed in order to iteratively prototype the correct granularity needed for a business calculation (See his blog entry here). This is the same technique can also be used to solve advanced Last Non Empty scenarios, something we will blog about soon.

Lets look at another example.  In the Adventure Works cube, there is a SCOPE statement in the MDX cube script in which months are evenly assigned the sales amount quota of their parent quarter on the fiscal year 2002:

image

In the cube designer we see that this is necessary in order to have monthly values, as the dimension usage has been defined by tying the ‘Sales Targets’ measure group with the Date dimension at the Year/Quarter level of granularity:

image

When browsing in Excel, we see the output of the behavior as:

image

We can achieve the same behavior on a tabular model by iterating through the distinct list of months in the lookup (dimension) table. To begin with, we load the appropriate tables in the tabular model:

image

Note that I created a hierarchy with custom fiscal year (FY) and fiscal quarter (FQ) calculated columns, in order to match the dimension member caption in the Adventure Works cube. When pivoting the data, without any DAX expressions defined, we get:

image

Notice the sales amount quota is tied to only one month.  This is because DimTime is at the daily level or granularity;  when Fact Sales Quota relates to DimTime, it assigns values on a specific date – however, as these quota values have been defined by the business at the quarter level, the regular sum aggregation at the monthly level does not make much sense.

SCOPE, in the MDX cube script, deals with it by getting the quarter value and dividing it in three, and assigning this new value to each of the months belonging to that quarter. To match the output of MDX SCOPE in a Tabular model, we use the following DAX expression:

Scoping (SUMX):=SUMX(
    VALUES(DimTime[EnglishMonthName]),
    CALCULATE( SUM( FactSalesQuota[SalesAmountQuota]  ) /
            CALCULATE(
                COUNTROWS( VALUES( DimTime[EnglishMonthName] ) ),
                ALL( DimTime[EnglishMonthName] ) ),
        ALL( DimTime[EnglishMonthName] ) )
)

Which yields the following output:

image

This DAX pattern involves using a row-by-row execution using SUMX over a table of distinct values at the desired level of granularity:

AggregateX ( <granularity level>, <expression> )

As you see, the pattern is just using the signature of any of the AggregateX functions.  The only difference is one of perspective:  in our case, we are explicitly leveraging the first argument to change the granularity at which the expression evaluates.  The granularity level is specified by generating a table at the appropriate level desired, whether you use VALUES, SUMMARIZE or even CALCULATETABLE.

This same technique is something Jason and I have explored when tackling interesting analytical scenarios, like the calculation of Last Ever Non Empty, which also uses the same pattern.

Clearly, the same result can be achieved by using other methods in this case.  One of those is by detecting level, as explained in more detail on prior blog entry.  If we apply that technique here, we get:

Scoping (ISFITLERED):=IF(
    ISFILTERED(DimTime[EnglishMonthName]),
        CALCULATE(
            SUM( FactSalesQuota[SalesAmountQuota] ) / 3,
            ALL(DimTime[EnglishMonthName] )
        ) ,
    IF(
        ISFILTERED(DimTime[FQ]),
            SUM( FactSalesQuota[SalesAmountQuota] ) ,
    IF(
        ISFILTERED(DimTime[FY]),
            SUM( FactSalesQuota[SalesAmountQuota] )
        )
    )
)

image

Notice how the total level is not showing a value, as I didn’t account for that level in my conditional IF statements. This is not ideal as in many scenarios you need the ability to compute on a level that is not displayed on the pivot table or report.

For example, if we wanted to slice this measure by  the Sales Territory Group dimension, we would get:

image

As you can see, when an unexpected context has been introduced, the technique using SUMX is still able to compute at other levels whereas using ISFILTERED can’t due to the unexpected filter which is not accounted for. Clearly, using ISFILTERED in this case would diminish the usability of the model.

ANATOMY OF THE CALCULATION

As we are computing rows for each member at a specific level or granularity, it is necessary to iterate through this members. The SUMX function allows us to do just that.

It is known that iterative functions like this do not perform as well as other calculations that compute on ‘bulk’ mode (like SUM).  However, SUM can only take a column for parameter, whereas with SUMX we can specify a table on which the expression will be computed on a row-by-row basis. An we need this table parameter precisely because it is the one that allows us to define the grain at which we the evaluation to be evaluated.

The calculation takes the list of distinct months and, through row context, grabs the quarter value and divides it by the count of months belonging to that quarter.  This is possible due to the fact that the context at the month level has been cleared out, and hence the values always belong to  the next level for which there is context (quarter in this case):

image

Notice also that the distinct list of months must be generated off the appropriate side of the one-to-many relationship:  if we had a column for month on the fact table and used that we would have gotten the following, obviously incorrect results:

Scoping (Incorrect Grain):=SUMX(

   VALUES(FactSalesQuota[FactMonth]),

    CALCULATE(

        SUM( FactSalesQuota[SalesAmountQuota]  ) / CALCULATE( COUNTROWS( VALUES( DimTime[EnglishMonthName] ) ), ALL( DimTime[EnglishMonthName] ) ),

        ALL( DimTime[EnglishMonthName] )

    )

)

image

Clearly, values not assigned on all qualifying months – only the ones active based on query context. As I have described on a prior blog post, DAX context does not automatically propagate from the base table to the lookup table.

We can take advantage of this by generating the distinct list of months based on the lookup table: doing so guarantees we will get a list of all months, regardless of the current filters in action on the fact (base) table.

IS THIS A REAL ASSIGNMENT OF VALUES?

It is not. In DAX, we are not really assigning or overriding values as we would with an MDX scope statement.  However, in Tabular, we have the ability to generate a new calculation that computes at a certain level of grain; this allows us to mimic the same behavior as that achieved through MDX.

The experienced MDX or DAX user may have noticed that our defined calculation expects a hierarchy in order to show the assigned values: we do this as we are clearing some filters but expecting others to be kept. When looking at the results at one level of grain only (monthly), this will lead to different results to those we get by use of the MDX SCOPE function. However, using DAX we are able to overcome this: see how on the second blog entry on this topic available here: https://javierguillen.wordpress.com/2012/05/04/scoping-at-different-granularities-part-ii/

Advertisements

Active day count in DAX

Alberto Ferrari posted a very nice blog entry some time ago regarding how to count active days in PowerPivot (http://sqlblog.com/blogs/alberto_ferrari/archive/2011/05/12/powerpivot-counting-active-days.aspx ).   His example relies on a data model that specifies ‘Active From’ and ‘Active To’ fields in order to compute the count.

I want to write a follow up entry for when such fields do not exist in the data model.  If all you have is a start date reflected as a transaction recorded in your fact table, counting active days is still possible in DAX.  Lets see how.

If we drop the entire list of products in the Adventure Works database in a pivot table, we are able to determine the first ‘active’ day by using the LOOKUPVALUE function:

First Activity Date:=LOOKUPVALUE(

                 DimDate[FullDateAlternateKey],

                 DimDate[DateKey],

                 MIN(FactInternetSales[OrderDateKey])

)

image

Having tested this, we can now simply use the same technique to count the rows between the first activity date and the last activity date in the fact table.  For this, I use the following expression:

Active Days Since Product Inception:=IF(

    CALCULATE( COUNTROWS( FactInternetSales ) ) ,

    CALCULATE(COUNTROWS(DimDate),

        DATESBETWEEN(

            DimDate[FullDateAlternateKey],

            LOOKUPVALUE(

                DimDate[FullDateAlternateKey],

                DimDate[DateKey],

                MIN(FactInternetSales[OrderDateKey]) 

            ),

        CALCULATE(

            LOOKUPVALUE(

                DimDate[FullDateAlternateKey],

                DimDate[DateKey],

                MAX(FactInternetSales[OrderDateKey])

            ), ALL(FactInternetSales) ) 

        ) 

    ) – 1,

    BLANK()

)

This expression is pretty straight forward:  it uses the DATESBETWEEN function for compute the amount of dates between the first date for which there is transaction data for any given product in the fact table and the last date in the fact table across all products.  The result is the following:

image

The power of this calculation is that it works even if we change the granularity of elements in the row labels.  As an example, if we replace product by category we get:

image

If what we want is to compute the active days between the first and last dates for which there is transaction activity for any given product or category, we can leverage the same technique we used to get the first activity date, but this time we use it to get the last activity date. We then compute the dates between the two:

Active Days:=IF(

    CALCULATE( COUNTROWS( FactInternetSales ) ) ,

    CALCULATE(COUNTROWS(DimDate),

        DATESBETWEEN(

            DimDate[FullDateAlternateKey],

            LOOKUPVALUE(

                DimDate[FullDateAlternateKey],

                DimDate[DateKey],

                MIN(FactInternetSales[OrderDateKey]) 

            ),        

            LOOKUPVALUE(

                DimDate[FullDateAlternateKey],

                DimDate[DateKey],

                MAX(FactInternetSales[OrderDateKey])

            )

        ) 

    ) – 1,

    BLANK()

)

image

Notice that is necessary for these calculations to have a proper there is a need to have Date table. That means there should be a list of dates without gaps, otherwise the count would be incorrect.

We can see how these two formulas compare, by placing them side to side.  I have highlighted the rows in which their outputs are different, indicating the products for which the last active date was not the same as the last transaction date in the table:

image

There is one last scenario that may be of use to know about.  Sometimes we want to calculate the amount of days from the first activity date until the most current date.  In other words, it is a countdown of days until a specific event, for example, the last date for which there was fact data. In this sense, it is the reverse of what we have computed so far.

In order to compute this figure, we can’t rely anymore on the DATESBETWEEN function. Instead, we can simply subtract the two dates, as follows:

Countdown to last fact date:=IF(

    CALCULATE( COUNTROWS( FactInternetSales ) ) ,

    INT(

        CALCULATE( LOOKUPVALUE(

                DimDate[FullDateAlternateKey],

                DimDate[DateKey],

                MAX(FactInternetSales[OrderDateKey]) 

            ), ALL(DimDate) )    –

            LOOKUPVALUE(

                DimDate[FullDateAlternateKey],

                DimDate[DateKey],

                MIN(FactInternetSales[OrderDateKey]) 

            )

    )   

         

    , BLANK()

)

 

image

Clearly, in this last formula we can replace the first date argument with any other arbitrary date for which we need to compute a countdown.  Interestingly, as this expression does not rely on the built-in tabular time intelligence functions, it doesn’t need a date table which was necessary in the prior expressions used in this blog post. 

We can use this last technique to dynamically calculate things like countdown to date when payment is due, for example.