Simulating an "approximate match" VLOOKUP in PowerPivot
February 10, 2012 8 Comments
It is often said that relationships in PowerPivot work in a similar way to VLOOKUPs. In reality, this is only partially true. If we examine the claim a bit closer we realize that Excel’s VLOOKUP function has a parameter in which it is possible to use an approximation when matching values against the lookup table:
Regular PowerPivot relationships emulate the ‘exact match’ option of the Excel VLOOKUP function And although PowerPivot relationships are a lot more powerful, being able to retrieve lookup values over huge tables at amazing speed, we are still left with the question: how can we achieve same type of behavior used by the ‘approximate matches’ option of the VLOOKUP function? Lets walk through the process.
Imagine we have the following tables:
As you can see, the table at the left is the one that will be executing the lookup call against that table on the right. The first thing we need to ensure is that there are no relationships among those tables in the PowerPivot model. That’s right, we will be generating the entire lookup as a DAX calculation. In other words, we will be generating a calculated relationship.
Lets first examine what Excel yields when using the VLOOKUP function with approximate matches:
If an exact match is not found, VLOOKUP returns the next largest value that is less than the lookup value.
In PowerPivot, we would generate the same effect with this calculation:
Hasonevalue ( TableA ),
Max ( TableB[TableB] ),
Filter ( TableB, TableB[TableB] <= Values ( TableA[TableA] ) )
The first condition (HASONEVALUE) forces the computation to be executed only when there is one and only one value on TableA (in other words, it avoids calculating for the grand total row if there was one as this would be meaningless to do so).
The actual computation is done via the CALCULATE expression. As you can see, it takes the maximum value of TableB in a context modified by the second parameter. As there is no relationship between TableA and TableB, context is not propagated and the calculation uses all rows in TableB. The FILTER function then filters the table and returns only those rows which are less than or equal to the current value in TableA.
The result is this:
As you can see, we were able to match the output we got with VLOOKUP. But what will happen if we use are trying to match text values instead of numbers?
We get the following error:
Calculation error in measure ‘TableA'[Measure 3]: The function MAX takes an argument that evaluates to numbers or dates and cannot work with values of type String.
A solution is found by using the LASTNONBLANK function:
Hasonevalue( TableA_Text ),
LASTNONBLANK ( TableB_Text[TableB], 1 ),
Filter ( TableB_Text, TableB_Text[TableB] <= Values ( TableA_Text[TableA] ) )
LASTNONBLANK requires two parameters, and we can use the first one for the column for which we want to get the last value in the lookup table. The second parameter must be an expression that evaluates for blanks; by using ‘1’ here we retrieve all rows on that table but that is OK as the table is being filtered by the second parameter of the CALCULATE function. In other words we truly get the last value in context which is exactly what we need.
This last calculation can be used with text values, but it will also work with numbers or dates, and it is a great pattern when simulating VLOOKUPs with approximate matches.
This post has been featured on Excel’s MVP Bill Jelen “VLOOKUP Week” (March 25 – 31, 2012 ) which groups podcasts and blogs describing the use of this and other interesting Excel-related lookup functions and techniques. For more information, visit Microsoft’s Excel team blog @ http://tinyurl.com/c2aw89y , or directly access VLOOKUP’s week website @ http://vlookupweek.wordpress.com/