Business Intelligence Prototyping with PowerPivot
September 13, 2011 1 Comment
PowerPivot is generally marketed as a ‘Self-Service BI’ tool. However, it can alternatively be used as a way to enhance the design phases of traditional Business Intelligence projects, even if the final implementation does not use PowerPivot technology at all.
Why is that? Because it more closely aligns business experts with BI developers during architectural discussions. In order to explain this better, I think it’s relevant to first describe the process by which data analysts can provide relevant input to BI designers:
A lot of intermediate Excel users will probably never use VBA macros and will barely feel comfortable with Pivot Tables. For advanced Excel users, however, there is a thrill associated with getting a cool formula to work, or knowing the most interesting formatting shortcuts or even knowing how to add dropdowns to a spreadsheet to achieve dynamic interactivity.
These folks do not normally belong to the technology team; however, in addition to being savvy Excel users they are in the front lines of the decision making process which many times makes them subject matter experts (SMEs). It is likely they will eventually be motivated to start playing with the data import facility in PowerPivot and start developing DAX calculations.
Fast forward a little into the future, and imagine these users developing meaningful and insightful PowerPivot reports. These contain connections to a multitude of data sources and aggregate data in a way that provides enlightening information to decision makers.
Though not always aware of it, these users will have accomplished three things of tremendous value for Business Intelligence professionals:
- Validated one or more sources of quality data
- Validated a useful data model
- Validated one or more business calculations
In other words, as they benefit and make use of ‘Self-Service BI’ technology they will unknowingly serve as ‘data explorers’ to the Business Intelligence and Data Warehousing team that benefits from their discoveries.
In the past, these ‘spreadsheet-based discoveries’ went unnoticed by members of the technology team (and became spreadmarts). However, the Business Intelligence team members can now gain insight into popular spreadsheet reports across the enterprise (PowerPivot for SharePoint). In addition to that, they will be able to take these PowerPivot applications and open them in Visual Studio as a SQL Server Analysis Services Business Intelligence Semantic Models (BISM) in SQL Server 2011. This is the server equivalent of a PowerPivot for Excel workbook: it is built using the same technology (the columnar based storage engine "Vertipaq") but with a number of robust features proper of an enterprise solution.
One of the great things of this new approach to BI that Microsoft is proposing is that it will most likely help close the gap between Excel power users and Business Intelligence professionals by creating an analytical environment built on the same technical foundation on both ends: DAX. This will most certainly help avoid communication challenges that normally occur when both sides speak in totally different “languages”: Excel formulas \ business calculations on one side, and T-SQL/MDX and OLAP on the other.
So what about BI prototyping? And how can we benefit from it even before SQL Server 2011 is released? Well, imagine having the ability to engage subject matter experts in their most familiar environment (Excel) and with their help be able to rapidly test assumptions about BI project deliverables, before starting development. Rather than purely describing a calculation in words or spreadsheets disconnected from a database, we can instead quickly and iteratively build a sample of it tied to a data source and observe the calculation in action as its being sliced by multiple dimensions. This can generate the necessary early insights that can help correct potential misunderstandings quickly and generate a more complete and accurate data model that can be used to guide actual development. And these insights have been already validated by the SMEs (in collaboration with the BI team).
As an example, imagine a dimension called “Orders”. During the initial design phase, it becomes clear it must be added to the cube in order to generate the expected reports. Fast forward a few months (and many hours of development). While building the final reports, the BI team becomes aware something that is missing: Even though DimOrders is used in many reports (as business users started at the beginning of the project) it was built at a different grain that what reports require; conditional logic is discovered – yes, at the last stages of the project – that treats some Order Details differently from others. What to do? Different approaches are possible, from redesigning parts of the ETL to creating new measure groups based on named queries in order include the data that is missing. Architectural re-arrangements are going to have to be made in order to complete it on time; and any ‘quick’ solution will risk compromising the ad hoc nature of the final product (which will in turn limit the ability to use it for analysis).
How would this change if PowerPivot is used as a prototyping tool? Using an Agile approach, during early scrum meetings it will be possible to ‘sample’ the Order dimension against a copy of all relevant reports. By loading a relatively small set of data into PowerPivot and designing ad hoc reports to mirror actual views across any functional existing reports, one is able to quickly uncover the fact that different grain is required for the dimension in question. This work would probably be done in very small amount of time compared to what it will require to re-desing and re-develop parts of the ETL and OLAP solution. Time – and money – can be saved while increase accuracy of the design.
How would it be possible to ‘mimic’ the dimension ‘Orders’ without building into a real SSAS cube? This is the most interesting part: PowerPivot was built on SQL Server Analysis Services code. In other words, it is a version of SSAS. All rules expected by SSAS when building cubes apply to PowerPivot as well; for example, dimensional tables in a star schema translate to ‘lookup’ tables in the PowerPivot window. PowerPivot data is consumed by Excel using ‘cube’ interfaces (MS OLAP provider, Excel CUBE functions, etc); However, PowerPivot was built for fast development which makes it an ideal tool for ‘mimicking’ an SSAS cube without the complexity of advanced server features.
I want to emphasize, we are not trying to build the entire BI project or corporate data warehouse in PowerPivot. We are just ‘sampling’ the data model and ‘testing’ the different calculations in order to allow subject matter experts to help us gain a much clarity regarding the quality of the data, the relationships among the different data entities and the validity and relevance of the business calculations. It is often the case that business users will start ‘refining’ their idea of what they want once they see a working model.
This final result of a BI project that used PowerPivot for prototyping can very well be a traditional UDM/OLAP environment using MDX calculations. The discoveries gained during the prototyping phase have no direct relationship to the specific technology that will be implemented; they only offer pointers to a better architecture that will increase the chances of a successful implementation, one that will have a profound impact on the business.
A lot of Business Intelligence projects fail for two major reasons: lack of Business Sponsorship and poor Data Quality. There is a good chance that using PowerPivot as a prototyping tool can help tackle these issues as subject matter experts use their familiar Excel environment while collaborating with BI professionals under the iterative Agile framework. Rapid dimension/measure sampling can quickly uncover data quality issues while establishing a solid data model that analysts can understand which gradually builds support for the BI vision (in contrast to solutions they perceive as overly complex and many times inaccurate, which is the mean reason why ‘spreadmarts’ exist in the first place).