Siddharth Mehta's Blog: Intelligent Distribution Analysis using Analyzer

Wednesday, September 07, 2011

Intelligent Distribution Analysis using Analyzer

Business intelligence reporting tools should be capable of analyzing massive amounts of quantitative data via categorization into distributed frequencies and groups. The intelligence expected here is facilitating analysis of data density distribution across various logically-related or unrelated groups, identifying outliers, quality control, identifying non-linear relationships between different business parameters, and beyond. This corresponds to a branch of statistical analysis known as distribution analysis.

For example, business assignments that are tracked using various project management metrics have cost performance index (CPI) and schedule performance index (SPI) as two of the tracked metrics. A large organization could have several hundred projects running simultaneously. To analyze how the business is performing on cost against schedule, data of all projects is categorized into clusters based on CPI versus SPI, and analysis can be done on clusters to derive the relationship between these parameters as well as performance of projects under different clusters.

Any intelligent distribution analysis starts with the study of higher level composition. In this Analyzer recipe, we would take a look at how a capable reporting tool can help with an interesting and intelligent distribution analysis, and extract insights.

The above screenshot is a typical example of time series analysis using the Adventureworks cube. Sales performance of different geographies is shown. Next logical step for analysis would be studying the distribution of geography in a particular year, and for this, the above graph is not suitable. The first step to study distribution is to merge all individual values in a single entity, i.e. the different bars in any particular year would stand as individual entities. A simple way to merge it into a single entity is by using a stacked bar chart. A stacked column chart would have vertical bars, and if you look carefully, the screen space you would have vertically is much less compared to horizontal screen space. Just with a simple selection, you can change the graph you need for the same data, and the visualization would look like the below screenshot. In case you wish to study only selected geographies, you always have the option to filter out legend and categories.

Pie-charts are widely used for distribution analysis. As we have time-series involved here, we would need a pie-chart for every year. This is as easy as selecting pie-chart graph, but each one points to the distribution of a specific entity for that year, so it should not really be compared across the entire time-series. For example, in the below screenshot, you should not compare the weight of the United States across each year, as the distribution is shown for a specific year. You might feel that CY 2001 has the highest weight of US across all years, but this is not correct. CY 2003 has the highest value and you can see the value when you hover on any pie of the chart. Cross series distribution analysis cannot be done, but you can derive clearly that every time the US had the highest weight age compared to others, which is not that clearly visible when you use a column chart. Keep in mind while considering is distribution analysis is that composition should be studied within the same entity and not across entities.

Senior business management typically uses pyramids to study the composition of different entities, for example a resource pyramid for any particular project. With the selection of a pyramid chart, you can easily achieve the same visualization too. You might be surprised at why all countries are listed in the same hierarchy in all the pyramids, as the performance of each country is varying in each year. The reason is that the Country attribute-hierarchy in the Geography user-hierarchy of the Adventureworks cube is sorted by name. Since Analyzer uses AMO behind the scenes, it will retrieve data in the same order. This works to the benefit of the user. If the sort order of the hierarchy is based on reseller sales amount, the order of the pyramid would also change, which is very much desirable as the flexibility of configuration is left to the discretion of the user. Analyzer also has built-in capabilities to define your own MDX queries, without very detailed knowledge of MDX.

Intelligence in any form of analysis sits in the brain of the individual analyzing the data. With Analyzer one can leave a reasonable level of onus of representing data intelligently on the tool itself, to analyze data using intelligent and interactive visualization suited for different forms of analysis. The above examples of distribution analysis are just of higher level data, but there are more charts options for quantitative distribution analysis using charts like scatter charts and more. You can find out about more such interesting options from the Analyzer website.

Siddharth Mehta's Blog

Wednesday, September 07, 2011

Intelligent Distribution Analysis using Analyzer

No comments:

Latest Trends and Technologies

Elasticsearch Resources

Hadoop, BIG Data, and Cloud

Read My Articles

Microsoft Business Intelligence

SQL Server Product Team Blogs

Community

MS BI 2008 Whitepapers

Article Category

MS BI 2008 Video Tutorials

Blog Archive