Tuesday, May 11, 2010
I'm reading: Enterprise Data Quality and SSISTweet this !
A general perception of Business Intelligence is Data Warehousing and using Analytical solutions. But this is a 10,000 Ft picture. If an end-to-end BI solution is considered, ETL, Operational Reporting, Relational and Analytical Data Marts or Data Warehouse development, Dashboard Development and deployment over a collaborative platform like Sharepoint becomes visible from a 1,000 Ft. I find a major part of the MS BI Community busy in taking steroids of T-SQL Performance Tuning or engrossed in wrestle mania of re-blogging content that has been published tonnes of times and is available even on MSDN, many community websites are even busy in saving the legacy of SQL Server 7.0 / SQL Server 2000 and entertaining blind community challenges on the same. If you are working with a solution provider where you get all different kinds of projects from a vast client base, you would find that the world is big enough and there is more spectrum in BI and more sensible exercises to carry out than the typical ones or the obsolete ones. When you take a 100 Ft level closer look at any of the development aspect like ETL for example, you would find that non-microsoft players would also come into picture that would integrate with MS BI technology stack, aspects like Data Quality, Data Cleansing, Systems Integration would start popping up. This would feel like I am describing a wonderland, but the fact is that you realize this only when you work in that role.
As usual, enough of theory and coming down to the subject of this post, I just have two points to mention. Firstly, Data Quality is a very important aspect of any Data Migration or Data Cleansing or Systems Integration Programme. Even when compliance or audits becomes the main objective, more than 80% of the project depends upon data quality. Perception of compliance is limited to auditing like C2, SOX, HIPPA and others. But auditing is one of the aspect of these compliance and there's more to it which also includes data quality compliance. Secondly, in SSIS, Data Profiling task is one of the best improvements after 2005 version. It can be used to make intelligent decisions based on the quality of the data, when the same is processed.
If you really intend to learn more on data quality, I suggest to read this whitepaper on Enterprise Data Compliance. This whitepaper explains the legal complexities and implications of data quality, business process that affects data quality, and technical architecture and components needed to establish enterprise data quality. After you are able to grasp this architecture, you would definitely find that SSIS can be used exhaustively as the primary tool to craft a data quality solution. Though just the Data Profiling task would fall short and rest of the requirements would have to be tailored manually using different tasks and transforms, but I am very sure that SSIS still has a large scope to evolve and the present potential is reasonable enough to consider it's use for data quality exercises.