Wednesday, December 29, 2010
Tuesday, December 28, 2010
Monday, December 20, 2010
Once you have the blue-print ready, artifacts required to build the anatomy of the data warehouse needs to be built, and two of the major ones are:
1) ETL routines to shape your data compliant to Data Mart design.
2) Relational Data warehouse / Data Mart i.e. dimension and fact tables and other database objects that would hold your data transformed by ETL.
The process sounds quite crystal clear, but when you are developing from scratch, and when your data warehouse and dimensional modeling is in the phase of evolution, there is one tool which can be very instrumental in designing the same. The wonderful part is that this tool / template comes for free from the courtesy of kimball group, and it's called Dimensional Modeling Spreadsheet.
Dimensional Modeling Spreadsheet: This template spreadsheet can help you to create your entire data dictionary / data map for your dimensional model, and it contains samples for some of the basic dimensions used in almost any dimensional model. The unique thing about this spreadsheet is that once you have keyed in your design, it has the option to create SQL out of your model. You can use this SQL Script in your database and create the dimension and fact tables right out of it, which means that your data mart / relational data warehouse is ready to store the data. Also this spreadsheet can form the base for your ETL routines. The only other tool in my knowledge which can serve near to this functionality is Wherescape RED, and of course it's not free, as it serves a lot more than just this.
You can read more about this spreadsheet in the book "The Microsoft Data Warehouse Toolkit: With SQL Server 2005 and the Microsoft Business Intelligence Toolset". For those who are fresh to dimensional modeling concepts, read this article to gain a basic idea of the life-cycle.
Sunday, December 19, 2010
- Parameterized reports
- Linked reports
- Snapshot reports
- Cached reports
- Ad hoc reports
- Clickthrough reports
- Drilldown reports
- Drillthrough reports
The categories of reports that I see from a development / production aspect are as below:
- Business Users authored reports: Ad hoc reports and Clickthrough reports can be classified in this category of reports.
- Performance Boosters: Cached reports and Snapshot reports can be classified in this category of reports.
- Data / Problem Decomposition: Drill-down and Drill-through reports can be classified in this category of reports. These are probably the reports of maximum value for business users.
- Business As Usual: Parameterized reports are the most common kind of reports, and can be classified in BAU category.
- Reusable / Versioned reports: Subreports and Linked reports can be classified in this category. These are probably the most focused category of reports from a development perspective.
For more detail on the types of reports in SSRS 2008 R2, a very detailed explanation can be read on MSDN from here.
Tuesday, December 14, 2010
The order of this process can be as follows:
Wednesday, December 08, 2010
1) What is the difference between scale out and scale up?
2) How would use ensure 24 x 7 availability of a cube, considering the point that globally users are accessing the cube, and the cube should always remain available for querying?
The answer to the first question is when you need to achieve parallelism for concurrency in querying or processing, you distribute / replicate processing operations and/or data on multiple nodes. Scale up usually means that you increase the capacity of the host to enable the server to cater the incoming load. When the capacities of scaling up ends, scaling out steps in.
The next question was quite interesting, and the challenge was that I was in a situation to instantly think of a design and answer this query. I answered this question correctly, and to my delight, I found this whitepaper which is exactly what I answered. Such moments bring a lot of happiness and confidence that my knowledge has not gone stale and I can continue to provide consulting in MS BI business.
The presentation layer is coupled with SSAS query server. Data is read from relational engine and cube is processed on a separate server, which can be considered another layer altogether. After the cube is processed, query server and processing server are synchronized. For multi-server synchronization, SSIS is used. The below two diagrams demonstrates the same. Entire whitepaper can be read from here.
Tuesday, December 07, 2010
Thursday, December 02, 2010
Monday, November 29, 2010
- Types of Dimensions
- Types of Measures
- Types of relationships between dimensions and measuregroups: None (IgnoreUnrelatedDimensions), Fact, Regular, Reference, Many to Many, Data Mining
- Star Vs Snowflake schema and Dimensional modeling
- Data storage modes - MOLAP, ROLAP, HOLAP
- MDX Query syntax
- Functions used commonly in MDX like Filter, Descendants, BAsc and others
- Difference between EXISTS AND EXISTING, NON EMPTY keyword and function, NON_EMPTY_BEHAVIOR, ParallelPeriod, AUTOEXISTS
- Difference between static and dynamic set
- Difference between natural and unnatural hierarchy, attribute relationships
- Difference between rigid and flexible relationships
- Difference between attirubte hierarchy and user hierarchy
- Dimension, Hierarchy, Level, and Members
- Difference between database dimension and cube dimension
- Importance of CALCULATE keyword in MDX script, data pass and limiting cube space
- Effect of materialize
- Partition processing and Aggregation Usage Wizard
- Perspectives, Translations, Linked Object Wizard
- Handling late arriving dimensions / early arriving facts
- Proactive caching, Lazy aggregations
- Partition processing options
- Role playing Dimensions, Junk Dimensions, Conformed Dimensions, SCD and other types of dimensions
- Parent Child Hierarchy, NamingTemplate property, MemberWithLeafLevelData property
- Cube performance, MDX performance
- How to pass parameter in MDX
- SSAS 2005 vs SSAS 2008
- Dimension security vs Cell security
- SCOPE statement, THIS keyword, SUBCUBE
- CASE (CASE, WHEN, THEN, ELSE, END) statement, IF THEN END IF, IS keyword, HAVING clause
- CELL CALCULATION and CONDITION clause
- RECURSION and FREEZE statement
- Common types of errors encountered while processing a dimension / measure groups / cube
- Logging and monitoring MDX scripts and cube performance
Thursday, November 25, 2010
How SSIS Performance can be optimized using Change Data Capture and/or Change Tracking , MERGE , and Filtered IndexesI'm reading: How SSIS Performance can be optimized using Change Data Capture and/or Change Tracking , MERGE , and Filtered IndexesTweet this !
Tuesday, November 23, 2010
Master Data Services Video Tutorial
SQL Server Integration Services Denali Video Tutorials
Migration and Deployment
- Working with Integration Services projects in Business Intelligence Development Studio (BIDS)
- Using the Deployment Wizard to deploy an Integration Services project
- Import an Integration Services Project from the Project Deployment File
- Import an Integration Services Project from an Integration Services Catalog
- Migrate an Integration Services 2008 Project to the Project Deployment Model
- Configure an Integration Services project using SQL Server Management Studio (SSMS)
- Running a package using SQL Server Management Studio (SSMS)
Impact Analysis and Data Lineage
- Extract Dependency Information Using the DependencyExtractor Command Prompt Tool
- Extract and View Dependency Information
- Extract and View Dependency Information for SQL Server Objects Using the MSSQL Provider
SSIS Designer Enhancements
Monday, November 22, 2010
Thursday, November 18, 2010
Monday, November 15, 2010
Sunday, November 14, 2010
The BI Semantic Model can be viewed as a 3-layer model:
- The Data Model layer that is exposed to client applications. Even though the BISM model is fundamentally relational, it can expose itself using both relational as well as multidimensional interfaces. OLAP-aware client applications such as Excel can consume the multidimensional interface and send MDX queries to the model. On the other hand, a free-form reporting application such as Crescent can use the relational interface and send DAX queries.
[Siddharth]: This can be seen as the view part of the MVC pattern. If I were an application developer I would call this layer as the presentation interface / facade. Crescent is providing support only to BISM models as of now. Crescent and SQL Azure Reporting are starting off on a conservative basis where they are providing support to BISM Models and SQL Azure databases only, respectively.
- The Business Logic layer that encapsulates the intelligence in the model. The business logic is created by the model author using DAX (Data Analysis Expressions) or MDX (Multidimensional Expressions). DAX is an expression language based on Excel formulas that was introduced in PowerPivot and built on relational concepts. It does not offer the power and flexibility that MDX does, but it is simpler to use and requires minimal tuning. There will always be sophisticated BI applications that need the power of MDX calculations and we envision that the BI Semantic Model will offer the choice of MDX as well, but this will likely come in a release after Denali.
[Siddharth]: This layer can be seen as the controller part of the MVC pattern. BI Semantic Model would offer the choice of MDX and would be available in future versions, means that it is definitely not a priority right now. In my views not just the sophisticated applications ones, but almost all existing applications need MDX. It would have been nice if it would have been the other way round, where MDX support is provided now with partial DAX now and rest in service packs / future release.
- The Data Access layer that integrates data from various sources – relational databases, business applications, flat files, OData feeds, etc. There are two options for data access – cached and realtime. The cached mode pulls in data from all the sources and stores in the VertiPaq in-memory column store. VertiPaq is a breakthrough technology that encapsulates state-of-art data compression algorithms along with a sophisticated multi-threaded query processor that is optimized for the latest multi-core chipsets, thereby delivering blazing fast performance with no need for indexing, aggregates or tuning. The realtime mode, on the other hand, is a completely passthrough mode that pushes the query processing and business logic evaluation down to the data source, thereby exploiting the capabilities of the source system and avoiding the need to copy the data into the VertiPaq store. Obviously there is a tradeoff between these two modes (the high performance of VertiPaq versus the latency and overhead associated with copying the data into VertiPaq) and that choice is left to the model author.
[Siddharth]: This layer can be seen as the model part of MVC pattern. If I am getting this right, this means MOLAP evoparates here. No doubt Vertipaq is the demand of time, but legacy applications cannot make an entry into this layer, or source data from their data marts to this layer and adopt this model.
In my views, neither Microsoft nor Developer community can decide what industry has to adopt. Only customer needs and market standpoint of BISM in comparison to it's competition can decide the success of BISM as the probable successor of it's predecessor.
Friday, November 12, 2010
Thursday, November 11, 2010
Wednesday, November 10, 2010
Coming to the topic of this post, after the availability of Database Manager for SQL Azure, SQL Azure Reporting is one step forward in bringing SaaS model in SQL Server ecosystem. Project Atlanta and Quest Software's Azure Services are two of the cloud based SQL Server related SaaS / PaaS (as I have not analyzed it thoroughly) services.
Microsoft's official description for Project Atlanta is as follows: "Microsoft Codename Atlanta is a cloud-based service that analyzes installations of Microsoft SQL Server 2008 (and later) and provides proactive alerts to help you avoid downtime and adhere to Microsoft's best practices. Detection of potential issues is based on knowledge provided by experienced Microsoft Support engineers who keep it up to date to leverage the most recent experience in supporting SQL Server deployments around the world". Though it sounds very exciting, there is another dimension to this offering, which Brent has described in his post. Do not miss the conversation between Brent and the Program Manager of Project Atlanta. This is one case study where one can take few lessons, that can be helpful in developing proprietary solution offerings on similar lines.
If you analyze the trend of the wind, you can easily make out the SQL Server product vendors are gearing up to provide cloud based solutions, some of which might be effective and some might be not. From a solution provider point of view, when your client might request a PaaS / SaaS solution on Windows Azure or SQL Azure, case studies like these can provide good insight from an architecture as well as designing of solution offering standpoint. I am wondering what would be Red Gate 's answer to these services, as SQL Toolsbelt might not be sufficient when cloud comes into picture.
Tuesday, November 09, 2010
2) Has the Reports Viewer control in SSMS been fixed to support SSRS 2008 and higher formats?
3) Has Silverlight support been added in SSRS?
4) What are the new enhancements in SSIS toolset as well as SSIS BIDS Designer?
5) Has better support now been provided in SSRS reports designer for reports debugging or we still need to use Fiddler?
6) Has any enhancements been provided for SSRS reports deployment or we still need to depend upon RSScripter?
7) Is there any improvement in SQL Server Agent to support better scheduling and triggering of services and instruction sets like SSIS packages, for example FileWatcher support in scheduler?
8) SSAS reference relationship of dimension with measuregroups has the probability to induce corruption in aggregation due to triggering of processing through partition processing. Has this been improved ?
9) BIDS Helper covers lots of gaps that BIDS does not provide. Has this support been incorporated as a part of BIDS?
10) Freeware utilities like SSMS Toolspack, SQL Search Add-in from Redgate, and DB Pro in Visual Studio are some of the distinct utilities that DB professionals require a lot during development. Has SSMS managed to spare us from using any of these utilities?
Saturday, November 06, 2010
1) SQL Server
3) IBM DB2
8) IBM UniVerse
9) IBM Unidata
15) Mimer SQL
18) Intersystems Cache
24) Visual FoxPro
27) Sharepoint (Document lists and libraries)
29) IBM AS/400
Thursday, November 04, 2010
Wednesday, November 03, 2010
Sunday, October 31, 2010
Saturday, October 30, 2010
Data Market which was formerly know as Microsoft Codename "Dallas" is now commercially available, and this announcement was made very recently. Also many other significant announcements for the Windows Azure Market Place, which can be read from here. App Market which is another section of services related to application development and hosting would be commercially released, probably early next year.
To have a first look at the DataMarket section of Windows Azure Market Place, check out this video.