Friday, July 31, 2009
Monday, July 27, 2009
Monday, July 20, 2009
SSIS Performance , SSIS Debugging , SSIS Development , SSIS Tools , SSAS Tools , SSRS Tools , MS BI Tools - BIDS HelperI'm reading: SSIS Performance , SSIS Debugging , SSIS Development , SSIS Tools , SSAS Tools , SSRS Tools , MS BI Tools - BIDS HelperTweet this !
BIDS Helper is such a great tool that needs no introduction for MS BI Professionals , or rather I would like to used the word "Product" for it to honour its value. I feel that sooner or later, Microsoft should incorporate the functionality that it provides within BIDS or should flag it as an official Microsoft Recommended Add-In and ship it with some service packs.
I would like to present below, some of the very useful functionalities that it provides for the SSIS environment, functionalities that in particular I have found useful. I don't mean to say that rest of the features are not useful, but it means that I have not required to use the features not listed below in my development knowledge.
1) dtsConfig File Formatter: Many of us hate the way config data is written to dtsConfig file and keep wondering why there's no reader or a micro IDE kind of thing for it. It can be read in Visual Studio which would format the excel, but the file formatter in this tool, does it all for the developer.
2) Expression and Configuration Highlighter: This is a super brilliant feature, and is very necessary in large projects where you develop tens of packages every week. Especially this is helpful in setting package configurations in a master - child package kind of scenario where the child packages read their configuration setting from master packages. This feature can work as a proof-reader to have a quick visual check on whether the configuration and expression setting for controls have been implemented or not.
3) Expression List: This is one of the much required report in any package. This feature combined with the previous feature, gives a nice proof-reading (verification) and reporting of the expressions implemented in a package. Also it comes handy to have a list of expressions used in the package, as it gives a better track of points where changes needs to be made.
4) Non-Default Properties Report: This is another brilliant report that becomes almost inevitable while working in a maintenance kind of project, and is also helpful in development projects for documentation and reconciliation purpose. As relevant from the name, it provides a report of all the sensible properties which has non-default values. This helps to verify that any property has not been changed inadvertently.
5) Pipeline Component Performance Breakdown: This is one of the key helpful debugging help, needed in a complex project. Thou from the theory and definition that the product page explains, I am quite convinced that this would help, but still I have skeptic about precision of its report. After all it's just a program and a mechanism of measuring the performance, and can't be more analytical than human brain !! I wish that it was perhaps possible to combine the replay from trace feature of SSIS Log Analyzer with this feature, it would had been a silver bullet in arena of SSIS debugging.
6) Reset GUIDs: This feature is a developer's hammer, and saves quite some time when a package is copied in the same or across environments, where duplication of a package is done for templating purpose.
7) Sortable package properties report: This is another nice report that this tool draws out of the SSIS solution. It's a good report to attach to a build, but it's not comprehensive. It can get better, if it provides options to select properties that are required in the report which becomes useful for a real-time reporting in a complex project. Also I believe this would work for only standard controls that come out of the box of SSIS, but not for custom or third-party controls. To summarize this report is good to attach to any build, thou just for a formality.
8) SSIS Performance Visualization: This is a nice analytical and pictorial representation of execution trees logs available in SSIS. At the first glance, this symbolic language got me confused and thinking about what it means as and requires a thorough reading of how to interpret the meaning of the charts. But it's a swiss knife, if used with a deep understanding. It's not a developer feature, it's a feature that a performance analyst would appreciate and love to use. I like this performance visualization feature 10 times more than what SSIS log analyzer tool shows in the form of pie charts with almost no technical details of how it was derived and the stats behind it.
9) SSIS Variables Windows extensions: This is again a developer's hammer and making your vision more deep and sharper into the package composition. I am not that impressed with the feature that it provides for copying the variables across tasks and containers, compared to the feature where it highlights the icons of the variables impacted by package configuration and expressions.
All in all, I feel that it's one of the must have utility after installing BIDS for any serious SSIS development project, as sooner or later eventually every SSIS developer or performance analyst finds a use of one or more features of this tool.
Monday, July 13, 2009
A new utility has recently got floated on codeplex, and this caught my attention as it's related to SSIS. This utility is known as SSIS Log Analyzer. It works for both SQL Server 2005 and SQL Server 2008. It offers different features (see mentioned below), but the one I like is "Rerun (debug) the entire sequence of events which happened during package execution showing the flow of control in graphical form, changes in runtime values for each task like execution duration etc". This is synonymous to the what profiler offers while recording the trace. The trace file can then be executed again in profiler to measure & analyze database performance.
The features that are listed for this utility on it's product page can be classified into two basic categories:
1) Reasonable user interface to analyze the log or I would use the word "View" the log in a organized fashion than viewing it in the execution results pane.
2) Analysis that this utility claims to provide, in terms of package analysis and log analysis as well.
The features that the product page lists are as below:
- Option to analyze the logs manually by applying row and column filters over the log data or by using queries to specify more complex criterions.
- Automated Performance Analysis which provides a quick graphical look on which tasks spent most time during package execution.
- Rerun (debug) the entire sequence of events which happened during package execution showing the flow of control in graphical form, changes in runtime values for each task like execution duration etc .
- Support for Auto Analyzers to automatically find out issues and provide suggestions for problems which can be figured out with the help of SSIS logs and/or package.
- Option to analyze just log file or log and package together.
- Provides a lightweight environment to have a quick look at the package. Opening it in BIDS takes some time as being an authoring environment it does all sorts of validations resulting in some delay.
It's an interesting tool, and has a good potential of gaining maturity in terms of providing package analysis. This utility used with DTLoggedExec, can be a very useful and handy toolset that too for free.
Friday, July 10, 2009
Reference: Data Platform Insider
Thursday, July 09, 2009
While working in development projects or especially on data migration kind of projects, the requirements are very volatile and can change a few times in a single day. There can be flying metadata, i.e. starting from column names to end mapping destination, anything could change. In such an environment, some of the practices which I am listing below have come very useful to me in my personal development experience.
1) Always use annotations, else there is a very good chance that you would forget why you placed a particular control.
2) Always use meaningful names, if possible the name of the column for which transformation is being used. If you are using a Derived column, where you are defining set of columns, add an annotation besides it specifying names of columns.
3) Always respect your source columns, and bear in mind that the joining keys i.e. the primary keys or the lookup keys should of the same/comparable data-type else you might run into serious mess at a later stage. This analysis should be done and reported to your DBA or design architect if there's scope of correction. Else the first step after taking out data from the source should be to make data-types equal.
4) Always put a multicast transformation after the source adapter, and after every stage where data takes a different form or an important column has been derived. You would never know when this source would be required at a different stream or transformation.
5) Before you start your development, find out common columns used across your flow provided your flow has more than one outputs / destinations or you are working in a multi data-flow kind of environment. This would help you centralize the change and require minimum effort for maintenance or change management.
6) Try to keep ValidateExternalMetadata property to False as far as possible during your development, especially where lots of columns are coming from sources (in hundreds) and dozens of transformations are being used to stop your BIDS from hanging. This would be helpful as even if you change a simple column at any level, the entire flow would change and it would go back to your source to validate the metadata, normally in case of a database if the property is set to True.
7) In case if you have developed your package to a great extent, and then you encounter an Einstein with an update that the column name has changed, use the new column and set the Output Alias property as the name of older column name. This would save you from correcting the issues in other transformations due to a new column induced from top to bottom of the flow.
8) Take backup of your package every 2 hrs. If you mess up seriously with your package, there is no Ctrl-Z that really helps.
Wednesday, July 08, 2009
How Do I: Import Text Files with SQL Server Integration Services?
How Do I: Import Data from Oracle Databases?
Creating a Basic Package
How to: Call a Web Service by Using the Web Service Task
How to: Implement a Lookup Transformation in Full Cache Mode
How to: Use the Data Profiling Task
Provide an Up-to-Date Data Warehouse with Change Data Capture
Create and Publish a Data Cube for SQL Server Analysis Server 2008
Creating an Analysis Services Project
Defining and Deploying a Cube
Modifying Measures, Attributes and Hierarchies
Configure a Report Portal in SSRS
Automate Report Delivery in SQL Server Reporting Services Using Subscriptions
Create and Modify Reports Using the Report Builder Tool
Configure Report Security in SSRS
Create a Report with Visual Studio 2008 and Deploy it to SQL Server Reporting Services 2008
Monday, July 06, 2009
Take example of For Each Loop container in SSIS 2008. By default the For Each File Enumerator is the default selection. But if you look at the details section, you won't find any details specific to this enumerator. In order to get the details in this section, you need to select some other enumerator and then re-select it to see the details. Ideally, there should be nothing selected in the Enumerator list or in case the details section should have the details of the Enumerator selected by default.
One another example is of SQL Server Management studio. By default the setting remains that it allows you to create a table, but due to a default setting, it won't allow you to save changes. It is very common and obvious for anyone to make changes to the design of a database object. It is very strange that on the first hand it allows you to create a table or database object in general, which in my opinion it should not if it has then to stop you from modifying the design thou the person being the owner of the database object.
Thursday, July 02, 2009
Wednesday, July 01, 2009
PTPower is an Excel 2007 Add-In that lets you view the MDX and/or create Calculated Members for pivot tables based on Analysis Services (2005 or 2008) cubes. Thou I have not used/tested this add-in, but if it lives upto what it states, I give it a 5-star.
Reference: Andrew Fryer's Blog