Tuesday, December 28, 2010
I'm reading: SSIS Architecture Design DocumentTweet this !
When I was quite junior in the IT industry, I use to wonder how one designs an architecture document. I used to collect templates of these documents from different sources. Today when I have matured enough to design one myself with my own sense and understanding of system, I feel that a basic idea of the sections that the SSIS architecture design document should contain would be of interest to all those minds who are growing up as future architects.
Delta Detection - This section would contain details about incremental load and Change Data Capture.
Extraction & Staging - This section would contain details regarding the treatment of extracted data and whether permanent / temporary staging are being used. If a staging area is used, it would require more elaboration about the same.
Facades - Interfaces and contracts are a vital part of any architecture design. This section would contain the views / SPs that you would create in your OLTP system. Even your CDC SPs can act as your facade. Also this section would describe whether you are using a push / pull model.
Trigger mechanism - Many solutions contain a application which triggers the ETL cycle. Whether you would use any scheduler like Autosys / SQL Agent should be described here.
Process Control Structure - Each category of packages classified by functionality should log which package was doing what and when. This would provide monitoring layer and control over the entire ETL execution.
Environment configuration - All details about environment variables, package configurations, and the storage locations of those configurations should be described in this section.
Tracability - Logging details should be mentioned in this section. Auditing can be of huge importance to support teams, so this section should be documented and designed keeping support teams in consideration.
Transaction Handling - Details about transaction handling within and across packages belongs to this section.
Error Handling - This section is very important to almost all teams screening your architecture document. Also is your error handling is weak, your transaction handling might suffer and this can be catastrophic to any kind of data loading.
Automation - Particularly in data warehousing projects, automation is one of the regular requirements. Trigger mechanism falls within the automation umbrella upto an extent.
Scalability - The more you loosely couple your package structure to take advantage of parallelism and more you design your logic keeping memory (synchronous and asynchronous transformation)in consideration, the more you gain in scalability. These details should be mentioned in this section.
Hardware Configuration - Server configuration details right from the number of cores till the amount of RAM that you intend to use belongs to this section.
Change Control & Deployment - Deployment methodology and location should be elaborated in this section. Though change control is a part of the configuration exercise, it's linked with how you would manage deployment. So it would make more sense to elaborate Change Control methodology in this section.
Feel free to add your comments to make this section more complete.