Sunday, March 08, 2009

Metadata information storage for Extraction or SSIS Extract Packages : SSIS Package Configuration

I'm reading: Metadata information storage for Extraction or SSIS Extract Packages : SSIS Package ConfigurationTweet this !
Ideally, extraction process should be guided or driven by metadata. There always remains a margin of change in any OLTP source system though the change may not be that huge or frequent. In case, if you are into a project that follows a RUP model and OLTP development plus ETL or BI development in general is carried out in parallel, metadata inventory is the key guideline to keep the BI development and delivery in control.

Some of the different kind of metadata information that is generally required, at least from the extract perspective of SSIS are as below:

o List of tables to be extractedo Columns that are to be extracted for each table
o Delta detection of each table/synch
o Extraction Start and End Date Times
o Status of each table extract
o System parameters such as folder location, connection parameters
o Schedule durations
o Number of retries incase of failure
o Logging Parameters

The metadata information could be stored in configuration files, database tables, DSVs etc. Based on the complexity of configuration and flexibility for making changes you could choose different stores.

As per my viewpoint, a convenient and advisable option for the metadata storage would be to start from an indirect XML file, the location of which can be taken from an environment variable. This methodology has advantages like, during the development phase when debugging can be expected to be frequent, editing an XML file is more easy than updating a table as it would require database permissions and the complexities associated with it. Also just having the repository in the table won't suffice as the end-user generally can't be expected to change data manually by opening and editing tables. A user-interface would be required to be developed, to faciliate data manipulation within those configuration tables.

As the iterations of RUP progresses and moves towards the final iterations of the construction phase, storage location could be replaced by database tables and these should not take more than two man days efforts to have this change in place.

No comments:

Related Posts with Thumbnails