I just read a post by Todd McDermid on Kimball Method Slowly Changing Dimension v1.5 release. Todd gives a very nice explanation on the internals of how this components works. Theoretically the performance is claimed to be 100 times faster than SCD, and the reason I feel is a particular design choice made for this component. I would not summarize all or any of the features of this new version of this component, as I cannot do a better job than what Todd has done on his post, but my interest is surrounding this design choice.
After reading this post, I was able to realize that this component uses Cached Lookup kind of implementation internally, and this makes this component performing like a bullet, but also a fat memory eater. So theoretically, in case if it's executed on a data warehouse or ETL server where other simultaneous ETL operations are working in parallel, one can expect a sudden low in memory if this component starts processing a huge SCD. I by no mean intend to point that one should not use this component, but the point I am trying to make is it should be used with a little caution. It's like you are boarding a super sonic carrier and reaching the space is guaranteed to be faster than any regular aircraft, but be ready to bear huge fuel tank and thrust for the minute till it reaches there.
Considering all the advantages it brings to the table compared to regular SSIS SCD component, and considering the standard RAM size that I have mostly seen on ETL Servers (8 GB normally), this component is a must-go selection. Read the entire post of Todd to understand various features of this component along with a video tutorial on the same.
No comments:
Post a Comment