Generally cloud and ETL have a typically known application of facilitating ETL, which mostly harnesses the elasticity of computing resources. ETL on cloud is ideal for applications where data is already stored on the cloud. Regular line of business applications would process data from OLTP sources and load the data on another data repository on the cloud. For ETL to source and load data from in-premise data sources, WCF and related RIA services are employed. Using Amazon EBS cloud and customized VM images, you can setup your ETL on cloud and Amazon EBS out-of-box supports SSIS standard version. Right when Azure was in it's CTP, I had authored two articles on SQL Azure, on how to read and write data to SQL Azure using SSIS and SSRS 2008 R2. But this is something that is already well-known. What's new ?
Semantic Web, Unstructured Data and technologies that store, process, analyze, warehouse such data and extract intelligence out of the same is the new challenge that is at the horizon of the IT industry. Few front line IT majors have tsunami sized data generated everyday due to the virtue of popularity of social media, and such organizations like Google, Yahoo, Facebook, etc have already started wrestling with these challenges. The benefits of processing unstructured data, and driving your business based on the extracted intelligence are very clear from the example of these companies where they within a duration less than a decade, their advertising revenues are worth billions and still skyrocketing. All standard business house have lots of unstructured data like emails, discussion forums, corporate blogs, recorded chat conversations with clients etc. Organizations generally have aspirations to build a knowledge base for all the different areas of business functions, but when they start seeking consulting on the strategy to implement the same, they find themselves in a whirlpool of processes and financial burdens. Unstructured data has the potential to generate data for such knowledge base.
Still the question in your mind would be, what has this to do with SSIS and Azure? Regular applications of SSIS are known to everyone, and more than that professionals would be knowing that SSIS is not supported on Azure cloud platform, which might be the motivation of reading this post as the subject line reflects that is might be supported now. SSIS is an in-memory processing architecture, and implementing the same of shared / dedicated cloud environments has its own challenges. I am also sure that SSIS Team must be on its way to bring it to the crowds in the time to come. But my interest in on the future application of SSIS on unstructured data, and hosting it on Azure cloud platform.
Semantic Web, Unstructured Data and technologies that store, process, analyze, warehouse such data and extract intelligence out of the same is the new challenge that is at the horizon of the IT industry. Few front line IT majors have tsunami sized data generated everyday due to the virtue of popularity of social media, and such organizations like Google, Yahoo, Facebook, etc have already started wrestling with these challenges. The benefits of processing unstructured data, and driving your business based on the extracted intelligence are very clear from the example of these companies where they within a duration less than a decade, their advertising revenues are worth billions and still skyrocketing. All standard business house have lots of unstructured data like emails, discussion forums, corporate blogs, recorded chat conversations with clients etc. Organizations generally have aspirations to build a knowledge base for all the different areas of business functions, but when they start seeking consulting on the strategy to implement the same, they find themselves in a whirlpool of processes and financial burdens. Unstructured data has the potential to generate data for such knowledge base.
Still the question in your mind would be, what has this to do with SSIS and Azure? Regular applications of SSIS are known to everyone, and more than that professionals would be knowing that SSIS is not supported on Azure cloud platform, which might be the motivation of reading this post as the subject line reflects that is might be supported now. SSIS is an in-memory processing architecture, and implementing the same of shared / dedicated cloud environments has its own challenges. I am also sure that SSIS Team must be on its way to bring it to the crowds in the time to come. But my interest in on the future application of SSIS on unstructured data, and hosting it on Azure cloud platform.
When SSIS would gain such capability, applications like Extractiv would be common across enterprises. To build application like Extractiv using SSIS, solution design would be too crippled as SSIS is not inherently blended with cloud. One day I would like to see SSIS packages getting executed as a crawler service in Sharepoint, which would crawl entire site and data on Sharepoint portals like FAST Server, extract entities from unstructured data and populate next generation of warehouse on the Azure cloud platform, that would be queried using LINQ for HPC kind of technologies. For me, applications like Extractiv are very fascinating, as they inspire to foresee ideas, that are window to opportunities which most are not able to envision right now.
No comments:
Post a Comment