Blog Moved

Future posts related to technology are directly published to LinkedIn
https://www.linkedin.com/today/author/prasadchitta

Thursday, June 16, 2011

data consolidation (ETL) and data federation (EII)

Operational IT systems focus on providing the support for the business operations & enable capture, validation, storage and presentation of transactional data during normal running of the operations. They contain latest view of the organization's operational state.

Traditionally, the data from various operational systems is extracted, transformed and loaded into a central warehouse for historical trending and analytic purposes. This ETL process will need a separate IT infrastructure to hold the data as well as it introduces some time lag in making the information in the OLTP systems available in the central data warehouse.

When the costs/resources required for consolidating data in the traditional way is not suitable due to the latest trends of acquisitions etc., there is a need for a different mechanism of data integration. The relatively different way of looking at this problem is to provide a semantic layer that can be used to access the data across heterogeneous sources for analytical purposes. This new way is called as "Data Federation" or "Data Virtualization" or EII - Enterprise Information Integration.

Key advantages of EII are quick delivery and lower costs. Key disadvantage is the performance of the solution and dependence on the source systems.

A good use case of data virtualization in my view is to consolidate different enterprise data warehouses due to mergers/acquisitions.

Traditional ETL and data warehouse technology vendors are coming up with data federation tools. Informatica Data Services uses a consolidate data integration philosophy where as Business Objects data federator uses a virtual tables in the BO universes for providing same functionality. Composite Integration Server is the independent technology provider in this area.

Key considerations in selecting the data federation and associated technologies are
1. native access to the heterogeneous source systems
2. capabilities of access method optimization
3. caching capabilities of the federation platform
4. metadata discovery capabilities from various sources
5. ease of development

A carefully chosen hybrid approach of consolidation and federation of data is required for a successful enterprise in the modern world.