Case Study Slides
Case Study Slides
1
Problem Statement
1. Multiple source systems with different kind of data from different line of business
2. Data extraction from multiple source systems with multiple (23) branch database schemas
3. 23 branches’ data residing in 23 different systems were to be integrated and loaded into Enterprise Data Warehouse
2
Infosys Understanding
• Infosys understood that a significant amount of data modeling work was already done by the client team. It was our
understanding that Client had already put logical & physical data model in place for the project
• Infosys understood that the data in central repository provided a single “source of truth” of operational reporting
requirements and users could access information on an on-demand basis
• Infosys proposed to execute the project on a Time & Material basis and to execute the engagement in an elapsed
timeframe of sixteen weeks. This was an initial estimate based on our current understanding of the scope and the initial
effort estimation could have varied because of the following reasons:
Possibility of data model changes and additional scalable data
Data sources could be different or more than anticipated
Extensive data quality & data cleansing mechanism (as per the standards laid down by the data architecture group)
Complexity in ETL design
3
Infosys Approach
Solution Definition Phase Execution Phase
4
Architecture
5
Project Scope
• ETL Solution Definition
• High level source system study
• Review and evaluate existing Data Model
• ETL Requirements Gathering & Analysis
• Design
• Design high-level ETL strategy
• Validate source-target mapping
• Specifications for transformation rules
• Define the loading processes
• Execution
• Development of extraction and loading processes
• Testing for Data Quality issues
• Testing for Performance issues
• Designing of Test Cases, Test Plans
• UAT support