0% found this document useful (0 votes)
906 views13 pages

Day 2 (1) .1.2 DataStage Projects Life Cycle

Uploaded by

Rahul Verma
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
906 views13 pages

Day 2 (1) .1.2 DataStage Projects Life Cycle

Uploaded by

Rahul Verma
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

DataStage Projects –

Life Cycle Stages


Agenda

 Introduction

 Requirements

 Design

 Build

 Testing

 Implementation

 Support

© 2002. Infosys Technologies Ltd. 2


Introduction
DataStage projects follow the same life cycle stages as other projects.

A typical life cycle phase of DataStage projects is

Requirements  Design  Build  Test  Implement  Support

© 2002. Infosys Technologies Ltd. 3


Requirements

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Warehouse needs to cater to a wide range of user analytics. Requirements should be


well documented, elaborate and tight
 Clearly identify the interface points and define the communication protocol

 User views need to be modeled and aligned more closely to meet business needs

 Identify the dependencies between all aspects of the project like ETL feeds, User
Views etc. to facilitate better control over project execution
 Performance related requirements need to be identified and documented.

 Source Data Analysis need to be done to understand the type of data which needs to
be processed.
 A detailed Analysis/High level design phase is required to drill down the requirements

© 2002. Infosys Technologies Ltd. 4


Steps to effective Requirement gathering

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Identify the source system tables required.


 Identify the data flow
 Identify the data process.
 Identify Views to be created, Reports to be generated etc.
 Create Requirement Traceability and Test Matrix
 State the assumptions clearly.
 Define implementation Considerations.
 Document Design Solution.
 Identify Transformations -- Define data mapping.
 Gather Volumetrics
 Start Data Analysis.

© 2002. Infosys Technologies Ltd. 5


Design

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 A fluid Data Model will result in lot of rework. Changes might be small, but might be required at
multiple places increasing volume of rework.
 Changing Data Model leads to difficulty in Metadata Management, which is very critical for an
enterprise data warehouse. Metadata needs to be extracted and loaded into DataStage every time
there is a change. This process needs a significant lead time.
 Design should be robust and accommodate process health features like Auditing, ACR balancing,
Error processing and reprocessing, Restart ability, Recovery etc
 Perform POC on critical requirements and identify performance bottlenecks upfront

 ACR checkpoints in the data flow will help in identifying the data problems early in the process
before data is loaded to warehouse.
 Design patterns should be reusable across projects to reduce development time

 Brainstorm and consider various aspects of Framework , Finalize and Bring Clarity.

 A flexible framework design which takes care of recovery in case of a downtime is very critical from
application support perspective.

© 2002. Infosys Technologies Ltd. 6


Steps to a Good Design

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Re-validate Data Mapping.

 Define General programming specifications.

 Define Development objects.

 Define Miscellaneous processes like Error processing, re-processing, ACR balancing,


Auditing etc.
 Create a POC for all the critical/complicated points, make it End to End to have no
surprises during build.
 Identify Common functionality, jobs, Scripts etc keeping re-usability in mind

 Prepare Test plans, map them to requirements.

 Define Programming standards, directory structure.

 Explore different options/possibilities for Data Extraction

© 2002. Infosys Technologies Ltd. 7


Build

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Multiple stages can be used to establish similar/same function. Choice of selecting the
right stage and configuration is key in developing a quality solution
 Implementation of encryption routines using Open SSL library for AES
encryption/Decryption/ SHA-1 hashing etc should be taken care in the start of the
phase.
 Metadata is a key aspect of a successful data warehouse implementation. Standards
need to be clearly defined and followed
 Accessing DataStage over Citrix server has improved productivity to a large extent.
This has also given the flexibility to try out multiple options and provide the best
solution. Hence Citrix server should be used for accessing datastage.
 Knowledge Management practices capture and disseminate information. Repository of
knowledge articles, learnings, checklist should be built from experience

© 2002. Infosys Technologies Ltd. 8


Tips to Efficient Build

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Categorize similar jobs.

 Define framework for each category.

 Define framework for each process (like error processing, record processing,

 Finalize job parameters.

 Build re-usable components, frameworks and custom stages.

 Prepare necessary check list for Build.

 Get Metadata ready.

 Build datastage jobs.

 Perform Usage Analysis for Metadata Compliance.

 Finalize sequencing and scheduling (Either Control M or Sequencer)

© 2002. Infosys Technologies Ltd. 9


Testing – System/ Volume/ Performance/ Integration/ Acceptance

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Experience in handling large volumes of data in multiple projects, including the huge
CSPAM volumes from Target Stores
 Broader understanding and good experience from innumerable challenges that we
have overcome across projects and environments, old as well as new.
 Understanding the role of the various teams involved. Ability to
partner/coordinate/collaborate with multiple teams.
 Testing of DataStage jobs requires considerable amount of time. Adequate testing time
should be planned
 Preparing a good test data bed is often complex and difficult. Plan well in advance.
 Plan to have enough database capacity and test schemas to have a smooth testing
phase.
 Learning's from Target DataStage/UDB environment is critical in successful testing
phase

© 2002. Infosys Technologies Ltd. 10


Testing – System/ Volume/ Performance/ Integration/ Acceptance

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Obtain/ prepare source data matching all the scenarios.


 Try to obtain production source data if possible for testing.
 If possible have more rounds of testing.
 Perform Unit testing
 Test for negative cases too.
 If there are changes, do regression testing.
 Ensure the configuration similar to production environment while testing.
 Identify system related issues and include them in System Testing. Configure and use
Schedulers.
 Perform Volume testing with various data. Use source data from production if available.
Otherwise use generate data using tools for volume testing.
 Identify all other external components and include them for Integration testing.

© 2002. Infosys Technologies Ltd. 11


Implementation

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Need to plan in advance for the implementation phase. Need to collaborate with
different stake holders to successfully implement various aspects of the application
such as DataStage jobs, Control-M schedule, Unix scripts, ACR application, etc.
 In case of a new environment like grmetlprod01, there needs to be a test
implementation phase to iron out any environment related surprises.
 Awareness of the new processes in place for DataStage implementation such as the
deployment using WBSD. This will help in resolving problems and reducing delays
 A well developed deployment checklist which can be reused across projects

© 2002. Infosys Technologies Ltd. 12


Support

Requirements
Requirements Design
Design Build
Build Test
Test Implement
Implement Support
Support

 Supported DataStage applications after implementation and successfully turned over a


few applications to ESS. Need to plan well in advance for the involvement of TOC for
the application turnover. Also a comprehensive knowledge article listing all the issues
faced and the resolution from the UAT through support phase is very critical for the
support team.
 Based on the criticality of the application, a clear escalation procedure/support plan
should be put in place to address environment related issues. This should be planned
in advance with the DataStage support team as well as the DB hosting team.
 Ability and experience to provide 24*7 support is critical for most of the high volume
Data Warehouse ETL applications. Infosys Global Delivery Model suits for round the
clock support with onsite-offshore resources.
 Familiarity with all the support/turnover activities and systems like remedy to manage
the post implementation/ turnover effectively.

© 2002. Infosys Technologies Ltd. 13

You might also like