Capacity Planning PDD Final

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

1

Capacity Planning -
Version 8.6

Murat Yesilsirt, Principal Consultant

2
Agenda

• Key Architecture Points


• Environment Sizing vs. Capacity Planning
• Environment Sizing
• Capacity Planning

• Tools
• Sizing Exercise
• Customer Examples

3
Capacity Planning

• Ensures that sufficient capacity is available at all


times to meet business requirements
• Integration capacity is not simply the sum of
capacity needs of each application
• Time dimension - Involves more than
performance of the system’s components,
individually or collectively
• Also deals with resolving incidents and
identifying problems relating to capacity issues

4
Have you ever asked yourself?

• Can I save money with server consolidation?


• Could I move my data faster with an expanded
environment?
• Is my Informatica Server ‘big’ enough?
• How much more data could I move with my existing
system configuration?
• How much faster could I execute my existing loads?
• If I had to add 1 more project – do I have sufficient
capacity?
• How about x more projects?

5
Server Sizing/Capacity Planning Goals

• Meet future requirements


• Meet performance requirements
• Satisfy load window requirements
• Minimize contentions due to lack of resources
• Lower maintenance cost and cost of ownership
• Optimize capital expenditures

6
Key Architecture Points

7
Data Integration Environment:
Key Architecture Points
Source File Target File
Server and Server and
RDBMS RDBMS
CPU/RAM CPU/RAM

Network Network

PowerCenter
Server
Sources Targets

Server
CPU/RAM

8
Informatica Real Time Data Integration Transactional
System –
Relational Source
(Oracle, DB2, etc)

Customer Portal Web Application Integrated


Server Customer Portal
Database

Mainframe System

Administration Portal

PowerCenter
Orchestration Engine Acquired Mainframe
System

Data Steward Portal

Acquired Mid-Range
Exception
AS/400 System
Management
Database

9
PowerCenter Data Integration
System Characteristics
• Block processing
• Parallelism – Multiple threads, partitioning
• 64-bit Option and Caching
• Random Reads and Sequential Reads
• Database and File processing
• String Manipulation and Unicode
• Pushdown Optimization
• Shared File System
• Checkpoint Recovery
• Web Services

10
Environment Sizing

11
Environment Sizing vs. Capacity Planning

• Environment Sizing
• New software implementation/install
• Extremely rudimentary models to predict estimated need
• Rarely perfect

• Capacity Planning (Existing Environment)


• Accuracy of exercise is based on statistics from existing
environment
• New projects on existing environment
• Ensure existing projects are not affected
• Load window
• Load times
• Performance
• PowerCenter upgrade on existing environment
• Use of new PowerCenter enhancements for performance gains on
existing hardware after upgrade

12
Environment Sizing

• New Environment
• Conversion of custom code and stored procedures
• Estimation process because there is no existing environment
• Consider various architectural options and how they will affect the
sizing
• GRID/HA
• Windows vs. UNIX/LINUX
• Shared PowerCenter Environment vs. Dedicated Environment
• Virtualization
• Hardware sizing considerations
• CPU
• Memory
• I/O Capacity
• Disk Space
• Network Bandwidth
• Repository Database

13
Environment Sizing Inputs

• Data volumes
• Mapping complexity
• Number of mappings
• Concurrent work load
• Peak work load
• Expected growth

14
Environment Sizing Methodology
• Gather performance requirements (Volume, load window etc.)
• Document assumptions e.g. planned architecture, usage period,
geographical distribution of data & users
• Evaluate alternatives – Commodity hardware vs. High-End SMP
• 75% CPU Utilization or less
• Minimize memory paging
• Consider future growth
• Use proof of concept benchmark testing to validate
• Based on high level estimation factors –
• 2MB per CPU per second avg.
• 2-4 GB Memory per Core
• Cross check with other implementations

15
Environment Sizing Example
Informatica PowerCenter Sizing Questions
Data Volume Rate Aggregates and Sorting
Data volumes are a critical aspect as the CPU cycles must be available to handle the
data volume in the appropriate timeframe. Getting a reasonable estimate of the volume
of data to be moved on a nightly/daily basis is the cornerstone of a sizing effort.
Method 1 (Volume based) What is the expected use of aggregates/sort (Enter a "1")
Number of Gigabytes per hour 8 Low(25% or less) 0
Number of simultaneous jobs on average? 5 Medium (25% to 75%) 1
Method 2 (Existing load process) High (75% or more) 0
How many loads? 10
How much data is being moved (in GB)? 0.5 Data Volume Growth
What is your load window in minutes? 720 What is the expected yearly data growth (%) 20%
Method 3 (Expected load process)
How many target tables do you have to load? 250 Operating System
Size of data to load (source data) in GB? 10 Unix or NT Unix
What is the load window in minutes? 480 64 bit or 32 bit 64

Continuous and/or Real Time Lookup Sizing

The assumption is that continuous and/or real-time workloads will require more CPU Lookups (caching data tables to match values) require additional CPU and
and memory. This is because there is less flexibility in workload management. RT RAM. It is an important factor in sizing the box. Use an educated guess as to
sessions must run, and they must run now and they should not be slowed by other the size of your lookup requirements. If you are loading a warehouse, think in
processes terms of the size of th
What percent of sessions/loads will be real time? (Enter a '1') Percent of lookups with > 250k rows? (Enter a "1")
25% or less 1 Low(25% or less) 0
25-60% 0 Medium (25% to 75%) 1
60-90% 0 High (75% or more) 0
90%+ 0

Load Window Criticality Application Type(s)


How critical is it that the load windows is always met? (Enter a '1') What sort of application(s) will PowerCenter be used for?
Not at all important 0
Somewhat Important 0
Very Important 0
Critical 1 I

Other Considerations
Please include any other considerations you feel are important to the sizing effort. Any environmental information, restrictions,
needs should be listed here.

16
Capacity Planning

17
Capacity Planning

• Existing Environment
• Measure actual performance in YOUR environment
• Use real world performance information to understand
current unused capacity
• Use linear scalability to predict future needs
• Key review points :
• Current performance
• Data growth projections
• Future integration needs
• Consider Impacts of any technology shift/change
• Web Services
• Grid/HA
• XML Processing etc.
• New server technologies

18
Capacity Planning Methodology
• Gather performance information
• Volume (data/records)
• CPU Usage
• Memory Usage
• Network Usage
• File System Usage
• System Characteristics (CPU speed etc.)
• Document future assumptions e.g. planned architecture, usage period,
geographical distribution of data & users
• Review future growth needs
• Review data growth projections
• Plan for 75% CPU utilization or less
• Determine required capacity
• Update/expand environment as needed
• Use benchmark testing with real production like data

19
Roles

• DBA
• Operations
• Informatica Administrator
• System Administrator
• Network Specialist
• Developer
• Business Analyst

20
Tools

21
Tools

• Monitoring tools to help determine how the


servers are performing
• Reports to provide metrics about how
PowerCenter is being used
• Analysis to find out your current maximum
capacity
• Estimation to determine required capacity for
future growth

22
Tools

• Repository Reports
• Repository Queries
• OPB_SWIDGINST_LOG, OPB_TASK_INST_RUN,
OPB_WFLOW_RUN, OPB_TASK_STATS

• Key Results
• Number of records from SQ per node
• Number of session runs per node per day
• Number of concurrent session runs per node per hour
• CPU/Memory used per session

23
Tools
vmstat
• Reports information about processes, memory, paging, block IO, traps, and cpu activity
• vmstat 5 10 – run with 5 sec delay 10 times
• Processes in the run queue (procs r) procs r consistently greater than the number of CPUs
is a bottleneck
• Idle time (cpu id) cpu id is consistenly 0 indicates CPU issue
• Scan rate (sr) sr rate continuously over 200 pages per second indicates a memory shortage
• Key Results
• Memory usage statistics
iostat
• Report on CPU, input/output statistics for devices and partitions
• iostat 5 10 – run with 5 sec delay 10 times
• Reads/writes per second (r/s , w/s) Consistently high reads/writes indicates disk issues
• Percentage busy (%b) %b > 5 may point to I/O bottleneck
• Service time (svc_t) svc_t > 30 milliseconds requires faster disk/controller
• Key Results
• Disk usage results

24
Tools

netstat
• Displays information about networkinterfaces on the
system
• Network connections, routing tables, and interface
statistics
ntop
• Shows a list of hosts using the network
• Provides information about traffic generated by each host

25
Tools

sar – System Activity Reporter


• Exists on many UNIX platforms
• Examine live statistics
• sar [options…] t n
• t is number of seconds per sample
• n is number of samples

• Save sar data for later analysis


• sar –o filename t n
• Recall CPU usage: sar –u –f filename
• Recall Disk usage: sar –d –f filename
• You can also specify time windows (-s, -e) and alternate interval with –I

• Key Results
• Consolidated CPU/Memory/Disk usage statistics

26
Tools

sar – Disk Utilization


• sar –d t n
• Average I/O size in bytes = (blks/s*512 bytes)/(r+w/s)
• % busy is a good indicator of disk bottleneck
• Shows disk devices -- can be tough to trace back to specific logical
volume

vega7077-root-># sar -d 60 1

HP-UX vega7077 B.11.23 U ia64 10/25/07

10:25:24 device %busy avque r+w/s blks/s avwait avserv


10:26:24 c2t6d0 0.65 0.50 1 23 0.00 9.14
c76t4d3 0.02 0.50 0 0 0.01 10.03
c140t2d0 3.13 0.50 2 180 0.00 18.21
c142t2d0 3.88 0.50 2 180 0.00 22.38
c148t2d0 0.28 0.50 2 180 0.00 1.69
c150t2d0 0.42 0.50 2 176 0.00 2.37
c108t2d0 3.03 0.50 2 179 0.00 17.67

27
Tools
sar – CPU utilization
• sar –u t n
• %sys is system/kernel time
• %usr is user space time
• %wio is Percent of time “waiting on I/O”
• wio is the best indicator if I/O is a bottleneck
• Directly reflects how much performance is lost waiting on I/O
operations

vega7077-root-># sar -u 60 1

HP-UX vega7077 B.11.23 U ia64 10/25/07

10:49:31 %usr %sys %wio %idle


10:50:31 1 5 6 87

28
Tools

top
• Provides a dynamic real-time view of a running
system
• Displays system summary information as well as
a list of tasks currently being managed
• Useful for shared environments to identify each
application process and their CPU/memory
consumption

29
Tools
Windows perfmon

30
Example

31
Capacity Planning Example

• Before upgrade to PowerCenter 8.6, planning for the new


environment is initiated
• Current hardware on Unix
• Business activity is expected to increase 20% annually
• Two new Business Units are expected to use Informatica
platform
• Explore PowerCenter 8.6 performance enhancements

32
Capacity Planning Example
• Peak Load Time – 1am to 1:35 am
• Number of Sessions – 45
• Most concurrent sessions – 15
• Total Data Processed – 10 GB
• Primarily flat file to DBMS and DBMS to DBMS data load
• Server is 4 CPU with 16gb of RAM
• Most sessions include lookups, but with fairly reasonable
cache size (ie. no 8gb customer master)
• Total Load Window requirement is 2 hrs (done by 3am)

33
Capacity Planning Steps
• Using repository reports
establish a timeline for loads
• Daily
• Weekly

Extract + Audit Completed


• Monthly
• Determine the complexity of

Dimension Loads

Daily Fact Loads


mappings

Extract Files
• High: Multiple sources or

Validations

Validations

Validations
targets, 5 or more lookups,
complex logic
• Medium: Multiple sources
or targets, 2-5 lookups or
1:00 AM

1:10 AM

1:20 AM

1:30 AM

1:40 AM

1:50 AM
2:00 AM

2:10 AM

2:20 AM

2:30 AM

2:40 AM

2:50 AM

3:00 AM
an aggregator, full update
strategy
• Low: Straight Thru Mapping
less than 3 lookups

34
Capacity Planning Steps
• Link the results of system metrics to the load timeline
• Identify the peaks in CPU/memory/disk utilizations
Time CPU 1 CPU 2 CPU 3 CPU 4 Avg RAM I/O
1:01 95% 90% 85% 25% 74% 90% Ok
1:11 90% 90% 65% 3% 62% 35% Good
1:21 90% 50% 10% 3% 38% 50% Good
1:31 75% 25% 3% 3% 25% 25% Good
Avg 87% 64% 41% 9% 50% 50% Good

Data Seconds Data/Sec Data/CPU/Sec Max Expected

10GB 2100 4.8mb 1.2mb 2.4/mb/CPU

35
Capacity Planning Steps

• Review bottlenecks to reveal areas of improvement with


addition CPU/memory/disks
• This may also result in code fixes, but performance tuning is
only a short term fix
• Value of new Informatica features e.g. using OS profiles for
more granular information and process ownership
• Consider architectural changes in the new environment such as
Enterprise Grid Option
• Start making projections based on the input available
• My current peak CPU/memory utilization is at 50% and I am expecting
20% growth per group and 2 new groups will join

36
Questions for the Example :
• Do you need more CPU?
• Do you need more RAM?
• How much more expected capacity do you have without
extending the current load window?
• How much more capacity do you have until you no longer
meet load window?
• What could you do to ‘free up’ more capacity?

37
Pitfalls and Common Mistakes
• Apples to Apples
• “I talked to <customer> at the user group and they are moving 1,000 rows a second –
why aren’t we experiencing the same?”
• “I read an Informatica benchmark and they moved a terabyte in 38 min, which showed
4mb a second per processor – mine should be the same performance right?”

• Growth Projections
• “Every day we process 100,000 records that equal 5mb of data thus our warehouse is
increasing by 5mb a day. “
• “Every year our warehouse grows by 25% so our daily capacity must be growing by
25%. “

• Adding Horsepower
• ‘If I add more CPU and RAM my loads will be faster.”
• “My hardware vendor promised their new CPU’s are 2x faster so my load should finish
in ½ the time.”

• Root Cause
• “My performance is poor, it must be the Informatica Platform.”
• “I’m seeing very low rows per second processed, I must have a slow server”

38
Capacity Planning Results

• Better to start low, observe the adoption rates and usage


and then adjust upward as necessary
• Vertical – Expandable servers
• Horizontal – Grid Architecture

• Start with adding CPU and memory to existing server


• Then increase number of servers with Grid Architecture
• Allocate abundant storage for infa_shared directory
• Use flexible storage architecture e.g. start with 4 stripes
over 4 LUN’s, then grow to 4 stripes over 8 LUN’s to
expand from 100 GB to 200 GB

39
Customer Examples

40
Customer Example 1
Scenario
• Planning for release to production for PowerExchange CDC
• First a benchmark test was conducted with a subset of the data
• Projected data volumes was used for the estimation
• Assumptions were documented: Projected data volumes and benchmark
results
• The disk space used for the file system during the benchmark test was
recorded
Recommendations
• For actual data volumes, session logs are expected to use about 13 GB
daily
• There will be process to purge log files older than two days
• Based on this 26 GB will be allocated for Session Log directory
• Based on the number of lookups, the sizes of the lookup tables, and
concurrent sessions, 20 GB for Cache directory should be allocated

41
Customer Example 2
Scenario
• Provide capacity planning assistance for upgrade and server purchase
Some key questions
What is the total volume of data to move?
• Current task – Data volume is less than 2 GB per Month.
What is the largest table (bytes and rows)? Is there any key on this table that could
be used to partition load sessions, if needed?
• Existing task – Largest table 6 M rows with average record length 1000 bytes
What is the batch window available for the load?
• Existing task – Batch window is around 6 hours. Future task – 3 hours Week days, 10 hours week end
What is the expected growth?
• The percentage of data volume growth has been projected to be 25% each year.
• Currently there are 50 interfaces loading an approximate average of 200MB of data each
Recommendation
• Compute a “base size” using the key driving factors for CPU and RAM. Then, adjust this base size
according to some key attributes of the job load
• The key driving factors for calculating the base CPU size are “cpu mb per sec” (data rate) and “cpu per
session” (job load)
• Data Rate: CPU mb per sec = cpu mb per sec factor * number of GB/hour

42
Customer Example 3
Scenario
• Upgrade, Server Consolidation, and ICC Organization

Recommendation
Informatica PowerCenter Sizing Results
Component Details Ram Factor Initial CPU Adjusted CPU
Data Volume Rate Method 1 Using CPU/MB sec factor 4.9 2.9 4.9
Data Volume Rate Method 2 Using CPU/MB sec factor 4.9 2.9 4.9
Data Volume Rate Method 3 Using CPU/MB sec factor 4.9 2.9 4.9
Base Size 4.9 2.9 4.9
Continuous And/Or Real Time Ranges (0,40%,60%,100%) 0% 0% 0%
Load Window Criticality Ranges (-30%,0,50%,75%) 0% 0% 0%
Aggregates and Sorting 0% 0% 0%
Data Volume Growth 0% 0% 0%
Operating System Unix vs NT and 32 vs 64 bit 100% -25% -25%
Lookup Sizing Ranges (Ram = -20%,0,50%) 0% 0% 0%
Application Types + Other Subjective Factor (in %)
Total Adjustment Factor 100% -25% -25%

Final Sizing Raw 9.8 2.175 3.675

Final Sizing Adjusted 10 2 4


Sizing Upper Range 15 4 6

Informatica would recommend a PowerCenter server(s) with 4 to 6 CPUs and 10 to 15 GB of RAM.

43
Customer Example 4
Scenario
• Upgrade, High Business Growth, and End of Life for Servers

Recommendation
4 Nodes with 2 Dual Core CPU and 32 GB Memory each

44
Summary

• Capacity planning is a complicated process that requires


input from various sources
• Testing the PowerCenter loads in your environment is the
most effective way to estimate system behavior
• Choose a flexible architecture to allow incremental growth
• Validate your conclusions with Informatica Professional
Services
• Informatica HACOE at your service for reference
architecture and testing

45
Questions?

46
Thank you

47

You might also like