Nestle Tivoli Db2

IBM SAP International Competence Center
Nestlé achieves
Global Excellence
with
SAP NetWeaver BI and IBM
“The proof of concept performed with
IBM and SAP has been of real value to
Nestlé. It has demonstrated that it is
possible to scale to the very large
volumes that we expect to reach. The
fact that the test combined the skills of
IBM, SAP and Nestlé and included
hardware, database and applications
gives us a lot of confidence in the results.
Secondly, it has enabled us to establish
the target architecture we believe is
required and to have confidence in that
architecture. Thirdly, because of the
testing carried out at various stages, it
has also helped us in determining our
road map and steps for moving to our
target architecture.”
Chris Wright
Nestlé GLOBE Business Warehouse Manager
Nestlé
2
Nestlé achieves Global Excellence with
SAP NetWeaver BI and IBM
About this paper
This technical brief describes a proof of concept for the implementation of a 60TB SAP® NetWeaver Business Intelligence (SAP
NetWeaver BI) system on behalf of the Nestlé worldwide Global Excellence project. In the Nestlé GLOBE project, the SAP NetWeaver
BI system plays a strategic role as it is the focal point for business process data over multiple markets and geographies, and provides
the consistent view necessary for global decision making.
This proof of concept was completed jointly by SAP and IBM to prove to Nestlé that a system of the magnitude proposed is feasible,
can be administered with current tools, and can provide the performance Nestlé requires for the rollout of the GLOBE project.
Customer Objectives IBM Solution

l Implement a consistent end-to-end business view over l Use the benefits of DB2 Database Partitioning Function
the business systems and processes in multiple (DPF) to manage high-end scalability and provide very
geographies, by consolidating within SAP BI multiple large data access bandwidth
sources of business data such as manufacturing, l Implement Tivoli Storage Manager and Tivoli Storage
sales, logistics, finance Manager for Advanced Copy Services (ACS) for parallel
l Provide an effective, high-quality tool with near real- backup/restore and disaster recovery to compress
time information, able to support executive business maintenance windows and recover time
decision making, through comparisons of actual and l Use IBM POWER5 virtualization functionality to support
historical data volatile load profiles and to support both production load
l Improve decision quality by enabling current and concurrent administration requirements
positioning with historical data and tracking trends in a l Provide high capacity I/O performance using IBM
global context. DS8300 with IBM FlashCopy functionality to support both
performance requirements and concurrent administration
Resulting Customer Requirements needs.
l Extremely large database requirement expected to
exceed 60TB
l Ability to maintain and protect 60TB of data with limited
disruption to business
l Quick reporting response times for “ad hoc” as well as
standard queries
l Quick deployment of new data into the system: “near
real time”
l Support a combination of the previous workload types.
3
“For industry leaders like Nestlé, the
rapid delivery of business intelligence is
becoming a critical part of operations.
This outstanding Proof-of-Concept has
helped Nestlé gain a more strategic view
of its business data – and demonstrates
that IBM solutions can deliver the levels
of performance and scalability required
for even the largest and fastest-growing
SAP NetWeaver BI environments.”
Volker Loehr
General Manager, Global IBM SAP Alliance
IBM
4
Background, starting point and objectives
Nestlé strategy
Nestlé is the largest worldwide company in the food and Nestlé had already begun to implement a limited DB2 DPF in its
beverages sector, with sales for 2006 up to CHF 98.5 billion and SAP NetWeaver BI production environment of 7TB (status
a net profit of CHF 9 billion. Headquartered in Vevey, December 2005). The production hardware configuration, an
Switzerland, the group employs more than 260,000 employees IBM System p model p5 595 and a single IBM System Storage
and has factories or logistics operations in almost every country. model DS8300 storage server were considered capable of
In recent years, Nestlé launched the world’s largest SAP scaling to 20TB, the expected data volume by December 2006.
implementation project called Nestlé GLOBE, designed to At this point new hardware would be required and Nestlé needed
unlock Nestlé’s business potential by creating and adopting a proven design for both hardware and database, and the
common global business processes while allowing each market certainty that the proposed infrastructure and the SAP
to make their specific analysis and take local decisions. applications were capable of meeting the business requirements
at 60TB.
Implementing and running a business intelligence system
supports the Nestlé business strategy. This repository of IBM & SAP working together
consolidated data combining production data extracted from the Beginning in December 2005, IBM and SAP engaged in
day-to-day operations as well as sales and financial data cooperation to perform a proof of concept for the next generation
representing the market trends is considered a strategic tool for of very high-end business intelligence requirements.
informed business decisions and further business development.
Using the infrastructure building blocks already successfully
For international companies such as Nestlé that run businesses implemented by Nestlé – IBM p5 p595, IBM DS8300 Storage,
all over the world, these consolidated data warehouses can DB2 DPF, and Tivoli – IBM and SAP proposed to demonstrate
quickly become extremely large. Positioned at the heart of their the following:
strategy, the reliability, scalability and manageability of this
system’s underlying solution is vital in every sense. Data l Optimal DB2 DPF design for high scalability
Management is a critical aspect of the whole system as Nestlé is l Optimal storage server design for performance and
forecasting that strategic data consolidation will increase administration of extremely large databases
significantly in the next few years. l Optimal Tivoli design for highly parallel administration of
extremely large databases
Starting Point at Nestlé l Proof of high-end scalability for the SAP NetWeaver BI
Nestlé had already decided to implement IBM DB2 Universal application
Database, using DPF (Database Partitioning Function). The l Proof of infrastructure and application design for “high-end”
Nestlé team was convinced that this partitioned database was performance scalability
the most likely technology to be capable of scaling to 60TB and l Proof of infrastructure flexibility supporting a broad mix of
beyond. workload types.
5
The objective was to prove that by using a solution based on
“The major challenges we had to
massive parallelism, that extremely large databases can be
overcome in terms of architecture were
managed using tools available today, and can be managed in
the selection of the right component
the maintenance windows typical for business-critical
characteristics to both deliver the end-
production systems. This was a proof of concept for the largest
to-end bandwidth to access the 60TB of
SAP database build to date, and it was to prove that the
active data, and to support the broad
infrastructure and application could be designed to achieve
extreme workload scalability and variation. This proof of
mix of workload types. Thanks to the
concept was a “first of a kind:” it had no precedence at any
high-end technology, combined into the
component level. largest DB2 DPF on System p and
DS8300 that has ever been designed
The proof of concept would chart the future direction that and tested, and thanks to our
Nestlé’s infrastructure should take in order to support the unmatched team expertise, we
predicted growth of the BI system in terms of size and load. The successfully delivered a ‘near real-time’
key performance indicators (KPIs) of the proof of concept were SAP Business Intelligence system that
modeled to fulfill the expectations of the Nestlé workload as it is should help to meet Nestlé’s predicted
expected to evolve over the next two to three years, and based requirements for two to three years
on actual business processes and design criteria. The from the start of the project.”
challenge was to demonstrate generally the scalability,
manageability, and reliability of the BI application and the Francois Briant
infrastructure design, as BI requirements begin to necessitate Executive IT Architect and SAP BI PoC Lead Architect
multi-terabyte databases. IBM

The Proof of Concept
Key Performance Indicators – the challenge

The proof of concept was based on KPIs identified by Nestlé as phases of data growth, a series of tests were performed for
critical to support the business. These KPIs are divided into two verification of the KPIs for administration and performance.
categories: administration and performance. Infrastructure KPIs had to meet “strict targets set by the business
around the maintenance time window”, even as the database
Administration KPIs increased to nearly nine times the baseline size.
The administration KPIs are those which the infrastructure must
achieve to prove the maintainability of the database and disaster Combined performance KPIs
recovery approach. The major KPIs are documented in the table. The performance KPIs focused on the time-critical application
processes, including the high-volume integration of new source
This proof of concept reproduced the BI system growth from data as well as online reporting performance. The tests
7TB to 20TB and finally to 60TB. At each of these strategic simulated a 24x7 Service Level Agreement over multiple
Ref. Description Requirement

KPI-1 Recovery of an entire days work <8hrs
Restore Database and roll forward 500GB of logs
KPI-2 Disaster Recovery <24hr
Complete Database restore from tape + roll forward of 2TB of logs
KPI-3.a Daily Backup to Tape <12hr
Full Database backup to Tape
KPI-3.b Daily Backup to Disk <8hr
Completed processing of incremental FlashCopy of production day
Figure 2 - 1: Administration KPI
Figure 2 - 2: Database sizes
7
geographies, a business environment common to many large
covered in
enterprises. Reporting 4
PoC
not covered
in PoC
These tests represented a combined load scenario, Figure 2 - 3,

Aggregate
3
Rollup
in which data translation or cube loading (2), aggregate building
(3), and online reporting (4) with fixed response time
requirements, all ran simultaneously. This combined load InfoCube
2 Upload
increased over the lifetime of the project to five times the peak
BW system ODS Load from source
volume observed on the Nestle Production System at the start of
1 system Into ODS
object, using PSA
the project. ERP source
system
The query design for the proof of concept was implemented to Figure 2 - 3: the 3 critical online load requirements
reproduce behavior observed in the real production system. The
reporting queries ranged over 50 different reporting cubes, and Combined Load Requirements
50% of the queries could take advantage of the application 160 Aggregation 2,5
Load Requirements
server OLAP cache, while 50% could not. Of the queries, 80% 140
Query/Sec 2,08 125
2
120
used the BI aggregates and 20% went directly to the fact tables,
100 1,5
and these latter 20% were the major challenge. The database
Mil Rec/Hr
1,25
Tnx/Sec
80 75
load, from these reports using the fact tables, was also affected 1
60
as the database grew and the number of rows in the fact tables 0,8
40
25 25 0,5
increased from 20 to 200 million rows. 20 15
5
0 0
KPIA KPID KPIG
Figure 2 - 4: Graph online KPI scale up to simulate expected

rollout – Factor 5
sys<n>onl
Size of Fact Table for "ad hoc" queries
Reporting Application 250

Server OLAP
Cache Aggregates
200
200
Million Rows
150
100
DB F-fact tables 50 48
20
0
KPIA KPID KPIG
Figure 2- 5: Report simulation and data growth of fact-table target
8
Solution architecture
DB2 database design decisions

Working together with the DB2 development lab in Toronto, communication connections, performs any necessary data
Canada, a DB2 design for 33 database partitions was selected. merging of results data coming from the parallel partitions, and
The database would consist of one database server containing performs various administration tasks.
only partition 0, and 4 additional servers housing 32 additional
database partitions (8 per server). This design was expected to This partition was also used for less volatile data such as the
provide the flexibility required to implement the 60TB database. dimension tables. The fact tables and aggregates were
distributed across the parallel partitions.
In the finalized design, each DB2 partition had dedicated file-
systems and dedicated disks (LUNs). This design provides the According to size, the tablespaces containing active data were
most flexibility should it be necessary to redistribute the spread across eight to twelve DB2 partitions. Care was taken
database partitions. The layout design of eight DB2 partitions that these partitions were also equally spread across the
per LPAR proved to be very effective, requiring no further physical server hardware. Using the data distribution design
redistribution in the proof of concept. The dedicated I/O design described above, a very even data distribution was achieved
also has advantages for error and/or performance analysis in over the 32 DB2 partitions used for production data, delivering
real life, as it is easier to trace the workload to LUN. the extreme bandwidth required to access the very large data
tables within the defined performance targets. Figure 2 - 8
Partition 0 has a unique role in DB2 DPF; it maintains all the client depicts the final data growth stage at 65TB total data.
Zoning and LUN masking (db2 server)

DS8000 16 Fibre
20 Fibre Channels
Channels
LPAR 0 (DB2 P0) SAN (4 per LPAR)
ARRAY A4,A13,A20,A29, Fabric
A52,A61,A36,A45.
0 0 p595
A6,A15,A22,A31, A54,
A63,A38,A47 LPAR 0 (DB2)
A0,A9,A16,A25, 4 FC
A48,A57,A32,A41 partition 0
0xxx 1
LPAR 1(DB2)
LPAR 1(DB2 P6 -13)
ARRAY A0 to A31
4 FC 1 partition 6 to partition 13
1xxx 1
1 LPAR 2 (DB2)
LPAR 2(DB2 P14 -21)
ARRAY A32 to A63 4 FC partition14 to partition 21
2xxx
LPAR 3(DB2 P22 -29) LPAR 3 (DB2)
ARRAY A0 to A31 4 FC partition 22 to partition 29
3xxx
LPAR 4(DB2 P30 -37) 4 FC LPAR 4 (DB2)
ARRAY A32 to A63 partition 30 to partition 37
15 4xxx 19
Figure 2 - 6: 33 DB2 DPF partitions with shared nothing infrastructure – Phase 1, 20TB
9
Small Objects Largest Table: 250GB
1 Partition # of Tables > 1G: 400 Large Objects
# of Tablespaces: 500 (ODS / PSA)
12 Partitions
DB Server - P595 (Squadron) 64Way / 256 GB RAM
DB Partition 0 1 2 3 4 5 6 7 8 9 10 30 31
Basis Tables
Master Data MidSize
Objects (Fact)
Dimension 8 Partitions
Tables
Figure 2 - 7: DB2 DPF data distribution design
File Sytems Space allocated for the 60 TB database

Total = 65 TB
2500
27
15
17
09
4
89
5
95
85
83
94
89
93
95
84
21
21
21
0
8
66
62
21
56
49
47
47
45
43
21
20
20
20
20
20
43
45
20
20
20
20
20
4
36
10
20
20
20
96
97
20
20
20
2
20
20
20
82
20
20
20
63
20
19
19
19
19
2000
1500
Gig a Bytes
1000
3
48
500
0
0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
D B2 Pa rtition s
Figure 2 - 8: Data distribution over all 33 DPF partitions at 65TB
10
Evolution of DB2 in the proof of concept
The database used for the proof of concept was an actual clone The focus of Phase 2 was scalability of the design to 60TB.
of the Nestlé system. At the time it was cloned, DPF was already Phase 1 proved that the DB2 design, of eight partitions per
being used, but only in a small configuration. LPAR, scaled well and this design was taken into phase2. The
LPARs were distributed over five p595 servers to allow for the
There were six active DB2 partitions housed in a single database necessary memory increase required by the larger database,
server with shared file-systems. The first steps in the proof of and to take advantage of the p595 resource virtualization. In the
concept were to implement a new storage design with dedicated final layout, the database shares p595 CPU resources with the
file system and storage per DB partition, and redistribute the application servers, the Tivoli Storage Manager agents and the
database over five servers, which are logical partitions on a administration servers.
p595 server.
DS8300 layout design decisions
In Phase 1, the database was limited (by Nestlé specification) to There were two major design decisions for the storage layout,
a single p595 to reflect the current Nestlé hardware landscape. and although there are many successful high-performance
Phase 1 proved that the current landscape could scale to 20TB storage implementations in production, in this “first of this kind”
by using an optimal database and storage design. new criteria had to be considered.
NestléClone Phase1 Phase2

DB2
DB2 DPF Partition 0 DB2 DB2 DB2
Partitions
Partitions (1-5 dormant) Partition 0
(1-5
Partitions
6-13
14-21
0-5 dormant)
Single LPAR DB2
Partitions 6 -13
DB2
Partitions 14 -21
DB2
Partitions 22 -29
DB2 DB2 DB2
Partitions 30 -37 Partitions Partitions
22-29 30-37
6->33 Partitions
As Received from
Nestlé: single DB Redistribution of data Phase2
LPAR with 6 DB2 over 33 DB2 DPF
partitions and 5 x p5 Scalability: DB2 distributed over 5
DPF partitons
LPARs p5-p595s. Final phase is p5 virtualization:
shared processor pool
Figure 2 - 9: Evolution of the database server during the proof of concept
11
Data distribution options Positioning of source and target LUNs for FlashCopy options
a) Spread all the DB2 partitions on all the arrays by using small c) Dedicated arrays for FlashCopy LUNs: half the arrays for
LUNs and having one LUN per array in each DB2 partition source, half for target
b) Dedicate a group of arrays for each DB2 partition by using d) Spread source and target LUNs over all arrays
big LUNs in the group of dedicated arrays.
By separating the target and source LUNs on dedicated arrays,
In a parallel database, where the data is spread very equally we can ensure that there will be no influence on the production
across all the parallel partitions, it is assumed that the I/O activity disks resulting from activity on the FlashCopy targets used for
for each partition will be similar and that the load will be backup. At the same time however, the number of spindles
concurrent. By using dedicated storage arrays, the likelihood of available for production I/O load is reduced. As the FlashCopy
I/O contention is reduced. Dedicated arrays allow the use of activity was not expected to have a major influence on
larger LUNs as the data does not need to be distributed over as performance, the decision was to place both target and source
many arrays, and fewer LUNs are needed. Using fewer LUNs is on all ranks, but ensuring that source and target pairs were
advantageous to the backup/restore and failover scenarios as placed on different ranks. This design allowed all spindles to be
there is some administration overhead per LUN. available for production requirements.
P20 – P36 P20 – P36 P8 – P24 P8 – P24 P16 – P32 P16 – P32
DA 2 P21 – P37 P21 – P37 DA 6 P9 – P25 P9 – P25 DA 3 P17 – P33 P17 – P33
P20 – P36 P20 – P36 P8 – P24 P8 – P24 P16 – P32 P16 – P32
P21 – P37 P21 – P37 P9 – P25 P9 – P25 P17 – P33 P17 – P33
P6 – P22 P6 – P22 P10 – P26 P10 – P26 P18 – P34 P18 – P34
DA 0 P7 – P23 P7 – P23 DA 4 P11 – P27 P11 – P27 DA 1 P19 – P35 P19 – P35
P6 – P22 P6 – P22 P10 – P26 P10 – P26 P18 – P34 P18 – P34
Source
Target
P7 – P23 P7 – P23 P11 – P27 P11 – P27 P19 – P35 P19 – P35
P12 – P28 P12 – P28
DA 7 P13 – P29 P13 – P29 DA 2

P12 – P28 P12 – P28
P13 – P29 P13 – P29
Source
Target
P14 – P30 P14 – P30
DA 5 P15 – P31 P15 – P31 DA 0

P14 – P30 P14 – P30
P15 – P31 P15 – P31
Figure 2 - 11: Source and

Figure 2 - 10: Dedicated storage ranks in Phase 1 – 20TB target pairs on alternate ranks
12
Evolution of the storage server during the proof of concept “This Proof of Concept demonstrates the
Phase 1 was limited to a single p595 database server and a strength of the two companies combined
single DS8300 storage server to achieve scalability to 20TB. The technologies and skills, and the mutual
first step in the proof of concept was to implement the storage commitment to excellence. We brought
design discussed above: dedicated storage arrays, dedicated
together a team of 25 experts, spanning
LUNs, dedicated file-systems. The storage server was
5 countries over two continents, for more
upgraded to the newest technology, DS8300 Turbo (2.3GHz) for
than 14 months.”
the KPI-D load scaling stage. Phase 1 proved the performance
of the single storage server to 20TB and three times the current Herve Sabrie
Nestlé high-load requirements. Server Specialist and Nestlé SAP BI PoC Project Manager
IBM
The Phase 1 design was carried over to Phase 2, distributed over

four DS8300 Turbo storage servers, for the growth to 6TB.
Phase1 Phase2
Redistibute Grow to 60
Redistribution Grow to the DB2 Terabyte
of DB2 data
20TB nodes over
from shared
5 LPARs
to non-shared
filesystems and 4
storage
servers
DS8300 DS8300Turbo
7-20TB 20TB
Ea. 512 Disks (146GB 15K rpm) 54

TBs Capacity (production + FC)
4 x DS8300Turbo
60TB
Figure 2 - 12: Evolution path DS8300
13
Proof of design
N14 Total I/O Rate - Individual Data Volumes
Figure 2 - 13 shows the CPU distribution within one database 1800
1600
server with eight DB2 partitions. Each partition is represented
1400
here by a Workload Manager (WLM) class which monitors its
1200
CPU utilization. All partitions are equally active and the load is
I/O Rate (IOPS)

1000
well distributed. 800
600
These two graphs in Figure 2 - 14 show the I/O behavior of two 400
DB2 partitions over the same time frame. The two individual DB2 200
0
partitions show similar I/O behavior and peak load requirements.
5:31
5:51
6:11
6:31
6:51
7:11
7:31
7:51
8:11
8:31
8:51
9:11
9:31
9:51
10:11
10:31
10:51
11:11
11:31
11:51
12:11
This demonstrates a good database load distribution and also Time
shows that the disk layout decision based on dedicated LUNs

N37 Total I/O Rate - Individual Data Volumes
was correct. I/O contention, caused by the simultaneous I/O
1800
activity is avoided, and the full benefit of the parallel database
1600
can be realized. 1400
1200
I/O Rate (IOPS)
1000
800
600
400
200
0
5:31
5:51
6:11
6:31
6:51
7:11
7:31
7:51
8:11
8:31
8:51
9:11
9:31
9:51
10:11
10:31
10:51
11:11
11:31
11:51
12:11
Figure 2 - 14 : I/O performance on two separate DB2 partitions
CPU by WLM classes sys3db1p 9/1/2006
CNODE6 CNODE7 CNODE8 CNODE9 CNODE10 CNODE11 CNODE12 CNODE13
60
50
40
30
20
10
0
T0005
T0006
T0007
T0008
T0009
T0010
T0011
T0012
T0013
T0014
T0015
T0016
T0017
T0018
T0019
T0020
T0021
T0022
T0023
T0024
T0025
T0026
T0027
T0028
T0029
T0030
T0031
T0032
T0033
T0034
T0035
T0036
T0037
T0038
T0039
T0040
T0041
T0042
Figure 2 - 13: Load distribution over eight DB2 partitions in a p595 LPAR
14
Backup/Restore architecture
The Tivoli design implements five storage agents for better CPU
utilization via cross-LPAR load balancing. As partition 0 must be
backed up as the first step, four storage agents would have been
sufficient. Four CIM agents were also implemented for faster and
more reliable execution of FlashCopy commands. The storage
agents and CIM agents were implemented on AIX 5.3 in shared
processor LPARs.
AIX operating system
AIX operating system DB2 UDB ESE
DB2 UDB ESE Tivoli Data Protection Backup of

FlashCopy image
Online backup
for SAP (DB2)
Tivoli Data Protection tape restore
for SAP (DB2) Log archiving Tivoli Storage Manager SHMEM
API Client
Log archiving (LAN)
Tivoli Storage Manager TCP/IP
API Client LANfree backup/restore

tape delegation for LANfree Tivoli Storage
SHMEM
TCP/IP (loopback) Tivoli Storage Agent LTO3 Manager Server
Management and Management and

Tivoli Data Protection Workflow for
Tivoli Data Protection Workflow for
for ACS FlashCopy Backup for ACS FlashCopy Backup
production server DS8000 http: backup server

Port 5988
Interface to
CIM Agent FlashCopy
FlashCopy CopyServices
FC
Source
Target
Figure 2 - 15: TSM implementation – component model
15
Proof of Design: Restore 60TB Database from Tape ( KPI-2)
Figure 2 - 16 shows one stage of the disaster recovery KPI, a full For KPI-2, a final step is required: the roll-forward of 2TB of logs,
restore of the 60TB database from tape. For a restore, partition 0 simulating the recovery of several days’ work at the DR site. The
must be restored first and therefore two tape drives are used for roll-forward is shown in Chart 1 5 below. The recovery time is
this step with a throughput of 700GB per hour. dominated by partition 0; the remaining 32 partitions roll-forward
in parallel.
The remaining 32 partitions are restored in parallel, using one
tape drive each with an individual throughput of 470GB per hour. With a non-parallel database, the recovery time would be the
The total restore time is 6 hours, 40 minutes. sum total of all 33 individual recovery times (the time for partition
0 + (32 * the individual parallel partition time)). Instead, using the
Prior to the restore, a full backup was completed using the same shared nothing parallel database design, the roll-forward of 2TB
configuration. The backup was achieved in 5hours, 27 minutes. is completed in 2hours 52 minutes.
Restore per DB2 Partition

2 Tape Devices used
800,00
700,00
470 GB/h => 133 MB/sec.
600,00
500,00
GB/hour
400,00
300,00
200,00
100,00
0,00
0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Partition Number
Figure 2 - 16: Restore from tape – 60TB
16
Rollforward Recovery per Partition
NODE0000 NODE0006 NODE0007 NODE0008 NODE0009 NODE0010 NODE0011 NODE0012 NODE0013

NODE0032 NODE0033 NODE0034 NODE0035 NODE0036 NODE0037
250
200
DB2 Partition 0
150
logfiles recovered [GB]
100
50
32 in
0
Parallel
03:30:00 04:30:00 05:30:00 06:30:00
time
Figure 2 - 17: Disaster recovery roll-forward of 2TB of logs
Results of administration KPIs

The graph below shows the results of the administration KPIs the Tivoli Storage Manager tools, Tivoli Storage Manager Server
and proof of design. The results of the proof of concept show that and Tivoli Storage Manager for ACS agents, demonstrate how
using parallel technology; extremely large databases can be they can manage multiple levels of parallelization: parallel
administered effectively within a surprisingly small maintenance database, multiple database hosts, multiple storage servers,
window. DB2 DPF shows the strength of its parallel design, and and multiple target devices.
Ref. Description Requirement Achievement

KPI-1 Recovery of an entire days’ work <8hrs 2hr 40min
Restore database and roll forward using IBM
500GB of logs FlashCopy
KPI-2 Disaster recovery <24hr 10hrs
Complete database restore from tape plus Using LTO3
roll forward of 2TB of logs tapes
KPI-3.a Daily backup to tape <12hr 6hr 10min
Full database backup to tape using LTO3
KPI-3.b Daily backup to disk <8hr 30min
Completed incremental FlashCopy DS8300 Turbo
of production day
Figure 2 - 18: Administration KPI achievements
17
P5 595 server design
The resource distribution and logical partition design of the p595 along with the new DB2 V9, with scalability and performance
was completed in several steps. The evolution of the server benefits as well as new technology such as data compression.
design was determined by the proof of concept starting point, Phase 2 introduced micro-partitions and processor sharing.
the DB2 evolution, and the scaling requirements. The basic plan
is depicted in Figure 2-19. To determine the initial design for the application servers, and
begin the capacity planning for Phase 2, a series of load profiling
The Nestlé clone was delivered on AIX 5.2 and DB2 V8. The tests were done. These provided the first “rule of thumb” for
objective was to introduce the newest technology, paving a resource distribution. The graph in Figure 2-20 shows the
migration path, and showing proof of benefit. Phase 1 introduced DATABASE:APPS ratio done for the individual load profiles and
AIX 5.3 and the benefits of the hardware multi-threading (SMT) the combined load for KPIs A and D at 20TB.
Customer
Basis
Phase 1 Phase 2
AIX 5.2 AIX 5.3 AIX 5.3 AIX 5.3
DB2 V8 DB2 V8 DB2 V9 micropartions
DB2 V9
7TB - 20 TB
20TB - 60 TB
Figure 2 -19: Evolution plan
Workload Distribution Ratio DB:APPS
7 6
6
5 3,9
Physical 4
DB
CPUs 3 2 2 2,25
APPS
2 1 1 1 1 1
1
0
Query Aggregate Load KPI-A KPI-D
Figure 2 - 20: Database to application server ratios
18
Virtualization design for shared processor pool
Figure 2 - 21 shows the virtualization design for the final p595 servers to manage the parallel LANFREE backup. Using
implementation at 60TB. The five database LPARs are the shared processor technology of p595 servers, a priority
distributed across the five p595 servers, one per machine. The schema was implemented which gave highest priority to the
first p595 is dedicated to administration-type work and load database, medium priority to the online, and lowest priority to all
drivers. It contains the DB2 partition 0, the SAP software central batch activity. The storage agents were given a fixed CPU
instance, the data aggregation driver, and the data load entitlement to ensure compliance with the SLA.
extractors.
This entitlement can be used by other workloads when no
There is an online partition on each system to support the backup is active. The highly application server CPU-intensive
reporting. The HTTP-based online reporting load is distributed data loading scenario was given very low priority, but allowed to
via an SAP load balancing tool, SAP Web Dispatcher®. access all resources not used by higher priority systems. This
allowed these application servers to be driven to full capacity
A Tivoli Storage Manager for ACS agent is placed on each of the while not impacting the response time KPIs of reporting.
P595 N °0 P595 N °1 P595 N °2 P595 N °3 P595 N °4
DB 1 DB DB DB DB
Online
2 Online Online Online Online
Aggregate
3
Extractor Upload Upload Upload Upload
SAP CI
Backup Backup Backup Backup Backup
Figure 2 - 21: 60 TB server design

P595 N °1 to 4 : MAX CPU per P595 = 64
Max CPU:20
1 DB
CE = 10
Max 10
2 Online
CE = 5
Max CPU:48
Upload
3 CE = 16
Backup CE = 7
8
Total Virtual CPU = 86 Total CE = 38
Figure 2 - 22: Virtualization Priority Schema
19
Infrastructure scaling
Figure 2 - 23 depicts the hardware scaling on the p5 595 as the physical CPUs have been normalized to relative GHz to factor in
load scenario was scaled by a factor of five times the baseline the speed increase from 1.9GHz to 2.3GHz. Between KPID at
load. The scaling diagram separates the application servers 20TB and KPID at 60TB, the system was moved into the final
from the database to show the load effects on the different distributed configuration. Here the increase is related to the
components. same load but for a much larger database. The final point is five
times the baseline load at 6TB. The increase in application
Figure 2 - 24 shows the physical resource scaling in relation to server requirements is factor 3.8, and the scaling factor for the
the load increase: for a load scaling of factor five, the physical database is 2.8 for a factor five load increase. The database
resource increase was only factor of 3.5. In Figure 2 - 24 the design scales very efficiently.
Apps
PhysCPU Util per Component DB
Linear (DB)
250
219
200
PhyCPU Utilized
140 146
150
100 72
70
49
50 31 35
0
KPIA-20 KPID-20 KPID-60 KPIG-60
Figure 2 - 23: Total physical CPUs utilized
Load vs CPU Ratio for Scale (GHz basis)
6
5 Load.Scale
Increase Factor
4
Phyc.Scale
3
2 Linear (Load.Scale)
1
Linear (Phyc.Scale)
0
KPIA-20 KPID-20 KPID-60 KPIG-60
Figure 2 - 24: Resource scaling vs. load scaling
20
Online KPI achievements
Figure 2 - 25 follows the load throughput over the landscape was used to scale the load to meet the KPI-G requirements.
evolution until achieving the final high-load combined KPI
represented by KPI-G: 125 million records/hr in loading, 75 The online achievements were the result of the p5 p595 server
million records/hr in aggregation, with concurrent reporting rate flexibility in handling diverse concurrent load requirement, the
of 2.08/sec, at 60TB. DB2 parallel database functionality, which supports a very broad
based scaling of the database, and the dedicated “shared
The steps through the hardware were completed primarily using nothing” storage design. The project ended with the high-end
the KPI-D load requirements. In the final runs, the parallel KPI achievements for Nestlé, and there was still considerable
landscape, implemented in SPLPARs (shared processor LPARs) scalability potential in this solution.
140 25
120
20
100
LOAD
15
80
AGGR
RTIME
60
10
10x TXTRATE
40
5
20
0 0
KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI
A A A D D D D D D D D D G
Figure 2 - 25: Evolution of the online KPI tuning and scale-up throughout the proof of concept
21
“The combination of SAP NetWeaver® BI
and IBM infrastructure components
show an unbeatable scalability
addressing and fulfilling the
requirements at high-end BI scenarios in
the industry today.”
Dr. Thomas Becker

Project achievements
SAP NetWeaver Business Intelligence Development Manager
The initiative paved the way for Nestlé’s future business
SAP AG
intelligence architecture implementation worldwide, and
provided a very high-end proof point for best practices and
directions for such systems in general.
Performance & throughput

A proven architectural design, based on IBM & SAP
technologies, which can support Nestlé requirements for the
next 2 to 3 years time (technology scalability proof-point).
Flexibility
IBM System p virtualization features enable the architecture to
support a very broad mix of workloads. This option both
capitalizes on Nestlé investments (skills and technology) and
resolves the problems experienced by Nestlé caused by large
variations in volumes and type of workload. Virtualization makes
it possible to prioritize sensitive load types, such as the reporting
queries, while utilizing the full capacity of the available
resources.
Manageability
Proven and extendable architecture, based on parallelization,
which allows an extremely large database to be managed well
within the maintenance window allowed by the business. This
design covers the spectrum from “business as usual”
maintenance to full disaster recovery.
23
© Copyright IBM Corp. 2008 All Rights Reserved.
For more information:
IBM Deutschland GmbH
D-70548 Stuttgart
ibm.com
To learn more about the solutions from IBM
Produced in Germany
and SAP visit: ibm-sap.com
IBM, the IBM logo, IBM System z, IBM System p,
IBM System i, IBM System x, BladeCenter, z/OS,
For more information about SAP products and z/VM, i5/OS, AIX, DB2, Domino, FlashCopy, Lotus,
POWER, POWER5, QuickPlace, SameTime, Tivoli
services, contact an SAP representative or visit: and WebSphere are trademarks of International
Business Machines Corporation in the United
sap.com States, other countries, or both.
UNIX is a registered trademark of The Open

Group in the United States and other countries.
For more information about IBM products and Linux is a trademark of Linus Torvalds in the
services, contact an IBM representative or visit: United States, other countries, or both. Microsoft,
Windows, Windows NT, and the Windows logo are
ibm.com trademarks of Microsoft Corporation in the United
States, other countries, or both. Other company,
product or service names may be trademarks, or
service marks of others.
Contacts:
This brochure illustrates how IBM customers may
be using IBM and/or IBM Business Partner
IBM technologies/services. Many factors have
contributed to the results and benefits described.
Francois Briant, [email protected] IBM does not guarantee comparable results. All
information contained herein was provided by the
Carol Davis, [email protected] featured customer/s and/or IBM Business
Partner/s. IBM does not attest to its accuracy. All
Thomas Rech, [email protected]
customer examples cited represent how some
Herve Sabrie, [email protected] customers have used IBM products and the
results they may have achieved. Actual
environmental costs and performance
characteristics will vary depending on individual
For further questions please contact the
customer configurations and conditions.
IBM SAP International Competency Center via This publication is for general guidance only.
Photographs may show design models.
[email protected]
SPC03021-DEEN-00 (03/08)

Nestle Tivoli Db2

Uploaded by

Copyright:

Available Formats

Nestle Tivoli Db2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nestle Tivoli Db2

Uploaded by

Copyright:

Available Formats

IBM SAP International Competence Center

Customer Objectives IBM Solution

multi-terabyte databases. IBM

Key Performance Indicators – the challenge

Ref. Description Requirement

Figure 2 - 1: Administration KPI

Figure 2 - 2: Database sizes

These tests represented a combined load scenario, Figure 2 - 3,

Figure 2 - 4: Graph online KPI scale up to simulate expected

Reporting Application 250

Figure 2- 5: Report simulation and data growth of fact-table target

DB2 database design decisions

Zoning and LUN masking (db2 server)

Figure 2 - 7: DB2 DPF data distribution design

File Sytems Space allocated for the 60 TB database

Figure 2 - 8: Data distribution over all 33 DPF partitions at 65TB

NestléClone Phase1 Phase2

Figure 2 - 9: Evolution of the database server during the proof of concept

P12 – P28 P12 – P28

DA 7 P13 – P29 P13 – P29 DA 2

P14 – P30 P14 – P30

DA 5 P15 – P31 P15 – P31 DA 0

P15 – P31 P15 – P31

Figure 2 - 11: Source and

The Phase 1 design was carried over to Phase 2, distributed over

Ea. 512 Disks (146GB 15K rpm) 54

I/O Rate (IOPS)

shows that the disk layout decision based on dedicated LUNs

CPU by WLM classes sys3db1p 9/1/2006

CNODE6 CNODE7 CNODE8 CNODE9 CNODE10 CNODE11 CNODE12 CNODE13

AIX operating system

AIX operating system DB2 UDB ESE

DB2 UDB ESE Tivoli Data Protection Backup of

API Client LANfree backup/restore

Management and Management and

production server DS8000 http: backup server

Figure 2 - 15: TSM implementation – component model

Restore per DB2 Partition

Figure 2 - 16: Restore from tape – 60TB

NODE0000 NODE0006 NODE0007 NODE0008 NODE0009 NODE0010 NODE0011 NODE0012 NODE0013

Figure 2 - 17: Disaster recovery roll-forward of 2TB of logs

Results of administration KPIs

Ref. Description Requirement Achievement

Figure 2 -19: Evolution plan

Workload Distribution Ratio DB:APPS

Figure 2 - 20: Database to application server ratios

P595 N °0 P595 N °1 P595 N °2 P595 N °3 P595 N °4

Backup Backup Backup Backup Backup

Figure 2 - 21: 60 TB server design

Total Virtual CPU = 86 Total CE = 38

Figure 2 - 22: Virtualization Priority Schema

Figure 2 - 23: Total physical CPUs utilized

Load vs CPU Ratio for Scale (GHz basis)

Figure 2 - 24: Resource scaling vs. load scaling

Dr. Thomas Becker

Performance & throughput

UNIX is a registered trademark of The Open

You might also like