Nestle Tivoli Db2
Nestle Tivoli Db2
Nestle Tivoli Db2
Nestlé achieves
Global Excellence
with
SAP NetWeaver BI and IBM
“The proof of concept performed with
IBM and SAP has been of real value to
Nestlé. It has demonstrated that it is
possible to scale to the very large
volumes that we expect to reach. The
fact that the test combined the skills of
IBM, SAP and Nestlé and included
hardware, database and applications
gives us a lot of confidence in the results.
Secondly, it has enabled us to establish
the target architecture we believe is
required and to have confidence in that
architecture. Thirdly, because of the
testing carried out at various stages, it
has also helped us in determining our
road map and steps for moving to our
target architecture.”
Chris Wright
Nestlé GLOBE Business Warehouse Manager
Nestlé
2
Nestlé achieves Global Excellence with
SAP NetWeaver BI and IBM
About this paper
This technical brief describes a proof of concept for the implementation of a 60TB SAP® NetWeaver Business Intelligence (SAP
NetWeaver BI) system on behalf of the Nestlé worldwide Global Excellence project. In the Nestlé GLOBE project, the SAP NetWeaver
BI system plays a strategic role as it is the focal point for business process data over multiple markets and geographies, and provides
the consistent view necessary for global decision making.
This proof of concept was completed jointly by SAP and IBM to prove to Nestlé that a system of the magnitude proposed is feasible,
can be administered with current tools, and can provide the performance Nestlé requires for the rollout of the GLOBE project.
3
“For industry leaders like Nestlé, the
rapid delivery of business intelligence is
becoming a critical part of operations.
This outstanding Proof-of-Concept has
helped Nestlé gain a more strategic view
of its business data – and demonstrates
that IBM solutions can deliver the levels
of performance and scalability required
for even the largest and fastest-growing
SAP NetWeaver BI environments.”
Volker Loehr
General Manager, Global IBM SAP Alliance
IBM
4
Background, starting point and objectives
Nestlé strategy
Nestlé is the largest worldwide company in the food and Nestlé had already begun to implement a limited DB2 DPF in its
beverages sector, with sales for 2006 up to CHF 98.5 billion and SAP NetWeaver BI production environment of 7TB (status
a net profit of CHF 9 billion. Headquartered in Vevey, December 2005). The production hardware configuration, an
Switzerland, the group employs more than 260,000 employees IBM System p model p5 595 and a single IBM System Storage
and has factories or logistics operations in almost every country. model DS8300 storage server were considered capable of
In recent years, Nestlé launched the world’s largest SAP scaling to 20TB, the expected data volume by December 2006.
implementation project called Nestlé GLOBE, designed to At this point new hardware would be required and Nestlé needed
unlock Nestlé’s business potential by creating and adopting a proven design for both hardware and database, and the
common global business processes while allowing each market certainty that the proposed infrastructure and the SAP
to make their specific analysis and take local decisions. applications were capable of meeting the business requirements
at 60TB.
Implementing and running a business intelligence system
supports the Nestlé business strategy. This repository of IBM & SAP working together
consolidated data combining production data extracted from the Beginning in December 2005, IBM and SAP engaged in
day-to-day operations as well as sales and financial data cooperation to perform a proof of concept for the next generation
representing the market trends is considered a strategic tool for of very high-end business intelligence requirements.
informed business decisions and further business development.
Using the infrastructure building blocks already successfully
For international companies such as Nestlé that run businesses implemented by Nestlé – IBM p5 p595, IBM DS8300 Storage,
all over the world, these consolidated data warehouses can DB2 DPF, and Tivoli – IBM and SAP proposed to demonstrate
quickly become extremely large. Positioned at the heart of their the following:
strategy, the reliability, scalability and manageability of this
system’s underlying solution is vital in every sense. Data l Optimal DB2 DPF design for high scalability
Management is a critical aspect of the whole system as Nestlé is l Optimal storage server design for performance and
forecasting that strategic data consolidation will increase administration of extremely large databases
significantly in the next few years. l Optimal Tivoli design for highly parallel administration of
extremely large databases
Starting Point at Nestlé l Proof of high-end scalability for the SAP NetWeaver BI
Nestlé had already decided to implement IBM DB2 Universal application
Database, using DPF (Database Partitioning Function). The l Proof of infrastructure and application design for “high-end”
Nestlé team was convinced that this partitioned database was performance scalability
the most likely technology to be capable of scaling to 60TB and l Proof of infrastructure flexibility supporting a broad mix of
beyond. workload types.
5
The objective was to prove that by using a solution based on
“The major challenges we had to
massive parallelism, that extremely large databases can be
overcome in terms of architecture were
managed using tools available today, and can be managed in
the selection of the right component
the maintenance windows typical for business-critical
characteristics to both deliver the end-
production systems. This was a proof of concept for the largest
to-end bandwidth to access the 60TB of
SAP database build to date, and it was to prove that the
active data, and to support the broad
infrastructure and application could be designed to achieve
extreme workload scalability and variation. This proof of
mix of workload types. Thanks to the
concept was a “first of a kind:” it had no precedence at any
high-end technology, combined into the
component level. largest DB2 DPF on System p and
DS8300 that has ever been designed
The proof of concept would chart the future direction that and tested, and thanks to our
Nestlé’s infrastructure should take in order to support the unmatched team expertise, we
predicted growth of the BI system in terms of size and load. The successfully delivered a ‘near real-time’
key performance indicators (KPIs) of the proof of concept were SAP Business Intelligence system that
modeled to fulfill the expectations of the Nestlé workload as it is should help to meet Nestlé’s predicted
expected to evolve over the next two to three years, and based requirements for two to three years
on actual business processes and design criteria. The from the start of the project.”
challenge was to demonstrate generally the scalability,
manageability, and reliability of the BI application and the Francois Briant
infrastructure design, as BI requirements begin to necessitate Executive IT Architect and SAP BI PoC Lead Architect
7
geographies, a business environment common to many large
covered in
enterprises. Reporting 4
PoC
not covered
in PoC
The query design for the proof of concept was implemented to Figure 2 - 3: the 3 critical online load requirements
reproduce behavior observed in the real production system. The
reporting queries ranged over 50 different reporting cubes, and Combined Load Requirements
50% of the queries could take advantage of the application 160 Aggregation 2,5
Load Requirements
server OLAP cache, while 50% could not. Of the queries, 80% 140
Query/Sec 2,08 125
2
120
used the BI aggregates and 20% went directly to the fact tables,
100 1,5
and these latter 20% were the major challenge. The database
Mil Rec/Hr
1,25
Tnx/Sec
80 75
load, from these reports using the fact tables, was also affected 1
60
as the database grew and the number of rows in the fact tables 0,8
40
25 25 0,5
increased from 20 to 200 million rows. 20 15
5
0 0
KPIA KPID KPIG
sys<n>onl
Size of Fact Table for "ad hoc" queries
150
100
DB F-fact tables 50 48
20
0
KPIA KPID KPIG
8
Solution architecture
2xxx
LPAR 3(DB2 P22 -29) LPAR 3 (DB2)
ARRAY A0 to A31 4 FC partition 22 to partition 29
3xxx
LPAR 4(DB2 P30 -37) 4 FC LPAR 4 (DB2)
ARRAY A32 to A63 partition 30 to partition 37
15 4xxx 19
Figure 2 - 6: 33 DB2 DPF partitions with shared nothing infrastructure – Phase 1, 20TB
9
Small Objects Largest Table: 250GB
1 Partition # of Tables > 1G: 400 Large Objects
# of Tablespaces: 500 (ODS / PSA)
12 Partitions
DB Server - P595 (Squadron) 64Way / 256 GB RAM
DB Partition 0 1 2 3 4 5 6 7 8 9 10 30 31
Basis Tables
Master Data MidSize
Objects (Fact)
Dimension 8 Partitions
Tables
2500
27
15
17
09
4
89
5
95
85
83
94
89
93
95
84
21
21
21
0
8
66
62
21
56
49
47
47
45
43
21
20
20
20
20
20
43
45
20
20
20
20
20
4
36
10
20
20
20
96
97
20
20
20
2
20
20
20
82
20
20
20
63
20
19
19
19
19
2000
1500
Gig a Bytes
1000
3
48
500
0
0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
D B2 Pa rtition s
10
Evolution of DB2 in the proof of concept
The database used for the proof of concept was an actual clone The focus of Phase 2 was scalability of the design to 60TB.
of the Nestlé system. At the time it was cloned, DPF was already Phase 1 proved that the DB2 design, of eight partitions per
being used, but only in a small configuration. LPAR, scaled well and this design was taken into phase2. The
LPARs were distributed over five p595 servers to allow for the
There were six active DB2 partitions housed in a single database necessary memory increase required by the larger database,
server with shared file-systems. The first steps in the proof of and to take advantage of the p595 resource virtualization. In the
concept were to implement a new storage design with dedicated final layout, the database shares p595 CPU resources with the
file system and storage per DB partition, and redistribute the application servers, the Tivoli Storage Manager agents and the
database over five servers, which are logical partitions on a administration servers.
p595 server.
DS8300 layout design decisions
In Phase 1, the database was limited (by Nestlé specification) to There were two major design decisions for the storage layout,
a single p595 to reflect the current Nestlé hardware landscape. and although there are many successful high-performance
Phase 1 proved that the current landscape could scale to 20TB storage implementations in production, in this “first of this kind”
by using an optimal database and storage design. new criteria had to be considered.
DB2
Partitions 22 -29
DB2 DB2 DB2
Partitions 30 -37 Partitions Partitions
22-29 30-37
6->33 Partitions
As Received from
Nestlé: single DB Redistribution of data Phase2
LPAR with 6 DB2 over 33 DB2 DPF
partitions and 5 x p5 Scalability: DB2 distributed over 5
DPF partitons
LPARs p5-p595s. Final phase is p5 virtualization:
shared processor pool
11
Data distribution options Positioning of source and target LUNs for FlashCopy options
a) Spread all the DB2 partitions on all the arrays by using small c) Dedicated arrays for FlashCopy LUNs: half the arrays for
LUNs and having one LUN per array in each DB2 partition source, half for target
b) Dedicate a group of arrays for each DB2 partition by using d) Spread source and target LUNs over all arrays
big LUNs in the group of dedicated arrays.
By separating the target and source LUNs on dedicated arrays,
In a parallel database, where the data is spread very equally we can ensure that there will be no influence on the production
across all the parallel partitions, it is assumed that the I/O activity disks resulting from activity on the FlashCopy targets used for
for each partition will be similar and that the load will be backup. At the same time however, the number of spindles
concurrent. By using dedicated storage arrays, the likelihood of available for production I/O load is reduced. As the FlashCopy
I/O contention is reduced. Dedicated arrays allow the use of activity was not expected to have a major influence on
larger LUNs as the data does not need to be distributed over as performance, the decision was to place both target and source
many arrays, and fewer LUNs are needed. Using fewer LUNs is on all ranks, but ensuring that source and target pairs were
advantageous to the backup/restore and failover scenarios as placed on different ranks. This design allowed all spindles to be
there is some administration overhead per LUN. available for production requirements.
P20 – P36 P20 – P36 P8 – P24 P8 – P24 P16 – P32 P16 – P32
DA 2 P21 – P37 P21 – P37 DA 6 P9 – P25 P9 – P25 DA 3 P17 – P33 P17 – P33
P20 – P36 P20 – P36 P8 – P24 P8 – P24 P16 – P32 P16 – P32
P21 – P37 P21 – P37 P9 – P25 P9 – P25 P17 – P33 P17 – P33
P6 – P22 P6 – P22 P10 – P26 P10 – P26 P18 – P34 P18 – P34
DA 0 P7 – P23 P7 – P23 DA 4 P11 – P27 P11 – P27 DA 1 P19 – P35 P19 – P35
P6 – P22 P6 – P22 P10 – P26 P10 – P26 P18 – P34 P18 – P34
Source
Target
P7 – P23 P7 – P23 P11 – P27 P11 – P27 P19 – P35 P19 – P35
Target
12
Evolution of the storage server during the proof of concept “This Proof of Concept demonstrates the
Phase 1 was limited to a single p595 database server and a strength of the two companies combined
single DS8300 storage server to achieve scalability to 20TB. The technologies and skills, and the mutual
first step in the proof of concept was to implement the storage commitment to excellence. We brought
design discussed above: dedicated storage arrays, dedicated
together a team of 25 experts, spanning
LUNs, dedicated file-systems. The storage server was
5 countries over two continents, for more
upgraded to the newest technology, DS8300 Turbo (2.3GHz) for
than 14 months.”
the KPI-D load scaling stage. Phase 1 proved the performance
of the single storage server to 20TB and three times the current Herve Sabrie
Nestlé high-load requirements. Server Specialist and Nestlé SAP BI PoC Project Manager
IBM
Phase1 Phase2
Redistibute Grow to 60
Redistribution Grow to the DB2 Terabyte
of DB2 data
20TB nodes over
from shared
5 LPARs
to non-shared
filesystems and 4
storage
servers
DS8300 DS8300Turbo
7-20TB 20TB
4 x DS8300Turbo
60TB
Figure 2 - 12: Evolution path DS8300
13
Proof of design
N14 Total I/O Rate - Individual Data Volumes
Figure 2 - 13 shows the CPU distribution within one database 1800
1600
server with eight DB2 partitions. Each partition is represented
1400
here by a Workload Manager (WLM) class which monitors its
1200
CPU utilization. All partitions are equally active and the load is
600
These two graphs in Figure 2 - 14 show the I/O behavior of two 400
DB2 partitions over the same time frame. The two individual DB2 200
0
partitions show similar I/O behavior and peak load requirements.
5:31
5:51
6:11
6:31
6:51
7:11
7:31
7:51
8:11
8:31
8:51
9:11
9:31
9:51
10:11
10:31
10:51
11:11
11:31
11:51
12:11
This demonstrates a good database load distribution and also Time
1000
800
600
400
200
0
5:31
5:51
6:11
6:31
6:51
7:11
7:31
7:51
8:11
8:31
8:51
9:11
9:31
9:51
10:11
10:31
10:51
11:11
11:31
11:51
12:11
Figure 2 - 14 : I/O performance on two separate DB2 partitions
60
50
40
30
20
10
0
T0005
T0006
T0007
T0008
T0009
T0010
T0011
T0012
T0013
T0014
T0015
T0016
T0017
T0018
T0019
T0020
T0021
T0022
T0023
T0024
T0025
T0026
T0027
T0028
T0029
T0030
T0031
T0032
T0033
T0034
T0035
T0036
T0037
T0038
T0039
T0040
T0041
T0042
Figure 2 - 13: Load distribution over eight DB2 partitions in a p595 LPAR
14
Backup/Restore architecture
The Tivoli design implements five storage agents for better CPU
utilization via cross-LPAR load balancing. As partition 0 must be
backed up as the first step, four storage agents would have been
sufficient. Four CIM agents were also implemented for faster and
more reliable execution of FlashCopy commands. The storage
agents and CIM agents were implemented on AIX 5.3 in shared
processor LPARs.
FC
Source
Target
15
Proof of Design: Restore 60TB Database from Tape ( KPI-2)
Figure 2 - 16 shows one stage of the disaster recovery KPI, a full For KPI-2, a final step is required: the roll-forward of 2TB of logs,
restore of the 60TB database from tape. For a restore, partition 0 simulating the recovery of several days’ work at the DR site. The
must be restored first and therefore two tape drives are used for roll-forward is shown in Chart 1 5 below. The recovery time is
this step with a throughput of 700GB per hour. dominated by partition 0; the remaining 32 partitions roll-forward
in parallel.
The remaining 32 partitions are restored in parallel, using one
tape drive each with an individual throughput of 470GB per hour. With a non-parallel database, the recovery time would be the
The total restore time is 6 hours, 40 minutes. sum total of all 33 individual recovery times (the time for partition
0 + (32 * the individual parallel partition time)). Instead, using the
Prior to the restore, a full backup was completed using the same shared nothing parallel database design, the roll-forward of 2TB
configuration. The backup was achieved in 5hours, 27 minutes. is completed in 2hours 52 minutes.
700,00
470 GB/h => 133 MB/sec.
600,00
500,00
GB/hour
400,00
300,00
200,00
100,00
0,00
0 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Partition Number
16
Rollforward Recovery per Partition
250
200
DB2 Partition 0
150
logfiles recovered [GB]
100
50
32 in
0
Parallel
03:30:00 04:30:00 05:30:00 06:30:00
time
17
P5 595 server design
The resource distribution and logical partition design of the p595 along with the new DB2 V9, with scalability and performance
was completed in several steps. The evolution of the server benefits as well as new technology such as data compression.
design was determined by the proof of concept starting point, Phase 2 introduced micro-partitions and processor sharing.
the DB2 evolution, and the scaling requirements. The basic plan
is depicted in Figure 2-19. To determine the initial design for the application servers, and
begin the capacity planning for Phase 2, a series of load profiling
The Nestlé clone was delivered on AIX 5.2 and DB2 V8. The tests were done. These provided the first “rule of thumb” for
objective was to introduce the newest technology, paving a resource distribution. The graph in Figure 2-20 shows the
migration path, and showing proof of benefit. Phase 1 introduced DATABASE:APPS ratio done for the individual load profiles and
AIX 5.3 and the benefits of the hardware multi-threading (SMT) the combined load for KPIs A and D at 20TB.
Customer
Basis
Phase 1 Phase 2
AIX 5.2 AIX 5.3 AIX 5.3 AIX 5.3
DB2 V8 DB2 V8 DB2 V9 micropartions
DB2 V9
7TB - 20 TB
20TB - 60 TB
7 6
6
5 3,9
Physical 4
DB
CPUs 3 2 2 2,25
APPS
2 1 1 1 1 1
1
0
Query Aggregate Load KPI-A KPI-D
18
Virtualization design for shared processor pool
Figure 2 - 21 shows the virtualization design for the final p595 servers to manage the parallel LANFREE backup. Using
implementation at 60TB. The five database LPARs are the shared processor technology of p595 servers, a priority
distributed across the five p595 servers, one per machine. The schema was implemented which gave highest priority to the
first p595 is dedicated to administration-type work and load database, medium priority to the online, and lowest priority to all
drivers. It contains the DB2 partition 0, the SAP software central batch activity. The storage agents were given a fixed CPU
instance, the data aggregation driver, and the data load entitlement to ensure compliance with the SLA.
extractors.
This entitlement can be used by other workloads when no
There is an online partition on each system to support the backup is active. The highly application server CPU-intensive
reporting. The HTTP-based online reporting load is distributed data loading scenario was given very low priority, but allowed to
via an SAP load balancing tool, SAP Web Dispatcher®. access all resources not used by higher priority systems. This
allowed these application servers to be driven to full capacity
A Tivoli Storage Manager for ACS agent is placed on each of the while not impacting the response time KPIs of reporting.
DB 1 DB DB DB DB
Online
2 Online Online Online Online
Aggregate
3
Extractor Upload Upload Upload Upload
SAP CI
1 DB
CE = 10
Max 10
2 Online
CE = 5
Max CPU:48
Upload
3 CE = 16
Backup CE = 7
8
19
Infrastructure scaling
Figure 2 - 23 depicts the hardware scaling on the p5 595 as the physical CPUs have been normalized to relative GHz to factor in
load scenario was scaled by a factor of five times the baseline the speed increase from 1.9GHz to 2.3GHz. Between KPID at
load. The scaling diagram separates the application servers 20TB and KPID at 60TB, the system was moved into the final
from the database to show the load effects on the different distributed configuration. Here the increase is related to the
components. same load but for a much larger database. The final point is five
times the baseline load at 6TB. The increase in application
Figure 2 - 24 shows the physical resource scaling in relation to server requirements is factor 3.8, and the scaling factor for the
the load increase: for a load scaling of factor five, the physical database is 2.8 for a factor five load increase. The database
resource increase was only factor of 3.5. In Figure 2 - 24 the design scales very efficiently.
Apps
PhysCPU Util per Component DB
Linear (DB)
250
219
200
PhyCPU Utilized
140 146
150
100 72
70
49
50 31 35
0
KPIA-20 KPID-20 KPID-60 KPIG-60
6
5 Load.Scale
Increase Factor
4
Phyc.Scale
3
2 Linear (Load.Scale)
1
Linear (Phyc.Scale)
0
KPIA-20 KPID-20 KPID-60 KPIG-60
20
Online KPI achievements
Figure 2 - 25 follows the load throughput over the landscape was used to scale the load to meet the KPI-G requirements.
evolution until achieving the final high-load combined KPI
represented by KPI-G: 125 million records/hr in loading, 75 The online achievements were the result of the p5 p595 server
million records/hr in aggregation, with concurrent reporting rate flexibility in handling diverse concurrent load requirement, the
of 2.08/sec, at 60TB. DB2 parallel database functionality, which supports a very broad
based scaling of the database, and the dedicated “shared
The steps through the hardware were completed primarily using nothing” storage design. The project ended with the high-end
the KPI-D load requirements. In the final runs, the parallel KPI achievements for Nestlé, and there was still considerable
landscape, implemented in SPLPARs (shared processor LPARs) scalability potential in this solution.
140 25
120
20
100
LOAD
15
80
AGGR
RTIME
60
10
10x TXTRATE
40
5
20
0 0
KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI KPI
A A A D D D D D D D D D G
Figure 2 - 25: Evolution of the online KPI tuning and scale-up throughout the proof of concept
21
“The combination of SAP NetWeaver® BI
and IBM infrastructure components
show an unbeatable scalability
addressing and fulfilling the
requirements at high-end BI scenarios in
the industry today.”
Flexibility
IBM System p virtualization features enable the architecture to
support a very broad mix of workloads. This option both
capitalizes on Nestlé investments (skills and technology) and
resolves the problems experienced by Nestlé caused by large
variations in volumes and type of workload. Virtualization makes
it possible to prioritize sensitive load types, such as the reporting
queries, while utilizing the full capacity of the available
resources.
Manageability
Proven and extendable architecture, based on parallelization,
which allows an extremely large database to be managed well
within the maintenance window allowed by the business. This
design covers the spectrum from “business as usual”
maintenance to full disaster recovery.
23
© Copyright IBM Corp. 2008 All Rights Reserved.
For more information:
IBM Deutschland GmbH
D-70548 Stuttgart
ibm.com
To learn more about the solutions from IBM
Produced in Germany
and SAP visit: ibm-sap.com
IBM, the IBM logo, IBM System z, IBM System p,
IBM System i, IBM System x, BladeCenter, z/OS,
For more information about SAP products and z/VM, i5/OS, AIX, DB2, Domino, FlashCopy, Lotus,
POWER, POWER5, QuickPlace, SameTime, Tivoli
services, contact an SAP representative or visit: and WebSphere are trademarks of International
Business Machines Corporation in the United
sap.com States, other countries, or both.
SPC03021-DEEN-00 (03/08)