SAP HANA Data Management and Performance On IBM Power Systems PDF
SAP HANA Data Management and Performance On IBM Power Systems PDF
Redpaper
Draft Document for Review March 23, 2020 3:20 pm 5570edno.fm
IBM Redbooks
March 2020
REDP-5570-00
5570edno.fm Draft Document for Review March 23, 2020 3:20 pm
Note: Before using this information and the product it supports, read the information in “Notices” on
page vii.
iii
5570edno.fm Draft Document for Review March 23, 2020 3:20 pm
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Chapter 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 How to approach SAP HANA on IBM Power Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Memory footprint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Startup times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 High availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.5 Hardware exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.6 Remote database connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines
Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright
and trademark information” at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
AIX® POWER® Redbooks (logo) ®
DB2® POWER9™ System z®
IBM® PowerVM® XIV®
IBM Cloud™ Redbooks®
Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive
licensee of Linus Torvalds, owner of the mark on a worldwide basis.
Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other
countries, or both.
Red Hat, are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries in the United States and
other countries.
Other company, product, or service names may be trademarks or service marks of others.
viii SAP HANA Data Management and Performance on IBM Power Systems
Draft Document for Review March 23, 2020 3:20 pm 5570pref.fm
Preface
This IBM® Redpaper publication provides information and concepts on how to take
advantage of HANA and Power Systems features to manage data and performance
efficiently.
The target audiences of this book are architects, IT specialists, and systems administrators
deploying SAP HANA whom often spend much time and effort managing data and SAP
system's performance.
Authors
This paper was produced in close collaboration with the IBM SAP International Competence
Center (ISICC) in Walldorf, SAP Headquarters in Germany and IBM Redbooks®.
Damon Bull is a Senior SAP Software Performance Engineer who has worked with different
SAP versions from 2.2F onwards. His focus is around SAP performance, having conducted
and published over 50 SAP benchmarks over the last 25 years. Damon has experience with
IBM DB2®, Oracle, SQL Server, and HANA databases, as well as SAP workloads SD, BW,
TRBK, FI, ATO and many others, running on IBM AIX®, Linux and Windows Server. Damon
holds a Bachelor of Science in Computer and Information Sciences from the University of
Oregon.
Vinicius Cosmo Cardoso is an IBM Certified Expert IT Specialist and Senior SAP Basis
Administrator working for IBM Brazil. He has 12 years of experience in SAP, working in
complex projects like new implementations, upgrades and platform migrations. He is an SAP
certified professional for SAP NetWeaver, SAP OS/DB Migration and SAP HANA. Vinicius
holds a Master of IT Solutions Architecture from 2018 and a Bachelor of Computer Science
from 2008. Throughout these 12 years, he has worked in several Brazilian and Global clients
of several industry areas.
Cleiton Freire is an IBM Cloud™ Solution Leader for the Europe, LA and Canada cluster.
Cleiton has almost 20 years in the Information Technology industry and throughout the past
14 years working in IBM, he has served as Subject Matter Expert on SAP on IBM Power
platform migrations, installations and administration for large customers in Brazil, United
States and Europe. For the past couple of years Cleiton has helped customers to find the best
tech solution to address their business problems. Cleiton is a certified Technology Consultant
for the SAP HANA platform and is a Bachelor of Science graduated from Faculty of
Technology, in Sao Paulo, Brazil.
Eric Kass is an IBM Developer Architect within Worldwide SAP Technical Enablement on the
IBM Systems group based in Boeblingen, Germany. Eric started in IBM in 1997 in operating
system network and communication development; writing device control and network routing
code. In 2000 he got an assignment for a yearlong in IBM Germany within SAP to write the
database layer for SAP to communicate between Windows and IBM i and never went back
becoming a successful IBM developer and architect working within the SAP development lab,
designing and coding database drivers, SAP kernel components, SAP and Database High
Availability engines, Network drivers, and Security protocols. Eric has over 20 patents on the
related fields and supports worldwide SAP customers on IBM i and AS400 platforms with
complicated bugs.
Wade Wallace
IBM Redbooks, Poughkeepsie Center
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our papers to be as helpful as possible. Send us your comments about this paper or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
[email protected]
Preface xi
5570pref.fm Draft Document for Review March 23, 2020 3:20 pm
xii SAP HANA Data Management and Performance on IBM Power Systems
Draft Document for Review March 23, 2020 3:20 pm 5570ch01.fm
Chapter 1. Introduction
This chapter contains an introduction on the features of IBM Power Systems for HANA to help
manage data and performance efficiently.
Our members of architects and engineers who have been implementing HANA systems for
years are often asked to provide their insights about designing systems. The inquiries
encountered are diverse, the conditions are various, and answers are individual;
nevertheless, in designing this book, the team anticipates the questions must likely
encountered.
The authors intended that this book be used in a task-based fashion to find answers to
questions like: which hardware to choose, how to plan for backups, how do you minimize
startup time, how to migrate data to HANA, how to connect to remote systems, how much
hardware you need (memory, CPU, network adapters, storage), how do you reduce memory
footprint, what high availability options exist, where do you get support, and how do you
answer other questions. Consider this book as a starting guide to learn what questions to ask;
and to answer the last question first, then for any questions which remain, contact the ISICC
using the following website:
https://www.ibm.com/services/sap/centers
The authors understand the goal of every SAP HANA installation is to be as resilient,
available, inexpensive, fast, simple, extensible, and as manageable as possible within given
constraints; even if some of the goals happen to be contradictory. This publication is unique in
its intention to exist as a living document. SAP HANA technology and IBM Power Systems
technology supporting HANA change so quickly that any static publication is out of date
months after distribution. The development teams intends to keep this book as up-to-date
with answers to the present questions.
Where does one begin to define the requirements of an SAP HANA system? Sizing - Yes, but
first it is necessary to establish your changing business goals brought forth by the changing IT
industry. The typical motivation for moving forward to a HANA based system is the path
towards digital transformation; a transformation requiring systems to perform real-time
digitalization processing. The requirements for processing pervasive user applications
utilizing FIORI and real-time analytics, for example, have a significantly different processing
footprint (ODATA based) than classical form-based process before output and process after
input SAP GUI applications.
Sizing is both a science and an art. IBM bases their sizing on sizing tables established by the
benchmark team, people who are highly proficient in SAP performance analysis.
Conceptually, calculating HANA sizing from a classical AnyDB system begins by determining
the SAPS (a standard measure of workload capacity) of your present system, then using that
value as reference into the IBM sizing tables to determine the hardware requirements for an
IBM Power Systems to run HANA with equivalent SAPS performance.
Soon following the task of sizing memory does it become apparent that the number of
hardware permutations fitting your requirements is overwhelming; at this point the scale-up or
scale-out decision turns to be your new primary concern. Let us help most of you. Learning
from a wealth of customer experiences, we have a suggestion: scale-up. Even though there is
a degree of automation available to assist in selecting which data can be best spanned
across multiple nodes, significant manual management is typically required when
implementing scale-out. In contrary, when selecting scale-up hardware, take into account that
different hardware implementations have different performance degradation characteristics
when scaling up memory utilization; as systems become larger, the memory to CPU
architecture plays an important role in:
1. How distant memory is from the CPU (affinity)
2. How proficiently the system is able to manage cache to memory consistency
3. How well the CPU performs virtual address translation of large working sets of memory
IBM Power Systems are designed to scale-up by way of enterprise grade high bandwidth
memory interconnects to the CPU, and relatively flat NUMA. Scaling up with Power Systems
is the recommended way to support large memory footprint HANA installations (2.2.5,
“Advantages of IBM Power Systems for HANA and HANA Data Tiering solutions” on
page 18). The positive HANA on Power Systems scaling benefits are complimented with the
advantage of consolidating multiple HANA, Linux, AIX and IBM i workloads on the same
machine.
Note: Acquisition costs and licensing become complicated when running mixed workloads.
Have your customer representative contact the ISICC to have them assist in providing an
optimal cost configuration. Contacting the ISICC is a good idea in general because we
want to know what you have in plan, and it is our job to help make the offering as attractive
as possible to fit your needs.
Customers having SAP support incidence with SAP on IBM need to be reminded of a story of
a past support experience. HANA on Power Systems support channels are intricately
integrated into SAP development and support. In fact, the people supporting SAP on AIX, or
SAP with DB2, or SAP on IBM i and System z® are members of the same team that support
HANA on Power Systems. By migrating to HANA but remaining with IBM, you have not really
changed support teams. For questions regarding anything SAP, if all else fails, open an SAP
support ticket to an IBM coordinated support channel such as: BC-OP-AIX, BC-OP-AS4,
BC-OP-LNX-IBM, BC-OP-S390; and for issues regarding interaction with other databases,
BC-DB-DB2, BC-DB-DB4, BC-DB-DB6.
An archiving strategy is your primary application level control for data growth, but some
applications like BW support near-line-storage as an alternative. For data that is averse to
archiving for various reasons, HANA supports a selection of hardware technologies for
offloading not so frequently used data (warm data or cold data) to places other than
expensive RAM with expensive SAP licensing. The currently available technologies to
Chapter 1. Introduction 3
5570ch01.fm Draft Document for Review March 23, 2020 3:20 pm
alleviate large data and data growth are Native Storage Extension (NSE - a method of
persisting explicit tables on disk and loading data into caches as necessary; similarly in
function to classical database engines), Extension Nodes (slower scale-out nodes), and
HANA Dynamic Tiering (HANA to HANA near line storage).
Note: Some of the mentioned options offered by HANA are not available for all
applications. S/4 has other restrictions than BW, and both S/4 and BW have more
restrictions than HANA native applications.
SAP provides reports to run on your original system to provide data placement assistance,
but results are typically good suggestions. You must be prepared to distribute your data
differently if prototypes demonstrate other configurations are necessary. Chapter 2, “SAP
HANA data growth management” on page 7 discusses options from planning SAP archiving
and managing various technologies for data of different temperatures.
Even though the hardware and operating system methods of retaining persistent data in
memory are various, HANA has one understanding. From HANA's perspective persistent
memory is referenced as memory mapped files. Either the operating system or hardware
provides a memory-based file-system (think RAM-disk) that are exposed as a file system
mounted in a path. Instructed by a configuration parameter, HANA utilizes as much persistent
storage as possible and revert to regular memory for the remainder. HANA can be instructed
to attempt to store everything in persistent memory, or each file can be specifically altered to
preference persistent or non-persistent RAM.
Even with hardware support for a quick start, consider that a high-availability solution
alleviates the usual need to wait for HANA to restart; HANA is always available. That is,
having high-availability solution in place, the preferred method for scheduled maintenance is
to move to the backup site.
1.1.3 Backup
SAP HANA provides an interface for backup tools called Backint. SAP publishes a list of
certified third-party tools that conform to the standard. If you plan on using methods like
storage-based snapshot or flash copies, be certain to quiesce the HANA system before
taking a storage-based snapshot. A quiesce state is necessary to be able to apply
subsequent logs (the record of transactions that occur since the time of the quiesce) to an
image restored from the flash-copy.
storage-based replication in which pages on disk are duplicated without consideration of the
logical data structure of the change. HANA System Replication accomplishes the task of
transferring changes between hosts by transferring segments of the log (segments of the
record of the changes to the database that are used for transactional recovery). Changes
received by backup hosts are played back to the local database in a fashion similar to when
undergoing forward recovery.
The SAP standard method of exchanging hardware, for example any hardware with the same
endianness, is by way of SAP HANA System Replication. See SAP Note 1984882 - Using
HANA System Replication for Hardware Exchange with minimum or zero downtime at
the following website:
https://launchpad.support.sap.com/#/notes/1984882
Although the SDA and SDI appears in title to be specific to connectivity from SAP HANA to
IBM i, this specificity is not the case. The IBM i case is a prime example of using the generic
ODBC adapter for SDA, and the generic Camel JDBC adapter for SDI.
1.1.7 Conclusion
Facilitating designs of superior SAP HANA systems is the goal of this publication. Your
productive, development, and test systems have different reliability, availability, connectivity,
and security requirements, but the aspects to be considered are universal. Where tradeoffs
need to be made between cost and reliability, or availability and complexity, are decisions
unique to your situation, but the intent of this book and the service and support you have from
the ISICC, and the IBM SAP development team, is to assist to optimize your decisions so you
feel comfortable and confident with the final architecture of your design.
Chapter 1. Introduction 5
5570ch01.fm Draft Document for Review March 23, 2020 3:20 pm
This chapter introduces the data temperature concept which is used as a guide to decouple
data types based on their criticality, and helpful for companies to decide when to move their
data to different, but still accessible data tiers.
Different SAP data tiering solutions fully supported on IBM Power Systems servers are also
presented in this chapter. You are presented with an overview for better understanding to help
the reader decide what solution is best suitable for the challenge in place.
The purpose of this chapter is not to define the best solution for the client’s needs, but to help
on the decision based on the different available solutions.
The more information is collected, the more IT resources are consumed over time, increasing
the costs for organizations because of the need for data scaling.
When talking about SAP HANA as the database, there is the notion of main memory and disk
areas consumed, which increases the TCO and impacts the performance over time.
Instead of just start scaling up or scaling out the HANA database, companies are encouraged
to think about options for decoupling their data location. This can be achieved by defining
what data must be in memory all the time and thus available with the highest performance for
applications and end users, and what data is less frequently accessed so it can be available
from a lower performance data tier with no impact to the business operations.
The organization can define which data is accessed infrequently, so that it can be available to
the users in a reasonable and cheaper performing storage tier. This is the data temperature
concept. Figure 2-1 shows how data temperature is classified.
There are benefits for using data tiering options for you HANA database:
Reduce data volume and growth in the hot store and HANA memory
Avoid performance issues on HANA databases because of too much data to be loaded
into main memory
Avoid the need of scale-up or scale-out over time
Lower total cost of ownership (TCO)
SAP offers four data tiering solutions fully supported on IBM Power Systems for HANA:
Near-Line Storage (cold data)
SAP HANA Extension Node (warm data)
SAP HANA Native Storage Extension – NSE (warm data)
SAP HANA Dynamic Tiering (warm data)
2.2 SAP HANA data tiering options for SAP HANA databases
This section describes the different data tiering options for SAP HANA databases.
SAP IQ is a column-based, petabyte scale, relational database software system used for
business intelligence, data warehousing, and data marts. Produced by Sybase Inc., now an
SAP company, its primary function is to analyze large amounts of data in a low-cost, highly
available environment.
SAP has developed NLS for using with SAP NetWeaver BW and SAP BW/4HANA only.
The required SAP BW/4HANA versions and support package level is BW/4HANA 1.0 SPS >=
00.
For more information about the minimum release level for SAP BW, refer to SAP Note
1796393 - SAP BW near-line solution with SAP IQ at the following website:
https://launchpad.support.sap.com/#/notes/1796393
Implementing HANA Smart Data Access (SDA) for accessing the NLS data is optional. HANA
SDA optimizes execution of queries by moving processing as far as possible to the database
connected by way of SDA. The SQL queries then work in SAP HANA on virtual tables. The
HANA Query Processor optimizes the queries and executes the relevant part in the
connected database, then returns the result to SAP HANA and completes the operation.
Usage of NLS solution with HANA SDA is supported as of SAP NetWeaver BW 7.40 SP8 or
higher, or SAP BW/4 HANA 1.0 or higher.
Note: To make usage of the SDA solution, the BW application team has to configure the
BW objects accordingly.
For more detailed information about the usage of HANA SDA for NLS and the performance
benefits, refer to SAP Notes 2165650, 2156717 and 2100962 at the following websites:
https://launchpad.support.sap.com/#/notes/2165650
https://launchpad.support.sap.com/#/notes/2156717
https://launchpad.support.sap.com/#/notes/2100962
The architecture of NLS implementation with SAP NetWeaver BW on HANA and SAP
BW/4HANA with Smart Data Access is shown in Figure 2-3.
For the Smart Data Access implementation SAP developed and provides the packages which
are fully supported on Power Systems servers.
For all details about the implementation of NLS for SAP BW, refer to SAP Note 2780668 -
SAP First Guidance - BW NLS Implementation with SAP IQ at the following website:
https://launchpad.support.sap.com/#/notes/2780668
Note: As a rule of thumb, this means that SAP recommendation for memory sizing is:
RAMdynamic = RAMstatic.
For example, if you have a footprint of 500 GB, your HANA database memory size must be at
least 1 TB.
The SAP HANA Extension Node can operate twice as much data with the same amount of
memory and less cores.
For example, if you expect to have up to 1 TB of footprint in your Extension Node, you can
have up to 250 GB of memory for the Extension Node.
Be aware that the ideal use case for Extension Node is BW on HANA or BW/4HANA, as BW
application fully controls the data distribution, correct partitioning and access paths. For SAP
HANA native applications, all data categorization and distribution must be handled manually.
As SAP HANA Extension Node is fully built in the SAP HANA platform, and it is fully
supported on Power Systems.
For additional information about the implementation of Extension Node, see the FAQ SAP
Notes 2486706 and 2643763 at the following websites:
https://launchpad.support.sap.com/#/notes/2486706
https://launchpad.support.sap.com/#/notes/2643763
To activate NSE, you need to configure your warm data related tables, or partitions of
columns to be page loadable by way of the SQL DDL commands.
After the table, partition or column is configured for using NSE, it is not fully loaded into
memory anymore, but it is readable using the buffer cache, and off-course, the performance
for accessing data on that table is significantly slower than accessing it in-memory.
buffer cache is used instead. This can drastically reduce TCO, because most of it is related to
the amount of memory (DRAM) necessary in the server.
Also, depending on the amount of data moved to NSE, the HANA startups can be faster
during scheduled and unexpected maintenance.
Supportability
NSE is supported for SAP HANA native applications, for usage on S/4HANA, and also in SAP
Suite on HANA (SoH).
Note: SAP recommends using NSE with S/4HANA or SoH in the context of Data Aging
only. If Data Aging is used in S/4HANA or SoH with SAP HANA 2.0 SPS4, NSE is used for
storing the historical partitions of tables in aging objects. Refer to SAP Note 2816823 -
Use of SAP HANA Native Storage Extension in SAP S/4HANA to learn more about the
use case for S/4HANA at the following website:
https://launchpad.support.sap.com/#/notes/2816823
Based on the workload on the table, partition or column over time, the NSE Advisor helps to
identify the frequently accessed and rarely accessed objects such that the system
administrators can decide which objects can moved to NSE. Figure 2-5 shows the NSE
architecture perspective.
As NSE comes with HANA 2.0 SPS4, it is fully supported on IBM Power Systems. For
detailed information about using NSE, see SAP Note 2799997 at the following website:
https://launchpad.support.sap.com/#/notes/2799997
The SAP HANA Dynamic Tiering does not come with the standard installation package,
hence you need to download an additional component, and install it on SAP HANA.
In SAP HANA Dynamic Tiering, you can create two types of warm tables: extended table and
multistore table. The extended table type is totally disk-based, therefore all data is stored in
disk. The multistore table is an SAP HANA partitioned table with some partitions in memory
and some partitions on disk.
The distribution of the data among the in-memory store tables, the extended tables, and the
multistore tables is shown in Figure 2-6.
SAP Dynamic Tiering can be installed in the same server hosting SAP HANA or in a separate
dedicated host server. You can also install a second dynamic tiering host as a standby host
for high availability purposes.
The operating system process for the dynamic tiering host is hdbesserver, and the service
name is esserver.
SAP Dynamic Tiering only supports low tenant isolation. Any attempt to provision the
dynamic tiering service (esserver) to a tenant database with high-level isolation fails. After the
dynamic tiering implementation, if you try to configure the tenant database with high-isolation,
the dynamic tiering service stops working.
For a complete list of the parameters, refer to the SAP official guide SAP HANA Dynamic
Tiering: Administration Guide at the following website:
https://bit.ly/2JTQZGK
Note: Same host deployments are primarily designed for small, nonproduction
environments, but are supported in production environments.
This is a deployment where the organizations can take advantage of all Power Systems
benefits with the flexibility of LPARs support, with low network latency.
Figure 2-9 Dynamic tiering: More than one dynamic tiering server deployment
2.2.5 Advantages of IBM Power Systems for HANA and HANA Data Tiering
solutions
With Power Systems, it is possible to implement a scale-up or scale-out SAP HANA database
environment taking advantage of its flexibility.
With the support for multiple LPARs support, organizations can consolidate multiple SAP
HANA instances or SAP HANA nodes of a same instance (multi-host) on a single Power
Systems server, leveraging its simplicity of management with low-latency network.
Using IBM PowerVM®, businesses can virtualize up to 16 production SAP HANA instances
on the logical partitions (LPARs) of a single Power Systems server (on E950 and E980 IBM
POWER9™). It is also possible to move memory and CPUs between the LPARs in a flexible
granularity (check SAP Note 2230704 - SAP HANA on IBM Power Systems with multiple -
LPARs per physical host at https://launchpad.support.sap.com/#/notes/2230704.
IBM PowerVM (the Power Systems hypervisor) allows for more granular scaling and
dynamically changing allocation of system resources. This means businesses avoid adding
new hardware that can caused higher energy, cooling, and management needs.
The HANA on IBM POWER® solution runs the same SUSE or Red Hat Linux distributions as
x86 servers, with the flexibility, scalability, resiliency, and performance advantages of POWER
servers that help the client to:
IBM Power Systems is the best solution for implementing SAP HANA scale-up and scale-out,
and the data tiering solutions presented in this section. Refer to the IBM Power Systems web
page for additional and updated information:
https://www.ibm.com/it-infrastructure/power/sap-hana
Before starting
This section describes a high-level step-by-step installation of dynamic tiering on SAP HANA
2.0 on Power Systems (Little Endian), and a demonstration on how to create an extended
store and multistore table for manipulating data on it.
Using the Console Interface (CI), you are introduced to the installation of SAP HANA Data
Tiering on a Same Host Deployment, that is, on the same server where the SAP HANA is
installed with no additional HANA nodes.
Note: Some data manipulation is demonstrated using the SAP HANA Studio.
For detailed procedures, refer to the SAP official documents SAP HANA Dynamic Tiering:
Master Guide, SAP HANA Dynamic Tiering: Installation and Update Guide, and SAP HANA
Dynamic Tiering: Administration Guide at the following website:
https://help.sap.com/viewer/product/SAP_HANA_DYNAMIC_TIERING/1.11.0.0/en-US
Note: IBM Power Systems environments require the appropriate IBM XL C/C++
redistributable libraries. Download and install the appropriate runtime environment from
the latest updates from the supported IBM C and C++ Compilers page on the IBM Support
Portal (https://ibm.co/32gjMvb). Install the libraries on both the SAP HANA and dynamic
tiering hosts. These libraries are not required for an Intel-based hardware platform
environment.
Check the SAP HANA compatibility with the dynamic tiering version
Check SAP Note 2563560 - SAP HANA Dynamic Tiering 2.0 SP 03 Release Note for a
matrix of compatible SAP HANA and SAP HANA Dynamic Tiering versions at the following
website:
https://launchpad.support.sap.com/#/notes/2563560
Note: This is the latest compatibility matrix SAP Note at the time this publication was
written.
Support Packages and Patches → By Alphabetical Index (A-Z) → H → SAP HANA DYNAMIC
TIERING → SAP HANA DYNAMIC TIERING 2.0 → COMPRISED SOFTWARE
COMPONENT VERSIONS → SAP HANA DYNAMIC TIERING 2.0.
Now you can download the SAP HANA Dynamic Tiering revision compatible with the SAP
HANA 2.0 SP level you have in place or will be installing.
Choose an action
----------------------------------------------------------------------------------
1 | add_host_roles | Add Host Roles
2 | add_hosts | Add Hosts to the SAP HANA Database System
3 | check_installation | Check SAP HANA Database Installation
4 | configure_internal_network | Configure Inter-Service Communication
5 | configure_sld | Configure System Landscape Directory
Registration
6 | extract_components | Extract Components
7 | print_component_list | Print Component List
8 | remove_host_roles | Remove Host Roles
9 | rename_system | Rename the SAP HANA Database System
10 | uninstall | Uninstall SAP HANA Database Components
11 | unregister_system | Unregister the SAP HANA Database System
12 | update | Update the SAP HANA Database System
13 | update_component_list | Update Component List
14 | update_components | Install or Update Additional Components
15 | update_host | Update the SAP HANA Database Instance Host
integration
16 | exit | Exit (do nothing)
4. At the Choose components to be installed or updated prompt, select Install SAP HANA
Dynamic Tiering as shown in Example 2-3.
Example 2-3 Dynamic tiering installation: Choose components to be installed or updated window
Choose components to be installed or updated:
5. You are prompted to add an additional host. Enter no as shown in Example 2-4.
6. You are prompted to enter the System Database User Name and password. Enter
SYSTEM and its password as shown in Example 2-5.
7. You are now asked to add the paths for dynamic tiering data and log volume paths. In this
case, the paths are /hana/data/dt_es/H01 for the data volumes and /hana/log/dt_es/H01
for the log volumes as shown in Example 2-6 on page 22.
8. Finally, you are asked to confirm all added parameters. Enter y as shown in Example 2-7.
During the installation process until it is completed, you see similar messages as shown in
Example 2-8.
Example 2-9 Dynamic tiering role: Execution of HANA Lifecycle Management tool
./hdblcm
SAP HANA Lifecycle Management - SAP HANA Database 2.00.044.00.1571081837
************************************************************************
Choose an action
----------------------------------------------------------------------------------
1 | add_host_roles | Add Host Roles
2 | add_hosts | Add Hosts to the SAP HANA Database System
3 | check_installation | Check SAP HANA Database Installation
4 | configure_internal_network | Configure Inter-Service Communication
5 | configure_sld | Configure System Landscape Directory
Registration
6 | extract_components | Extract Components
7 | print_component_list | Print Component List
8 | remove_host_roles | Remove Host Roles
9 | rename_system | Rename the SAP HANA Database System
10 | uninstall | Uninstall SAP HANA Database Components
11 | unregister_system | Unregister the SAP HANA Database System
12 | update | Update the SAP HANA Database System
13 | update_component_list | Update Component List
14 | update_components | Install or Update Additional Components
15 | update_host | Update the SAP HANA Database Instance Host
integration
16 | exit | Exit (do nothing)
Select the SAP HANA host to be assigned the additional dynamic tiering role. In this case,
there is just one host as shown in Example 2-10.
Example 2-10 Dynamic tiering role: Host selection for adding role
System Properties:
H01 /hana/shared/H01 HDB_ALONE
HDB00
version: 2.00.044.00.1571081837
host: pdemo1 (Database Worker (worker))
edition: SAP HANA Database
In Select additional host roles for host '<host>', select the host. In this case, there is just one
host. Add the <sid>adm ID password for it as shown in Example 2-11.
----------------------------------------------------------------------------------
1 | extended_storage_worker | Dynamic Tiering Worker
(extended_storage_worker)
You are prompt to confirm all added parameters. Confirm and enter y as shown in
Example 2-12.
At the end of the process, you see a summary of the installation as shown in Example 2-13.
Using HANA Studio, log into the tenant database as SYSTEM ID. In the Overview tab
window, you see the SAP HANA Dynamic Tiering status as Installed but not running yet for
that tenant as shown in Figure 2-10.
Click Landscape tab, you see the Dynamic Tiering Server service esserver as shown in
Figure 2-11.
The Extended Store consists in a database space which is the file on disk where tables and
partitions store the dynamic tiering.
Note: The Extended Storage is created with SYSTEM ID at this time, but for all
subsequent activities, you need another user ID with all necessary privileges for the
subsequent activities, and it is the owner of the extended store and multistore tables.
On HANA Studio, connect to your HANA Tenant Database to which the dynamic tiering has
been provisioned to with SYSTEM ID.
Note: In our case, there is a single tenant as the initial tenant and the SAP HANA instance
never previously contained additional tenants, hence the dynamic tiering service is
automatically provisioned to the tenant database.
For this demonstration create one Extended Store with 1 GB of available allocated space.
Right-click in the Tenant database and click Open SQL Console for the SQL console to be
opened. Then enter the command as shown in Example 2-14.
Notice that pdemo1 is the location (host) used in this lab. Adjust it accordingly to your host.
Figure 2-12 Dynamic tiering: Result for extended storage creation command line
Now, click Overview tab in HANA Studio, you see the status of the SAP HANA Dynamic
Tiering as Running as shown in Figure 2-13.
Your Dynamic Tiering is now ready for you to create an Extended Table or multistore table.
To create the user ID using HANA Studio, extend the Security folder of the Tenant tree,
right-click Users and then click New User.
Define a name for the user and add the System Privileges CATALOG READ, EXTENDED
STORAGE ADMIN and IMPORT. You can choose to not force the password change in the
first logon if you prefer. The parameters are shown in Figure 2-14.
In this case, the user ID is RPDT. Login into the tenant with the user ID defined by you.
The same table has been created in this demonstration as shown in Figure 2-15.
As you can see in the left side of the window, the table has been identified as EXTENDED
table in the HANA Catalog for user RPDT.
Restriction: Foreign keys between two extended tables or between an extended table and
an in-memory table are not supported.
For inserting data, you can use the same syntax as a common in-memory column store table
as shown in Example 2-16 and in Figure 2-16.
Example 2-16 Dynamic tiering: Insert data into table CUSTOMER_ES command line
INSERT INTO "RPDT"."CUSTOMER_ES"
(C_CUSTKEY, C_NAME, C_ADDRESS, C_PHONE)
VALUES
(1,'CUSTOMER 1','ADDRESS 1', 19999999999);
Figure 2-16 Dynamic tiering: Insert data into table CUSTOMER_ES - HANA Studio window
For reading data from the table, the SQL syntax is exactly the same for reading data from any
other table. Hence you can select data in the same way of any other in-memory column store
table. Refer to Figure 2-17.
Figure 2-17 Dynamic tiering: See contents of the extended table CUSTOMER_ES in HANA Studio
In this case, create a table called SALES_ORDER and partition it by RANGE, using a date
type field as the partition field. In the default store, the values range from 2010/12/31 to
9999/12/31, and in the extended store, the values range from 1900/12/31 to 2010/12/31, as
shown in Example 2-17.
In HANA Studio, you can see that the multistore table symbol differs from the extended table
symbol as shown in Figure 2-18.
The Data Manipulation Language (DML) operations on the table does not differ from any
other table type. So, for the user (applications and developers) perspective, there is no
concern about that.
You query from the TABLE_PARTITIONS table as shown in Example 2-18, and you can see
the new table has two partitions, one in the default store, and another in the extended store as
shown in Figure 2-19.
Example 2-18 Dynamic tiering: Query SALES_ORDER table from TABLE_PARTITIONS table
SELECT SCHEMA_NAME, TABLE_NAME, PART_ID, STORAGE_TYPE FROM TABLE_PARTITIONS WHERE
TABLE_NAME = 'SALES_ORDER' AND SCHEMA_NAME = 'RPDT'
Now insert a row in the table so that it is stored in the default store with the command as
shown in Example 2-19.
Example 2-19 Dynamic tiering: Inserting a row in the default store partition of SALES_ORDER table
INSERT INTO "RPDT"."SALES_ORDER"
(S_SALESOKEY, S_CUSTOMER, S_VALUE, S_DATE)
VALUES
(1,1,120,'2011-12-11');
From table M_CS_TABLES (table that shows data stored in the default store) as shown in
Example 2-20, you can see that there is one record in the default store as part of
SALES_ORDER table as shown in Figure 2-20 on page 32.
Example 2-20 Checking the count of default store rows of table SALES_ORDER
SELECT RECORD_COUNT FROM M_CS_TABLES WHERE TABLE_NAME = 'SALES_ORDER' AND
SCHEMA_NAME = 'RPDT';
Figure 2-20 Dynamic tiering: Results for the count of default store rows of table SALES_ORDER
Now insert one more row in the extended store of table SALES_ORDER as shown in
Example 2-21.
Example 2-21 Inserting a row in the extended store partition of SALES_ORDER table
INSERT INTO "RPDT"."SALES_ORDER"
(S_SALESOKEY, S_CUSTOMER, S_VALUE, S_DATE)
VALUES
(1,1,120,'2009-12-11');
From table M_ES_TABLES (table that shows data stored in the extended store) as shown in
Example 2-22, you can now see that there is one record in the extended store as part of
SALES_ORDER table as shown in Figure 2-21.
Example 2-22 Checking the count of extended store rows of table SALES_ORDER
SELECT * FROM M_ES_TABLES WHERE TABLE_NAME = 'SALES_ORDER' AND SCHEMA_NAME =
'RPDT';
Figure 2-21 Dynamic tiering: Results for the count of extended store rows of table SALES_ORDER
When making a normal DML select, you can see the two rows normally, with no distinction
between default store and extended store as shown in Figure 2-22.
Figure 2-22 Dynamic tiering: Normal select statement from a multistore table
This chapter discusses different solutions that can help speed up the starting of large HANA
databases to help minimize downtime. This chapter contains following sections:
Virtual Persistent Memory (vPMEM)
RAM Disk (tmpfs) SAP HANA Fast Restart Option
Persistent disk storage using native NVMe devices
NVMe Rapid Cold Start Mirror
Comparison to Intel Optane
Impact to LPM capabilities
The PowerVM Persistent Memory architecture allows for multiple types of memory to be
defined and deployed for different use cases. Currently, the vPMEM solution creates
persistent storage volumes from standard system DRAM, providing high speed persistent
access to data for applications running in an LPAR. For this current solution, no special
memory or storage devices are required, just unused available system DRAM. There is the
intention to support enhancements that will allow access other types of memory to be used
for different use cases.
IBM vPMEM solution is incorporated into the PowerVM hypervisor management interface for
POWER9 and later systems. vPEM volumes are created as part of a specific LPAR definition.
Each defined LPAR on a system can have a dedicated vPMEM volume. Individual vPMEM
volumes are not sharable between LPARs, and vPMEM volumes are not transferable to
another LPAR.
The PowerVM hypervisor allocates an independent segment of system memory for the
vPMEM volume and associates it to the LPAR. This system memory segment is separate
from the DRAM memory that is defined for the LPAR to use for the operating system to run
applications, as defined in the LPAR profile. After all the application uses this persistent
system memory volume as a disk resource, any data stored in the vPMEM device persists if
the LPAR is restarted.
Access to vPMEM volumes by the Linux operating system is provided by the standard
non-volatile memory device (libnvdimm) subsystem in the Linux kernel corresponding ndctl
utilities. The resulting vPMEM volumes are then mounted on the Linux file system as Direct
Access (DAX) type volumes.
When HANA detects the presence of DAX vPMEM volumes, it starts the copy of main column
store table data into these defined persistent volumes. Through its default settings, HANA
attempts to copy all compatible column store table data into the vPMEM volumes, maintaining
a small amount of space in LPAR DRAM for column store table metadata. This creates a
persistent memory copy of the table data that HANA then uses for query and transactional
processing. HANA can also be configured to copy into vPMEM only desired column store
tables, or even just specific partitions of individual column store tables.
To access the column store table data on the vPMEM volumes, HANA creates memory
mapped pointers from the DRAM memory structures to the column store table data.
Considering these vPMEM volumes are allocated from memory, accessing the table data is
done at memory speeds with no degradation in performance as compared to when the
column store data is stored without vPMEM in DRAM.
Any changed or added data to tables loaded into the vPMEM device will be synchronized to
disk-based persistent storage with normal HANA save point activity. When HANA is shut
down, all unsaved data stored in the LPAR DRAM and the vPMEM volumes will be
synchronized to persistent disk.
When HANA is shut down, the column store table data persists in the vPMEM volumes. The
next time HANA starts, it detects the presence of the column store table data in the vPMEM
volumes and skips loading that data. HANA instead just re-creates memory structure pointers
to the columnar table data stored on the vPMEM volumes. This results in a significant
reduction in HANA startup times as shown in Figure 3-1.
Figure 3-1 A large HANA OLAP database starts and shuts down times with and without vPMEM
This chart shows the substantial time savings for HANA start for a large OLAP database
when all of the databases column store data has been allocated in vPMEM volumes. There
are also time savings for HANA, if shut down after all, not as much DRAM memory is required
to be programmatically tagged as freed for control back to the operating system.
It is recommended to add additional 10-15% memory than required to the vPMEM volume to
allow for growth of the database over time.
After all less LPAR DRAM memory is used by HANA to store the column store table data that
resides on the vPMEM volume, the RAM memory allocation for the LPAR can be reduced by
a similar amount to avoid using more system memory that is required for the LPAR. This
memory definition adjustment is done in the LPAR profile on the HMC.
The resource allocation of CPU and memory for a system's LPARs can be queried by
performing a Resource Dump from the HMC. There are two methods, one within the HMC,
which is detailed at the IBM Support page
https://www.ibm.com/support/pages/how-initiate-resource-dump-hmc-enhanced-gui, or
two, by logging on to the LPAR's HMC command line with an appropriately privileged HMC
user account, like hscroot, and executing the command shown in Example 3-2.
Example 3-2 Start a Resource Dump from the HMC command line
startdump -m <system name> -t resource -r 'hvlpconfigdata -affinity -domain'
Substitute <system_name> with the system name defined on the HMC that is hosting the
desired LPAR.
Both methods create a resource dump file in the /dump directory, that is time stamped.
Depending on the size of the system, it can take a few minutes before the dump file is ready.
View the list of resource dump files on the HMC, listed in chronological order using the
command as shown in Example 3-3.
The last in the list can be viewed with the less command as shown in Example 3-4.
The less command determines that the file can be in binary form, as some of the data in the
file is in this format. But the details of interest are in text form as shown in Example 3-5 for a
E950 system with four sockets and 2 TB of RAM, with a single LPAR running that has been
allocated 48 cores and 1 TB of the available memory.
Example 3-5 Main section of the RSCDUMP file list CPU and memory resources assigned to LPARs
|-----------|-----------------------|---------------|------|---------------|---------------|-----
--|
| Domain | Procs Units | Memory | | Proc Units | Memory |
Ratio |
| SEC | PRI | Total | Free | Free | Total | Free | LP | Tgt | Aloc | Tgt | Aloc |
|
|-----|-----|-------|-------|-------|-------|-------|------|-------|-------|-------|-------|-----
--|
| 0 | | 1200 | 0 | 0 | 2048 | 311 | | | | | |
0 |
| | 0 | 1200 | 0 | 0 | 2048 | 311 | | | | | |
0 |
| | | | | | | | 1 | 1200 | 1200 | 1023 | 1023 |
|
|-----|-----|-------|-------|-------|-------|-------|------|-------|-------|-------|-------|-----
--|
| 1 | | 1200 | 0 | 0 | 2048 | 269 | | | | | |
0 |
| | 1 | 1200 | 0 | 0 | 2048 | 269 | | | | | |
0 |
| | | | | | | | 1 | 1200 | 1200 | 1024 | 1024 |
|
|-----|-----|-------|-------|-------|-------|-------|------|-------|-------|-------|-------|-----
--|
| 2 | | 1200 | 0 | 0 | 2048 | 312 | | | | | |
0 |
| | 2 | 1200 | 0 | 0 | 2048 | 312 | | | | | |
0 |
| | | | | | | | 1 | 1200 | 1200 | 1024 | 1024 |
|
|-----|-----|-------|-------|-------|-------|-------|------|-------|-------|-------|-------|-----
--|
| 3 | | 1200 | 0 | 0 | 2048 | 311 | | | | | |
0 |
| | 3 | 1200 | 0 | 0 | 2048 | 311 | | | | | |
0 |
| | | | | | | | 1 | 1200 | 1200 | 1025 | 1025 |
|
|-----|-----|-------|-------|-------|-------|-------|------|-------|-------|-------|-------|-----
--|
In Example 3-5, the columns of data that are of interest in this context have the following
meanings:
Domain SEC - This is the socket number that cores and memory are installed. In
Example 3-5, the system has four sockets, 0 - 3.
Domain PRI - This is the NUMA domain number. Example 3-5 has 4 NUMA domains, 0 -
3, and each NUMA domain aligns to a socket number. Some Power Systems have two
NUMA domains per socket.
Procs Total - This is the number of processors in a 1/100 of a processor increment. As
PowerVM can allocate sub processor partitions in 1/100th of a single core, this number is
100 times larger than the actual number of cores on the NUMA domain. Example 3-5
shows each socket has a total of 12 cores per socket.
Procs Free / Units Free - The total number of 1/100th of a core processor resources that
are available.
Memory Total - This is the total amount of memory available on the NUMA domain. This
number is four times larger than the actual memory in GB that is available. Example 3-5
shows each socket has 512 GB of RAM installed, for a total system capacity of 2 TB.
Memory Free - This is the amount of memory that is not in use by assigned LPARs on the
NUMA domain. Again, this value is four times larger than the actual memory in GB of
available memory. This is the important detail in determining the amount of memory that
can be used for a vPMEM volume considering this value decreases after the creation of
the vPMEM volume. Example 3-5 on page 39 shows sockets 0, 2 and 3 all have about 75
GB of available memory, and socket 1 has about 65 GB of available memory. This system
already has a vPMEM volume of 700 GB assigned to the running LPAR.
LP - This is the LPAR number, as defined in the HMC. Example 3-5 on page 39 shows
only one LPAR running, LPAR number 1, it is assigned all CPU resources, and a subset of
the available RAM resources.
Proc Units Tgt - This is the number of subprocessor processor units that is assigned to
the LPAR from the NUMA domain. This is allocated from the value from the Procs Total
column. Example 3-5 on page 39 shows the target allocation of processing units is 1200.
Proc Units Aloc - This is the number of subprocessor units that have been allocated to
the LPAR. Example 3-5 on page 39 shows all 1200 units per socket are assigned and
activated to the LPAR across all four NUMA domains or sockets.
Memory Tgt - This is the amount of desired memory assigned to the LPAR's DRAM
configuration as defined in the LPAR profile. Again, this value is four times larger than the
actual memory (GB) assigned, and the hypervisor allocates this memory per NUMA
domain in the same ratio as the processing unit assignment to the rest of the processing
unit allocation across the other assigned NUMA domains. Example 3-5 on page 39 shows
around 256 GB is targeted to be allocated to each NUMA domain, and in the same ratio as
processing units. This means that memory is evenly distributed just as the processing
units are evenly distributed.
Memory Aloc - This is the real allocation of memory to the LPAR per NUMA domain.
Example 3-5 on page 39 shows all memory requested has been allocated to the LPAR.
Summing up these values across the system reflects the LPAR DRAM memory allocation
as seen by the operating system.
If the system has vPMEM volumes already assigned, this memory allocation is not explicitly
listed in this output. The memory values for the LPARs are those that are assigned to the
LPAR’s memory allocation in the profile. To determine the approximate amount of memory
vPMEM volumes are taking on a specific socket, add up the memory allocations for the
LPARs on that socket and subtract that value from the Memory Total. Taking this results and
subtracting the value from Memory Free gives the amount of RAM used by the vPMEM
volume as shown in Example 3-6.
Example 3-5 on page 39 for socket 0 used the values as shown in Example 3-7.
Considering the same memory allocation is assigned across all four nodes, the total vPMEM
device allocated to this LPAR is about 714 GB.
Figure 3-3 Persistent Memory pane: Empty list of defined vPMEM volumes
Add a descriptive volume name, the total size in MegaBytes of the desired vPMEM device,
and check to add Affinity. Clicking OK creates a single vPMEM device for the LPAR.
Figure 3-5 An 8 TB system with memory allocated across four NUMA domains
Figure 3-5 on page 42 shows an 8 TB system with memory allocated across four NUMA
domains. Creating a 4 TB vPMEM device with NUMA affinity creates one vPMEM device per
NUMA node, each 1 TB in size.
The benefit of dividing the vPMEM volume into segments and affinitizing them to NUMA
boundaries enables applications to access data in physically aligned NUMA node memory
ranges. Considering data is usually accessed sequentially, storing data NUMA optimized is
best for throughput and access latency performance.
Affinitized vPMEM volumes is the only option that is supported for use with HANA.
Figure 3-6 An 8 TB system with memory allocated across four NUMA nodes
Figure 3-6 shows an 8 TB system with memory allocated across four NUMA nodes creating a
4 TB non-affinitized vPMEM device resulting in a single 4 TB device that is stripped across all
NUMA nodes.
Currently, this virtual persistent memory device option is not supported for use with HANA.
The persistent memory volumes are then initialized, enabled and activated with the standard
operating system non-volatile DIMM control (ndctl) commands. These utilities are not
provided by default in a base level Linux installation, but are included in the Linux distribution.
Install them with the application repository commands, for example, for Red Hat install it with
the command as shown in Example 3-8.
Example 3-8 Red Hat command line installation of the ndctl package
yum install ndctl
On Power Systems, each vPMEM volume is initialized and activated automatically. There is a
corresponding number of /dev/nmem and /dev/pmem devices as there are NUMA nodes
assigned to the LPAR as shown in Example 3-10.
If the /dev/pmem devices are not created automatically by the system during the initial
operating system start, then these need to be created. Example 3-11 shows the set of ndctl
commands to initialize the raw /dev/nmem device, where X is the device number (for example,
/dev/nmem0).
New device definitions are then created as /dev/pmemX. These are the new disk devices that
need to be formatted as shown in Example 3-12 on page 45.
Example 3-12 Creating the XFS file system on the vPMEM device
# mkfs -t xfs -b size=64k -s size=512 -f /dev/pmemX
When mounting the vPMEM volumes, it is advised to use the /dev/disk/by-uuid identifier for
the volumes. These values are stable from OS renaming of devices on restart of the OS. Also,
these must be mounted with the -o dax option as shown in Example 3-13.
To mount the volumes automatically on system restart, add an entry in the /etc/fstab file for
each vPMEM volume, using the corresponding uuid name and adding the option dax in the
options column. Using the uuid name of the volume guarantees correct remounting, after all
the /dev/pmemX volume number can after an OS restart. Example 3-14 shows an entry for
the fstab file for one vPMEM volume.
Example 3-14 Add vPMEM devices into the /etc/fstab file for automatic mounting on OS start
/dev/disk/by-uuid/34cb1120-1a61-47e5-9bcc-5b60e6d8e1d /hana/data/vPMEM/vPMEM0 xfs
defaults,dax 0 0
vPMEM volumes are not traditional block device volumes. Therefore, normal block device disk
monitoring tools, for example, iostat, nmon do not work to monitor the I/O to the vPMEM
devices. But normal director monitoring tool, for example du, works as the files consume the
available storage space of the vPMEM volume.
Example 3-15 Entry in the global.ini file defining the paths to the vPMEM volume directories
[persistence]
basepath_persistent_memory_volumes =
/path/to/first/directory;/path/to/second/directory;…
This parameter option is an offline change only, requiring the restart of HANA to be enabled.
On first startup, HANA by default copies all column store table data, or as much as possible,
from persistent disk into the newly configured and defined vPMEM volumes. With partitioned
column store tables, HANA round robin assigns partitions to vPMEM volumes to evenly
distribute the column store table partitions across the entire vPMEM memory NUMA
assignment. When all column store data for a table is loaded into the vPMEM volumes, HANA
maintains a small amount of column store table metadata in normal LPAR DRAM memory.
indexserver.ini parameter to turn off the loading of all column store tables into the persistent
memory as shown in Example 3-16.
Example 3-16 Change default behavior of HANA to not load all tables on HANA startup
[persistent memory]
table_default = OFF
The table_default parameter's default value is DEFAULT. This is synonymous with the value
of ON. This parameter, along with the global.ini parameter
basepath_persistent_memory_volumes together make HANA's default behavior to load all
table data into the vPMEM devices.
This is a dynamic parameter. If there is currently column store table data in the vPMEM
volumes, then performing a HANA UNLOAD of the unneeded columnar table data removes
the table data from the persistent device, after which a LOAD operation is needed to reload
the table data into system DRAM. Or a shutdown of HANA can be done so the old column
store table data can be removed from the vPMEM volumes. Then, upon the next startup,
HANA loads all column store table data into DRAM.
Individual column store tables that are to use the vPMEM volumes can be moved with the
SQL command as shown in Example 3-17.
Example 3-17 Altering a table move the column store table to PMEM
ALTER TABLE "<schema_name>.<table_name>" PERSISTENT MEMORY ON IMMEDIATE CASCADE;
This immediately moves the table from LPAR DRAM to the vPMEM devices and is set for all
future restarts of HANA.
For columns and partitions, currently the only way to load data into vPMEM volumes is by use
of the CREATE COLUMN TABLE commands as shown in Example 3-18.
Example 3-18 Specifying column store table columns or partitions to reside in vPMEM
# create a table that the named column will use persistent memory
CREATE COLUM TABLE … <column> … PERSISTENT MEMORY ON
# create a table that the named partitions will use persistent memory
CREATE COLUMN TABLE … PARTITION .. PERSISTENT MEMORY ON
Column store table data can also be unloaded from the vPMEM devices and the data
removed from the vPMEM volumes. Example 3-19 shows the commands which remove the
column store table data from the vPMEM volume, and unload the data from all memory.
After this command as shown in Example 3-19 runs, the column store table data is no longer
loaded into any memory areas (DRAM or vPMEM). Table data needs to be reloaded either by
future query processing, or by manually executing the SQL command as shown in
Example 3-20 on page 47.
If the table persistent memory setting has been changed to either OFF or ON, it can be reset
back to the default value of DEFAULT with the SQL command as shown in Example 3-21.
Table 3-1 Software version dependencies for using vPMEM on POWER with HANA
Component Version
https://launchpad.support.sap.com/#/notes/2700084
2618154 - SAP HANA Persistent Memory - Release Information
https://launchpad.support.sap.com/#/notes/2618154
2786237 - Sizing SAP HANA with Persistent Memory
https://launchpad.support.sap.com/#/notes/2786237
2813454 - Recommendations for Persistent Memory Configurations with BW/4HANA
https://launchpad.support.sap.com/#/notes/2813454
Just like with vPMEM persistent memory volumes, HANA can take advantage of tmpfs
volumes to store columnar table data in configured LPAR DRAM volumes. The mechanisms
by which HANA is configured to use tmpfs volumes are identical to vPMEM volume usage.
Like vPMEM volumes, access to the files stored in a tmpfs file system are performed at
DRAM speeds. In this regard, accessing data from tmpfs and vPMEM volumes have the
same performance. Also, no special hardware is needed to create tmpfs volumes considering
the DRAM memory allocated to the LPAR is utilized.
Unlike vPMEM, considering the memory allocated to tmpfs volumes is allocated within the
memory allocated to an LPAR, tmpfs volumes are not persistent to operating system or LPAR
restarts. Data stored in tmpfs LPAR DRAM disks is volatile and the contents are erased when
the operating systems is shutdown or rebooted. Upon restart of the operating system, the
RAM disk volumes must be recreated, and HANA columnar table data is reloaded from
persistent disk volumes.
Also, unlike vPMEM volumes that are created with the Affinity option, creation of a tmpfs
volume is not automatically aligned to any specific NUMA node. NUMA node memory
alignment details are gathered as a preparatory step and used in creating the tmpfs volumes
at the operating system level.
One benefit of tmpfs file systems have over vPMEM volumes is that tmpfs volumes can be
created to grow dynamically as the tmpf volumes fills. This allows the volumes to
accommodate larger than expected data growth. But this dynamic characteristic of tmpfs file
systems has the side effect of using more LPAR DRAM memory than expected, which steals
memory away from applications that need that memory to function. Hence, correctly sizing
the tmpfs volumes is still important. Alternatively, the tmpfs volumes can be created to
consume a fixed amount of LPAR DRAM.
A quick check at the operating system shows the available NUMA nodes and how much
memory is allocated to each node as shown in Example 3-22.
Example 3-22 Determining the amount of RAM allocated to each NUMA node of an LPAR
grep MemTotal /sys/devices/system/node/node*/meminfo
This produces output showing an amount of memory available on each of the NUMA nodes
that the operating system has assigned as shown in Example 3-23.
In this output (Example 3-23), the system has four NUMA nodes, each installed with roughly
512 GB of system DRAM.
To create tmpfs devices in Example 3-23, you want to allocate four different tmpfs devices,
one for each NUMA node. The mount command has an option to assign the memory for the
tmpfs to a named NUMA node. Example 3-23 shows four directories to where the tmpfs files
systems will be mounted, then create the file systems using the mount command options as
shown in Example 3-24.
In Example 3-24:
<tmpfs file system name> is the operating device name. Use any descriptive name
desired
-t tmpfs is the file system type, and in this case, of type tmpfs
-o mpol=prefer:X as mentioned, is the NUMA node number to assign the memory for
the tmpfs
/<directory to mount file system> is the location on the OS file system path to mount the
tmpfs file system. This directory is accessible and readable from the OS level.
Check this directory exists, just like for any normal mount command
In Example 3-23 on page 50, the system has four NUMA nodes, so creating four directories
and mounting four different tmpfs file systems can be done as shown in Example 3-25,
substituting the <SID> with the SID of the HANA database.
Example 3-25 Sample script to create, mount and make available tmps volumes for use by HANA
for i in 0 1 2 3; do
mkdir -p /hana/data/<SID>/tmpfs${i}
mount tmpfs_<SID>_${i} -t tmpfs -o mpol=prefer:${i} /hana/data/<SID>/tmpfs${i}
done
chown -R <db admin user>:<db group> /hana/data/<SID>/tmpfs*
Using these options (Example 3-25), the amount of memory allocated to each tmpfs is
dynamically sized based what is being stored by HANA in the file systems. This is the
preferred option as the file system grows as HANA table data is migrated from RAM to the
tmpfs files system.
If there is a need to statically allocate an amount of memory to the tmpfs file system, using the
-o size=<size in GB> option statically allocates a fixed size of LPAR DRAM to the tmpfs file
systems.
Note: The tmpfs volumes are not formatted as an XFS file system as is done for vPMEM
volumes, nor are the volumes mounted with the -o dax option. This is due to the
differences in file system format for tmpfs volumes, and the ability of HANA to differentiate
and support both types of file system formats to persistently store columnar table data.
Multiple LPARs on a system Yes, and volume DRAM taken Yes, tmpfs volume created
can be assigned memory from spare system memory within each LPAR memory
volumes
Partitioned columnar table data Yes, with Affinity enabled Yes, when each file system
will be round-robin assigned to aligned to NUMA nodes in the
memory volume mount command
Memory volume is Live Partition Not yet, but coming in a future Yes
Migration capable release
Dependency on system Yes, as outlined in the previous Can be used with any Power9
firmware, HMC code, or OS section supported versions
release
Some of the key benefits NVMe devices provide over SAN disk-based storage devices:
Increased queue depth, which is provides decreased I/O times
Significantly lower latencies for read and write operations
Higher I/O throughput than traditional disk fabric systems (for example, Fibre Channel)
due to the location of the adapters on a PCIe adapter slot
NVME adapters are made up of multiple individual FLASH memory modules. This
architecture allows the operating system to access multiple storage modules per NVMe
adapter independently. Figure 3-7 shows a sample output listing the devices on an individual
NVMe adapter.
Figure 3-7 shows there are four modules of 745 GB per module and 3 TB total storage for the
adapter.
Figure 3-8, Figure 3-9, and Figure 3-10 show a few examples.
Figure 3-8 Raid1 storage volumes for SAP HANA data and log using NVMe adapters
Figure 3-9 Raid1 arrays for SAP HANA data and log created from individual NVMe adapters
Figure 3-10 Raid0 and Raid1 arrays for HANA data and log using multiple NVMe storage modules
As FLASH modules are subject to long term degradation in high write and data change
environments, caution must be taken to which data is assigned to FLASH storage volumes.
Preference must be given to data profiles that do not have excessive data modification rates.
For environments that must use FLASH modules with high write and change activity
characteristics, mirroring a NVMe volume with a traditional disk-based storage volume is a
technique to help preserve data integrity in the case of a NVMe FLASH wear failure.
For read performance, testing shows that NVMe adapter volumes are up to 2-4x faster than
disk-based storage solutions, depending on I/O block sizes as shown in Table 3-3.
Creating a Raid0 volume over multiple NVMe devices increases the throughput on write
operations on block sizes greater equal 64 KB nearly by factor 1.7. On block sizes greater
than 256 KB, the factor is nearly 2.
Creating a Raid0 volume on the multiple memory modules of one NVMe device has no
positive effect on performance over storage on a single memory module.
3.3.5 Summary
NVMe adapters with their fast flash modules can provide significant increase in I/O
performance for HANA when compared to storage on traditional SAN storage options, either
SSD or spinning disk. The increase in speed and lower latencies are provided by much faster
flash technology in the NVMe adapter versus SSD flash storage, and by accessing the NVMe
storage through PCIe 3.0 protocol versus over Fibre Channel protocols.
As stated, large database take quite some time to load from SAN disk-based storage through
connectivity solutions like Fibre Channel connected to traditional disk-based or SSD-based
volumes. NVMe storage that is hosted at the host level provides faster access to the HANA
data for loading and writing.
But FLASH modules in general are subject to wearing due to data write or change. To protect
valuable data, a hybrid of the traditional SAN disk-based storage and NVMe storage can be
used to provide faster HANA startup along with protection of the data on disk-based storage.
In Figure 3-11, the NVMe adapter FLASH modules (inside the green box) are all added into a
RAID0 array. On the SAN side, RAID0 arrays of the same size as the NVMe RAID0 array are
created (inside the red box). Each of these four RAID0 volumes are represented with blue
boxes on their respective storage platforms.
Then, a mirrored RAID1 volume is created by assigning one SAN RAID volume to one NVMe
RAID0 volume, represented by each of the grey boxes.
When creating the RAID1 mirror between the NVMe volumes and the SAN storage volumes,
preference for the OS to write to the NVMe volumes is set by passing the option
--write-mostly of the mdadmin array utility, assigning it to the RAID0 device name for the
external SAN volume. Example 3-26 shows the Linux man page excerpt for mdadm.
Example 3-26 mdadm --write-mostly option to favor one RAID device for reading
-W, --write-mostly
subsequent devices listed in a --build, --create, or --add command will be
flagged as 'write-mostly'. This is valid for RAID1 only and means that the
'md' driver will avoid reading from these devices if at all possible. This
can be useful if mirroring over a slow link.
Hence, in the building of the RAID1 device, specify the SAN storage device as
--write-mostly in the mdadm --create command as shown in Example 3-27.
3.4.2 Summary
Mirroring an internally installed NVMe adapter to an external SAN volume of the same size
can provide the benefits of a rapid startup of HANA by reading data from the NVMe adapter
with the RAID1 mirroring of the data to an external SAN disk for ultimate data protection.
Optane DCPMM memory is implement by installing new Optane DCPMM memory cards into
existing system DRAM DIMM slots. This means that real DRAM capacity needs to be
sacrificed to use the Optane memory solution. In contrast, the vPMEM option for Power
Systems uses standard system DRAM that is already installed in the system. Implementation
of vPMEM is as simple as defining the vPMEM memory segment, start the LPAR, configure
HANA to use the persistent memory segments. No hardware downtime needs to be
experience, unless additional DRAM is required to support the vPMEM solution.
Optane memory capacities are currently provided in 128 GB, 256 GB and 512 GB per
PCDMM module. This is much higher capacity that currently exists for DIMM memory
modules, with currently have a maximum capacity of 64 GB per DIMM. Rules for using
PCDMM modules are complicated and vary depending on the memory mode used, but with a
maximum of 12 DIMM modules per socket, using of six DIMM modules of 64 GB and the
maximum of six 512 GB DCPMM memory modules, a socket maximum memory configuration
is 3.4 TB. This is compared to a maximum memory configuration of 1.5 TB when using only
DIMM memory modules. POWER9 systems support up to 4 TB of DIMM system memory per
socket using currently available 128 GB DIMMs. Future DIMM sizes will increase this memory
footprint.
From a memory latency point of view, Optane DCPMM memory modules have a higher read
and write latency when compared to standard DIMM memory technologies. This is attributed
to the technology implemented to provide data persistency in the DCPMM module. Higher
latencies can have an impact on application performance and must be evaluated when
implementing the Optane solution.
On the other hand, POWER9 vPMEM and tmpfs are using DIMM backed RAM and perform
read and write operations at full DIMM throughput capabilities.
Optane has three memory modes that the DCPMM modules can be used:
Memory Mode: In this mode, the DCPMM memory modules are installed along side
standard DIMM modules and are used as a regular memory device. One advantage of this
using the DCPMM modules in this mode is that greater overall memory capacities can be
achieved over standard DIMM modules available for x86 systems. But, enabling the
PCDMM modules in this mode puts the regular DIMMs in the system into a caching
function, which makes their capacity invisible to the host operating system. Therefore, only
the capacity of the DCPMM memory can be used by the host operating system, and the
regular DIMM memory capacity is unavailable for operating system and application use.
App Direct Mode: In this mode, the DCPMM memory modules are used as persistent
storage for operating systems and applications that can take advantage of this technology.
The DCPMM memory modules are recognized by the operating system as storage
devices and can are used to store copies of persistent disk data, making access to that
data faster after an OS or system restart. The standard DIMMs are used normally as
available RAM to the operating system.
Mixed Mode: This mode is a mixture of Memory and App Direct modes. When using
DCPMM modules in this mode, a portion of the capacity of the module is used as memory
for the host operating system, and the remaining capacity is used for persistent storage.
But, just like in Memory Mode, any DIMM memory is unavailable for use by the host
operating system and is instead converted into a memory cache subsystem.
Currently, Optane persistent memory is not supported for use by HANA in virtualized
environments.
Using tmpfs for persistent memory uses the memory assigned to the LPAR and available to
the LPAR. Therefore, moving an LPAR from one system to another preserves the use of the
tmpfs persistent memory volumes at the destination system.
Currently, vPMEM volumes that are assigned to an LPARs are defined outside the LPAR
configuration. Currently due to this implementation, LPM operations are not supported for
vPMEM enabled LPARs. Prior to the LPAR migration, the vPMEM device needs to be
removed from the LPAR. Then, at the destination system, a new vPMEM volume can be
created to support the application. vPMEM LPM is intended to be supported in a future
firmware release.
Related publications
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this paper.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this
document. Note that some publications referenced in this list might be available in softcopy
only.
SAP HANA on IBM Power Systems: High Availability and Disaster Recovery
Implementation Updates, SG24-8432
IBM Power Systems Virtualization Operation Management for SAP Applications,
REDP-5579
IBM Power Systems Security for SAP Applications, REDP-5578
SAP Landscape Management 3.0 and IBM Power Systems Servers, REDP-5568
You can search for, view, download or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Online resources
These websites are also relevant as further information sources:
IBM Power Systems rapid cold start for SAP HANA
https://www.ibm.com/downloads/cas/WQDZWBYJ
SAP Support Portal
https://support.sap.com/en/index.html
Software Logistics Tools
https://support.sap.com/en/tools/software-logistics-tools.html
Guide Finder for SAP NetWeaver and ABAP Platform
https://help.sap.com/viewer/nwguidefinder
Welcome to the SAP Help Portal
https://help.sap.com
REDP-5570-00
ISBN DocISBN
Printed in U.S.A.
®
ibm.com/redbooks