0% found this document useful (0 votes)
76 views11 pages

SAP HANA's Defining Capabilities

Uploaded by

Bogdan ddm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views11 pages

SAP HANA's Defining Capabilities

Uploaded by

Bogdan ddm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

SAP HANA’s Defining Capabilities

Table of Contents

SAP HANA UNIQUE CAPABILITIES


Overview 3
The SAP HANA Difference 4
Architecture 8
Summary 10

The information in this document is confidential and proprietary to SAP and may not be disclosed without the permission of SAP. Except for your obligation to protect
confidential information, this document is not subject to your license agreement or any other service or subscription agreement with SAP. SAP has no obligation to
pursue any course of business outlined in this document or any related document, or to develop or release any functionality mentioned therein. This document, or any
related document and SAP’s strategy and possible future developments, products and or platforms directions and functionality are all subject to change and may be
changed by SAP at any time for any reason without notice. The information in this document is not a commitment, promise or legal obligation to deliver any material, code
or functionality. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability,
fitness for a particular purpose, or non-infringement. This document is for informational purposes and may not be incorporated into a contract. SAP assumes no
responsibility for errors or omissions in this document, except if such damages were caused by SAP’s intentional or gross negligence.

All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned
not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions

2 SAP HANA’s Defining Capabilities


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
Overview

SAP HANA with its new vision and approach, as with many cutting-edge innovations, it is still ACID compliant and it
things novel and paradigm-shifting, can be met with resistance supports Disaster Recovery (DR) via synchronous or
and disbelief. Change is not always easy and many who are asynchronous native replication.
comfortable with existing approaches try to compare SAP HANA • Unique on-the-fly schema extension capability allows for
with, and to see it in the light of, those older paradigms. This, flexible business model changes.
however, cannot successfully be done given that SAP HANA • Dynamic Data Tiering optimizing the balance between
represents a completely new way to address traditional data processing and data storage. Data identified as
computing issues. There is a SAP HANA difference and this frequently used is kept in memory. Any data not recently
paper will focus on some of the unique capabilities of SAP accessed is purged from memory but persisted on-disk
HANA’s revolutionary approach — ­ especially the technological without write-back.
underpinnings — and how these translates into a very different • It is more than just a database. SAP HANA converges
way of assessing value for those who deploy it. Some of these platform, database, data processing capabilities, handles
unique SAP HANA capabilities that will be explored here include: spatial and textual data analysis, and provides libraries for
predictive, planning and business analytics. All in a single
• SAP HANA’s Data Locality Aware Processing. SAP platform with a built-in application and web server.
HANA’s CPU-aware optimizer leverages optimizations of • SAP HANA provides the most comprehensive data
the data’s locality to the CPUs and their caches to reduce provisioning for any data from any source.
latency of data compute. SAP HANA ensures that data • It is open and agnostic meaning any application and any
resides in the DRAM and CPU caches to avoid cache hit data. SAP HANA runs on commodityx86 based hardware
misses, and applies knowledge of all the bandwidth and from SAP partners. It supports a wide variety of languages,
latencies between every cache, every CPU, and every node certified 3rd party tools, and custom apps created in
in the execution plans. almost any language. It is data agnostic with support for
• SAP HANA can run real-time OLTP and OLAP on a single structured, unstructured (i.e. text), spatial, document data,
copy of the data. There is no need for unique OLTP and and sparse data.
OLAP copies. Data is stored only once for both • SAP HANA has extreme linear scalability scaling up and
transactional and analytical applications. scaling out. Adding more nodes to handle more data,
• Multiple Parallel Processing (MPP) on a shared nothing keeping response time relatively unchanged.
architecture along with its support for Single
Instruction, Multiple Data (SIMD). Let’s examine these unique and game-changing capabilities in
• Even though the SAP HANA true in-memory platform is a more detail.
paradigm-shifting new approach, nothing has changed
with regards to ACID compliance or Reliability, Full High
Availability, Disaster Recovery and Supportability. For all its

SAP HANA’s Defining Capabilities 3


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
The SAP HANA Difference

SAP HANA is more than just a database, • Atomicity: Requires that each and recognize that SAP HANA is ACID
but at its heart there is columnar database transaction is “all or nothing”. If one compliant and does not lose data in event
with a data processing engine at its core. part of the transaction fails, the of a power loss (or any other type of
And one that is constantly evolving. Two entire transaction fails, and the failure). Just like any other ACID compliant
years ago, when it first came out, SAP database state is left unchanged. database, SAP HANA writes any changes
HANA was capable of scanning data at 2.0 • Consistency: Ensures that any to its data to non-volatile memory before
billion integers per second per core, transaction will bring the database acknowledging success. SAP has
already an impressive rate. Today, this has from one valid state to another. Any ‘Durability’ and it is fully recoverable.
increased to 3.2 billion integers per data written to the database must SAP HANA is more than just a
second, per core. This continued drive for be valid according to all defined database. The danger of viewing SAP
improvement is important because, all rules, including but not limited to, HANA simply as a database is that it
data in SAP HANA gets represented as constraints, cascades, triggers, and associates SAP HANA with the
integers. This means all data is capable of any combination thereof. complexity of most databases. Most
being scanned at this incredible speed. • Isolation: Ensures that the databases have the need for tuning with
Scanning, while it is a good speed test, concurrent execution of indexes, specialty purchased and task
does not indicate actual processing speed. transactions results in a system configured hardware, and creation of
SAP HANA is also capable of handling 12.5 state that would be obtained if aggregate tables to maintain the speed
million aggregations per second per core. transactions were executed serially. of transactional systems or to “pre-
A standard SAP HANA node (i.e. single • Durability: Means that once a answer” queries in analytic ones. But
server) has 40 cores. Meaning that a single transaction has been committed, it SAP HANA has none of these drawbacks.
SAP HANA node can scan data at 128 will remain so, even in the event of The speed of SAP HANA, while game-
billion integers per second and aggregate power loss, crashes, or errors. changing and beneficial (cost of FTEs
data at 158 million integers per second. So while it can be said that SAP HANA is staring at hour glasses adds up), allows it
Where this gets really interesting is a database capable of running any to run without indexes, aggregates, or
realizing that SAP HANA can ingest data at application that a relational database can specially configured hardware and
1.5 million records (i.e. rows of data) per run, that is just the beginning. Before you software. You don’t need to create
second per node, all the while maintaining read any further, please look at the indexes to make SAP HANA achieve the
and supporting all the ACID properties. ‘Durability’ point in the ACID list just above astounding read speeds noted earlier in

SERVER 1

SERVER 2 Traditional: Separated OLTP + OLAP OLTP + OLAP in SAP HANA


Shared Storage

SERVER 3
High Availability

Current
24 HR Old
Data
SERVER 4 Data
SAP
HANA

SERVER 5
6 Hours Immediate
SERVER 6

Cold Standby Server

4 SAP HANA’s Defining Capabilities


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
The SAP HANA Difference

the paper. You don’t need to create and thus supportive of agile reporting is no longer needed. That said,
materialized aggregates to make the methodologies. Schema agnostic in SAP the rest of the benefits SAP HANA
system achieve acceptable performance. HANA also means that you can do full provides by being schema agnostic are
You don’t need to configure your analytics on this same schema. Let this lost in these competitive systems. The
hardware and database based on sink in. You no longer need to ETL data OLTP data still needs to be migrated to
whether the application will be used for into an operational data store in order to the read-optimized columnar tables and
OLTP versus OLAP (a.k.a. analytics). off-load processing from the OLTP thus losing the real-time benefit.
Just load data into SAP HANA (or create system. That is half the hardware costs Additional storage is needed as the data
virtualized tables to access data in and requires no administrators to is duplicated. Administrative costs are
remote source systems locally) and it will monitor and maintain a second system. higher as you need to monitor and
run faster than any other database – Additionally, analytics data is now 100% administer the change data capture
period. This also means that it costs less real-time – no batch. Need to know what process on the row tables and the delta
to run, but again more on that later. orders are open — just ask. Note that data activation process on the columnar
One of the benefits of SAP HANA’s other vendors entering the market are store. How does SAP HANA avoid all this
speed is that it is schema agnostic. What making similar sounding claims, but with and provide superior benefits? Simple.
this means is that your OLTP schema dramatically fewer benefits. They are SAP HANA’s table structure has the
doesn’t need denormalization in order to putting very fast columnar tables OLTP speed of row tables and the read
perform. This reduces development time alongside their OLTP row tables. This is speed of columnar tables. There is no
and makes the schema self-describing good. A second server for operational cut-over needed, no row table to

Browser / Mobile
Third Party &
Web JS Lib Data Viz Lib Custom Application Easily migrate your applications
(e.g.L: Java, PHP, .NET) in almost any lan-
guage, PHP, Ruby, Java, C,...the list goes on:
ODBO http(s), OData/JSON HTTP)S), OData, XML/A, ODBC, • Support for ANSI SQL, ODBC, JDBC,
JDBC, ADSC, ODBO MDX, SLQ
Odata/JSON, and certified 3rd party tools.
• Support more standards: JSON and
XMLA over HTTP so it is truly multi-
Web App Server
dimensional platform.

Build new web applications with any open


ODBC, JDBC
source
HTML5/JS libraries, Server Side Java Script.

Open Cloud Partner Program


App Services Any HTML5/JS
Embed sentiment fact extraction in same SQL (Web Server) Library Allows you to select the best SAP HANA cloud
Embed geospatial in same SQL option from partners.
Embed fuzzy text search in same SQL SQL Script
DB Services Support advanced text analytics:
CREATE FULLTEXT INDEX i1 ON PSA_TRANSACTION(
AMOUNT, TRAN_DATE, POST_DATE, DESCRIPTION, Analyze text in all columns of table and text
SAP HANA
CATEGORY_TEXT ) FUZZY SEARCH INDEX ON SYNC;
inside binary files with advanced text analytic
SELECT SCORE() AS SCR, * FROM
"SYSTEM"."PSA_TRANSACTION" WHERE CONTAINS (*,
'Sarvice', fuzzy) ORDER BY SCR DESC;
capabilities such as: automatically detecting
31 languages, fuzzy, linguistics, synonymous
search, using SQL.

Analyze streaming data


Any Data From integrated ESP in combination with data
in SAP HANA.

Process geospatial data


Text Geospatial Smart Mobile Point of RFID Machine Connected Structured Click- Social Customer
Data Data Meter Sale Data Vehicles Data Stream Network Data

SAP HANA’s Defining Capabilities 5


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
The SAP HANA Difference

columnar table conversion needed, no data as well. And SAP is getting ready to data that isn’t structured (i.e. not
batch building for columnar release support for property graph data in relational). These purpose built
compression, no nothing. The same SAP HANA as well. With unstructured data, databases have provided, huge
table supports OLTP and OLAP Analytics SAP HANA supports ingestion of a huge innovation leaps. But using them is still a
– at the same time – with full ACID number of document types including all separate database approach with all its
properties. This means real real-time the usual suspects, language identification inherent issues: The need for separate
and real-value. for 31 languages, entity extraction for three servers, data to be copied between the
Another advantage SAP HANA has over languages, and sentiment analysis for the systems, and applications having to
traditional databases is openness. SAP top five spoken languages. For spatial merge the data coming from the
HANA is open and agnostic. It supports data, SAP HANA is not only screaming fast different systems via in-app code. All a
ANSI compliant SQL, ODBC and JDBC but also ships with a geo-coding service, lot of work and effort. This may be fine if
protocols, and a wide range of certified polygons for all major political and you are a specialized web start-up, but
3rd party query tools, ETL tools, back-up geographic areas, and an industry definitely not if you are an enterprise.
and recovery tools, monitoring tools, and compliant integration layer supporting Additionally, these complex
even applications. Yes, even ones that ESRI, Navtec, and Google. As for document infrastructures reduce agility. By being in
compete directly with SAP applications. data, SAP HANA supports complete the single SAP HANA data platform,
SAP HANA supports custom applications transactions using JSON or XML. For developers can easily take advantage of
built on SAP HANA in any language sparse data with varying attributes, SAP data of any kind to build next generation
including PHP, Ruby, Python, Java, C, and HANA supports flexible tables in which applications. Data of heterogeneous
of course, ABAP. And SAPHANA also columns can be dynamically added on data types can be easily combined,
supports MDX over ODBO and XMLA over Insert or Update. All vitally important non- processed, and leveraged.
HTTP if you want to access SAP HANA as a structured data types explode. SAP HANA also brings the analytics
multi-dimensional database. SAP HANA is There are several new specialized code closer to the data. In and of itself, this
data agnostic as well, not only supporting databases built specifically for these isn’t new or revolutionary. Stored
tables, but unstructured data (i.e. text), data types that have allowed for procedures have been around forever. The
spatial data, document data, and sparse applications to address the 80% of all unique aspect is the offering and provision

Any Apps SAP Business Suite and


Any App Server BW ABAP App Server
Supports Open
any Device SQL MDX R JSON Connectivity

HANA PLATFORM

SQL, SQLScript, JavaScript


Stored Procedure Application & UI
Spatial Search Text Mining
& Data Models Services
Business Function Predictive Analysis Database Planning Rules
Library Library Services Engine Engine

Integration Services

Transaction Unstructured Machine HADOOP Real-time Locations Other Apps

6 SAP HANA’s Defining Capabilities


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
The SAP HANA Difference

of pre-built libraries and the speed with libraries are being added to SAP HANA for So SAP HANA is a fully ACID
which SAP HANA can run the analytics. data quality and data transformations. compliant database that supports real-
There is a Predictive Analysis Library There is also a high speed integration to time OLTP and OLAP (analytics) without
(PAL) which supports 27 data preparation “R” that lets you utilize the over 3500 duplicate copies of the data. It supports
and data mining functions including algorithms that have been built on this every major data type along with text
clustering and classification algorithms, open source analytics platform to leverage analytics and geo-spatial capabilities. It
regression methods, decision trees, and the real-time capability. supports all major industry standard
forecasting methods. These are all natively Individually these innovations are great, query languages, and comes with highly
compiled and parallelized as part of the but what is really amazing is that that you optimized industry functions and
query execution plan determined by the can use them together. SAP HANA makes algorithms to build class best
SAP HANA optimizer providing blinding this not only possible, but even easy. Want performance (think 1000s of times
speed. There is also a Business Function to show a billion pieces of data on a map faster). And it does this without indexes
Library (BFL) that contains 55 data showing visualizations based on or aggregates. When these are
intensive functions that would slow down clustering analysis of that data? How combined in totality, you have a system
traditional three tier applications given about a financial system that takes every that uses less hardware and requires
their need for data movement from the transaction for the quarter up until this less maintenance and saves you money
database tier to the app tier. Things like last second, does a synthetic close on the while providing game-changing
currency conversion, depreciation, and books, and then forecasts where the capabilities and agility.
moving median calculations. And new financials will be at quarter end?

SAP HANA’s Defining Capabilities 7


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
Architecture

SAP HANA’s speed is often referred to RAM, through the caches, and into a HANA purges any column partition,
as being “in-memory”, where all data can core’s registers. column, or table that hasn’t been accessed
be placed in the computer’s RAM and 3. Aligning data in-memory in vectors recently from memory. It is not lost. This is
thus avoid slow disk reads. This may be (columns) allowing data to be efficient as the data is already persisted
true, but overshadows other huge compressed highly efficiently by data on-disk and no write-back is needed.
innovations developed by the HANA type. Therefore SAP HANA is able to BLOBs really don’t need to be in-memory
team in partnership with Intel. These pipeline much larger amounts of data once their entities have been extracted, so
innovations deliver orders of magnitude to the core’s registers. these can be set to only reside on disk
faster performance over just putting data 4. SAP HANA acts on compressed data saving memory. In most instances SAP
a big cache. whenever possible including scans, recommends leaving all data in memory,
SAP HANA’s version of in-memory aggregates, and joins. but you don’t have to and be less
works by: When processing data, SAP HANA frequently accessed data can be stored in
1. Having a SAP HANA optimizer that leverages Intel’s SSE vector processing systems such as Hadoop and SAP Sybase
leverages knowledge of the data’s technology (i.e. SIMD instruction set) to IQ. Further SAP HANA also doesn’t need
locality to the CPUs and their caches. process multiple pieces of data in a single that much memory to process very large
It leverages this information along instruction cycle. Since SAP HANA uses amounts of data. Most DBAs quote their
with knowledge of all the bandwidth variable length integer encoding, this databases in total size, including indexes
and latencies between every cache, means that for lower cardinality columns, and aggregates. So a 1TB database
every core, every CPU, and every it can process 10s of integers per register typically only has in approximately 500 GB
node to formulate the most per instruction cycle. of raw data. SAP HANA achieves
performant execution plans Now that we’ve established where SAP compression of OLTP data in the 3-6X
2. Aligning all data in-memory with the HANA’s speed comes from, we can range and data warehousing data in the
processor’s cache-line size (the address perceived concerns around out- 10-20X range. To be conservative, let’s
amount of data always fetched by the of-memory conditions, scalability, and data assume 5X compression. This makes the
processor’s memory controller). This center readiness. SAP HANA is resilient. 500GB of raw data 100 GBs. Now, SAP
allows data to be pipelined from When available memory is filling up, SAP HANA doesn’t only store data in-memory,
it also processes all its interim datasets in-
memory, so we recommend a SAP HANA
instance memory of 2X the compressed
Single Node SIMD Operation raw data, or 200 GB in our example. This
means that you can get all the benefits
127 0
SAP HANA offers for a 1TB existing
SOURCE X4 X3 X2 X1 database in a 200GB SAP HANA node.
Said another way, a standard 1TB SAP
HANA node is capable of handling a 5TB
SSE2 OP database on migration. SAP HANA is also
fully elastically scalable allowing you to add
nodes to accommodate more memory
and data. There are certified
DEST Y4 Y3 Y2 Y1
configurations up to 56 nodes and 56 TB
of memory and SAP has built a 100 node
100 TB system on which a 1 PB benchmark
(20x compression) was run. Conceive of
needing even more than 56 nodes? Just
CLOCK ask and SAP will certify more.
X4opY4 X3opY3 X2opY2 X1opY1 What is truly incredible is the linearity
CYCLE 1
SAP HANA exhibited during scaling tests

8 SAP HANA’s Defining Capabilities


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
Architecture

on the 100 node system noted above. Sapphire conference in Orlando, FL, SAP allows secondary SAP HANA systems to
As you can see from the graph above, has developed a new object aware cache be located continents away providing
SAP HANA’s response time was essentially that is capable of handling delta changes. protection from any natural or political
flat as data and nodes were doubled. This means that similar queries should get disasters. There are many options for DR
Simply stated, scaling SAP HANA just millisecond response times even under that can allow for near zero down time
takes math. Want to double the speed of load once this innovation is released. upgrades or use of the secondary system
your queries? Double the nodes? Want to SAP HANA offers reliability, full high for use as test and dev. Our partners are
double your users and maintain response availability, disaster recovery (DR) and also offering log shipping as another DR
times on the same data size? This is supportability. So from a data center option and efficient use of hardware.
because SAP HANA is able to fully readiness perspective, SAP HANA is As noted earlier, SAP HANA is open
parallelize any query and fully saturate its complete. SAP HANA fully supports fail- and agnostic and that goes for
CPU cores. This means that SAP HANA is over nodes so that if a node fails for any hardware as well. SAP HANA runs on
fully utilizing all available resources while reason a backup node will kick in. Since “commodity” components assembled
wasting none. It means that SAP HANA is SAP HANA is in-memory, performance by our hardware partners to the
always going to solve your query as fast as will be slowed to disk-database speeds minimum specifications given to them
is possible. If you have large concurrency until all data is moved from disk to by SAP. These specifications allow very
on SAP HANA, its execution plans will use memory. Note that the fail-over node is different architectures by each partner
fewer cores and memory to allow all users available immediately and data will load providing our customers both choice
to be serviced. And a planned innovation as queries require it. SAP HANA also and a competitive market. We recently
will make SAP HANA even faster. As noted supports DR via either synchronous or announced an initiative allowing
by Hasso Plattner at this past year’s asynchronous native replication. This customers to use their own storage
systems as long as they meet SAP
HANA’s specifications. And SAP HANA
Extreme Linear Scalability can also be run virtualized to support
Query processing time (in seconds)
testing and dev. All these innovations
Query 1 Query 2 Query 3
are designed to keep hardware costs
down. Lower costs does not mean
3.816 cheap or chintzy. SAP HANA hardware
3.249 is not inexpensive as standard
3.102 components are enterprise class and
the servers do have lots of memory.
But, using compression, hardware and
storage can be conservatively sized.
And you do get what you pay for. As an
instance can handle mixed workloads
0.7
0.502
of OLTP and operational reporting, one
0.491
0.425
0.266 0.142
less server is required. And as SAP
HANA doesn’t require indexes and
16 nodes 51 nodes 95 nodes aggregates, data duplicates in row and
(100 billion rows) (650 billion rows) (1,200 billion rows)
columnar structures, customers report
lower operating costs. So you get good
value and good TCO. Take SAP’s own
Sales and Distribution Reports experience in migrating its CRM
Query 1: Single Customer and Material for One Month system. SAP saw a 30% reduction in
Query 2: Range of Customers and Materials for Six Months hardware costs due to no longer
Query 3: Year-Over-Year Trending Report for Top 100 Customers for Five Years needing a separate system for
operational reporting.

SAP HANA’s Defining Capabilities 9


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
Summary

SAP HANA represents such a huge shift in


data management, application
development, and analytics that it is a
paradigm shift. And like most things new
and earthshattering it is often hard to
believe or accept at face value, especially
when looked at through the lenses of
existing paradigms. SAP HANA takes the
most important innovations in hardware
architectures, columnar and row
databases, text analytics, parallel
processing, and much more and blends
them together in an elegantly designed
software platform and appliance. SAP
HANA is convergence. SAP HANA
represents a huge shift for enterprise
computing. SAP HANA is the first
technology in decades that will
simultaneously allow landscape
simplification (i.e. fewer servers) and rapid
innovation. Said another way, SAP HANA
is the next generation data platform.

10 SAP HANA’s Defining Capabilities


© 2013 SAP AG or an SAP affiliate company. All rights reserved.
www.sap.com/contactsap

13/08
© 2013 SAP AG or an SAP affiliate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any


form or for any purpose without the express permission of SAP AG.
The information contained herein may be changed without prior notice.

Some software products marketed by SAP AG and its distributors contain


proprietary software components of other software vendors. National
product specifications may vary.

These materials are provided by SAP AG and its affiliated companies


(“SAP Group”) for informational purposes only, without representation
or warranty of any kind, and SAP Group shall not be liable for errors or
omissions with respect to the materials. The only warranties for SAP
Group products and services are those that are set forth in the express
warranty statements accompanying such products and services, if
any. Nothing herein should be construed as constituting an additional
warranty.

SAP and other SAP products and services mentioned herein as well as
their respective logos are trademarks or registered trademarks of SAP AG
in Germany and other countries.

Please see
http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark
for additional trademark information and notices.

© 2013 SAP AG or an SAP affiliate company. All rights reserved.

You might also like