Introduction To HANA - Deep Dive
Introduction To HANA - Deep Dive
1
Need of today data basefeatures
➢ Ability to connect to different sources of data and report at same data with
no data redundancy
History of SAP HANA
• SAP HANA is the synthesis of three separate products – TREX search engine,
P*Time in-memory OLTPdatabase, and MaxDB in-memory liveCache engine.
• In-memory attributes were added in 2002 and columnar data store was added
in 2003, both as ways to enhance performance.
• In 2005 SAP acquired Menlo Park-based Transact In Memory, Inc. With the acquisition
came P*Time, an in-memory, light-weight online transaction processing (OLTP) RDBMS
technology with a row-bases data store.
• MaxDB (formerly SAP DB), a relational database, came from (first) Nixdorf, (second)
Software AG (was named adabas D), and then (third) SAP. It was added to TREX and
P*Time to provide persistence and more traditional database features, like backup.
• In 2008, teams working from Hasso Plattner Institute and Stanford University developed
this “New Database” as it was called.
• First shipment was in November 2010, support for BW available in November 2011,
support for ERPavailable in May 2013.
What is HANA - Is it a data base or ERP or Software ?
• HANA is a solution for in-memory computing, Acronym HANA means “High Performance
Analytic Appliance”.
• SAP HANA is a flexible data source-agnostic appliance that enables customers to analyze
large volumes of data in real time.
• HANA DB takes advantage of the low cost of main memory (RAM), data processing
abilities of multi-core processors, and the fast data access of solid-state drives relative to
traditional hard drives to deliver better performance of analytical and transactional
applications.
• Mentioned below are the few reasons why to choose SAP HANA – Real Time – SAP HANA
provides Real – Time Data Provisioning and Real-time Reporting.
• Speed – SAP HANA provide high speeds processing on massive data due to In-Memory
Technology.
• Any Data/Source – SAP HANA can access various data source including Structured and Un-
Structured data from SAPor Non-SAP data source.
• Cloud – SAPHANA database and application can be deployed to the Cloud environment.
• Simplicity – SAP HANA reduce efforts behind ETL process, Data Aggregation, Indexing, and
Mapping.
• Cost – SAPclaims that SAPHANA software can reduce Total IT cost of a company.
• Choice Option – SAP HANA is supported by different hardware vendor and software provider,
so based on the requirement, the user can choose the best option.
Advantages of SAPHANA
• Mobile experience – SAP Fiori can be used and good number of APPS
available
HANA Live
Spatial tables on HANA and its uses
SAP HANAPlatform
Hardware Architecture trends
•SAP HANA is the foundation for SAP
S/4HANA and provides many of its critical
services, so it is worth taking the time to
learn a little about it.
•Historically, the high cost of memory meant that only small amounts were available. This
caused a serious bottleneck in the flow of data from disk to CPU (see the figure), with the
CPU waiting idle for data to arrive through the tiny gateway.
•Now with memory prices falling, we have access to huge amounts. SAP HANA runs on
hardware with many terabytes of memory. In fact, with so much memory available, the
entire database of even a large organization can be stored completely inside memory, so
there is instant access to all data and wait times are eliminated. Memory is no longer the
bottleneck it once was.
•In addition to huge memory, the processors continue to improve at a phenomenal rate. We
have high-speed, multi-core processors that can take on complex tasks and process them
in parallel. This means response times for even the most complex analytical tasks, such as
predictive analysis, can be carried out in real time.
•SAP could have kept the same business applications produced 20 years ago, along with
the traditional databases that supported them, and installed all of these on the new
hardware. There would be some gains, but traditional databases and applications were
designed around old, restricted hardware architecture. This means they would not be able
to fully exploit the power of the new hardware.
•Put simply, the business software needed to catch up with advances in hardware
technology, so a complete rewrite of the business suite was required.
Column Store And RowStore
SAP HANA databaseProperties
•The SAP HANA database is fully in-memory, so it is very fast.
•The downside to column store is the cost of reconstructing complete records from
the columns if all data is required by the application. This is the case when the
application is transactional and all fields are needed for an update, insert, and delete.
Additionally, for write-intensive applications, column store tables are not optimal
compared to row store tables.
Memory is now huge and relatively affordable. It is technically possible to store an entire
enterprise database in memory, especially if you implement multi-terabyte memory.
However, for most organizations, most of the data that they own is not frequently used,
so they really do not need to implement such huge memory sizes. Only recent data is
frequently used.
This may well be only 5-10% of the entire company's data, which is called hot data. The
rest of the data, which makes up 90- 95%, is called warm data.
With SAPHANA, hot data is stored in memory, and warm data is stored on disk.
Whenever older data is needed by an application, it is loaded from disk to memory and the
application reads the data from memory. This data may not be needed again for a long
time, so it is displaced from memory at the moment when the memory is full and other,
more recent data, replaces it. The older data then goes back to disk until it is needed again.
For row store tables, loading and displacement happens at the row level.
This means all columns in the row, whether they are needed or not, are
loaded to memory. For analytical applications that require only few columns,
this is inefficient, as it involves moving all columns to memory, even those
not used.
For column store tables, loading and displacement happens at the column
and partition level. This means that only the required columns, and even
better, only the required partitions in the columns, are loaded to memory.
This is very efficient for analytical applications, which often only ask for
small portions of data
Reducing the dataFootprint
Benefits of reduced dataFootprint
•The data in the SAP HANA column store tables is automatically compressed in order to
reduce the data footprint.
•The following are a number of benefits associated with a reduced data footprint:
•You can get more data into the CPU cache, and therefore reduce main memory access, in
order to maintain high performance.
•You can fit entire enterprise databases into memory and avoid disk access.
•Operations such as backup and restore are speeded up as data sizes decrease.
•The amount by which data reduction can take place is driven by the shape of the business
data. Compression is most impressive when there is a lot of data repetition in the tables. An
example is a huge sales order table, in which the customer type is stored on each customer
order, but there are only three customer types. The customer type is repeated many times
across the table.
•Compression strips out the repetition and uses integers to represent the business values.
Then it uses special dictionary tables to hold the distinct list of business values and the
corresponding integers. This all happens in the background, and is not visible to the
business user. It is also not something with which the developer needs to be concerned.
Parallel Processing
A key theme of SAPHANA is parallel processing.
With the new hardware architecture, especially
utilizing the new multi-core processors, you can
ensure instant responses by spreading out the
processing tasks across the cores.
•SAP HANA is capable of taking over many of the processing tasks from the
application server. All data-related tasks, such as aggregation, filter, sort,
calculate, and predict can be handled by SAP HANA.
•Now the application layer simply needs to tell SAP HANA what is to be done on the
data, and SAP HANA processes the data and send only the results. This is done in
memory, so speeds can be impressive. We call this code-to-data, as opposed to the
traditional way, which was data-to-code.
•The application layer is still needed with SAP S/4HANA to handle the complex
business logic that must be programmed in a business programming language. In the
case of S/4HANA, this is ABAP. However, many simpler applications can be built
directly on SAP HANA, with no need for an additional application server.
Multitenancy
A key theme of SAP HANA is parallel processing.
With the new hardware architecture, especially
utilizing the new multi-core processors, you can
ensure instant responses by spreading out the
processing tasks across the cores.
•SAP HANA has an extensive built-in library of powerful predictive algorithms and
business functions to suit different analysis scenarios, as shown in the figure.
•In addition to the built-in algorithms, SAP HANA is integrated with the 'R 'public
libraries, where
thousands of additional algorithms can be found.
•With SAP HANA's ability to manage huge data volumes, and at speed, real-time
predictive analysis is possible and can add huge value to business transactional
processing to offer decision support in-line. You can find many examples of
embedded predictive analysis in S/4HANA applications.
Reasons To Choose SAP HANA - IOT, Big data,Social
media, Real time data
Row/Column – OLTP/OLAP
• Row stores are good fit for OLTP
• Reading small portions of a table, but often many of the columns
• Frequent changes to data
• Small (<2TB) amount of data (typically working set must fit in ram)
• “Nested loops” joins are good fit for OLTP
Row for OLTP and Column forOLAP
Row/Column – OLTP/OLAP
1. The application needs to only process a single record at one time (many selects and/or updates
of single records).
3.The columns contain mainly distinct values so that the compression rate would be low.
The SAP HANA data-base allows joining row-based tables with column-based tables. However, it is
more efficient to join tables that are located in the same row or column store. For example, master
data that is frequently joined with transaction data should also be stored in column-based tables.
HANA Row VsColumn
Limitation of standard data base and In-memorydatabase
• Potential loss of data and limit on database size. When this is fine, you should
certainly use RDM’s in memory solution.
• three schemes for dealing with the durability issue. Each has advantages and
challenges:
• RAM Disk vs SSD – Ten Times Faster Read and Write Speed
via RAM Virtual Disk
1000 requests
Single CPU
500
500
Core 2
Core 1
Multi CPU Processing MultiCore
1000 requests
• A simple example scenario is, while processing a sales order, the business
want to give special discounts to customers based on their transactions
history (say for last 30 years). The recent transaction data are available in
Hana and the very old data are moved to Hadoop
• In fact the analysis has to be done over data made by correlating the
recent data residing in Hana and old data residing in Hadoop and it has to
be processed on-the fly in faster fashion. Being an in-memory database
Hana runs very faster. SAP Vora is also an in- memory query engine and it
can access data from Hadoop and process some of them in a faster way.
• SAP Vora builds structured data hierarchies for the unstructured data in
Hadoop and integrates it with data from HANA, and then through the
Apache Spark SQL interface it enables OLAP-style in-memory analysis on
the combined data.
• Typically, all this data in Hadoop is unstructured and SQL cannot be run
immediately on top of that. And that’s where Vora adds value and also
could be a bridge between HANA & Hadoop
How In-memory data base willhelp
• Faster Processing
• Ability to give real time analysis
• Live dash boards
• Lesser data foot print on account no data redundancy
• Single source of truth
Compression
• However, the user can load and unload data from individual table
manually by selecting a table in SAP HANA studio in respective
Schema- by right- clicking and selecting the option "Unload/Load".
What is compression ratio
Compression Technologies
Multi –Core Parallel
Multi User parallel
HANA Streaming
HANA Streaming
HANA IOT
HANA IOTApplication
Architecture
HANAArchitecture
1.Index Server
2.Pre-processor Server
3.Name Server
4.Statistics Server
5.XS Engine
Architecture Discussion
• Key points of the HANAarchitecture:
• Not only is data stored in HANA memory, but what makes it faster is that
the calculations are made in the database and only the results transfer to
the application layer
Data Loading to HANA
• SAPLandscapeTransformation (SLT),
• Data Services (DS),
• Smart Data Access (SDA)
• SAPHANAthe major focus is real-time
• Data Services can do Near real-time and batch processing.
Data Provisioning For SAPHANA
• This is the area where 100 % of your PRIMARY IMAGE DATA is in the
HANA in- memory space (RAM) and is instantly available for all
operations.
• In the BW world, this is typically the Info Cubes and Standard DSOs as they
constitute the reporting and harmonization (EDW) areas respectively as
show below. They are very frequently accessed for reporting and
harmonization purposes and hence is the ideal candidates for being fully in-
memory and to fully benefit from the HANA capabilities.
COLD DATA
• This is the area where 100 % of your
PRIMARY IMAGE DATA is in a
SECONDARY DATABASE (ON DISK) and
the response is slightly slower than
HANA but still offers reasonably fast
READ ONLY access to data for reporting
purposes, as if they were in one
database.
WARM DATA
• SAP HANA Live provides SAP-delivered content (similar in concept like SAP BW
content), in form of SAP HANA calculation views for real-time operational reporting.
The calculation views spans across majority of ECC modules (FI, CO, MM, PP, SD, PS,
CRM, GTS, AM and GRC). The content is represented as a VDM - virtual data model,
which is based on the transactional and master data tables of the SAPBusiness Suite.
Currently more than 2000 views are delivered in HANA Live Package.
HANA Fuzzysearch
• Full Text Search
• Full Text Indexing
• Fuzzy Search
• Which of the hotels on India get great reviews for the room service?
T a b l e 43: T h r o u g h p u t - P r oj e c t © SAP SE
Sizing Element Timeinter System Sizing o b j Structural Average n Average n N o . of cha N o . of dis No. of m o Checks fo Start o f p r End o f ID fo r m u l Fiori A p p s A short te
Sizing E l e m e n t T I : Y / P / H / Average/P Objects W B S Netw. Act/nw Changes Displays M o n . Arch. S.t. pro ID FIORI Short text
A/P E.t.
PS P Y A _ 9 18 _
PS P P P _ 12 13 _
Legend: W it h
e xa m ples
T a b l e ... Table ding
hea S i z i n g o b j e c t s ... C e l l p
toolti Objects Cell ption st t h r e e co l u mn s of a
descri -in the s of the fir
A D o n o t fill cell
Inputfield
Inputfield
T Shirt Sizing