Getting Started Building An On-Premise Data Warehouse Using SAP BW4HANA
Getting Started Building An On-Premise Data Warehouse Using SAP BW4HANA
We have access to more computing power than ever before. In recent years, we have seen memory
sizes increase dramatically. Also, CPUs have become incredibly powerful with multiple cores
running on each CPU, which means that we can distribute workloads across the cores to achieve very
high parallelization.
Many databases rely on disk for data storage, but SAP HANA uses very large memory for data
storage.
SAP HANA was built to run on the latest hardware that uses multi-core processors and huge
memory. The power of SAP HANA has a direct impact on the performance of SAP BW/4HANA.
By utilizing multiple cores across many CPUs, data is loaded faster and queries run at very high
speed.
1
Figure2: Row and Column store tables
SAP HANA supports the traditional row table architecture found in most databases, but also
supports column store tables. Column store tables are optimal for analytical use cases. Data
warehousing, and in particular the querying function, is the ideal use case for column tables. SAP
BW/4HANA uses column store tables.
There is a very good reason for using column store instead of row store, especially for querying data.
A typical query usually requests only a few columns of a table. A query on a row table processes all
columns of the table, even those columns you did not request in the query. But a query on a column
table processes only the requested columns. Processing only the columns that were requested
improves query performance.
One of the additional benefits of using column store tables is that the data in the table is
automatically compressed. Compression of data massively reduces the footprint of the table by as
much as 90%. This means we can fit more data in SAP BW/4HANA compared to row storage.
Compression is taken care of automatically by SAP HANA.
2
Key Features of SAP BW/4HANA
SAP BW/4HANA is a software solution to build an on-premise data warehouse solution using the
out-of-the-box approach.
Historically, data warehouses were built from scratch, typically using SQL code. This approach is
still favored by many organizations who have very specific needs that can't be met using standard
software. SAP provides tools to support the custom approach.
But for many organizations, especially those who run SAP applications, an out-of-the-box approach,
where no coding is needed, is preferred.
The out-of-the-box approach means customers can immediately extract data from any SAP source
application and quickly build analytics on top. SAP BW/4HANA also includes tools to extend the
ready-made content and create brand new content.
SAP BW/4HANA can connect to any data source from any application, SAP and non-SAP. SAP
BW/4HANA includes ready-to-go connectivity to most SAP applications. There are tools to create
new connections to any non-SAP source.
We already learned that a data warehouse includes three basic layers: data acquisition, data storage
and data modeling. SAP BW/4HANA includes tools to quickly build all three layers. A data
warehouse requires tools for handling operations such as scheduling and monitoring, defining
security access, and managing the life-cycle of models and data. SAP BW/4HANA includes all of
those.
Learn more about each of the capabilities of SAP BW/4HANA:
Data Acquisition and Storage
• Acquire data periodically or in real-time data
• Load and store data
• Delta load (load only what has changed)
• Error handling of data loads
• Transform data (clean bad data, fill in missing value etc.)
Data Modeling
• Develop physical and virtual data models and combine them
• Build data marts and star schemas (facts and dimensions)
• Use or adapt SAP-provided data models for fast start
• Built-in planning functions
3
• Define error handling within jobs
• Tools provided for monitoring scheduled jobs
• Collect statistics from operations and business users' analytics for performance monitoring
Life-cycle Management
• Multi-temperature data management (hot, warm, cold) to manage data growth
• Propagation of meta-data changes (e.g. when source tables change) across the entire
development landscape (DEV → QA → PROD)
• Impact analysis of adjustments to models
• Tools to manage upgrades of SAP BW/4HANA
SAP BW/4HANA has its roots in SAP BW, which was the first generation data warehouse solution
from SAP that was launched in the late nineties. Many customers implemented SAP BW and are
now on a path to migrate to the more powerful SAP BW/4HANA.
4
Figure3: SAP BW/4HANA: Three-Layer Architecture
SAP BW/4HANA manages these layers using various technical objects, described below.
5
Data Acquisition
SAP BW/4HANA provides tools that support the connectivity of any source system. The crucial
object that supports the connectivity is the DataSource. The DataSource defines connection to the
source system and also the fields that are required from the source tables that provide the data
transfer to SAP BW∕4HANA.
Data can be extracted, transformed, and loaded to SAP BW/4HANA either periodically - sometimes
referred to as batch loading - or in real-time. Many source systems support the loading of only the
data that has changed, or is new, since the last load. This is known as a delta load.
One of the most popular sources of data to SAP BW/4HANA is from the SAP ERP systems such as
SAP Business Suite or SAP S/4HANA. For these types of sources, a large number of ready-made
DataSources are provided by SAP so that customers can get started quickly. The ready-made
DataSources could be considered as pre-wiring of the SAP application source data to SAP
BW/4HANA. That is one of the most appealing aspects of SAP BW/4HANA for customers who
already run SAP systems, and why most SAP ERP implementation also include SAP BW/4HANA.
For non-SAP sources, it is possible to develop custom DataSources using the provided SAP
BW/4HANA tools.
Many DataSources support a connection to data that is stored remotely. This means a real-time view
of fast-moving, operational data is possible. Loading and storing of data to SAP BW/4HANA is not
always necessary.
Data Modeling
SAP BW/4HANA provides tools for the development of dedicated objects that support advanced
data modeling. A modeler creates a data flow that moves data from the source systems into target
storage objects. The target storage objects are collectively known as InfoProviders. InfoProviders
include DataStore Objects (advanced) for storing transactional data, and Characteristic InfoObjects
for storing master data.
At various points in the data flow we can pass data through Transformations. Transformations clean
up data. Example of data cleansing include filling in missing values, or adjusting or correcting data.
During the load, data can be checked for quality and bad or suspect data can be flagged and even
quarantined. Once raw data is loaded it can be modeled using tools to add additional business
semantics. Business semantics might include assigning a currency to a measures, or assigning some
useful attributes to a characteristic, such as adding weight to a product.
Beside the DataStore Objects (advanced) and Characteristic InfoObjects, which are physical objects
that store the data, the modeling layer can contain the modeling object Open ODS View which
supports a connection to remote data without storing data. Another type of object that doesn't store
data is the CompositeProvider. A CompositeProvider combines data from a number of sources to
make it useable for analysis.
Data Analysis
In order to consume the data model, a developer creates a BW Query on top of an InfoProvider. The
BW Query defines the specific layout of the final report and any special requirements such as sub-
totals, sorting sequence, additional calculations, and filters.
6
A BW Query is not a reporting tool. A BW Query sits between the data model and the final report. A
reporting tool is not provided with SAP BW/4HANA. Customers usually have their own preferences
for which reporting tool they would like to deploy. SAP offers cloud and on-premise reporting tools
such as SAP Analytics Cloud, SAP Analysis for Microsoft Office, SAP Crystal Reports, and many
more. SAP BW/4HANA also supports non-SAP reporting tools using open standards.
The modeling of the SAP BW/4HANA objects described above is performed in BW Modeling
Tools (BWMT). Eclipse is a very well known, open-source, integrated development environment.
SAP developed the special SAP BW/4HANA plug-ins so that the tool includes a perspective,
multiple views, menus, and icons that support the SAP BW/4HANA modeler.
Using InfoAreas it is possible to define a hierarchical structure to organize the SAP BW/4HANA
objects so that the developers can easily locate the objects.
7
Figure 6 SAP BW/4HANA Cockpit
8
Figu
re7: SAP BW/4HANA Content.
SAP BW/4HANA and SAP S/4HANA embedded analytics
Many SAP customers use SAP S/4HANA to run their business. SAP S/4HANA already includes
powerful tools to generate analytics. These included tools are collectively known as SAP S/4HANA
embedded analytics. SAP S/4HANA embedded analytics does not replace SAP BW/4HANA but
they complement each other.
Many SAP BW/4HANA customer also use SAP S/4HANA embedded analytics so we should briefly
take a look at how they work together.
10
• provide a view of data at any level of granularity
• provide a trusted source of data used by the entire organization
A data warehouse is a core part of a business intelligence (BI) solution and has these main functions:
• Extract data from all data sources across an organization in real-time or periodically
• Manage the data in a central repository
• Provide sophisticated data modeling capabilities
A data warehouse should be capable of extracting data from any data source on any technology
platform, regardless of whether it is cloud or on-premise.
During data extraction, data cleansing takes place. Data cleansing is the process of analyzing
incoming data and thoroughly checking it for invalid or missing values. Incorrect values are
immediately corrected and missing values are added. Invalid or incomplete data can be rejected.
Once data has passed through the cleansing stage it is then combined with related data from other
sources. For example, sales data might be combined with delivery and billing data to provide a
complete picture of the entire sales process.
Modern data warehouses not only collect historic data but are also able to access live data. This
means an up-to-the-second view of business data is supported in conjunction with the possibility to
navigate to any point in history.
11
A data warehouse is part of a continuous cycle in which data is generated by transactional systems
and is then collected by the data warehouse. The data warehouse enriches the data with additional
calculations, aggregations, conversions before it is consumed by analytical applications.
Data from all sources across the organization is combined in the data warehouse to provide the
complete picture of business performance. The analytical applications generate insights which can be
used by business leaders to act quickly on opportunities or to respond to risks that are highlighted.
Data warehouse architecture
A data warehouse architecture can be described by using the following layers. Click on the video
below to learn about these layers.
Data Modeling Layer
• Build reusable models ready for analytics consumption.
• Enrich the raw data with extra information useful for analytics.
Data Storage Layer
• Create a trusted source of data for the modeling layer to consume.
• Cleanse data, harmonize, check values, and add missing data.
Data Acquisition Layer
• Extract data from all source systems periodically or in real-time.
• Load data that has changed since the last load.
Building a data warehouse requires a well-trained team of specialists who understand the data
sources and the analytical needs of the organization.
There are many questions to ask that will determine how the data warehouse will be built and the
type of analytical services it will provide to the business.
Figure10: Important questions that need to be answered during the design phase of a data warehouse
12
Positioning Enterprise Data Solutions
Positioning SAP BW/4HANA
SAP BW/4HANA is the enterprise data warehousing solution of SAP for on-premise or private
cloud deployment. Customers with SAP BW/4HANA benefit from future improvements and
investment protection until 2040 at least.
SAP Business Warehouse (BW) has been available since 1998 and is in use by a large customer base.
It represents an integrated data warehouse application which provides predefined content (line of
business and industry-specific), tools and processes to design and operate an enterprise data
warehouse end-to-end. This approach simplifies development, administration, and user interface of
your data warehouse, resulting in enhanced business agility for building a data warehouse in a
standardized way.
SAP BW/4HANA was published in 2016 as the logical successor of SAP BW (which was based on
SAP Netweaver) with a strong focus on integration of existing SAP ABAP application components
into the SAP HANA platform. It follows a simplified data warehouse approach, with agile and
flexible data modeling, SAP HANA-optimized processes, and user-friendly interfaces.
The figure below gives an overview of some core features of the application-driven data warehousing
approach provided by SAP BW and SAP BW/4HANA.
Figure11: Core features of the application-driven data warehousing approach provided by SAP BW
and SAP BW/4HANA
What makes SAP Business Data Cloud so powerful, is that it offers the tools and technologies to
meet all data and analytics requirements of a modern and agile organization. It uses the latest
technology to support scenarios such as:
• Out-of-the-box reporting.
• Machine learning and artificial intelligence.
• Advanced data modeling and data warehousing.
• Powerful planning and reporting.
• Intelligent data management.
SAP Business Data Cloud provides data warehousing features including predefined standard content
for insights on business related data, a manual data integration and data modeling approach, AI and
Machine Learning based extensions of data models as well as innovative out-of-the-box reporting
capabilities side-by-side. With this wide range of functions, it covers all the requirements of a
modern data and analytics solution and thus serves different target audiences with different
requirements.
SAP centralizes data from SAP and non-SAP sources into a unified semantic layer, unlocking a new
dimension of insights, advanced analytics, and AI capabilities. By integrating cross-company data,
businesses gain actionable intelligence to bridge transactional processes and drive AI-powered
growth. SAP’s AI agents leverage accurate, context-rich data from both SAP and non-SAP systems
to deliver advanced automation, seamless cross-solution collaboration, and innovative decision-
making, enabling businesses to adapt, innovate, and thrive at scale. Every part of the business is
deeply connected, driving today’s digital first world.
A key highlight of SAP Business Data Cloud is its out-of-the-box reporting capability, featuring
Insight Apps, which create business insights with a single click, empowering informed decision-
making. The concept of this feature is based on predelivered artifacts and objects that may remind
you at the first glance on the well-known business content in SAP BW/4HANA, SAP Datasphere, or
SAP Analytics Cloud. But there`s a significant difference to the existing concepts. Unlike before, all
14
artifacts, objects, and process steps for a ready-to-consume full-stack application are included and
fully managed by SAP.
By deploying Insight Apps, you can already gather many insights into your SAP data. However, what
if you needed to execute some machine learning algorithms on top of that data? To save you the
trouble and the cost of copying the data into another Machine Learning Platform, SAP decided to
provide Databricks as a service with SAP Business Data Cloud. As it's been tailored to the specific
needs of SAP, and does not include the complete Databricks architecture and capabilities, this
component is called SAP Databricks.
In SAP BW/4HANA, we define InfoObjects to represent each individual business entity. Examples
of business entities include: customer, product, delivery quantity, or billing amount. InfoObjects are
used throughout SAP BW/4HANA to provide essential business and technical information and the
handling rules for each entity. When you implement SAP BW/4HANA, one of the very first tasks is
to define all InfoObjects that you require.
To save you time, SAP provides, within the SAP BW/4HANA Content, thousands of InfoObjects
already created within SAP BW/4HANA. Each InfoObject has a technical name (the unique
identifier of the object), and a description (what the business user sees in a report). The technical
names of all InfoObjects delivered by SAP begin with 0 (zero).
If you cannot find a ready-made InfoObject that fits your requirement, you can define your own
custom InfoObject.
15
Characteristic InfoObjects
Characteristic InfoObjects are the main business entities that provide context in a report. Examples
are:
• Customer (0CUSTOMER)
• Material (0MATERIAL)
16
• Quantity (0QUANTITY)
• Amount (0AMOUNT)
Unit InfoObjects
Unit InfoObjects must be assigned when defining a key figure of the type amount or quantity to
provide meaning to the key figure. SAP provides a ready-made unit InfoObject for each type:
• Unit of Measure (0UNIT)
Time Characteristic
Time Characteristic InfoObjects are used to represent units of time, such as day, month. SAP
provides most of the common units of time, but you can create your own if needed. The following
are examples of this type of InfoObject:
17
• Calendar day (0CALDAY)
SAP BW/4HANA provides different types of InfoProviders to support various scenarios. Let's take a
look at some of them:
DataStore Object (advanced)
• This is the main InfoProvider used to store the transaction data that has been loaded.
• There are 3 main types of DataStore Object (advanced) to support different scenarios:
a) DataStore Object (advanced) - Staging used to temporarily hold data while it is in flow and
also to retain a copy of the data in case it is needed for future data flows. It usually appears
early in the data flow and usually loaded directly from the source system. This raw data is
usually unchanged from the source system.
18
b) DataStore Object (advanced) - Standard used as a permanent store of data that has been
integrated, harmonized from multiple sources, and enriched. Data is usually loaded from a
Staging DataStore Object.
c) DataStore Object (advanced) - Data Mart is also used as a permanent store of data that is
typically aggregated and calculated for a specific business case. Data is usually loaded from a
Staging DataStore Object or from a Standard DataStore Object.
Open ODS View
• An Open ODS View is a modeling object that can be used to by-pass the data staging and
storage layers if they are not needed.
• An OpenODS View is often used to integrate data that is managed outside of SAP
BW/4HANA without having to store it in SAP BW/4HANA.
CompositeProvider
• A CompositeProvider combines data from multiple sources, such as sales and delivery
data stored in DataStore Objects (advanced) or provided by Open ODS Views. Additional
calculations can be added to CompositeProviders.
• A CompositeProvider does not persist data but always processes data at run-time. A query
is usually built on top of a Composite Provider.
• Characteristic InfoObject:
• A characteristic InfoObject can store master data such as attributes, texts, and hierarchies.
Master data provides additional information in a business report.
• It is possible to build a query directly on a characteristic InfoObject to directly report on
master data values. For example, a list of customers in a specified region.
19
Figure17: SAP BW/4HANA Layered Scalable Architecture (LSA)
20
Figure18: SAP BW/4HANA data flow.
Master Data
• To load master data you need to define a characteristic InfoObject. There are three types of
master data in SAP BW/4HANA: attributes, texts, and hierarchies. You can chose to load
any one of these types of master data or all of them.
• The main purpose of master data is to enrich the transactional data with additional
information in a report. On its own, master data is not especially interesting to the business
user who monitors business performance. Equally, transaction data without master data does
not provide enough information to the business user.
• In our example we saw the characteristic InfoObject ProductID with the value HT-1000 has
the text Notebook Basic 15 and the Product ABC Category attribute value is C. These values
are stored in the master data tables of SAP BW/4HANA and are loaded from the source
system master data tables.
• Optionally, hierarchy master data can be loaded to enable the hierarchical presentation of
multiple characteristic InfoObject values, instead of presenting the values in a endless flat
list. This makes navigation in reports much easier.
• All three types of master data must be loaded into SAP BW/4HANA using data flow objects
such as DataSources, Transformations, and Data Transfer Processes (DTPs).
Transactional Data
• Transactional data is generated in source systems such as SAP S/4HANA by the execution of
various business processes such as sales order processing.
• The transactional data is stored in source system tables that include many fields.
21
DataSource
• A DataSource is the object in SAP BW/4HANA that is created to support data extraction
from the source system.
• A DataSource can be regarded as the entry point for data arriving in SAP BW/4HANA.
• A DataSource is simply a structure of fields that correspond to the source system fields.
• The DataSource defines which fields are transferred from the source tables to SAP
BW/4HANA.
• No data is stored in a DataSource. It is a pass-through object used to convey data to the next
layer in SAP BW/4HANA.
• A DataSource is used in a transactional data flow and a master data flow.
DataStore Object (advanced) - Type: Staging and Field-based
• A DataStore Object (advanced) - Type: Staging stores the data is the same format as the
source system with no changes to the values.
• This layer retains all changes to the loaded data so that we have a complete history.
• This layer is also known as Corporate Memory because it can be regarded as the permanent
copy of the source data that we preserve in case we need to reuse the data in the future.
• This type of DataStore Object is not used for reporting because, with its history preservation
features its focus is on staging data and not reporting.
Transformation
• A transformation connects the DataSource with the DataStore Object (advanced).
• A transformation defines the data movement action for each field. Actions can include direct
mapping (no change to the data), adjustment to an incorrect data value, adding a missing
value, formatting a value, concatenating two fields, splitting a field into two fields, and so on.
• A transformation always sits between a source object, such as a DataSource and a target
object such as a DataStore Object.
• A transformation is used in a transactional data flow and a master data flow.
Data Transfer Process (DTP)
• A DTP provides the loading parameters such as the filters that determine which data to load
and settings that determine how loading errors are handled.
• A DTP always sits between a source object, such as a DataSource or a DataStore Object, and
a target object.
• A DTP is the object which you execute to start the data load.
• A DTP is used in a transactional data flow and a master data flow.
BW Query
• A BW Query defines which InfoObjects from the CompositeProvider should be used in a
report.
• It is always good practice to build a query on top of a CompositeProvider and not directly on
top of DataStore Objects (advanced). This is because the relationship between the BW Query
and the underlying CompositeProvider remains stable even if the underlying sources of the
CompositeProvider change.
• The query defines the initial layout of the report, filters, calculations, and so on.
Report
• SAP BW/4HANA does not provide tools to create reports. Customers can choose their own
reporting tool such as SAP Analysis for Microsoft Office or SAP Analytics Cloud.
• One or more BW Queries are included in the report.
• When the transactional data is modeled using master data characteristic InfoObjects, the
master data can be displayed in the report.
• This happens because the characteristic values in the transactional data, such as product
numbers are automatically connected to the central master data of the characteristic value.
See the diagram below for an example.
Shared Master Data
We learned that master data is automatically combined with transaction data. In our example we saw
that the product ID in the transaction connects to the product master data to provide additional
attributes and descriptions. If we have loaded customer master data then we would also have
additional information about each customer such as the country and discount %.
23
Figure19: Shared Master Data
One of the main reasons we load and store master data separately from the transaction data is so we
can share the common master data across all transactions where the characteristic is present.
For example, if we load the master data for product we can then share the product attributes and
texts with multiple DataStore Objects where product is included in the record. See the diagram
above for an example.
In terms of sequence it is good practice to first load the master data before the transaction data. This
is because during transaction data loading, we often refer to the loaded master data to check the
transaction values are valid. For example, is this a real customer number? If not, we can reject the
transaction record or at least mark it to be checked.
24
Figure20: Consolidating and Distributing Data
25
Acquiring Data, Modeling, and
Analytical Functions of SAP BW/4HANA
Data flow scenario for transactional data
Scenario: The Journey of Data from SAP S/4HANA Source Tables to a Report
Transactional data in SAP S/4HANA is distributed across many tables. For example, to collect the
most important data relating to a sales order, you would need to access more than 20 database tables
and then combine the records.
But instead of extracting data from the individual tables, we define views to collect and combine the
data from the various tables so that during loading, SAP BW/4HANA can simply consume the view.
The view is created in the SAP S/4HANA source system as an ABAP CDS View.
In our example scenario, the sales transactional data provided by the ABAP CDS View
ZPEPMCDSVSO1 in SAP S/4HANA generates sales data from the columns of multiple tables:
Even though there are plenty of useful columns in this view, we are still missing some additional
information that would improve the final report that the business user would generate. For example:
26
• description (name) of the product
• product category and gross weight of products shown
• tax rate percentage
• number of business partners that products are sold to
Perhaps, the business user would like to see the products displayed hierarchically within product
groups to facilitate drill-down.
Here is an example of a report that includes the basic information from the view that has been
complemented with additional information to produce an effective result:
This report contains key figures such as Gross Amount and Net Amount that relate to each
characteristic InfoObject Product ID. However, the report not only displays Product IDs but also
organizes the Product IDs using a hierarchy. Here we also see the product description, and additional
attributes such as Product ABC Category and Gross Weight.
27
Figure23: Hierarchy in the report.
In our scenario, the master data (texts, attributes, hierarchies) that relate to the Product ID has
already been loaded to SAP BW/4HANA.
However, the sales transactional data has not yet been loaded. To achieve this, a data flow has to be
created, starting with an ABAP CDS View in SAP S/4HANA and ending with the BW Query. On
top of the BW Query we will create a business report. The finished data flow is shown in the diagram
below:
29
ABAP CDS View: terminology and use cases
An ABAP CDS View is an ABAP object that is a part of SAP S/4HANA. An ABAP CDS View
exposes business data that is stored in database tables. SAP provides a very large number of ABAP
CDS Views covering all business functions. It is possible to customize these SAP- supplied ABAP
CDS Views or even create your own from scratch. An ABAP CDS View is created and maintained
in a development tool called Eclipse. We use code to define the ABAP CDS View. The code looks a
lot like SQL. The code can appear a little complicated at first, but with practice you will soon learn
how to understand the processing logic of an ABAP CDS View and to identify the most important
parameters used by the SAP BW/4HANA extraction process.
An ABAP CDS View can serve multiple purposes. It can be used by many different SAP systems
including SAP Analytics Cloud and SAP DataSphere. If you want an ABAP CDS View to support
data extraction and loading to SAP BW/4HANA, you must enable the setting:
@Analytics.dataExtraction.enabled: true in the definition.
In our scenario the ABAP CDS View ZPEPMCDSVSO1 combines several sales tables such as
SNWD_SO (Sales Order Header data), SNWD_SO_I (Sales Order Item Details) and SNWD_SO_SL
(Sales Orders Schedule Lines). This ABAP CDS View consists of source fields and their
corresponding meta data (data type, length, decimals, currency, etc) from the selected tables.
Our ABAP CDS View has the setting @Analytics.dataExtraction.enabled: true so it is
ready to extract data from SAP S/4HANA to SAP BW/4HANA.
30
Defining the Inbound Layer of SAP BW/4HANA : DataSource
Scenario Overview
In the next step of the data flow we define a DataSource in SAP BW/4HANA.
Figure27: DataSource
In our scenario the DataSource ZP_EPM_CDSV_SO_1_DS has been created and connects to a
transactional data ABAP CDS View that provides sales data. The DataSource has been created for
the source system with the technical name T41_CDS to extract ABAP CDS Views from SAP
S/4HANA system which has the system id (SID) T41.
You can start the following demo to explore the settings of the DataSource in SAP BW/4HANA:
32
Figure29: SAP BW/4HANA DataStore Object (advanced)
33
Figure30: Different types of DataStore Object (advanced)
Depending on the role of the DataStore Object (advanced) different types of DataStore Object
(advanced) can be defined. Here are the most important types:
• Staging DataStore Object: typically used to store a copy of source field data (Raw data
/ Corporate Memory Layer)
• Standard DataStore Object: typically used to handle delta records (Data Warehouse
Layer)
• Data Mart DataStore Object: typically used for reporting and analysis (Data Mart
Layer)
Field-based Staging DataStore Object: system settings
34
Figure31: Staging DataStore Object with Inbound Queue Only
In our scenario the DataStore Object (advanced) is defined as Staging DataStore Object with Inbound
Queue Only selected. The setting Inbound Queue Only is selected when a data activation step is not
needed. In our scenario we have decided that we don't need an activation step. Our DataStore Object
(advanced) is built using fields instead of InfoObjects. We have chosen to use fields because in this
layer of the data flow we don't need the advanced features that are provided when you use a
DataStore Object (advanced) InfoObject based.
Transformation
Scenario Overview
The next step of the data flow is to define a Transformation between the DataSource and the field-
based Staging DataStore Object.
35
Figure32: Transformation is an SAP BW/4HANA
36
Figure33: Transformation between the DataSource and the field-based Staging DataStore Object
In our scenario a Transformation is needed between the DataSource and the field-based Staging
DataStore Object. As our DataStore Object (advanced) is modeled as Corporate Memory, every field
of the DataSource is mapped 1:1 with the corresponding field of the target DataStore Object
(advanced). Corporate Memory always preserves the raw data from the source system without
making changes. We do this so that we can re-use this data in many other data flows.
Data Transfer Process
Scenario Overview
The next step of our data flow is to define a Data Transfer Process (DTP) between the DataSource
and the field-based Staging DataStore Object.
37
Furthermore, it's possible to define a Filter in a DTP to specify which data should be loaded. For
example, records for a specific calendar year, or records from a specific sales organization.
To load the data, a DTP has to be executed. For testing purposes execution can be manual, but for
production systems, the execution is usually automatic. You will learn later how we automate the
execution of multiple DTPs using a Process Chain.
Data Transfer Process between DataSource and field-based Staging DataStore Object:
system settings
38
Figure34: Data Transfer Process with the Extraction Mode setting Full
In our scenario, a Data Transfer Process with the Extraction Mode setting Full is defined. This means
it cannot be used to manage delta records. We did not define a filter in our DTP.
You can watch the following demo to explore the DTP.
39
Figure35: InfoObject-based Standard DataStore
But unlike the previous object, which was defined as a Staging object type using fields, this one will
be defined as a Standard object type and will use InfoObjects instead of fields.
InfoObject-based Standard DataStore Object: system settings
40
Figure36: InfoObject-based Standard DataStore Object
In our scenario, the sales transactional data is loaded from the Corporate Memory which uses a field-
based Staging DataStore Object. The target will be the InfoObject-based Standard DataStore Object.
The object has modeling type Standard DataStore Object with Write Change Log option selected. The
setting Write Change Log is typically used for a DataStore Object (advanced) in the Data Warehouse
layer. This is because a change log enables the management of delta records in downstream data
flows.
With this type of DataStore Object (advanced), a data load has to be activated in order to apply the
changes and enable the records to be available for reporting. For testing purposes activation of
requests can be manually executed, but for production systems, activation of requests is part of an
automated process chain which can be scheduled.
In the InfoObject-based Standard DataStore Object, four InfoObjects have been defined as part of
the key. This key defines the uniqueness of records in the DataStore Object (advanced) and is used
to handle delta records. As this DataStore Object (advanced) may also be filled with sales
transactional data from other sources systems, the InfoObject 0LOGSYS (Source System) is also
defined as part of the key and provides an indication of the origin of the data.
Transformation
41
Scenario Overview
The next step in our data flow is to define a Transformation between the field-based Staging
DataStore Object and the InfoObject-based Standard DataStore Object.
Figure37: Transformation between the field-based Staging DataStore Object and the InfoObject-based
Standard DataStore
42
Figure38: Transformation between the field-based Staging DataStore Object and the InfoObject-based
Standard DataStore
n our scenario, the fields from the field-based Staging DataStore Object are mapped to the
InfoObjects from the InfoObject-based Standard DataStore Object. Most source fields are mapped
1:1 to the target InfoObjects because for many source fields there is one target InfoObject and the
value should not be changed during loading. However, four of the InfoObjects use a different
transformation rule. We need these four rules because either the source does not provide the field, or
the field is provided but the value needs adjusting.
Here are our rules:
• 0CALYEAR: Transformation rule Formula (f) is used to derive the year from the source
field DELIVERYDATETIME
• 0LOGSYS: Transformation rule Constant (c) is used to fill it with value T41 because this
is not provided by the source system.
• 0CALDAY: Transformation rule Time is used to derive the date from the source field
CREATIONDATETIME
• 0D_NW_ROLE: Transformation rule Constant (c) is used to fill it with value 1 because
this is not provided by the source system.
Figure39: Data Transfer Process between Staging DataStore Object and the InfoObject-based Standard
DataStore
In our scenario, a Data Transfer Process of the type Full is defined. This type of DTP is not capable
of loading delta records. Additionally, once again, there is no Filter defined in our DTP.
44
Figure40
45
Figure41: Best practice in BW4HANA
46
Figure42: CompositeProvider with one source provider InfoObject-based Standard DataStore Object.
Generating Analytics
BW Query
Scenario Overview
In the next step of the data flow we define a BW Query on top of the CompositeProvider.
47
Figure43: BW Query Scenario Overview
48
Figure44: BW Query: system settings
Presenting Analytics
Analysis for Microsoft Office Workbook
Scenario Overview
The final step is to present the results of the BW Query in a report that a business user would use.
BW Queries can be consumed by many different reporting tools, SAP and non-SAP. For our
scenario we have chosen the on-premise tool SAP Analysis for Microsoft Office.
49
Figure45: Analysis for Microsoft Office Workbook Scenario Overview
50
Analysis for Microsoft Office Workbook: system settings
We will see how the report presents the sales transactions and also integrates the master data for
products. In particular, look out for the following:
• Products are displayed in a hierarchy
• Description of each product is displayed
• Product ABC Category and Gross Weight of products are displayed
• Tax rate calculation is shown
• Number of business partners that products are sold to is displayed
51
Combine loaded and stored actual transactional sales data with plan sales data from an
SAP HANA table
The actual product sales quantities have been loaded from SAP S/4HANA to SAP BW/4HANA and
is stored in a Standard DataStore Object. The plan sales quantity is stored in an SAP HANA
database table. The plan data in the SAP HANA table looks like this:
Instead of setting up a data flow to physically load the plan data to SAP BW/4HANA, a different
approach is used:
Firstly, an Open ODS View is defined. This object is created in SAP BW/4HANA and reads the plan
quantity in a remote SAP HANA table.
In the next step, a CompositeProvider is defined to combine the actual sales data, which is stored in
the Standard DataStore Object, and the plan data which is read at run-time from the remote database
table using an Open ODS View. We will define a union of the providers so that for each product we
will have the actual and plan quantity.
We then create a BW Query on top of the CompositeProvider and finally display the results in
Analysis for Microsoft Office. The report should look like this:
52
Figure48: Results in Analysis for Microsoft Office
53
Open ODS View: terminology and use cases
An Open ODS View is an SAP BW/4HANA object that exposes remote source data to SAP
BW/4HANA without the need to create InfoObjects and without loading and storing data.
Open ODS Views enable you to define data models based on database tables and database views that
are located in remote databases. These data models allow flexible integration without the need to
create InfoObjects and without the need to load the data into SAP BW/4HANA. This flexible type of
data integration makes it possible to consume external data in SAP BW/4HANA without staging,
and afterward, using a CompositeProvider, to combine these external data with other SAP
BW/4HANA models.
Open ODS View: system settings
CompositeProvider
A CompositeProvider represents the virtual layer of SAP BW/4HANA.
54
CompositeProvider: system settings
Our CompositeProvider has two providers: the Standard DataStore Object that contains the actual
data and the Open ODS View that exposes the plan data.
55
Figure51: CompositeProvider: system settings
In our scenario we create a CompositeProvider to generate a union of the actual and plan sales data.
BW Query
A BW Query is defined on top of the CompositeProvider.
56
Figure52: BW Query Scenarion Overview
57
BW Query: system settings
In our scenario the BW Query selects Sold Quantity from the Standard DataStore Object and
Planned Quantity from the Open ODS View. Two additional formulas are also defined in the BW
Query: Diff. Sold - Planned and % Variance Sold/Planned.
Combine loaded and stored actual transactional sales data with CO2 footprint data from
an SAP HANA calculation view
In another scenario we combine data relating to the CO2 footprint of each product with the actual
sales data to generate the total CO2 footprint for each sale. The CO2 footprint data is provided by an
SAP HANA calculation view and the actual sales data again comes from the Standard DataStore
Object.
The CO2 footprint data provided by the SAP HANA calculation view looks like this:
58
Figure54: CO2 footprint data provided by the SAP HANA calculation view
To set up this scenario, another CompositeProvider is created to combine the Standard DataStore
Object which stores the actual sales data and the SAP HANA calculation view. The results of a BW
Query, defined on top of this CompositeProvider, are then displayed in Analysis for Microsoft
Office. The final result should look like this:
59
Figure55: The final result
Figure56: The data flow to notice the objects for this scenario
60
CompositeProvider
We define a CompositeProvider to combine the Standard DataStore Object with the SAP HANA
calculation view.
CompositeProvider: system settings
In our CompositeProvider we define a Left Outer Join between the Standard DataStore Object and
the SAP HANA calculation view.
!Note
We use a left outer join to ensure that if the CO2 data is missing for a product, we will still display
the sales data. An inner join would not return the sale data if CO2 data was missing.
BW Query
A BW Query is defined on the CompositeProvider.
61
BW Query: system settings
In the BW Query Gross Amount and Quantity are selected from the Standard DataStore Object and
the key figure CO2 Footprint (Each) is selected from the SAP HANA calculation view. Also a
formula is defined, CO2 Footprint (Total), which calculates the total CO2 Footprint when the BW
Query is executed.
62
Operating SAP BW/4HANA
Scheduling Routine Tasks and General Housekeeping
SAP BW/4HANA Cockpit
Once the data models and data flows have been created we need to build a schedule to load data. We
also need to perform ad-hoc housekeeping tasks. There are many tools provided to support data
loading and housekeeping. These tools are presented to the administrator using an easy-to-use,
customizable interface called SAP BW/4HANA Cockpit.
SAP BW/4HANA Cockpit Overview
The SAP BW/4HANA Cockpit is an intuitive, web-based SAP Fiori user interface, it provides a
central entry point for the administration, monitoring, and modeling of an SAP BW∕4HANA
system.
The SAP BW/4HANA Cockpit contains various SAP Fiori apps, organized in groups. Some apps for
some of the most important key operations related to SAP BW/4HANA are listed here:
63
Modeling
• Process Chain Editor:
The system displays a list of all process chains in the system. You can create, change, and delete
process chains here.
Learn more about Process Chains later in this lesson.
• Analysis Authorizations Editor:
You can create and change analysis authorizations.
Learn more about analysis authorizations later in this lesson.
Monitoring
• Process Chains - Display Dashboard:
The system displays various charts with an overview of important statistical analyses of process
chains and process chain runs.
• Process Chains - Display Latest Status:
The system displays a list of all process chains with status information.
• Process Chain Runs:
The system displays a list of all process chain runs in the system with status information. You can
repair a run here or call the log of a run in order to analyze errors.
Data Management
• DataStore Objects - Manage Requests
The system displays technical information about the content of the DataStore Object.
• InfoObjects - Manage Requests
The system displays technical information about the content of the InfoObject.
• DataStore Objects - Manage Data Tiering
The system displays an overview of all DataStore Objects and their temperatures. In addition, you
can change and manage the temperature for the partitions of the DataStore Objects.
Learn more about data tiering later in this lesson.
Process Chains
In SAP BW/4HANA there are many processes that need to be performed, such as loading master
data, loading transaction data, activating data, deleting old data, etc. Although these processes could
be executed on-demand by an administrator, it is more efficient to automate these processes.
In order to automate the processes, a Process Chain is created. A Process Chain defines a sequence
of processes and is scheduled to run in the background. Process Chains can even trigger other
Process Chains.
A Process Chain consists of three types of processes:
• start process
• application processes
• collector processes
A Process Chain always includes one start process, at least one or more application processes, and
optionally one or more collector processes.
Process Chains are created in the Process Chain Editor app of SAP BW/4HANA Cockpit.
You define the start condition of a Process Chain with the start process. All other processes in the
chain wait until their time has come to execute. Application processes are the actual work processes
65
that you want to run. These represent activities typically performed in the operation of SAP
BW∕4HANA, for example load processes. Collector processes allow you to define which previous
processes must have completed before the next process can start.
The following slide includes examples of SAP delivered process types that you can use in your
Process Chain:
Data has a life-cycle. When data is new, it is usually very important to the business and used in
decision-making. But as data ages, its value usually decreases as it is used less and less. We should
move the less valuable data, from the highest performance / highest cost storage - which is memory -
to the lower performing / cheaper storage options - such as disk.
What is needed are tools to manage the data life-cycle so that we can move data from memory to
disk when the time is right. Such tools are included in SAP BW/4HANA.
SAP has defined a multi-temperature data storage concept. Data is classified as HOT, WARM, or
COLD that aligns to the frequency of data access and the requirement for performance. These
criteria include the type of data involved, how useful it is for business purposes, the importance of
68
these processes, frequency of access, performance and security requirements. More facts that
influence the setup of the SAP Multi-Temperature strategy for storing data include:
• Budget restrictions that limit spend on hardware
• Technical restrictions regarding the capacity of the SAP HANA database
• Storage of historical data (due to data growth)
• Rules for storing data, such as the requirement to save all data for at least some years for
legal reasons
A large portion of data managed in a large enterprise data warehouse such as SAP BW/4HANA, is
processed frequently and needs fast access. This kind of data can be considered as HOT.
In addition, there is often also a large volume of data that is not accessed frequently and perhaps
might not require fast access. For this reason, this type of data can be defined as WARM. Data that is
no longer needed but must still be kept, perhaps for legal reasons, can be classified as COLD.
SAP BW/4HANA data is classified by access frequency and use case into HOT, WARM, and
COLD.
69
In the context of the SAP BW/4HANA reference architecture for data warehousing (Layered
Scalable Architecture for BW/4HANA), the various areas in a data warehouse and the different
architectural layers of the EDW architecture can be assigned to these multi-temperature data
categories.
Let's learn more about these multi-temperature data categories:
HOT
• All layers related to mission-critical, day-to-day business analysis and planning
• Hot data is accessed frequently for reporting and planning purposes, or by regular SAP
BW/4HANA processes such as lookups during data loading. Examples of SAP BW/4HANA
objects that typically handle hot data, include the following:
◦ Data Mart DataStore Objects (advanced)
◦ Standard DataStore Objects (advanced)
• No functional restrictions: read and write actions are allowed on this data.
WARM
• All layers related to data acquisition
• Warm data is accessed less frequently. Performance is not the top priority for this data. . and
does not have to be permanently stored in the main memory. Examples of SAP BW/4HANA
objects that handle warm data include the following:
◦ Objects in the Corporate Memory, typically Staging DataStore Objects (advanced).
◦ Objects in the Open Operational DataStore layer, typically Staging DataStore Objects
(advanced).
• No functional restrictions: read and write actions are allowed on this data.
COLD
• All layers related to the retention of historical data
• Cold data is accessed very rarely or not at all.
• Functional restrictions:
◦ this data is used mainly for read only.
◦ It can only be made available for reporting by enabling a setting in the query.
◦ By default this data is not accessed by queries.
◦ Writing to the data is possible in exceptional cases only such as corrections.
◦ If reading this type of data is required, expectations regarding performance must be set
accordingly.
SAP BW/4HANA Data Tiering Optimization (DTO)
Tools to Manage Multi-Temperature Data Management in SAP BW/4HANA
For implementing multi-temperature data management in SAP BW/4HANA, SAP provides a tool
called Data Tiering Optimization (DTO).
70
Figure66: Data Tiering Optimization (DTO)
The physical storage locations for the various data temperatures are:
• HOT: Data is managed completely in-memory of SAP HANA.
• WARM: Data is managed on SAP HANA disk.
• COLD: Data is managed outside SAP HANA in an SAP IQ database.
In SAP BW/4HANA, the DataStore object (advanced) is the object used to store transactional data.
Transactional data is where significant data growth typically occurs. For this reason a DataStore
Objects (advanced) is where you set-up Data Tiering Properties:
For each DataStore Object (advanced) you first choose the temperatures which are relevant and
whether you would like to manage the individual data temperature by partition, or if the temperature
applies to all data in the DataStore Object.
71
The next step is to define the planned temperature of each DataStore Object (advanced) or each
partition of an DataStore Object (advanced). The SAP BW/4HANA Cockpit provides a suitable app
for this purpose: DataStore Objects - Manage Data Tiering. The app provides all required functions to
define temperature for the DataStore Objects (advanced) partitions or, depending on the general
setup, for the entire DataStore Object (advanced), and to execute the data movement across
temperatures.
In addition to the manual definition of the temperature of a partition or object, you can also define
temperature change rules for selected objects. Rules can be defined for DataStore Objects
(advanced) with tiering at partition level and with an SAP time characteristic or a field of data type
DATS as the partition field. In the rule, starting from the current date, you can specify a relative time
based condition for the various possible temperatures. For example, you could specify a warm
temperature for data that is more than one year old, and cold temperature for data that is more than
five years old. Using Process Chains to execute the jobs, you can completely automate the movement
of data across temperatures.
72
Figure69: Define if the COLD data should be considered in reporting.
The last step is to define if the COLD data should be considered in reporting. You do this by
enabling the setting in the General properties of the BW Query definition that reads data from the
DataStore Object (advanced).
We need to ensure users do not perform activities that they are not authorized to, for example,
deleting data, executing data loads or creating or executing queries.
73
Clearly, we need to protect these activities so we assign the authorizations to the relevant users in the
team.
An Authorization Object consists of fields where the authorized settings are specified. SAP provides
many Authorization Objects to secure common activities across SAP BW/4HANA.
Here is an example of the Authorization Object : S_RS_COMP which is used to provide authorizations
for working with BW Queries:
Notice how we define the allowed action such as create, execute etc. For a business user to execute a
query, they simply need the Execute (16) authorization, they do not need Display (03) unless you
would like them to be able to open the query definition to view the settings.
Notice we then specify the type of objects the action applies to, for example REP is the code for a
BW Query. There is a code for every type of object in SAP BW/4HANA. Finally, you see the value
74
of the object, in this case the * means any query. But notice that the InfoProvider P_V_SO_1 is
specified which means any query can be displayed or executed as long as it was built on the specified
InfoProvider.
So that is the authorization of a task taken care of. We now come to the Analysis Authorization
setup. This is all about data access.
To set up an Analysis Authorization we need to perform three steps:
Not all data in SAP BW/4HANA is relevant for authorization. For example, color or weight of
product. In this case you would not enable the setting for Authorization Relevance. Once you enable
this setting, each user must have an Analysis Authorization assigned to their user profile to grant
access to the data of the object.
So, with the combination of the two types of authorization we have seen how our user could now
execute any query (as long as it is based on the CompositeProvider P_V_SO_1) and display the data
only for company code 1000, 2000, 2200 and 3000.
75
GDPR can have a crucial impact on SAP systems, as those systems often manage processes with
reference to personal data and GDPR regulates how long this personal data must be maintained,
when it needs to be deleted and in what manner it can be kept in a depersonalized way.
In technical terms this means, whenever personal data is deleted in a source system of SAP
BW/4HANA, it is an obvious expectation to handle the replicated data in SAP BW/4HANA
accordingly in order to comply with legal requirements; as a consequence, the personal data in SAP
BW/4HANA needs to be deleted or depersonalized (anonymized) as well.
GDPR Adoption in SAP ERP Systems based on SAP ILM
SAP ERP systems are typical examples of source systems that replicate data to SAP BW/4HANA.
The recommended approach for GDPR adoption in these systems is based on SAP Information
Life-cycle Management (ILM). This component is a powerful and integrated solution able to
orchestrate archiving and deletion processes for sensible data. It provides a broad range of advanced
capabilities, including blocking and deletion, residence and retention management, consolidation of
legacy data, and more – some of which are relevant to regulatory demands.
During the typical SAP data life-cycle from data creation to data destruction, the data is first active
in the database. When it turns inactive it qualifies for archiving once the data surpasses the resident
time and the archived data qualifies for destruction at the end of a retention period. With the GDPR
compliance requirement, it is mandatory to comply with the critical requirement of deletion of data
that is no longer required for the given business purpose. This mandates an additional control in the
system that would allow defining a status "End-of-Purpose". SAP ILM with enhanced feature
76
addresses this requirement and it blocks and deletes the data that are no longer required in the system
and has reached this "End-of-Purpose" status.
Having successfully configured SAP ILM for a given data object, a Notification for each SAP
BW/4HANA Characteristic value which has been deleted will be triggered. Those notifications are
collected and persisted centrally and can be extracted based on the technical content of the SAP
BW/4HANA Data Protection Workbench.
SAP BW/4HANA Data Protection Workbench
Data that is deleted in the SAP ERP systems at the end of the usage purpose when the retention
period expires, must also be deleted or anonymized in the connected SAP BW/4HANA
environment. For this reason, SAP BW/4HANA has been integrated with SAP ILM of the connected
SAP ERP systems. This means, whenever sensible bits of data are deleted in SAP sources, SAP
BW/4HANA can trigger corresponding follow-up activities to mirror this adjustment.
In SAP BW/4HANA, there is a component called Data Protection Workbench providing such
capabilities. This workbench is an integrated solution for selective deletion of transactional data,
deletion of master data, and creating anonymized of personal data which was originally replicated
data from supported connected SAP source systems.
77