AugmentedAnalytics B
AugmentedAnalytics B
AugmentedAnalytics B
PUBLIC
Warning
This document has been generated from the SAP Help Portal and is an incomplete version of the official SAP product
documentation. The information included in custom documentation may not re ect the arrangement of topics in the SAP Help
Portal, and may be missing important aspects and/or correlations to other topics. For this reason, it is not for productive use.
This is custom documentation. For more information, please visit the SAP Help Portal 1
6/30/2021
Augmented Analytics
Augmented Analytics comprises a set of SAP Analytics Cloud features that enhance the analytics process using machine
learning.
The Augmented Analytics features include Smart Insights, Search to Insight, Smart Discovery, and Smart Predict.
Smart Insights
Smart Insights allows you to quickly develop a clear understanding of complex aspects of your business data, by letting you see
more information about a particular data point in your visualization or table, as well as about a variance on your acquired data.
Search to Insight
Search to Insight is a natural language query interface used to query data.
Smart Discovery
By running machine learning algorithms, Smart Discovery uncovers new or unknown relationships between columns within your
dataset to help you understand the main business drivers behind your core KPIs.
Smart Predict
Smart Predict helps you answer business questions that need predictions or predictive forecasts to plan for the future business
evolution: It automatically learns from your historical data, and nds the best relationships or patterns of behavior to easily
generate predictions for future events, values, and trends. Additionally, you get easy to understand KPIs and visualizations that
help you evaluate the predictions accuracy. You can then leverage those predictions and predictive forecasts with con dence to
augment your planning model and stories.
Related Information
Smart Insights
Search to Insight
Smart Discovery
Smart Predict – Using Predictive Scenarios
A Predictive Scenario helps address a business question requiring predictions. It is a workspace, where you create and compare
predictive models to nd the best one to bring the best predictions to address the business question.
The following types of predictive scenarios are available. You choose the one that best ts your business question.
Predictive
scenario Answers this type of business question...
Classi cation What is the likehood that a future event occurs? This event is observed at an individual level (customer, asset, product,
...) and at a certain horizon (in the year, before the end of week, in the month after a customer contact, ...) .
Example
Who is likely to buy or not buy your new product? Which client is or isn't a candidate for churn?
This is custom documentation. For more information, please visit the SAP Help Portal 2
6/30/2021
Predictive
scenario Answers this type of business question...
Regression What could be the prediction of a business value, taking into account the context of its occurrence?
Example
What will be the revenue generated by a product line, based on planned transport charges and tax duties?
Time Series What are the future values of a business value over time, at a certain granularity/place?
Example
How much ice cream will I daily sell over the next 12 months? I have my historical daily sales information, but I'd like
to be able to include other factors such as vacation months, and the seasons.
You can create one or several predictive models within a predictive scenario. Each predictive model produces intuitive
visualizations of the results making it easy to interpret its ndings. Once you have compared the key quality indicators for
different models, you choose the one that provides the best answers to your business question, so you can apply this predictive
model to new data sources for predictions.
Restriction
To verify that Smart Predict is available in your SAP Analytics Cloud system, check out the SAP Note 2661746 .
Restrictions
Predictive scenario migration - When predictive scenarios are moved from the Browse Predictive Scenario page to the Files
creation dates not kept area, the individual creation dates are not kept. The predictive scenario creation date that is
displayed after the move to Files is the timestamp for the move operation.
Availability of Smart Predict Smart Predict is available in most regions and for most tenant types.
For more details on exceptions and general availability, refer to the SAP Note 2661746 .
This is custom documentation. For more information, please visit the SAP Help Portal 3
6/30/2021
Data Sources (acquired datasets and You can create predictive scenarios on datasets that use the following data sources:
planning models)
local le (.txt, .csv, .xlsx)
Note
Files with extension .xls are not supported.
SAP HANA
SAP S/4HANA
SAP SuccessFactors
SAP Qualtrics
OData Services
SQL Databases
SAP BW
Note
We recommend you to upgrade your SAP Analytics Cloud version to 1.0.43 to have
drop parent hierarchy nodes functionality. Although you can import dataset with a
lower C4AAgent version, hierarchy selection will be disabled and a corresponding
message will be shown in query builder.
Google Drive
This is custom documentation. For more information, please visit the SAP Help Portal 4
6/30/2021
YYYY-MM-DD
YYYY/MM/DD
YYYY/MM-DD
YYYY-MM/DD
YYYYMMDD
YYYY-MM-DD hh:mm:ss
Where YYYY stands for the year, MM for the month, and DD for the day of the
month, hh stands for hours from 0 to 24, mm stands for minutes from 0 to 59,
and ss stands for seconds from 0 to 59.
Example
January 25, 2018 will take one of the following supported formats:
2018-01-25
2018/01/25
2018/01-25
2018-01/25
20180125
The column name restrictions are the same as the SAP HANA ones. If some characters
are not supported, the column name is automatically converted to a supported name.
The original name is kept as a column description in the metadata.
Note
Time variables are currently not supported by Smart Predict. If your dataset (acquired or
live dataset) contains a column that contains only time variables, this column won't be
included in the training process.
Dataset Maximum Sizes and Limits See System Sizing, Tuning, and Limits
YYYY-MM-DD
YYYY/MM/DD
YYYY/MM-DD
YYYY-MM/DD
YYYYMMDD
YYYY-MM-DD hh:mm:ss
This is custom documentation. For more information, please visit the SAP Help Portal 5
6/30/2021
Note
While you can use this format in both live and acquired datasets, the
seconds (ss) won't be taken into account during the training of your
predictive models.
Where YYYY stands for years, MM stands for months, DD stands for day of the month,
hh stands for hours from 0 to 24, mm stands for minutes from 1 to 59, and ss stands
for seconds from 0 to 59.
Note
Regardless of the date granularity you choose in your time series predictive
scenarios with a dataset as your data source, every date format has to include
years, months and days. This means that even if you just want a quarterly or
monthly forecast, the date format in your dataset still needs to include days.
If your data source is a planning model, you can use the YYYY-MM date format.
Example
Let's say you use the YYYY-MM-DD date format, you can create time series
predictive scenarios where the date granularity can be:
Weekly data in the date format YYYY-MM-DD taking for instance the 1st day
of the week as the characters DD (moving week).
Smart Predict expects a date per each period to learn on: if you want to forecast your
monthly sales, you provide a date per month representing the value of the
corresponding month.
In a time series predictive scenario, you can de ne entities, each generating its speci c
predictive model simultaneously.
For example, if you de ne a column with countries as an entity, Smart Predict will
generate as many predictive models as there are countries in your data source.
The following limits are recommended when using a time series forecast model:
If your predictive model is con gured for a number of forecasts and/or entities beyond
the recommended maximum limits, it is likely to create performance issues that can
impact other users on the same SAP Analytics Cloud tenant. In the user interface, the
maximum number of forecasts that can be set is restricted to 500.
Tip
When deciding how much historical data to use, please use the following as a
recommendation:
This is custom documentation. For more information, please visit the SAP Help Portal 6
6/30/2021
Time Series Forecasts Smart Predict time series forecasts don't persist the settings for Number Formatting selected
by the user in the User Preferences section of SAP Analytics Cloud Pro le Settings.
Classi cation Predictive Scenario In a classi cation predictive scenario, the target can only be a binary column that only takes
two values, for example, true or false, yes or no, male or female, 0 or 1. For this type of scenario,
Smart Predict considers that the positive target value, or positive target category of this
column, is the least frequently occurring value in the training dataset. However, to make sure
your trained predictive model is reliable, you need to make sure that you have a minimum
representation in your training dataset. For example, if your dataset contains very few failures,
your predictive model won’t be able to predict the under-represented category failures.
Training a Predictive Model Smart Predict currently excludes the following columns when training your predictive model:
Note
Date & Time is supported by Smart Predict.
Restrictions
on Information on Restrictions
SAP HANA You should not allow the creation of live datasets on top of SAP HANA SQL Views using row-level security (see Structure of
SQL Views SQL-Based Analytic Privileges). In Smart Predict you access the dataset using the SAP HANA technical user con gured at
using row- the data repository level, and not using the SAP Analytics Cloud user pro le. This could result in a security issue as all
level SAP Analytics Cloud users would get access to the data accessible by the SAP HANA technical user. For more information,
security see Con guring a SAP HANA technical User in the On-Premise SAP HANA System.
Number of There is a limit of 1000 columns when using live datasets with predictive models.
columns for
live
datasets
Live Data You can create predictive scenarios on live datasets in the following on-premise SAP HANA systems:
Sources
SAP HANA 1.0 SPSP12 rev 122.04 and upwards
Note: Cloud deployments of SAP HANA systems are currently not supported.
Privileges A maximum of 4000 tables/SQL views are displayed for creating a live dataset through browsing. It is recommended that
for a SAP the SELECT privileges for a SAP HANA technical user are limited to only tables/SQL views required for the predictive
HANA models. For more information, see Con guring a SAP HANA technical User in the On-Premise SAP HANA System
technical
user
BI story You can't directly create a BI Story on top of a live dataset whether or not this live dataset was created with Smart Predict.
For more information, refer to Creating Calculation Views to Consume Live Output Datasets
This is custom documentation. For more information, please visit the SAP Help Portal 7
6/30/2021
Restrictions
on Information on Restrictions
Train and Train or Apply operations using live datasets that last longer than 8 hours, don't complete.
Apply steps
with live
datasets
Date Format For live datasets, the following default SAP HANA date formats are supported:
DATE
SECONDDATE
TIMESTAMP
Type of predictive models For Smart Predict - Predictive Planning, i.e. the integration of SAP
Analytics Cloud Smart Predict with SAP Analytics Cloud planning
models, only time series forecasting is supported.
Note
Smart Predict doesn't support predictive
forecasting on calculated measures, including
currency conversion measures, when your
planning model is a new model type.
This is custom documentation. For more information, please visit the SAP Help Portal 8
6/30/2021
Tip
It's possible to add custom properties to group
members in custom ways: you can use this
mechanism to keep the number of entities under
1000 and perform an intermediate forecasting
approach where predictive forecast is run on
intermediate nodes: For nodes above, predictive
forecasts will be spread and for nodes below,
they will be summed.
This is custom documentation. For more information, please visit the SAP Help Portal 9
6/30/2021
Predictive Goal
These are Smart Predict - Predictive Planning settings
available:
Note
Only the Date dimension is used as as
in uencer. All other dimensions, attributes or
measures are ignored when you select Train &
Forecast.
This is custom documentation. For more information, please visit the SAP Help Portal 10
6/30/2021
Outputs
Output versions:
Time aggregation Time granularity: The time series predictive model is trained and
applied based on the level of time granularity available in the
planning model data source.
Example
The granularity of the date dimension of your planning
model is de ned as monthly.
Example
You have a planning model with daily granularity in the
date dimension – from January 1st 2016 to December
31st 2021.
Note
To learn more about date dimension, you can refer to the
chapter called About Dimensions and Measures.
This is custom documentation. For more information, please visit the SAP Help Portal 11
6/30/2021
Publishing to PAi It's not currently possible to publish a predictive model created
from a planning model data source to a PAi application.
Spreading The spreading policy is the default policy available in the planning
model data source. It depends on how the dimension is used in the
model:
Related Information
Setting up Live SAP HANA Data Access for Smart Predict
Understanding the Basic Concepts Used in Smart Predict
Planning Model as Data Source
The following tables detail what you can do in Predictive Scenarios, and which roles and permissions you need to perform the
action.
Connections
Users
Predictive Models
Predictive Scenarios
This is custom documentation. For more information, please visit the SAP Help Portal 12
6/30/2021
Data Sources
Data Repository
Note
If you are an admin, you can create custom roles. For more information, see Creating Custom Roles and Standard
Application Roles.
Connections
Delete a connection x
Users
Delete a user x
Predictive Models
This is custom documentation. For more information, please visit the SAP Help Portal 13
6/30/2021
Predictive Scenarios
Data Sources
Dataset
Note
These roles and permissions apply to both live and acquired datasets.
Delete a dataset x x
Planning Models
Note
This is custom documentation. For more information, please visit the SAP Help Portal 14
6/30/2021
You must have the relevant license to create planning models. For more information, see Features by License Type for
Planning Models.
Note
You need read access to the planning model. You can
create private versions once you have read access.
Note
This applies to the private version of the planning model
only. You must have write access to the private version
related to the planning model. You can always write to
private versions you own, but you need to be granted
write access for the versions you don't own (shared
versions).
Note
To publish a private version of your planning model with Smart Predict forecasts, you need Maintain permissions at a global
level, and at the level of the speci c model. Maintain permissions aren't enabled by default on the Predictive Content
Creator and Predictive Admin roles. Planning Professional Admin , Planning Professional Modeler , and Planning Standard
Reporter are the roles in the application that include Maintain permissions.
Data Repositories
* This means the Predictive Content Creator can effectively view the existing data repositories when creating a live dataset but
this does not mean the Predictive Content Creator can view the data repositories in the Administration User Interface.
** The Predictive Content Creator cannot access the Administration User Interface and therefore can't effectively see the data
repositories.
For more information, see Adding and Con guring the Data Repository in SAP Analytics Cloud.
To consume predictions saved in a planning model version, you can create story tables or charts on your planning model,
selecting the private version used for predictive model application. For more information on planning in tables, and creating a
chart, please refer to the following chapters:
Planning in Tables
Creating a Chart
For required permissions on datasets, models, and stories please refer to the following chapters:
Permissions
Note
The BI Admin and Planning Admin roles include all Predictive Admin permissions by default.
Related Information
Standard Application Roles
Security
Model and Version Security
Permissions
Sharing Private Versions
Features by License Type for Planning Models
Predictive Scenario: A workspace where you create and compare predictive models to nd the one that provides the
best insights to help solve a business question requiring predictions. Currently, you can choose between 3 different
types: classi cation, regression, and time series forecasting.
Predictive Model: The result found by Smart Predict after exploring relationships in your data using SAP automated
machine learning. Each predictive model produces visualizations and performance indicators based on certain
requirements that you have set, so you can understand and evaluate the accuracy of the predictive results. You'll
probably want to experiment a bit with different predictive models, varying the input data, or the training settings, until
you are satis ed with the accuracy and relevance of the results.
Data source: The form and origin of the data that you'll use to create a predictive model. This could be a dataset in a
database or a planning model in an SAP Analytics Cloud story.
Target or Signal: The variable that you want to explain or predict values for. Depending on your data source, this could be
the column or dimension that you're interested in knowing about. Target is used for classi cation and regression models,
Signal for time series forecasting models.
Note
This is custom documentation. For more information, please visit the SAP Help Portal 16
6/30/2021
In the Smart Predict documentation the term variable is used to mean either column or dimension. However, in the
user interface and messages, you'll see the speci c term for the data source being used: columns in datasets and
dimensions in planning model versions.
Entity: Only used in time series forecasting predictive scenarios. You can split up a population into distinct sections called
entities. A predictive model is created for each entity allowing you to get more accurate forecasts aligned with the
entity's particular characteristics.
In uencers: The variables that have an in uence on the target or signal. By default the predictive model considers all the
columns or dimensions as in uencers, and during training, will only retain the signi cant ones. You can chose to exclude
in uencers that you consider not worth including in the training. This is useful when dealing with large data sources.
Training: The process that uses SAP automated machine learning to explore relationships in your data and nd the best
combinations. The result is a formula, your predictive model, that can be applied to new data to obtain predictions.
Related Information
What Type of Predictive Scenario Do You Need?
Starting with a Predictive Model
Variables in Smart Predict
Training a Predictive Model
Partition Strategy
A Predictive Scenario is a precon gured workspace that you use to create predictive models and reports to address a business
question requiring the prediction of future events or trends. You choose the one that is relevant to the type of predictive
insights you are looking for.
Restriction
Smart Predict is not available in all regions or for all tenant types.
You can nd examples on how to create and use the predictive scenarios available in Smart Predict in our playlist on YouTube
or looking at the individual videos:
Overview on Basic Concept in Smart Predict: The Explore the different predictive scenarios currently available in Smart
Predictive Predictive Scenario Predict: Classi cation, Regression and Time Series Predictive
Scenarios Scenarios.
available
Classi cation Smart Predict: Finding the best Using a simple scenario, you will create a Classi cation Predictive
classi cation predictive model Scenario. You will create a rst predictive model and check the accuracy
of your predictions. Then, you will duplicate your predictive model and
will train it using another dataset. You will nally compare the 2
predictive models to nd the best one.
Classi cation Smart Predict: Understanding the confusion In the video Smart Predict: Finding the best classi cation predictive
matrix model, you have built a predictive model to answer your business issue.
Now, you will interpret the results of the confusion matrix for this
business case and understand how to set a threshold (or cut off point) to
best categorize your target group.
This is custom documentation. For more information, please visit the SAP Help Portal 17
6/30/2021
Classi cation Smart Predict: Using the pro t simulation In the previous videos, you have created a predictive model to answer
your business issue and have interpreted the confusion matrix and set a
threshold. Now, you will use the pro t simulation of Smart Predict, to
calculate the return on investment using this predictive model, and set
the ideal threshold that allows you to optimize your pro t.
Classi cation Smart Predict: Applying a classi cation In a previous video, we created a classi cation predictive model to
predictive model identify which customers to contact with a marketing campaign. Now,
we’ll use this predictive model on actual customer data to create the
output dataset containing the answer to this question.
Classi cation Smart Predict: Publishing a predictive Using a simple scenario (Predict if passengers are likely to cancel their
model using a PAi Connection ight booking), you will go through the different steps to publish a
predictive model to an S/4HANA system, using a Predictive Analytics
integrator connection.
Regression Smart Predict: Debrie ng a Regression Using a simple scenario we will create a Regression Predictive Scenario
Predictive Model and will go through the generated reports to evaluate the accuracy of the
predictive model.
Time Series Smart Predict: Creating a Time Series Using a simple scenario, you will create a Time Series Predictive
Forecasting Predictive Scenario Scenario. You will create a rst predictive model and go through the
different settings to be lled to create the scenario.
Time Series Smart Predict: Debrie ng a Time Series In the video Smart Predict: Creating a Time Series Predictive Scenario,
Forecasting Predictive Model you have created a Time Series Predictive Model in your Predictive
Scenario. Now you will go through the generated reports and evaluate the
accuracy of your predictive model.
Time Series Smart Predict: Creating a segmented time Using a simple scenario and the segmented variable of a Time Series
series predictive model predictive model, you will create several predictive models at once. You
will get accurate predictions per predictive model considering the
characteristic of each individual segment.
The predictive model is built using SAP automated machine learning algorithms that explore relationships in the data, and nd
the best combinations. This is called training the predictive model, and the result is the predictive model that can be applied to
new data to obtain predictions.
Each predictive model produces visualizations and KPIs that help you understand and evaluate the accuracy of a predictive
model.
Depending on your business question, you will probably want to experiment a bit with different predictive models, varying the
training data and settings to deliver a more accurate or relevant predictive output.
When you are con dent that you have a trained predictive model that generates results that satisfy your business question, you
can apply that predictive model to new data.
In the Smart Predict documentation the term variable is used to mean either column or dimension. However, in the user
interface and messages, you'll see the speci c term for the data source being used: columns in datasets and dimensions in
This is custom documentation. For more information, please visit the SAP Help Portal 18
6/30/2021
planning model versions. Rows contain the observations for the variable. For example, in a database containing information
about your customers, the <name> and <address> of those customers are variables.
In a predictive scenario, variables have different roles that you assign when de ning the predictive goal and the training
requirements for a predictive model. For example, a variable can be a target or signal, another can be an identi er for an entity,
and others can be excluded from consideration by the predicitve model, perhaps because you consider them to have no
in uence on the target.
Related Information
Variable Statistical Types
Variable Data Types
Understanding Predictive Goal and Training Roles for Variables
Editing Column Details
De ne Settings and Train a Classi cation or Regression Predictive Model
De ne Settings and Train a Time Series Predictive Model
Continuous Values are numerical, continuous, and The variable <salary> is both a numerical
sortable. They can be used to calculate variable, and a continuous variable. It may,
measures; for example, mean or variance. for example, take on the following values:
<$1,050>, <$1,700,> or <$1,750>.
During modeling, a continuous variable may
be grouped into signi cant discrete bins. The mean of these values may be
calculated.
Ordinal Values are discrete. They can be regrouped The variable <school grade> is an ordinal
into categories and are sortable. variable. Its values actually belong to
de nite categories and can be sorted. This
Ordinal variables may be:
variable can be:
Numerical: the values are numbers
numerical, if its values range
and they are ordered according to
between <0> and <20>,
the natural number system (0, 1, 2,
and so on). textual, if its values are A, B, C, D, E
and F.
Textual: the values are character
strings. They are ordered according
to alphabetic conventions.
This is custom documentation. For more information, please visit the SAP Help Portal 19
6/30/2021
Nominal Values are discrete. They can be regrouped The variable <zip code> is a nominal
into categories. variable. The set of values that this variable
may assume are clearly distinct, non-
Caution ranked categories, although they happen to
be represented by numbers. For example:
Binary variables (variable with 2 distinct
<10111>, <20500> or <90210>.
values only) are considered as nominal
variables. They are the ones that can be The variable <eye color> is a nominal
used as target for classi cation variable. The set of values that this variable
predictive models may assume are clearly distinct, non-
ordered categories, and are represented by
character strings. For example: <blue>,
<brown>, <black>.
Textual A type of nominal variable containing For example the variable <Bluetooth
phrases, sentences, or complete texts. Headphones Customer Feedback> is a
Note
Textual variables are used for text analyses. textual variable. The values for this variable
These variables are currently not can be <Durable cord, connect easy to
supported by Smart Predict, and are phone and plug.>, <Great t and great
therefore excluded from the training of a sound!> or <Great length and color. Super
predictive model. fast charging.>.
Note
During training, the values of the categorical variables are regrouped into homogeneous categories. These categories are
then ordered as a function of their relative contribution with respect to the values of the target variable. For more
information, see Category In uence.
String
Integer
Number
Boolean
Date
Time
Angle
Note
Variables with Time (not timestamp) and Textual storage formats aren't currently supported by Smart Predict. These
variables won't be considered when training the predictive model.
A variable corresponds to a column in a dataset or a dimension in a planning model. The observations relating to each variable
correspond to the rows. Variables that have been speci ed as a target/signal, or an entity identi er, are not considered as
This is custom documentation. For more information, please visit the SAP Help Portal 20
6/30/2021
in uencers. Unless you exclude certain in uencers, all other variables are treated as in uencers. The training retains the most
signi cant ones for the predictive model reports for debrie ng.
This is custom documentation. For more information, please visit the SAP Help Portal 21
6/30/2021
Date The variable used for the date values. The date formats that should be used in
your dataset are the following:
Note
This variable is mandatory for a time YYYY-MM-DD
series predictive scenario.
YYYY/MM/DD
YYYY/MM-DD
YYYY-MM/DD
YYYYMMDD
YYYY-MM-DD hh:mm:ss
Note
Let's say you use the YYYY-MM-DD date
format, you can create Time Series
Predictive Scenarios where the date
granularity can be:
This is custom documentation. For more information, please visit the SAP Help Portal 22
6/30/2021
Example
If a predictive model has the target
variable <has bought the product
Yes/No>, you should exclude the
in uencer <Billing amount> if it contains
the cost for the product.
Tip
If there is a variable that is in uencing
the prediction at very high level then
there is a chance that it is a leak
variable.
Note
This is custom documentation. For more information, please visit the SAP Help Portal 23
6/30/2021
any updates you make in the dataset are permanent: if you (or another user) reuse this dataset in another predictive model
or another predictive scenario, the changes will remain.
Field Values
Storage:
String
Data type for the column.
Integer Note that a telephone or account numbers should not be considered numbers.
Number
Boolean
Date
Time
Angle
Type: Continuous: columns whose values are numerical, continuous, and sortable. They can be used to
Statistical data type calculate aggregations (such as min, median or max).
Nominal: columns that label data. They have no quantitative value (such as 1 and 2 to indicate male or
female).
Tip
While creating a predictive model, if the column you want to select as target or entity (time series
forecasting) isn't available it is likely that this column wasn't assigned the right data type, you can
correct this here in Edit Column Details.
Missing: For example, if you enter #Empty, then any rows with no entries will receive #Empty as a value.
String speci ed here
replaces a missing value in
the column.
Key Your dataset needs to have at least one key column if you use a regression predictive model.
Specify one or multiple
unique identi ers for
observations in the dataset.
In Smart Predict, you need to provide data so that your predictive model can be trained or applied to generate the predictions.
This data can be organized using different formats depending on the type of predictive model:
This is custom documentation. For more information, please visit the SAP Help Portal 24
6/30/2021
Time Series
Dataset It can be an acquired or a live dataset.
Planning Model Not all types of planning model are supported. See Restrictions
Related Information
Planning Model as Data Source
About Datasets and Dataset Types
With Smart Predict you can go one step further and generate forecasts per entities to get accurate business-oriented insights,
not only raw forecasts. You can directly use your planning model as the data source, no need to extract data in a dataset.
Smart Predict uses the data available in your planning model to create and train a predictive model. You can then analyze
predictive forecast accuracies across the combined dimension values and understand signal breakdown in details. Once you are
satis ed with the accuracy of your predictive model, you can generate the predictive forecasts: they are saving back directly in
the private version of your planning model. It’s then easy for you to augment your story with actual and predictive forecasts.
You can nd a detailed use case using a planning model as data source. Read this blog .
Note
They are restrictions on using planning models as data sources. See Restrictions
When you create a predictive model, you initially specify a training data source, a target or signal variable, and then de ne
additional training settings. Training is a process that uses SAP automated machine learning algorithms to nd the best
relationships or patterns of behavior in the data source. The result is a predictive model that you can apply to a new data source
to predict with a probability what could be the value of the target or signal for each element of the data source.
While creating the predictive model, you've selected a training data source. As the values of the target variable are known in
this data souce, the data can be used to evaluate the accuracy of the predictive model's results. During the training process,
the data source is cut into sub-sets using a process called partition strategy, with a nal partition used to validate the predictive
model's performance, using a range of performance indicators and graphical tools. For more information regarding the partition
strategy, refer to the related link.
This is custom documentation. For more information, please visit the SAP Help Portal 25
6/30/2021
The following graphics summarize what's happen when you click Train with a training dataset as data source:
If the training is successful, the predictive model produces a range of performance indicators and graphical charts that
allow you to analyze the training results. Assessing the accuracy and robustness of the training is called debrie ng the
predictive model. You can nd more information on debrie ng clicking on the related link Look for the Best Predictive
Model at the end of this page.
If the training is not successful, use the Status Panel (click the icon Settings panel), to access detailed information
why warnings and errors were generated during the training process.
Note
Smart Predict doesn't take into account neither the textual variables nor the time variables that are contained in the
columns of your training dataset. Therefore, these variables are excluded from the training of your Predictive Model. For
more information, see the Restrictions.
Once you are satis ed with the accuracy and robustness of your predictive model, you can apply it to a new data source for
predictive insights, ensuring that the new data source has the same information structure as the training data source. For more
information on the apply step, click the related link below Generating Your Predictions.
Related Information
Partition Strategy
Restrictions
Keeping Informed With The Status Panel
Looking for the Best Predictive Model
Generating Your Predictions
Partition Strategy
A partition strategy is a technique that decomposes a training data source into two distinct subsets:
This is custom documentation. For more information, please visit the SAP Help Portal 26
6/30/2021
A training subset
A validation subset
Thanks to this partition strategy, the application can cross-validate the predictive models generated to ensure the best
performance.
The following table de nes the roles of the two data subsets obtained using partition strategies.
Validation Select the best predictive model among those generated using the
training subset, which represents the best compromise between
perfect quality and perfect robustness.
Note
For Time Series Forecast, the validation subset allows you to calculate the con dence interval (Error Min and Error Max) of
the predictions.
Related Information
Training a Predictive Model
This is custom documentation. For more information, please visit the SAP Help Portal 27
6/30/2021
The selection of the best time series predictive model is based on the horizon-wide MAE: The time series predictive model is
applied on the past observations found in the validation set. For each period, the predictive model calculates as many
forecasted values as requested by the analyst. This is called the horizon of forecasts. Each of those forecasted values is
compared to the corresponding actual one. Then, for each possible horizon, a per-horizon MAE can be calculated. The horizon-
wide MAE is the mean of all per-horizon MAE values.
Context
You create a predictive scenario that corresponds to the type of information you need to answer a business question. The
predictive scenario is where you will create and store one or several predictive models. Refer to Related Information for a full
description of the types of predictive scenarios available.
Procedure
1. Select Main Menu Create Predictive Scenario .
Note
You can also create a predictive scenario from the Files page by clicking the Create icon in the My Files menu bar.
2. Select the type of predictive scenario that best ts your business question.
The New Predictive Scenario dialog appears. It lists les and folders in the Files repository. If you already have a user
folder, you can store your predictive scenario there, or you can create a new folder.
3. Browse to and select a folder for the predictive scenario, and then enter a meaningful and unique name.
4. Enter a description, if needed. It should describe the business intent of the predictive scenario; for example, what
problem is it trying to solve? Is its goal to predict who will buy a product? Or is it creating groups of customers?
5. Click OK.
Your predictive scenario is created and automatically opens the Settings pane. The pane contains the parameters and
options that you'll use to de ne its rst predictive model. The settings available depend on the type of predictive
scenario, and these are described in Related Information.
Related Information
De ning the Settings of a Classi cation or Regression Predictive Model
De ning the Settings of a Time Series Predictive Model Using a Dataset as Data Source
What Type of Predictive Scenario Do You Need?
Depending on the type of predictive scenario you have created and the training data source type you will use to train your
predictive model, the predictive model settings can be different.
Note
Don' forget to keep in mind the following when:
This is custom documentation. For more information, please visit the SAP Help Portal 28
6/30/2021
Using an acquired dataset as data source: Don’t forget that an acquired dataset must be uploaded in SAP Analytics
Cloud, under the Files area, before you can use it.
Using a live dataset as data source: Before you start using live datasets as data source, you need to need to check
with your administrator that your live SAP HANA data repository works. For more information, see Setting up Live
SAP HANA Data Access for Smart Predict.
Using a planning model as data source (Time series predictive scenario only): Before you can use your planning model
as a data source, you must ensure that you have a least read access to it. There are also some speci cities when
currencies conversion is enabled. See How does Smart Predict Support Currencies De ned in Planning Model?.
Note
Keep in mind that your training and application data source must come from the same data source location. You can't apply
a predictive model on a live dataset if it was trained with an acquired dataset, nor can you apply a predictive model on an
acquired dataset if it was trained using a live one.
Regarding planning model: the same planning model you use as a source will be the one where you save the predictive
forecasts back.
This image is interactive. Hover over each area for a description. Click highlighted areas for more information.
Related Information
About Preparing Datasets for Predictive Scenarios
Before you train your classi cation or regression predictive model, you need to specify how you want your predictive model to
be trained through the Settings panel.
The following sections mirror the sections of the Settings pane you need to complete to create your predictive model.
This is custom documentation. For more information, please visit the SAP Help Portal 29
6/30/2021
General
Description Enter what your predictive model is trying For example, predict if a customer will
to do. churn or not.
Training Data Source Browse and select the data source that The data source can be an acquired dataset
contains your historical data. or a live dataset.
Edit Column Details Check and update if necessary the columns You might need to check the statistical type
contained in your data source. if you cannot select it as your target at next
step.
Predictive Goal
Target Select the column from your data source For a classi cation predictive model, the
that contains the information you want to target column must contain binary values
get predictions for. only (for example: yes or no).
For a regression predictive model, the
target column must contain numerical
values.
In uencers
Exclude as in uencer Select the in uencers that should not be All of the in uencers contained in your
taken into consideration by the predictive training data source can in luence more or
model. less the target.
This is custom documentation. For more information, please visit the SAP Help Portal 30
6/30/2021
Example
Imagine that you want to launch a phone
survey. You decide to limit the survey up
to 3 questions. In this case, as you need
to focus on the questions that best
in uence the prediction, you check the
option Limit Number Of In uencers and
set Maximum Number of In uencers to
3
Click Train button. Thanks to the generated reports, you can analyze the predictive model performance and decide if you need
to further re ne your predictive model or if you can use it with con dence. For more information, see Looking for the Best
Predictive Model.
De ning the Settings of Time Series Predictive Model With a Planning Model as
Data Source
There are some settings to specify before you train your time series predictive model using a planning model as data source.
To de ne how you want your predictive model to be trained, use the Settings panel as described in the tables below.
For more information about what is currently supported in Smart Predict, see the section Restrictions Using Planning Model as
Data Source for Smart Predict in Restrictions.
General
Description Enter a description that explains what your For example, you might want to Forecast
predictive model is trying to do. product sales by city.
Times Series Data Source Browse and select the planning model you Smart Predict supports only standalone
want to use as a data source. planning models, (both new model types
and classic account models).
Note
SAP Business Planning and
Consolidation (SAP BPC) planning
models are not supported whether these
are live or acquired.
This is custom documentation. For more information, please visit the SAP Help Portal 31
6/30/2021
Version Browse and select the planning model The input version must be a public version,
version you want to use as data source. not in edit mode, or a private version. You
must have a least read access to it. There
are also some speci cities when currency
conversion is enabled. See How does Smart
Predict Support Currencies De ned in
Planning Model?.
Predictive Goal
Signal Select the numeric value containing the data Smart Predict doesn't support calculated
to be forecasted. measures when using a planning model,
even if an inverse formula is provided. For
Note
more information on inverse formulas, you
If your signal is related to a can refer to the chapter Inverse Formulas.
currency, you also have the
For more information about using a planning
option to select Default
model as a data source, see the section
Currency or Local Currency.
Restrictions Using Planning Model as Data
The option you choose
Source for Smart Predict in Restrictions.
determines the currency used to
forecast and report on your For more information about currency
signal. support, see the chapter How does Smart
Predict Support Currencies De ned in
If your planning model is a new
Planning Model?.
model type structured with an
account dimension and one or To learn more about the different model
multiple measures, to de ne types, you can refer to the chapter called
your Signal, you must select a Getting Started with the New Model Type.
value for both, using the
Measure and Account selectors.
Time Granularity By default, this refers to the level of date If the lowest level of the date hierarchy in
granularity available in the planning model the planning model is daily, then Smart
data source. Predict will create daily predictive
forecasts.
Number Of Forecasts Select the number of predictive forecasts For more information, see How Many
you would like to get. Forecasts can be Requested?
This is custom documentation. For more information, please visit the SAP Help Portal 32
6/30/2021
Note
There are speci c restrictions on
entities. For more detailed information,
see the section Restrictions Using
Planning Model as Data Source for
Smart Predict in Restrictions.
Train Using Select whether you want to train your It can be useful to de ne the range of
predictive model using all observations or a observations that will be used to train the
window of observation. predictive model. You may want to ignore
If you choose to use a window of very old observations or inappropriate
observations you'll need to specify the size observation to avoid that your predictive
of the window you want to use. model learns based on
obsolete/inappropriate behavior.
Note Example
If the range of predictive forecasts
For example, if you want to forecast
overlaps existing data in the private
travel costs for next year, you might
version, data will be overriden.
want to ignore a couple of months in
your past data where travel has been
frozen for budget reasons.
Until Select whether you want to train your If you select a custom observation date,
predictive model until the last observation make sure it stays within the time range
or another date of your choice. de ned in the data source planning model.
Force Positive Forecasts Switch the toggle on if you want to get This turns negative predictive forecasts to
positive forecasts only. zero. This can be useful when predictive
forecasts only make sense as positive
predictive forecasts. For example, if you
need to forecast the number of sales for
one of your main product by major cities for
a region. It makes no sense to get negative
values. Either you sell a number of products
or you sell none of them. Negative values
will be turned to 0.
Select the Train & Forecast button. Thanks to the generated reports, you can analyze the predictive model performance, and
decide if you need to further re ne your predictive model, or if you can use the predictive forecasts with con dence. For more
information, see Analyzing the Results of Your Time Series Predictive Model.
Related Information
This is custom documentation. For more information, please visit the SAP Help Portal 33
6/30/2021
Getting Started with the New Model Type
The planning model used as a data source might contain currencies. But how does Smart Predict support these currencies? It
depends on how they are con gured in your planning model data source, and on what type of planning model you are using - a
classic account model, or a new model type.
Classic account model: You can generate predictions in Smart Predict using the default or local currency to read and
write data, when your planning model is a classic account model, and currency conversion is enabled. Depending on your
selection, the report for all your model entities are consistently expressed in one default currency, or multiple local
currencies. For example, if the default currency is USD, you always see numbers expressed in USD in the Smart Predict
report. If there are multiple local currencies in your planning model, the numbers in the Smart Predict report re ect
these multiple local currencies.
When you select the Default Currency setting, you can only write to planning versions con gured to receive default
currencies.
When you select the Local Currency setting, you can only write to planning versions con gured to receive local
currencies.
Note
To understand the currency displayed in the report, you can check the Smart Predict settings, and the currency
de nition in your planning model. The default currency is indicated in the planning model.
Example
If you deal in Japanese yen, you would understand that your report is in Japanese yen, as your Smart Predict currency
setting is Local Currency.
New model type: Smart Predict doesn't support predictive forecasting on calculated measures, including currency
conversion measures, when your planning model is a new model type.
To learn more about the different model types, you can refer to the chapter called Getting Started with the New Model Type.
To learn more about how currencies can be set up in a planning model, see Planning on Data in Multiple Currencies.
For classic account models, Smart Predict can support currencies when they are set at the planning model level in the following
cases:
This is custom documentation. For more information, please visit the SAP Help Portal 34
6/30/2021
Caution
You must have at least read access to the input version, and written access to the output version (private only). Public
versions in edit mode are available for selection, but aren't supported.
Related Information
Attributes of an Account Dimension
Attributes of an Organization Dimension
Attributes of a Dimension
Planning on Data in Multiple Currencies
Setting Up Model Preferences
Displaying Currencies in Tables
Restrictions
Getting Started with the New Model Type
How Can You Get Distinct Predictive Forecasts per Entities For Your Planning
Model?
Thanks to Smart Predict, you can create distinct predictive forecasts per entities using your planning model as data source,
where the granularity of the predictive forecasts is determined by the aggregation level of the combined dimensions. But what
does it mean?
Example
Let's take an example to better understand how it works: Imagine that you want to forecast your future sales by country and
by product.
To build a predictive model with distinct forecast per entities taking your planning model as data source, Smart Predict
needs to match the data contained in your planning model (actuals are used as historical data ) with the variable roles that
are mandatory to generate the predictive forecasts:
Roles in Smart Predict Correspondence in your planning model Correspondence with our example
Signal Measure that does not involve calculation It's the measure you want to forecast:
<Sales>
This is custom documentation. For more information, please visit the SAP Help Portal 35
6/30/2021
Roles in Smart Predict Correspondence in your planning model Correspondence with our example
Once the training is done and the generated predictive forecasts are available, the data look like:
This is custom documentation. For more information, please visit the SAP Help Portal 36
6/30/2021
When application is done, the predictive forecasts are added to your planning model. In our example, it means that the
generated forecasts for Sales will be added for June and July.
Even if the time series predictive model is trained and applied based on a lowest level of date granularity, you can still report
data at upper level in a story.
Going back to our example: The times series has been generated on months basis. You can report by aggregating the sales
by quarters or years (instead of month).
This is custom documentation. For more information, please visit the SAP Help Portal 37
6/30/2021
Entities help you create your predictive forecasts at different levels, depending on your business needs. They can also help you
detect performance gaps in your predictive models in some cases.
Entities are subsets of your predictive forecasts that are calculated independently from a combination of one to ve
dimensions. Each entity can be seen as an individual predictive forecast. These individual predictive forecasts can be
agreggated at a higher level if needed.
The level you need for yourr predictive forecasts depends on the insights you want to get from your predictive forecasts and the
data available in your source planning model. You can create entities by combining up to 5 dimensions for varying levels of high-
detail predictive forecasts. You can also work without entities to keep your predictive forecasts high-level.
Let’s take a car sales scenario to explore in more detail how entities can help you nd the level of predictive forecasts you need.
Your company wants insights on future car sales. The company sells ve car brands across six countries. You have ve months of
data available in your source planning model, wand you want two months of predictive forecasts.
This is custom documentation. For more information, please visit the SAP Help Portal 38
6/30/2021
In this case, you use car sales as your signal, Date as your date dimension and you train your predictive model.
You get a single predictive model that generates two car sales predictive forecasts, one for June and one for July.
This is custom documentation. For more information, please visit the SAP Help Portal 39
6/30/2021
This high-level forecasting of the company’s future car sales is useful if you just need an overview without looking into subset-
speci c trends.
In this case, you keep car sales as your signal and Date as your date dimension. You add Brand and Country dimensions as
entities and train your predictive model.
You get thirty predictive models that are calculated independently. The thirty predictive models generate sixty predictive
forecasts if all combinations have data available. This high-detail forecasting approach is useful if you need to focus on several
subsets such as how one brand performs in a given country or how one country performs by brand compared to other countries.
The sixty predictive forecasts can still be aggregated at a higher level by Country, Brand or Date if needed.
In this case, you keep car sales as your signal and Date as your date dimension. You add the Country dimension as entity and
train your predictive model.
This is custom documentation. For more information, please visit the SAP Help Portal 40
6/30/2021
You get six predictive models (one per country) that are calculated independently. Twelve predictive forecasts are generated if
all combinations of Country and Date are possible.
These twelve predictive forecasts can also be aggregated at a higher level by Country or Date if needed. This mid-level
forecasting approach is useful when you need to focus on trends and relationships in one speci c subset such as how each
country will perform individually and compared to other countries.
Entities are a useful tool to tailor the level of your predictive forecasts to your different business needs. You may need to try a
few forecasting combinations to reach the level of accuracy you want.
Note
Hierarchies are currently not supported as entities. For more information, see Restrictions.
De ning the Settings of a Time Series Predictive Model Using a Dataset as Data
Source
Before you train your Time Series predictive model using a dataset as data source, you need to specify how you want your
predictive model to be trained through the Settings panel.
The following sections mirror the sections of the Settings pane you need to complete to create your predictive model.
Note
Don't forget that the date formats in your time series data source must be:
YYYY-MM-DD
YYYY/MM/DD
YYYY/MM-DD
YYYY-MM/DD
YYYYMMDD
This is custom documentation. For more information, please visit the SAP Help Portal 41
6/30/2021
YYYY-MM-DD hh:mm:ss
where YYYY stands for the year, MM stands for the month, DD stands for the day of the month, hh stands for hour, mm
stands for minutes, and ss stands for seconds.
Example
January 25, 2018 will take one of the following supported formats:
2018-01-25
2018/01/25
2018/01-25
2018-01/25
20180125
General
Description Enter what your predictive model is trying For example, forecast product sales by city.
to do.
Times Series Data Source Browse and select the data source that
contains your historical data.
Edit Column Details Check and update if necessary the columns You might need to check the statistical type
contained in your data source. if you cannot select it as your signal at next
step.
Predictive Goal
Number Of Forecasts Select the number of predictive forecasts See How Many Forecasts can be
you want. Requested?.
Entity Select up to ve columns from your data This corresponds to identify each entity
source for which you want to get distinct that you want to get predictive forecasts
forecasts. This eld is optional. for. The predictive model will capture
speci c behaviors for each entity and will
generate distinct predictive forecasts.
This is custom documentation. For more information, please visit the SAP Help Portal 42
6/30/2021
Train Using Select whether you want to train your It can be useful to de ne the range of
predictive model using all observations or a observations that will be used to train the
window of observation. predictive model. You may want to ignore
If you choose to use a window of very old observations or inappropriate
observations you'll need to specify the observation to avoid that your predictive
window size you want to use. model learns based on
obsolete/inappropriate behavior.
Example
For example, if you want to forecast
travel costs for next year, you might
want to ignore a couple of months in
your past data where travel has been
frozen for budget reasons.
Until Select whether you want to train your Last Observation: Let the application use
predictive model until the last observation the last training reference date as a basis.
or another date of your choice.
User-De ned Date: You select a speci c
date (available in the dataset).
Force Positive Forecasts Switch the toggle on if you want to get This turns negative predictive forecasts to
positive forecasts only. zero. This can be useful when predictive
forecasts only make sense as positive
predictive forecasts. For example, if you
need to forecast the number of sales for
one of your main product by major cities for
a region. It makes no sense to get negative
values. Either you sell a number of products
or you sell none of them. Negative values
will be turned to 0.
Click Train & Forecast button. Thanks to the generated reports, you can analyze the predictive model performance and decide if
you need to further re ne your predictive model or if you can use the predictive forecasts with con dence. For more
information, see Analyzing the Results of Your Time Series Predictive Model.
You can set the number of predictive forecasts that corresponds to your business needs. However, this number can be set with
some limits:
In the case that your time series data source is a dataset that contains future values for in uencers, your number of
predictive forecasts needs to be inferior or equal to the number of future values observations you have in your data
source.
Example
If you have future values for the next six months, then the number of predictive forecasts requested cannot exceed
six.
The number of predictive forecasts delivered with con dence intervals is determined as follows:
If the time series data source size is equal or fewer than 12 periods, it is treated as a small data source case and
by default the number of predictive forecasts with con dence intervals is set to 1.
This is custom documentation. For more information, please visit the SAP Help Portal 43
6/30/2021
In the other cases, the number of predictive forecasts with con dence intervals is set to 1/5 of the time series
data source size.
Example
If your time series data source contains 1,000 observations, Smart Predict can provide up to 200 predictive
forecasts with con dence intervals. If you ask for more than 200 predictive forecasts, the accuracy of the
forecasts starting from the 201st cannot be evaluated.
Related Information
Partition Strategy
De ning the Settings of a Time Series Predictive Model Using a Dataset as Data Source
How Adding In uencers to Your Dataset Can Potentially Increase the Accuracy of
Your Predictive Model?
Once you’ve trained your predictive model, the performance indicators can be too low to immediately consider the predictive
model accurate (see Predictive Power for a classi cation predictive model, Root Mean Square Error for a regression predictive
model or Horizon-Wide MAPE for a time series predictive model).
One way to increase your predictive model’s accuracy is to add in uencers to your dataset. These in uencers can then be used
by Smart Predict to improve its understanding of the relationships between your data.
Note
In uencers are only available if your data source is a dataset.
Example
Your company noticed that the maintenance costs of their stores are getting too high. You need to analyze them to see where
to cut costs but also predict future maintenance costs better to avoid going over budget. You create your rst predictive
scenario with a Time Series predictive model to assess the maintenance costs per store. You choose the overall expenses as
signal, the date of these expenses as date variable and the store ID as entity.
You train your rst predictive model excluding the twenty-three possible in uencers.
The Horizon-Wide MAPE of your rst predictive model in the debrief is at 26.71%.
Note
This is custom documentation. For more information, please visit the SAP Help Portal 44
6/30/2021
You want the percentage of your predictive model’s Horizon-Wide MAPE to be as low as possible as it indicates the
percentage of error you can expect in your predictive forecasts.
You notice that some of the variables excluded as in uencers such as the number of Saturdays and Sundays have a direct
relation to the date dimension your used in your predictive model. You realize they impact the insights and could improve the
accuracy of your predictive forecasts if they were included as in uencers.
You create a second predictive model by duplicating your rst predictive model. However, this time you include all in uencers
and train your second model.
The Horizon-Wide MAPE of your second predictive model in the debrief drops to 20.77%.
Your predictive model gained 22% of accuracy by simply including variables as in uencers in your predictive model. You may
need to try a few in uencers combinations to reach the level of accuracy you want.
Related Information
In uencer Count
Root Mean Squared Error (RMSE)
Horizon-Wide MAPE
Predictive Power
1. You save your predictive model to add it to the predictive model list of your predictive scenario. The predictive model is
saved with the status Not Trained.
Note
The "Save" button is in the toolbar.
2. You train your predictive model. During training, Smart Predict explores and learns from relationships in the data source
to nd the best combination or patterns of behaviour and generates the predictive model. Its status is updated
immediately in the predictive model list as Trained.
This is custom documentation. For more information, please visit the SAP Help Portal 45
6/30/2021
Note
It can happen that the training fails. In this case, check the Status Panel clicking . You can also click on the Status
Related Information
Training a Predictive Model
Keeping Informed With The Status Panel
Status of a Predictive Model
At the top of the Settings panel, click to access to information on the predictive model, and any errors that occurred
Note
The Status panel stays empty until your predictive model is trained.
View Displays
Model Status The area where you can access the error messages. For example, if
the training failed, you can get some information on what went
wrong.
Detailed Logs Logs that display the details of each step of the process. In case of
a problem, it allows you to provide SAP support professionals with
information.
When you train or apply your predictive model, you can get the following types of information on the training or apply process:
This is custom documentation. For more information, please visit the SAP Help Portal 46
6/30/2021
Note
For Time series predictive models that are split up into entities, errors/warnings are displayed in the debrie ng reports
directly and exact error / warning is displayed per entity.
The Status panel is located at the top of the Settings panel, by clicking .
This is custom documentation. For more information, please visit the SAP Help Portal 47
6/30/2021
You've trained your rst predictive model using your training data source and it's now time to evaluate if you can use it with
con dence to generate your predictions.
You can evaluate your predictive model's performance using a range of performance indicators. You can also add new predictive
models with different settings and compare their performances if you want to re ne your results.
This image is interactive. Hover over each area for a description. Click highlighted areas for more information.
Displaying the Assessing Your Predictive Model With the Performance Indicators
Context
You want to open an existing predictive model.
Procedure
1. Open the predictive scenario that contains your predictive model.
2. Click the relevant predictive model to open from the Predictive Models list.
This is custom documentation. For more information, please visit the SAP Help Portal 48
6/30/2021
Context
You want to experiment with new settings starting from a predictive model version that already exists. You can duplicate the
predictive model's settings to create a new version of the current predictive model to:
Set or update the number of in uencers for Classi cation and Regression predictive models.
Set new training and forecast settings for Time Series predictive models.
Procedure
1. Open the predictive scenario, which contains the predictive model you want to duplicate.
3. Click at the predictive model level you want to duplicate, and select Duplicate.
You create an exact copy of the original version of your predictive model.
Note
The duplicated version created is always untrained.
Context
You want to delete a predictive model.
Procedure
1. Open the predictive scenario, which contains the predictive model you want to delete.
4. Select Delete.
Context
Your predictive scenario is created and you've already created at least a rst predictive model. Now you want to create a new
predictive model to test new training settings and compare the results.
This is custom documentation. For more information, please visit the SAP Help Portal 49
6/30/2021
Procedure
1. Click the button Create Predictive Model.
The new predictive model is added to the predictive scenario and appears in the predictive model list. You can now easily
compare the existing predictive models to nd the one that best ts to your need.
You can check the performance and robustness of your predictive model using several performance indicators. The indicators
available depend on the type of predictive model you have set up:
In uencer Count
Record Count
Gini Index
In uencer Count
Record Count
Error Mean
Maximum Error
In uencer Count
Record Count
Predictive Power
Quality indicator of a classi cation predictive model.
The predictive power measures the ability of your predictive model to predict the values of the target variable using the
in uencers present in the training data source.
This is custom documentation. For more information, please visit the SAP Help Portal 50
6/30/2021
The predictive power indicator takes a value between 0% and 100%. This value should be as close as possible to 100%, without
being equal to 100%.
A predictive power of 1 is a hypothetically perfect predictive model, where the in uencers are capable of accounting for 100% of
information in the target variable. In practice, however, this is usually an indication that an in uencer, that is 100% correlated
with the target variable, was not excluded from the data source analyzed. A good practice would be to exclude this in uencer
when you de ne the settings of your predictive model.
Tip
To improve the predictive power of a predictive model, try adding new in uencers to the training data source.
Example
A predictive model with a predictive power of 79% is capable of accounting for 79% of the variation in the target variable
using the in uencers in the data source analyzed.
No exact threshold exists to separate a “good” predictive model from a “bad” predictive model in terms of predictive power as
this depends on your business case. The predictive model of a customer-based scenario can be considered "good" with a
predictive power of 40, while the predictive model of a nance-based scenario usually requires a predictive power above 70 to
be considered "good".
The prediction con dence indicates the capacity of the predictive model to achieve the same performance when it is applied to
a new data source, which has the same characteristics as the training data source. If the distribution of data is different
between the two data sources, the predictive model is no longer useful.
The following graph displays the predictive power and the prediction con dence calculation:
This is custom documentation. For more information, please visit the SAP Help Portal 51
6/30/2021
Horizon-Wide MAPE
Example
A Horizon-Wide MAPE of 12% indicates that the error made when using a forecasted value will be of more or less 12%.
The absolute value of the differences is taken into account to evaluate the average error.
Related Information
How Many Forecasts can be Requested?
In uencer Count
The In uencer Count indicates the number of in uencers used in the predictive model.
This is custom documentation. For more information, please visit the SAP Help Portal 52
6/30/2021
Tip
To improve the predictive power of your predictive model, you may want to add In uencers to the training data source.
Record Count
Number of rows processed.
Tip
To improve the prediction con dence of a predictive model, you may want to add new observation rows to the training data
source. In case of a classi cation predictive model, keep in mind that the number of rows of the class less represented must
ideally be higher than 1000.
For classi cation predictive models, it corresponds to the ratio of correctly classi ed rows to the total number of rows.
Example
A classi cation rate of 0.82 means that 82% of the rows in the training dataset are correctly classi ed by the predictive
model.
Note
The classi cation rate is not very well adapted to unbalanced cases, when the target category is not very frequent. For
example, if there is only 1% of the target category, it's very easy to have a very high classi cation rate. In such a case, check
the Predictive Power or the Area Under the ROC Curve (AUC).
The Area Under the ROC Curve (AUC) is another way to measure predictive model performance. It calculates the area under
the Receiver Operating Characteristic (ROC) curve. The AUC is linked to Predictive Power (PP) according to the following
formula: PP = 2 * AUC - 1. For a simple scoring predictive model with a binary target, this represents the probability that a
randomly chosen signal observation will have a higher score than a randomly chosen negative observation (non-signal).
One of the interests of the Area Under the ROC Curve is its independence from the target distribution. For example, even if you
duplicate each positive observation twice by duplicating rows in the dataset, the AUC of the predictive model stays the same.
Tip
AUC is a good measure for evaluating a binary classi cation system. It is useful for cases when the target category is not
very frequent, which is not the case of the Classi cation Rate.
Example
This is custom documentation. For more information, please visit the SAP Help Portal 53
6/30/2021
Below is an example of a ROC curve:
Sensitivity, which appears on the Y axis, is the proportion of CORRECTLY identi ed signals (true positives) found (out of all
true positives on the validation data source).
[1 – Speci city], which appears on the X axis, is the proportion of INCORRECT assignments to the signal class (false
positives) incurred (out of all false positives on the validation data source). (Speci city, as opposed to [1 – speci city], is the
proportion of CORRECT assignments to the class of NON-SIGNALS – true negatives.)
Error Mean
Mean of the differences between predictions and actual values.
The Error Mean or Standard Error of the Mean quanti es the precision of the predictive model's estimations. It's used to
determine how precisely the mean of the predictive model's predicted values estimates the population mean.
A negative mean value indicates that the predictive model always underestimates the values, and often generates values under
the actual values.
A high mean value indicates that the predictive model over-estimates the target values, and often generates values above the
actual values.
To improve the accuracy of your predictive model, you can bring additional in uencers that make the target clearer to the
training data source.
The Error Standard Deviation or Standard Deviation is a measure of variability that quanti es how much the errors vary from
one another.
This is custom documentation. For more information, please visit the SAP Help Portal 54
6/30/2021
A high standard deviation indicates that the data points are spread out over a wider range of values.
To improve the accuracy of your predictive model, you can bring additional in uencers that make the target clearer to the
training data source.
Gini Index
Measure of predictive power based on the Lorenz curve.
The Gini index is a measure of the predictive power based on the Lorenz curve. It is proportionate to the area between the
random line and the predictive model curve. The Gini index is de ned as the area under the Lorenz curve. It is the area between
the ʻTrade-off’ curve and the obtained curve multiplied by 2.
Value 0 corresponds to random predictive model, value 1 corresponds to ideal predictive model.
Maximum Error
The Maximum Error is the highest value resulting from the calculation of the absolute differences between the predicted and
the actual values for each row of the data source.
This is custom documentation. For more information, please visit the SAP Help Portal 55
6/30/2021
The Root Mean Squared Error has the advantage of representing the amount of error in the same unit as the predicted column
making it easy to interpret. If you are trying to predict an amount in dollars, then the Root Mean Squared Error can be
interpreted as the amount of error in dollars.
What is the formula used to calculate the Root Mean Squared Error?
The Root Mean Squared Error is calculated using the following formula:
where:
N = Number of observations
Other Interpretation
The Root Mean Squared Error can be interpreted as the standard deviation of the error (it's the square root of the error
variance).
This is custom documentation. For more information, please visit the SAP Help Portal 56
6/30/2021
Once you've trained your classi cation predictive model, you can analyze its performance to make sure it's as accurate as
possible.
Use the dropdown list to access and analyze the reports on in uencers and predictive model performance.
What do the values of the two Does the Which Which Is my model Can I see any What's next?
Cli main performance indicators target value in uencergroup of producing model errors?
ck mean? appear in s have thecategorieaccurate Is my You have two
th sufficient highest s has the predictions?predictive possibilities:
e Quickly check if your predictive
quantity in the impact onmost Can I model
ar model is accurate and robust, Your are
different data the in uence evaluate the producing
ea checking the global performance satis ed with
sources? target? on the costs/savin accurate
for indicators: your
target? gs using predictions?
m predictive
Get an overviewCheck how this model?
or Predictive Power is your model's
of the the top In the Use a large
e main measure of predictive frequency in performance
ve In uencerUse the panel of
inf model accuracy. It takes a each data after
in uencersContributiConfusion performance
or value between 0% and checking the
source of each impact on ons Matrix tab curves in the
m 100%. This value should be target class performance
the target. report, and assess Performance
ati as close as possible to indicators.
(positive or Only the analyze the predictive Curves tab, to
on. 100%, without being equal Then you can
negative) that top ve the model compare your
to 100% (100% would be a belongs to the contributinin uence use it: See
performance predictive
hypothetically perfect Generating
target variable. g of in detail, model to a
predictive model; 0% would Your
in uencersdifferent using random
be a random predictive It's usually Predictions.
are categorie standard predictive
model with no predictive recommended displayed s of an metrics such model and a You would like
power). To improve your that you have as a in uencer as speci city. hypothetical to see if you
Predictive Power, you can at least 1000 default. on the perfect can improve
add more in uencers, for records of the target: Use the predictive your
example. each class in For more Pro t model: predictive
your data informatio If Simulation model's
For more information, refer
source. Under n, refer to th tab and performance:
to Predictive Power.
this threshold, In uencer
This is custom documentation. For more information, please visit the SAP Help Portal 57
6/30/2021
Prediction Con dence the validity of Contributi e estimate the Determi Duplic
indicates the capacity of the prediction ons. in expected ne the ate
your predictive model to con dence is no ue pro t, based percenta your
achieve the same degree of longer nc on costs and ge of the curren
accuracy when you apply it guaranteed. e pro ts populatio t
to a new data source, which val associated n to predic
has the same For more ue with the contact tive
characteristics as the information, is predicted to reach model
training data source. It takesrefer to Target po positive and a speci c and
a value between 0% and Statistics. siti actual percenta experi
100%. This value should be ve, positive ge of the ment
as close as possible to we targets. actual with
100%. To improve your ar positive updat
Prediction Con dence, you e For more target ed
can add new rows to your moinformation, with The settin
data source, for example. re refer to Detected gs. You
lik Confusion Target can
For more information, refer
ely Matrix, The Curve. then
to Prediction Con dence.
to Pro t compa
Check
ge Simulation. re the
how
Note t two
much
Depending on your business "m versio
better
issue, you can look at the other ino ns and
your
provided performance indicators rit nd
predictiv
for the predictive model, and y the
e model
also review the pro le of the cla best
is than
detected curve. For more ss" one.
the
information, refer to Assessing . See
random
Your Predictive Model With the Duplic
If predictiv
Performance Indicators and The ating a
th e model
Detected Target Curve. Predic
e with The
tive
in Lift
Model.
ue Curve.
nc Updat
Check
e e the
how well
val settin
your
ue gs of
model
is your
discrimin
ne predic
ates, in
ga tive
terms of
tiv model
the
e and
compro
we retrain
mise
ar it. See
between
e De ne
sensitivit
les Settin
y and
s gs and
speci cit
lik Train a
y with
ely Classi
The
to catio
Sensitivit
ge n or
y Curve
t Regre
(ROC).
"m ssion
ino Predic
This is custom documentation. For more information, please visit the SAP Help Portal 58
6/30/2021
rit Check tive
y the Model
cla values
ss" for [1-
. Sensitivit Ca
y] or for uti
The Speci cit on
in uence y against You
of a the will
category populatio era
can be n with se
positive The the
or Lorenz pre
negative. Curves. vio
us
For more Understa
ver
informati nd how
sio
on, refer positive
n!
to and
Category negative
targets Delete
In uence,
are your
Grouped
distribut predic
Category
ed in tive
In uence
your model.
and
model See
Grouped
with The Deleti
Category
Density ng a
Statistics.
Curves. Predic
tive
Model
Target Statistics
Identify the categories in your target and their frequency in each data source.
Target Statistics are expressed in percentage and show how often each target class appears in the data source.
For each data source, the Target Statistics report lists the categories of the target variable. For each category, the table
indicates how often it appears compared to the other category. The target category (positive or negative target) is by default
the less frequent one.
In uencer Contributions
The In uencer Contributions show the relative importance of each in uencer used in the predictive model.
The In uencer Contributions view allows you to examine the in uence on the target of each in uencer used in the predictive
model.
The most contributive ones are those that best explain the target.
Only the contributive in uencers are displayed in the reports, the in uencer with no contribution are hidden. The sum of their
contributions equals 100%.
This is custom documentation. For more information, please visit the SAP Help Portal 59
6/30/2021
Note
The number of in uencers that are displayed, depends on the predictive model settings you have de ned at the creation
steps. Indeed, if you decided at the creation step to Limit Number Of In uencers to 3 for example, then you will get the
information on the 3 most important in uencers at a maximum.
Related Information
Variables in Smart Predict
Category In uence
Category In uence
Analyze the in uence of different categories of an in uencer on the target.
Category in uence is an analysis of the in uence of different categories of an in uencer on the target, computed from basic
information:
The difference between the percentage of positive cases in this category and the percentage of positive cases in the
whole population.
Categories with positive values are categories where observations are more likely to be in the positive category of the
target: The percentage of positive targets within this category is above the percentage of positive target in the whole
data source.
Categories with negative values are categories where the observations are more likely to be in the negative category of
the target: the percentage of positive target within this category is below the percentage of negative target in the whole
data source.
The in uence is computed for each category and provided by the engine using this formula:
where:
Related Information
Grouped Category In uence
This is custom documentation. For more information, please visit the SAP Help Portal 60
6/30/2021
Grouped Category Statistics
Grouped Category In uence shows groupings of categories of an in uencer, where all the categories in a group share the same
in uence on the target variable. You can quickly see which category group has the most in uence.
The length and direction of a bar show whether the category has more or fewer observations that belong to the target
category:
A positive bar (in uence on target greater than 0) indicates that the category contains more observations belonging to
the target category than the mean (calculated on the entire validation data source).
<0> means that the category has no speci c in uence on the target.
A negative bar (in uence on target less than 0) indicates that the category contains fewer positives cases (%) than the
percentage of positive cases in the overall validation data source.
Grouped Category Statistics show the details of how the grouped categories in uence the target variable over the selected
data source.
For a nominal target, the target mean is the frequency of positive cases for the target variables contained in the
data source.
For a continuous target, the target mean is the average of the target variable for the category in the data source.
The Y Axis displays the frequency of the grouped category in the selected data source.
You can use the Validation and Prediction dropdown list to compare the results obtained by the predictive model (training
subset) and the one obtained on validation (validation subset).
Confusion Matrix
The Confusion Matrix, also known as an error matrix, is a table that shows the performance of a classi cation predictive model's
performance by comparing the predicted value of the target variable with its actual value.
Each column of the Confusion Matrix represents the observations in a predicted category, while each row represents the
observations in an actual class:
Actual 1 (= Actual Positive Targets) Number of correctly predicted positive Number of actual positive targets that have
This is custom documentation. For more information, please visit the SAP Help Portal 61
6/30/2021
targets (True Positive =TP) been predicted negative (False Negative =
FN)
Actual 0 (= Actual Negative Targets) Number of actual negative targets that have Number of correctly predicted negative
been predicted positive (False Positive = targets (True Negative = TN)
FP)
The observations are classed into positive and negative target categories:
Positive target (Predicted 1 and Actual 1): An observation that belongs to the population you want to target.
Negative target (Predicted 0 and Actual 0): An observation that does not belong to this target population.
The Confusion Matrix reports the number of false positive, false negative, true positive, and true negative targets. It is a good
estimator of the error that would occur when applying the predictive model on new data with similar characteristics.
By default, the Total Population is the number of records in the validation data source. This is a part of your training data source
that Smart Predict keeps separate from the training data, and uses to test the predictive model's performance.
The classi cation model allows you to sort the Total Population from the lowest to highest probability. To get the predicted
category, which is what you are interested in, you need to choose the threshold that determines who or what gets into that
category, and the others that don't make it. Sliding the threshold bar allows you to experiment with this number to see the
resulting Confusion Matrix for the population on which you want to apply your predictive model.
Detected Target: You select the percentage of positive targets you want to detect.
Note
Refer to the section How is a Decision Made For a Classi cation Result? for information on how Smart Predict automatically
sets the threshold.
Get a detailed assessment of your predictive model's quality. This is because it takes into account a selected threshold
that transforms a range of probability scores into a predicted category. You can also use standard metrics such as
speci city. For more information, see the related link.
To estimate the expected pro t, based on costs and pro ts associated with the predicted positive and actual positive
targets. For more information, see the related link.
In some cases, assessing the predictive model quality based on the error matrix is more relevant than using metrics like the
classi cation rate.
Example
In a business scenario where you want to detect fraudulent credit card transactions, the False Negative (FN) class can be a
better metric than the Classi cation rate. If your predictive model to detect the fraudulent transactions always predicts
"non-fraudulent", the Classi cation rate will be 99.9%.
The classi cation rate is excellent, but it isn’t a reliable metric to evaluate the real performance of your predictive model
because it gives misleading results. These results are usually due to an unbalanced data source, where there is a lot of
variation in the number of samples in different classes.
This performance issue will show up in the error matrix as a high False Negative (FN) class (number actual fraudulent
transactions detected as non-fraudulent by the predictive model)
This is custom documentation. For more information, please visit the SAP Help Portal 62
6/30/2021
Related Information
How is a Decision Made For a Classi cation Result?
The Metrics
Example: Interpreting The Confusion Matrix
The automatically determined threshold is the point where you have the same % of positive observations for the applied data
source population as you do for the training data source population.
The Metrics
You can use the Confusion Matrix to compute metrics to associate with different needs.
Fall-out Proportion of negative targets that have FP/(FP+TN) or (100% - Speci city)
been incorrectly detected as positive.
De nition:
N = Number of observations
FN (False Negative) = Number of actual positive targets that have been predicted negative.
FP (False Positive) = Number of actual negative targets that have been predicted positive.
This is custom documentation. For more information, please visit the SAP Help Portal 63
6/30/2021
Example
A company wants to do a marketing campaign. They would like to target the campaign to the customers that will answer
positively to the campaign and to avoid unecessary costs. They have built a model to classify the customers into two
categories:
Positive Targets (Predicted 1 and Actual 1): The customers will response positively to the campaign and need to be
contacted.
Negative Targets (Predicted 0 and Actual 0): The customers will response negatively to the campaign and don't need
to be contacted.
By default, the application proposes to contact 24.1% of the population (see 1 on the graphic below).
Note
The population is sorted in decreasing score order.
Note
Above this threshold, customers will not be targeted for marketing actions.
24.1% of the population (see 3) is considered as positive cases and is selected in the marketing campaign.
This is custom documentation. For more information, please visit the SAP Help Portal 64
6/30/2021
The percentage of "True Positives" is 16.31% (see 4) whereas the percentage of actual positives is 23.86% (see
5).
The Classi cation rate is 84.6%. This means that almost 85% of the customers will be correctly classi ed in the two
categories (answer positively/answer negatively to the campaign) when you will apply the validation data source.
The sensitivity is 68.35%. This means that almost 70% of customers that will answer positively to the campaign are
correctly predicted as positive targets. These customers will be selected for the campaign.
The Speci city is 89.77%. This means that almost 90% of customers that will answer negatively to the campaign are
correctly predicted as negative targets. These customers will not be contacted for the campaign.
The Precision is 67.67%. This means that almost 70% of customers predicted as customers that will answer positively
to the campaign are correctly classi ed. These customers will be part of the campaign.
The Fall-out is 10.23%. This means that almost 10% of the customers that will answer negatively to the campaign are
classi ed as positive targets and will be selected for the marketing campaign.
You can visualize your pro t based on the selected threshold, or automatically select the threshold based on your pro t
parameters.
Set the threshold that determines which values are considered positive (see the relevant-related link) and provide the
following:
a Cost Per Predicted Positive: you de ne a cost per observations classi ed as positive by the Confusion Matrix. This
covers the costs both for True Positive Target (actual positive targets that have been predicted as positive) and False
Positive Target (actual negative targets that have been predicted positive).
a Pro t Per Actual Positive: you de ne a pro t per True Positive Target (targets correctly predicted as positive) identi ed
by the Confusion Matrix.
The Total Pro t table is updated accordingly to calculate your pro t/cost. You obtain an estimation of the gap between the gain
of the action based on a random selection (without any predictive model) and the gain based on the selection.
To see the threshold that will give you a maximum pro t for the pro t parameters you have set, click Maximize Pro t.
Associate a cost/pro t
Example
As an example to understand how the pro t simulation works, we will consider the same example as for the error matrix.
This is custom documentation. For more information, please visit the SAP Help Portal 65
6/30/2021
In our Confusion Matrix example (see the relevant-related link for more information), we have decided on the following
threshold:
The marketing department has estimated that the cost per contacted customers is 2€ and that the pro t per customers
that will really answer positively is 20€ (see 3).
The total pro t matrix is updated accordingly and displays the following results:
You obtain an estimation of the gap between the pro t of the action based on a random selection (without any predictive
model): 8, 314€ (see 4) and the pro t based on this selection: 34,634€ (see 5).
Example
If with those unit cost/pro t (see 1 on the graphic below), you select the option Maximize Pro t, the matrix is updated as
follow:
This is custom documentation. For more information, please visit the SAP Help Portal 66
6/30/2021
To maximize your pro t, the application recommends to target 50.5% of the population, not 24.1% (see 2). This would
represent 95.3% of the detected target (see 3). Using this proposed threshold, the pro t will be 44,114€ (see 4).
Use the Performance Curves tab to compare the performance of your predictive model to a random and a hypothetical perfect
predictive model.
Related Information
The Detected Target Curve
The Lift Curve
The Sensitivity Curve (ROC)
The Lorenz Curves
The Density Curves
The Detected Curve compares your predictive model to the ideal and random predictive models. It lets you determine the
percentage of the population to contact to reach a speci c percentage of the actual positive target.
Example
A company wants to do a mailing campaign. They have built a predictive model to target to which customers to send the
campaign. The predictive model will classify the customers into two categories:
This is custom documentation. For more information, please visit the SAP Help Portal 67
6/30/2021
Positive Targets: The customers will response to the campaign.
The predictive model debrief displays the following Detected Target curve:
With a random predictive model, you would reach 30% of the positive population (= population that will response to
the mailing).
With a perfect predictive model, you would reach 100% of the positive population (= population that will response to
the mailing).
With the Smart Predict predictive model (the validation curve), you would reach 78% of the positive population (=
population that will response to the mailing).
Related Information
In uencer Contributions
Debrie ng Classi cation Predictive Model Results
Target Statistics
The lift is a measure or the effectiveness calculated as the ratio between the results obtained with and without a predictive
model. The lift curve evaluates predictive model performance in a portion of the population.
The Y axis shows how much better your model is than the random predictive model.
Example
A company wants to do a mailing campaign. They have built a predictive model to target to which customers to send the
campaign.
The predictive model will classify the customers into two categories:
You would reach 3.09 times more positive cases with your predictive model than with a random predictive model.
A perfect predictive model would reach 4.19 times more positive cases than the random predictive model.
This curve shows the True Positive rate against the False Positive rate as the detection threshold is varied:
The X Axis shows the [1-Speci city]. It represents the proportion of actual negative targets that have been predicted
positive (False Positive targets).
The Y Axis show the Sensitivity. It represents the proportion of actual positive targets that have been correctly
predicted (True Positive targets).
This is custom documentation. For more information, please visit the SAP Help Portal 69
6/30/2021
Each point on the curve represents a Sensitivity/[1-Speci city] pair corresponding to a particular threshold, so the closer the
curve is to the upper left corner, the higher the overall accuracy of the predictive model.
Example
Take the following Sensitivity Curve:
At 40% of False Positive targets (observations incorrectly assigned to the negative target) we see the following:
A random predictive model (that is, no predictive model) would classify 40% of the positive targets correctly as True
Positive.
A perfect predictive model would classify 100% of the positive targets as True Positive.
The predictive model created by Smart Predict (the validation curve) would classify 96% of the positive targets as
True Positive.
Using the selector, you can display the cumulative percentage for:
[1-Sensitivity], where Sensitivity is the proportion of the actual positive targets that have been correctly predicted,
Speci city, which is the proportion of actual negative targets that have been correctly predicted..
This is custom documentation. For more information, please visit the SAP Help Portal 70
6/30/2021
The X Axis shows the percentage of the population ordered from the lowest to the highest probability. The Y Axis shows the [1-
Sensitivity], that is [1- the proportion of positive targets classi ed as True Positive]. This is equivalent to the proportion of the
missed positive targets.
The results are ordered from the lowest probability (on the left) to the highest probability (on the right).
The X Axis shows the percentage of the population ordered from the lowest to the highest probability whereas the Y Axis shows
the Speci city.
The positive targets would represent the population with a high risk: this population should not be granted a credit.
The negative targets would represent the population with a low risk: This population could be granted a credit.
For the following examples, we will consider a threshold set at 80% of the population with the lowest probability that the
customers cannot reimburse the credit.
Example
We got the following [1-Sensitivity] Lorenz Curve:
A "random predictive model" (that is no predictive model) would not identify 80% of the population with a high risk (=
population that should not be granted a credit).
This is custom documentation. For more information, please visit the SAP Help Portal 71
6/30/2021
A perfect predictive model would not identify 17% of the population with a high risk (= population that should not be
granted a credit).
The predictive model created by Smart Predict (the validation curve) would not identify 40% of the population with a
high risk (= population that should not be granted a credit).
A random predictive model would classify 80% of the population with a low risk as True Negative (= population that
could be granted a credit).
A perfect predictive model would classify 100% of the population with a low risk as True Negative (= population that
could be granted a credit).
The predictive model created by Smart Predict (the validation curve) would classify 93% of the population with a low
risk as True Negative (= population that could be granted a credit).
Understand how the positive targets and the negative targets are distributed in your predictive model.
The Density curves display the density function of the score (probability that an observation belongs to each class) for positive
and negative targets.
This is custom documentation. For more information, please visit the SAP Help Portal 72
6/30/2021
The length of an interval is its upper bound minus its lower bound.
The X axis shows the score and the Y axis shows the density.
As a default view, a line chart is displayed with the following density curves:
The blue curve, Positives: This curve displays the distribution of population with positive target value per score value.
The yellow curve, Negatives: This curve displays the distribution of population with negative target value per score value.
As an example, check the density curves below. The rst example is a good model because there is a small overlapping zone
with low density. This means the predictive model is pretty good at separating the positive and negative cases. Whereas in the
second example, you see a large zone with high density for both positive and negative cases.
The fewer observations there are and the smaller the score interval for the overlap zone, the better it is.
Example
A good predictive model:
Example
A bad predictive model:
This is custom documentation. For more information, please visit the SAP Help Portal 73
6/30/2021
Use the dropdown list to access and analyze the reports on in uencers and predictive model performance.
What do the values of the two main How Which Which Can I see any What's next?
Clic performance indicators mean? does in uencers group of errors in my
k the have the categories predictive You have two
the target highest has the model ? Is my possibilities:
are
This is custom documentation. For more information, please visit the SAP Help Portal 74
6/30/2021
a Quickly check if your predictive model is value impact on most predictive Your are
for accurate and robust, checking the two main appear the target? in uence model satis ed with
morperformance indicators: in the on the producing your predictive
e differe Check how target? accurate model's
info Root Mean Squared Error (RMSE) nt data the top ve predictions? performance.
rma measures the average difference Then you can
source in uencers In the
tion between values predicted by your impact on the In uencer Compare the use it:
s?
. predictive model and the actual values. target. Only Contribution prediction Generating
The smaller the RMSE value, the more the top ve s report, accuracy of your Your
Get
accurate the predictive model is. contributing analyze the predictive model Predictions.
some
This is custom documentation. For more information, please visit the SAP Help Portal 75
6/30/2021
For more De ne
information, Setting
refer to s and
Category Train a
In uence, Classi c
Grouped ation or
Category Regress
In uence and ion
Grouped Predicti
Category ve
Statistics. Model.
Cau
tion
You
will
eras
e the
previ
ous
versi
on!
Delete
your
predicti
ve
model.
See
Deletin
ga
Predicti
ve
Model.
This chart shows the accuracy of your predictive model. It displays the actual target value as a function of the prediction.
To build the graph, Smart Predict groups these predictions on 20 segments (or bins). Each segment represents roughly 5% of
the population.
This is custom documentation. For more information, please visit the SAP Help Portal 76
6/30/2021
the mean of the actual target values (Target Mean)
The Validation - Actual curve shows the actual target values as a function of the predictions.
The hypothetical Perfect Model curve shows that all the predictions are equal to the actual values.
The Validation - Error Min and Validation - Error Max curves show the range for the actual target values.
For each curve, a dot on the graph corresponds to the segment mean on the X-Axis, and the target mean on th Y-axis.
The area between the Error Max and Error Min represents the possible deviation of your current predictive model: It's the
con dence interval around the predictions.
What can the chart tell you about your predictive model's accuracy?
You can draw three main conclusions from your Predicted vs. Actual chart depending on the relative positions of the curves on
the graph:
Your predictive model isn't accurate. Con rm this conclusion by checking the prediction con dence indicators. If the
indicators con rm your predictive model isn't reliable, you can improve its accuracy by adding more observations or
in uencers to your training data source.
Your predictive model is accurate. Con rm this conclusion by checking the predictive con dence indicators. If the
indicators con rm its reliability, you can trust your predictive model and use its predictions.
The Validation and Perfect Model curves match closely but diverge signi cantly on a segment.
Your predictive model is accurate, but its performance is hindered in the diverging segment. Con rm this conclusion by
checking the predictive con dence indicators. If the indicators con rm its overall reliability, you can improve that
segment's predictions by adding more observations or in uencers to it in your training data source.
If your Predicted vs. Actual chart is between any of these three cases, the prediction con dence indicators remains the best
way to assess your predictive model's accuracy.
Example
You are working for an insurance company. You want to adapt client’s premium rates according to their age while accounting
for their risk of sudden death. You want to make sure the age tiering is accurate.
This is custom documentation. For more information, please visit the SAP Help Portal 77
6/30/2021
The predictive model debrief displays the following Predicted vs. Actual graph:
In our example, when the prediction (in blue) is 45 years old, the actual value ("validation value" taken from our historical
data) is 44.75 years old. The error min and error max calculated by our predictive model are respectively 33.17 years old and
56.34 years old.
As you can see, the blue curve (our predictive model) and the green curve (the hypothetical perfect model) are very similar,
then it means that you can rely on the predictions.
Target Statistics
For continuous target, the Target Statistics give descriptive statistics for the target variable in each data source:
Name Means
Minimum Minimum value found in the data source for the target variable.
Maximum Maximum value found in the data source for the target variable.
Standard Deviation Measure of the extent to which the target values are spread around
their average.
Analyze the reports to get information on your predictive model composition and evaluate your predictive model performance.
This is custom documentation. For more information, please visit the SAP Help Portal 78
6/30/2021
Is the main performance What are How accurate is my What are the past data What's next?
Clic indicator high enough to the predictive model? that most in uences the
k consider my predictive predicted signal? You have two
the model robust and accurate? values Use the Forecast vs. possibilities:
are provided Actual graph to visualize Identify whether the signal
a Check the quality of your the predicted values is in uenced by the recent Your are satis ed
by the
for predictive model performance with your
predictive (predictive forecast) and past or far past in the case
mo over the Horizon-Wide MAPE. actual values (signal) for of an autoregressive predictive model's
model?
re the training data source. component. performance.
infoThe Horizon-Wide MAPE is the You can then quickly see Then you can use
Analyze
rm evaluation of the "error" that how accurate your The lags are numbered it: Saving
the
ati would be made if the forecast Predictive
predicted predictive model is, what with negative integers
on. was calculated in the past Forecasts
values for are the outliers, the zone representing their distance
where the actual values are of possible errors. in the past from the Generated by a
the
known. A Horizon-Wide MAPE of predictive forecast. Lag -1 is the point Time Series
zero indicates a perfect For more information, referin the past just before the Predictive Model
model over
predictive model. The lower the a set of to The Forecast vs. Actual forecast. Lag -5 is ve into a Dataset or
Horizon-Wide MAPE, the better known data Graph and The Signal points in the past. The Saving Predictive
your predictive model Outliers. higher the absolute value, Forecasts Back
from the
performance. the further the point is in into Your Planning
training
the past. Model.
data
For more information, refer to source. You would like to
Horizon-Wide MAPE. For more information, refer
see if you can
to Past Signal Value
Check if improve your
there are Contributions.
predictive model's
outliers in performance:
the
forecasts Duplicate
and detect your
anomalies current
on the predictive
signal. model and
experiment
For more with
informatio updated
n, refer to settings.
The You can
Predictive then
Forecasts, compare
This is custom documentation. For more information, please visit the SAP Help Portal 79
6/30/2021
The Signal the two
Outliers versions
and The and nd
Signal the best
Anomalies. one. See
Duplicating
a
Predictive
Model.
Update the
settings of
your
predictive
model and
retrain it.
See De ne
Settings
and Train a
Time
Series
Predictive
Model.
Cautio
n
You will
erase
the
previou
s
version!
Delete
your
predictive
model. See
Deleting a
Predictive
Model.
Viewing Entities
If you choose to get predictive forecasts per entity, the reports for each entity are available in the Forecast and Explanation
tabs. If there are less than 20 entities, then these reports are available automatically following training. You select the column
values that appear together forming an entity, for example Product X, Store Y, from the top left dropdown list in both tabs to
view its report.
If a predictive model contains more than 20 entities, the reports for each entity are not available automatically following
training, but are accessed on demand. You just have to select the entity, and after a slight delay, the reports are created and
made available. This is to ensure that time isn't lost creating reports for predictive models with a high numbers of entities, when
not all of those reports may be required all at once. Once a report is available, you can then access it immediately any time
afterwards.
This is custom documentation. For more information, please visit the SAP Help Portal 80
6/30/2021
The Signal Statistics is described in detail in the Forecast tab of your report. The Signal Statistics describe information on the
signal (target), the minimum, maximum, and average (mean) values, as well as the standard deviation measure.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
The Forecast vs. Actual graph appears in the Forecast tab of your report. The Forecast vs. Actual graph shows curves for the
predicted values (forecast) and actual values (signal) for the time series data source. You can then quickly see how accurate
your predictive model is. The predictions are displayed at end of the graph.
For each forecasted value, the predictive model shows an estimation of the minimum and maximum error. The area between
this upper and lower limit of the possible errors in the predictive forecasts produced by your predictive model, is called the
con dence interval. It's only displayed for the predictive forecasts.
Outliers are values marked with a red circle on the graph (see The Signal Outliers for more information). The forecasting error
indicator is the absolute difference between the actual and predicted values. This is also called the residue. The residue
abnormal threshold is set to 3 times the standard deviation of the residue values on an estimation (or validation) data source.
The forecasted value and error limit values are listed in the table for each predictive forecast.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
Note
You can display the data as a table. See Customizing the Visualization of Your Debrief.
Related Information
The Signal Outliers
The Predicted Forecasts are described in the Forecast tab of your report.
Error min and max: Minimum and maximum deviation measures of the values around the predictive forecasts.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
This is custom documentation. For more information, please visit the SAP Help Portal 81
6/30/2021
Anomalies are signal values that are outside the zone of possible error for the predictive forecast, which is de ned by the upper
and lower limit.
Example
Your facilities department wants to monitor the electrical consumption of your building. The signal is very regular with
consumption peaks in the day time, low consumption in the night, and some seasonalities related to vacations, for example.
A predictive model based on this signal will forecast a very low consumption at 11:00 PM.
At 11:15pm the predictive model is re-forecasted and the actual consumption for 11:00 PM is known. It is very far from what
was expected by the predictive model: an anomaly is detected.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
The default view (table) displays details about the outliers on the signal and the predictive forecasts. An actual signal value is
quali ed as outlier once its corresponding forecasting error is considered to be abnormal relative to the forecasting error mean
observed on the estimation data source. The forecasting error indicator is the absolute difference between the actual and
predicted values. This is also called the residue. The residue abnormal threshold is set to 3 times the standard deviation of the
residue values on an estimation (or validation) data source.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
Note
We describe the modeling techniques used by Smart Predict in the two tables below. In the rst table, we describe the
breakdown modeling technique that may be applied when you're working with a time series with limited disruptions. In the
second table, we describe the smoothing technique that may be applied when you're working with a disrupted time series
that doesn't follow a regular trend or cycle.
1. When a time series breakdown technique is used to create your predictive model,
your report can contain the following information:
Information Description
This is custom documentation. For more information, please visit the SAP Help Portal 82
6/30/2021
Information Description
Textual explanation The textual information describes the modeling technique that is
used to calculate the forecast. This is textual explanation you see
in the report:
The predictive model was built by breaking down the time series
into different components.
Trend The Trend is the general orientation of the time series. The report
can show linear or quadratic trends.
This is custom documentation. For more information, please visit the SAP Help Portal 83
6/30/2021
Information Description
For example, the predictive model can detect that the previous 37
values have an impact on the actual values.
For more information you can refer the the chapter called Past
Signal Value Contributions.
Residuals Residuals refer to what is left when the trend, cycles, and
uctuations have been extracted from the initial time series.
Residuals are neither systematic nor predictable. They re ect the
part of the signal that Smart Predict can't explain or model. The
smaller the residuals, the better the predictive model. A good
predictive model produces residual data that contains no pattern.
2. When a time series smoothing technique is used to create your predictive model,
your report can contain the following information:
Information Description
Textual explanation The textual explanation describes the modeling technique that is
used to calculate the forecast. This is textual explanation you see
in the report:
The predictive model was built incrementally by smoothing the
time series, with more weight given to recent observations.
This is custom documentation. For more information, please visit the SAP Help Portal 84
6/30/2021
Information Description
Cycles A time series predictive scenario can detect seasonal cycles, with
or without amplitude variations. These cycles are calculated using
an algorithm that applies an exponential smoothing techniques on
the past data over time. The recurrence of seasonal cycles is
based on a calendar time unit such as day, week, month etc. The
report shows the recurrence of the cyclic pattern. The following
seasonal cycles can be detected:
Residuals Residuals refer to what is left when the trend and cycles have been
extracted from the initial time series. Residuals are neither
systematic nor predictable. The smaller the residuals, the better
the predictive model. A good predictive model produces residual
data that contains no pattern.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
Related Information
Past Signal Value Contributions
The Past Signal Value Contributions are described in the Fluctuations section of the Explanation tab of your report.
At the step of identifying the model components, Smart Predict found that previous values of the time series have an impact on
the actual values.
The Past Signal Value Contributions graph shows how the signal is in uenced by the recent past, or distant past in the case of
an autoregressive component.
The lags are numbered with negative integers representing their distance in the past from the predictive forecast. Lag -1 is the
point in the past just before the forecast. Lag -5 is ve points in the past.
Example
Let's take the following example: we have created a predictive model to forecast the ozone rate for the next 12 months.
This is custom documentation. For more information, please visit the SAP Help Portal 85
6/30/2021
We have obtained the following Past Signal Value Contributions graph:
Thanks to this graph, you can identify if the ozone rate is in uenced by observed values in the recent or distant past. It also
shows the most important dates. The lags are numbered with negative integers that represent how far back in the past they
are from the predictive forecasts. Smart Predict found that the 10 previous values have an impact on the subsequent values.
This is why the graph stops at 10. Using these lags, you can analyze how the previous values in uenced the subsequent ones.
Here you see that the lags -7, and -6 are very in uential.
Note
If you choose to get predictive forecasts per entity, you have this information for each entity.
Context
To customize the generated debrief, use the Visualizations Settings dialog box. Depending on the metrics you want to display,
different options are available.
Remember
The customized settings are stored for each type of predictive model and user. For example, if you customize a classi cation
debrief for a given predictive model, the debrief of all the other classi cation predictive models is updated with the same
change. However, the debrief settings for other users are not affected; their customizations are unchanged.
Note
To return to the default visualization settings, click Reset.
Procedure
1. Click the icon next to the area you want to customize.
2. In the Data section, change the sorting/ranking options if required. These depend on the active metric.
This is custom documentation. For more information, please visit the SAP Help Portal 86
6/30/2021
3. In the Display section, select another chart from the Type list. The visualization types available depend on what kind of
metrics you have.
Note
For example, a scatter plot visualization is only available if the metrics include at least two measures.
4. In the Analysis section, select the information for the type of chart that you want to display.
According to the type of visualization you have chosen at step 2, you can assign data elds to the feeding area of your
visualization.
For example, if you select a column chart to display the metrics, you de ne which data eld to display in the X and Y axis.
Also, you can map the data series of the chart to different metrics using the One color per option.
Note
When you change the type of visualization, the displayed information elds change accordingly. For example, if you
choose a bar chart, you can select the category in uencers to display on the X axis, the in uencer or measure for the
Y axis, and the colors for either the in uencer or measure values. You can also choose to display a single chart, or
separate charts for each iun uencer or measure. These display options would change if you chose a table, for which
you would only have columns to select.
5. In the Interactivity section, select an element that limits the chart display to data that interacts with only that element.
Depending on the type of chart, this could be a single in uencer or measure.
Note
Some elements are mandatory because they are data elds that qualify the metrics. You must assign the mandatory
elements to the boxes that specify the chart elements. Otherwise, the missing data eld is automatically entered in
the Selectors box.
Related Information
Choosing the Right Chart for Your Debrief
Debrie ng Classi cation Predictive Model Results
Debrie ng Regression Predictive Model Results
Debrie ng Time Series Predictive Model Results
The type of chart available depends on whether it is appropriate for visualizing the type of in uencer analysis in your debrief.
Depending on how you are visualizing your model data, you can choose from the following chart types:
Pie Compare categorical data as percentages. If you have Show the predicted percentage breakdown of contributing voting
more than 10 contributing in uencers, then using a bar regions to the results of a national election.
or column chart would show a clearer view of the data
spread.
Bar Compare categorical data along the vertical axis by You have an employee turnover predictive model that predicts
the category count or percentage on the horizontal axis potential churn level for staff. Your target variable is Employee
displayed as bars. Churn Estimate. The in uencers Marital-status, Age, Quali cation
Level, Salary, Recent Promotion, and Training Participation are
Column Show the same information as a bar chart, with the plotted along one axis, and the percentage contribution of each
axes interchanged: inlfuencers along the horizontal category to Churn Estimate is plotted as a bar or column on the
axis by the group count, or percentage values on the other.
vertical axis displayed as columns. For a table, the columns would be the category in uencers
Marital-status, Age, Quali cation Level, Salary, Recent Promotion,
and Training Participation, and the percentage contribution of
This is custom documentation. For more information, please visit the SAP Help Portal 87
6/30/2021
each category to Employee Churn Estimate appears in each cell
Chart type Use to Example
row.
Table Represent the same type of information as column
and bar charts, but in a table format where the
categorical inlfuencers are represented as columns,
with the count or percentage values in row cells.
Radar Display data for multiple in uencers in two You have a predictive model to predict sales of candy. Your target
dimensions with multiple categories represented on variable is Chocolate Sales, and you plot different chocolate
radial axes. avors around the radial axes of a bubble chart. Your categorical
in uencer that is measured over the axes is three brands of
chocolate. The spread of sales gures around the axes would give
a good idea of which different brands would do better for the
same avor than others.
Tag Cloud Represent category in uencer names as text A retail chain selling multimedia and cultural products wants to
juxtaposed geographically on a canvas, where the font venture into publishing to produce a compilation of "retro" styled
size of each text label indicates the in uence on the detective stories. The target audience is younger readers not
target variable. Tag charts are useful when the familiar with traditional detective characters. They develop a
in uencer names have semantic signi cance, for predictivve model including in uencers such as education level,
example keywords in a twitter feed, country names, age, buying history for DVDs, books, games, MP3s and streaming
business companies' stock market values, or different video, to predict a possible taste for different detective pro les.
television shows' audience ratings for a night's The results could be easily represented as a tag cloud with the
viewing. names most likely or not to appeal, for example Sherlock Holmes,
Father Brown, Miss Marple, Hercule Poirot, Auguste Dupin, Philip
Marlowe, and others.
Line Show a model performance curve. You want to see the performance curve of your training predictive
model compared to the validation and random plots for a
predictive model that predicts what percentage of a population
are identi ed positively as having a disease, after being tested
using a new screening test.
Bubble See the correlation between two in uencers, one You have a predictive model to predict fatal car accidents. Using
dependent on the other. The correlation is represented a bubble chart, you could evaluate dependency between
by third in uencer at the plot position and the area of in uencers such as "Car Accident Frequency" and "Speed", with a
the plot shows the magnitude of the relationship. categorical in uencer of Yes or No for Fatality.
You've assessed the performance of your predictive model and you're con dent using it to generate predictions.
To generate the predictions, the process can differ following the type of predictive model and the type of data source used.
This image is interactive. Hover over each area for a description. Click highlighted areas for more information.
This is custom documentation. For more information, please visit the SAP Help Portal 88
6/30/2021
Generating and Saving the Predictions for a Classi cation or Regression Predictive
Model
Context
You want to generate and save the predictions for a predictive model of type classi cation or regression.
Procedure
1. Open the relevant predictive model.
3. In the Apply To Population section, select the application you want to apply your predictive model on. Don't forget that
this dataset must be prepared beforehand, it cannot be created at this step.
4. In the Generated Dataset section, you select the additional columns you want to have in your generated dataset:
Replicated Column: select which columns from the training data source that should replicated in the generated
dataset.
Restriction
If your application dataset contains more columns than your training dataset, the additional columns will be
ignrored by the application process.
Statistics & Predictions: This is information about your predictive model that you want to have in the generated
dataset.
Apply Date It's the start date of the predictive model application. The type of the column is TIMESTAMP.
Train Date It's the start date of the predictive model training. The type of the column is TIMESTAMP.
This is custom documentation. For more information, please visit the SAP Help Portal 89
6/30/2021
Statistics: select the statistics regarding the in uencers you want to save in your dataset:
Statistic Description
Assigned Bin When selected, individuals in the application population are assigned to referring
quantiles de ned on the validation population.
Assigned bins explained: The validation population during training is spread out in
quantiles (bins), each de ned by a range of scores, to serve as references (assigned
bins) to an application population. When a predictive model is applied, each individual
in the application population is allocated to an assigned bin based on its predicted
score. As each assigned bin represents 10% of the training population, if the
population structure is unchanged, this % value should remain stable on the
application population. If this is not the case, it doesn’t mean that the predictive
model is no longer accurate, rather that the structure of the population has changed.
For example there are more or less potential churners now, than in the past. The
accuracy of the predictions should be monitored to back up the decisions.
Note
The number of bin is set to 10 and isn't customizable.
See the section How does Smart Predict Create Assigned Bins? for information on
using assigned bins.
Outlier Indicator For each row in the application dataset, the Outlier Indicator is 1 if the row is an outlier
with respect to the target, otherwise 0.
Prediction Description
Predicted Category For each row in the application dataset, the Predicted Category is the target category
determined by the predictive model.
Classi cation predictive models
(nominal target with 2 values The percentage of predicted target categories found in the application dataset
only) corresponds to the Contacted Population percentage that is set by default when
entering the Confusion Matrix.
Any change done by the user in the Confusion Matrix does not affect the Predicted
Category in the generated dataset.
Prediction Probability For each row in the application dataset, the Prediction Probability is the probability
that the Predicted Category is the target value.
Classi cation predictive models
(nominal target with 2 values
only)
Predicted Value For each row in the application dataset, the Predicted Value is the value predicted for
the target.
Regression predictive models
(continuous target)
Note
This is custom documentation. For more information, please visit the SAP Help Portal 90
6/30/2021
If you do not select any statistics or predictions, only the target and the key in uencer(s) are included.
5. Click Apply.
The status of your predictive model is updated to <Applied> and you can nd your generated dataset with the
predictions under Main Menu Browse Files . You can then access to your results directly by opening the generated
dataset or depending on your business needs, consume the output dataset in a BI story.
While applying a classi cation or regression predictive model to an application dataset, you can require to get the statistics
information on Assigned bins. But what are Assigned bins and how should they be leveraged?
During the training step, Smart Predict uses past observations compiled in a training dataset to create a predictive model.
For a classi cation predictive model: Smart Predict associates to each observation (customer, product, etc…) a
probability that an event (target) occurs. Then, it uses this probability to group the list of observations, ranged in
decreasing order from the most probable to the least probable in 10 bins (or groups). Each bin represents 10% of those
observations and in each bin, the observations have the same level of probability.
For a regression predictive model: Smart Predict associates to each observation a predicted value. Based on this value,
it groups the list of observations ranged from the highest to the lowest predicted value in 10 bins (or groups). Each bin
represents 10% of those observations and in each bin, the observations have the same value or range of values.
During the application step, Smart Predict refers to the bins de ned in the training step to assign the current observations
from the application dataset to the relevant bin. It compares each value obtained by the predictive model with the limits of each
Assigned bin de ned in the training step. Then it assigns each observation to the relevant bin.
Example
Let's take the following example: you want to know if customers will buy your new product "P". You train your predictive
model using a training dataset containing past observed observations for 1,000 customers. As a result, Smart Predict has
ranged your observations as follows:
Bin number Number of customers in the bin Average probability to buy "P"
This is custom documentation. For more information, please visit the SAP Help Portal 91
6/30/2021
Then, you use your predictive model to get predictions on a new set of customers. Let's say your application dataset
contains observations on 700 customers.
Smart Predict will give you the following result in the generated dataset:
Bin number Number of customers in the bin Estimation of the probability to buy "P"
6 45 customers (~ 6%) 8%
7 50 customers (~ 7%) 7%
8 35 customers (~ 5%) 4%
9 32 customers (~ 5%) 3%
10 88 customers (~13%) 1%
Note
It can happen that the distribution of the
observations is not similar (10% of
observations in each bin). It's not because
the structure of the population has
changed that the predictive model is not
relevant anymore (see next point).
This is custom documentation. For more information, please visit the SAP Help Portal 92
6/30/2021
Monitoring the population structure Dividing the dataset into bins means that
Example
each bin should contain +/-10% of the
observations. However, if this changes, then it Having a look back at the example above,
indicates that your population is changing. For you can see that the distribution per bin in
example, there could be an effect that the generated dataset is not similar as in
advertising on social media sites might the training dataset. For example, for bin 1,
in uence and attract more young customers, we have 200 customers, which correspond
rather than other age groups. It doesn't mean to 28% of the dataset. It could simply be
that the predictive model is not efficient because you have more young customers,
anymore. But it may be an alert to check this but with the same buying behaviour as
performance with more data from the recent young customers in the training population.
past (than the ones used to train the model).
Monitoring the predictive model Once the predictive model has been applied,
performance it is easier to analyze the classi cation
performance by bins, rather than interpreting
the performance curve. Use the classi cation
rate (see The Metrics) calculated at the
training step associated with each bin, and
detect any variation of this rate when applying
your predictive model.
Example
In the following example, you want to predict the deal values for the next quarter. Your training dataset contains
observations on 3,000 customers.
1 300 customers (= 10% of the dataset) Predicted values between 90,001 and
100,000 $
2 300 customers (= 10% of the dataset) Predicted values between 80,001 and
90,000 $
3 300 customers (= 10% of the dataset) Predicted values between 70,001 and
80,000 $
4 300 customers (= 10% of the dataset) Predicted values between 60,001 and
70,000 $
5 300 customers (= 10% of the dataset) Predicted values between 50,001 and
60,000 $
6 300 customers (= 10% of the dataset) Predicted values between 40,001 and
50,000 $
7 300 customers (= 10% of the dataset) Predicted values between 30,001 and
40,000 $
8 300 customers (= 10% of the dataset) Predicted values between 20,001 and
30,000 $
9 300 customers (= 10% of the dataset) Predicted values between 10,001 and
20,000 $
This is custom documentation. For more information, please visit the SAP Help Portal 93
6/30/2021
10 300 customers (= 10% of the dataset) Predicted values between 0 and 10,000 $
Then, you use your predictive model to get predictions on a new set of customers. Let's say your application dataset
contains observations on 800 customers.
Smart Predict will give you the following result in the generated dataset:
1 110 customers (~ 14% of the dataset) Predicted values between 90,001 and
100,000 $
2 100 customers (~ 13% of the dataset) Predicted values between 80,001 and
90,000 $
You can use Assigned Bins to monitor the population structure: As each bin should contain +/-10% of the observations, if these
gures increase or decrease for one or several bins, it indicates that your population is changing and you might need to retrain
your predictive model with more recent data. For example, having a look back at the example above, you can see that the
distribution per bins is quite similar in the generated dataset as in the training dataset. However, we could have different
results. For example, for bin 1, we could have 300 customers, which correspond to 37.5% of the dataset.
You've assessed the performance of your predictive model and you're con dent saving the predictive forecasts into a dataset.
Context
To save the predictive forecast results into a dataset:
Procedure
This is custom documentation. For more information, please visit the SAP Help Portal 94
6/30/2021
1. Open the relevant predictive model.
4. Click Save.
In the predictive model list, the status of your predictive model is updated to <Applied>. You can nd your generated
dataset with the forecasts under Main Menu Browse Files .
Here are the columns that are added to your generated dataset:
Forecast This is the column where you nd the forecast values for the
signal based on the number of requested forecasts speci ed in
the predictive model settings.
Error Min For each requested forecast at a given horizon H, the predictive
model calculates a con dence interval. The Error Min value is
the lower bound of this con dence interval. It is equal to the
forecasted value – sigma(RMSE)*1.96, where sigma(RMSE)
represents the standard deviation of RMSE between the actual
and forecasted signal value at horizon H. The weighted value of
1.96 corresponds to a con dence level of 95%.
Error Max For each requested forecast at a given horizon H, the predictive
model calculates a con dence interval. The Error Max value is
the upper bound of this con dence interval. It is equal to the
forecasted value + sigma(RMSE)*1.96, where sigma(RMSE)
represents the standard deviation of RMSE between the actual
and forecasted signal value at horizon H. The weighted value of
1.96 corresponds to a con dence level of 95%.
2. Specify the Private Version to which you want to save the generated forecasts.
Tip
Private versions are only initially visible to the person who created them.
You can make them visible to other users by sharing the private versions. Shared versions are private versions
that you allow other people to see.
4. To also save forecasts for past periods, select Advanced Settings Save Forecasts for Past Period , and change the
default setting from Off to On.
Note
This is custom documentation. For more information, please visit the SAP Help Portal 95
6/30/2021
When the default setting of Save Forecasts for Past Period is set to Off, only forecasts for future periods are saved to
the private version of your input planning model, in the measure that was selected as Signal. When the default
setting is set to On, it allows you to save forecasts for a past period to the private version of your planning model. This
means you can assess the performance of your predictive forecast by using all the visual and modeling powers right
there in your story, to compare the difference between your predictive forecast and the actuals, plans or budget.
Related Information
About Version Management
Once the predictions are generated, a new dataset is created. You can augment your SAP Analytics Cloud stories or models
using the data available in this generated dataset. The process might differs depending on the type of dataset (acquired or live)
you used to generate this dataset and the insights you want to reuse in SAP Analytics Cloud.
This image is interactive. Hover over each area for a description. Click highlighted areas for more information.
Related Information
Using Your Acquired Dataset Generated by a Classi cation or Regression Predictive Scenario
Using Your Acquired Dataset Generated by a Time Series Predictive Scenario
Using Your Acquired Generated Dataset in a Story
Using Your Live Generated Dataset in an SAP Analytics Cloud Model
You can use your acquired generated dataset either directly in a story, or in a story via an SAP Analytics Cloud model.
Depending on the type of predictive model, you need to keep different elements in your generated dataset to consume it.
Note
If you want to consume your generated dataset in a story or via an SAP Analytics Cloud model in SAP Analytics Cloud, only
the rst 100 columns will be taken into account.
This is custom documentation. For more information, please visit the SAP Help Portal 96
6/30/2021
Note
You need to keep speci c information in your generated dataset in order to consume it in a story. The type of information you
need, depends on the predictive scenario type. For more information, refer to the related links.
Directly in a story:
Note
When you upload the generated dataset in a story, it implicitly becomes an "embedded" model. For more information,
see Models in Stories.
If you update the generated dataset later on, the story will be updated as well.
This is custom documentation. For more information, please visit the SAP Help Portal 97
6/30/2021
In a story via a SAP Analytics Cloud model:
Note
The SAP Analytics Cloud model can be shared with other users.
If you update the generated dataset later on, the SAP Analytic Cloud model will not be updated.
Related Information
Using Your Acquired Dataset Generated by a Classi cation or Regression Predictive Scenario
Using Your Acquired Dataset Generated by a Time Series Predictive Scenario
Depending on how you want to use your Classi cation or Regression Predictive Scenario, you need to keep different information
in your generated dataset containing predictions.
The application dataset in uencers, if these in uencers are not available from another source.
The predictions
Note
If you have speci ed the key in uencer(s) during the training of your predictive model, it is automatically added.
This is custom documentation. For more information, please visit the SAP Help Portal 98
6/30/2021
Go to Files Create Story , then Access & Explore Data. Select Data acquired from an existing dataset or model and browse up
to your generated dataset. As soon as a change will be done in the generated dataset, the story will be automatically updated.
Note
When you upload the generated dataset in a story, it implicitly becomes an "embedded" model.
Combining existing data with the generated predictions and consuming them in a
story
To combine existing data with your predictions, you need to keep the following information in your generated dataset:
The predictions
Note
If you have speci ed the key in uencer(s) during the training of your predictive model, it is automatically added.
Thanks to the key in uencer(s), you can blend the predictions with other data sources, in the context of a story.
Note
You will then be able to easily share this model with other users.
For more information, refer to the related link Creating a New Model and Model preferences.
To do so:
3. From the proposed list, select Dataset and browse up to your generated dataset.
Related Information
Creating a New Story
Creating a New Model
Setting Up Model Preferences
Blending Data
Depending on how you want to use your Times Series Predictive Scenario, you need to keep different information in your
generated dataset containing the predictive forecasts.
This is custom documentation. For more information, please visit the SAP Help Portal 99
6/30/2021
Note
The date and signal are selected automatically.
Go to Files Create Story , then Access & Explore Data. Select Data acquired from an existing dataset or model and browse up
to your generated dataset. As soon as a change will be done in the generated dataset, the story will be automatically updated.
Note
When you upload the generated dataset in a story, it implicitly becomes an "embedded" model.
For more information, refer to the related link Creating a new story.
Combining existing data with the generated predictions and consuming them in a
story
To combine existing data with your predictions, you need to keep the forecasts in your generated dataset.
Note
The date and signal are selected automatically.
Thanks to the date variable, you can blend the predictions with other data sources, in the context of a story.
Importing the generated dataset containing the forecast in SAP Analytics Cloud
models
You can rst use the generated dataset to create a model and then consume it in a story.
To do so:
3. From the proposed list, select Dataset and browse up to your generated dataset.
Note
You will then be able to easily share this model with other users.
For more information, refer to the related link Creating a New Model and Model preferences.
Related Information
Creating a New Story
This is custom documentation. For more information, please visit the SAP Help Portal 100
6/30/2021
Creating a New Model
Setting Up Model Preferences
Blending Data
Context
Live datasets cannot be consumed in SAP Analytics Cloud as they are. To be able to use your predictions, you need the help of
an IT administrator and to go through some additional steps.
Here is an overview of the full process you have to follow while working with live datasets (see related links for more
information):
Caution
As of Google Chrome version 80, you need to con gure your SAP on-premise data source to issue cookies with
SameSite=None; Secure attributes. If the SameSite attribute is not set, cookies issued by your SAP data source
system will no longer work with SAP Analytics Cloud. Refer to SameSite Cookie Con guration for Live Data Connections for
more information.
Restriction
Your IT administrator user has to create calculation views before you can consume the generated tables containing your
predictions. For more information, see Creating Calculation Views to Consume Live Output Datasets.
Procedure
1. In the main menu of SAP Analytics Cloud, select Create Model .
This is custom documentation. For more information, please visit the SAP Help Portal 101
6/30/2021
2. Select Get data from a data source
4. In the Create Model From Live Data Connection window, enter the following information:
Connection: Select the previously created data connection (to this SAP HANA system)
Data Source: Enter rst 3 letters of the calculation view created by your SAP HANA technical user.
Related Information
Creating Calculation Views to Consume Live Output Datasets
Setting up Live SAP HANA Data Access for Smart Predict
Context
You can publish a predictive model to a Predictive Analytics Integrator (PAi) application as a new predictive scenario, or as a
predictive model within a scenario.
Note
You can nd more information Predictive Analytics Integrator at https://help.sap.com/pai.
Restriction
You can only publish classi cation and regression predictive models trained with an acquired dataset to a PAi Application. It
is not possible to publish predictive models trained with a live dataset.
Before you can publish to PAi, check the following pre-requisites with your SAP Analytics Cloud administrator:
You have the application role of Predictive Admin assigned to your user pro le.
Procedure
1. From the main menu, select Browse Predictive Scenarios .
The debrief page for the predictive model appears showing its statistics and KPIs.
The trained predictive models for the predictive scenario are listed.
Note
Only a trained predictive model can be published to PAi. It is published to the PAi application as a model speci cation
either as a new predictive scenario, or as a model within a predictive scenario.
4. Click the predictive model version that you want to publish to your PAi application.
This is custom documentation. For more information, please visit the SAP Help Portal 102
6/30/2021
5. Click the Publish icon in the toolbar menu for the page.
6. Depending on whether you want to publish the predictive model as a new predictive scenario or a model in a predictive
scenario, enter information for the following elds:
Field Actions
PAi Connection Select a PAi connection. If your connection doesn't appear in the list, see the SAP Analytics
Cloud administrator to make it available. Creating a PAi connection in SAP Analytics Cloud is
described here: SAP Analytics Cloud Connection to PAi for SAP S/4HANA Cloud
PAi Predictive Scenario Browse to the relevant catalog, and either select a predictive scenario to receive the model, or
select New Predictive Scenario to create a new one.
Add a predictive scenario name. The name must comply with Namespace rules in
S/4HANA:
The name of a Predictive Scenario must be all uppercase, and have a maximum of 20
characters.
7. Click Publish.
The publishing progress is indicated by messages in the Status column for the predictive model. The status messages
are described here: Predictive Model Publishing Status Messages.
When you publish a predictive model to a PAi application, the following status levels can be displayed that indicate the
publishing progress:
Status Description
Publish Pending Publishing hasn't started, but it is in a queue, and will start as soon as possible.
Published The predictive model has been published successfully without warnings or errors.
Publish Failed Publishing couldn't be completed. Click the status icon for more information.
Published with The predictive model has been published, but with one or more issues. Click the status icon for more
Warning information.
This is custom documentation. For more information, please visit the SAP Help Portal 103
6/30/2021
Short tour guide around Smart Predict with the Help topics accessible from the application interface.
These topics are accessed directly from the Smart Predict application. We've grouped them together as sort of a short tour
guide around the application interface. You'll be able to create a predictive scenario, as well as add and train a new predictive
model without needing to have a background in the predictive analytics eld. However, to get the most out of your new
predictive scenarios, we suggest that you take a bit of time to browse about the more in depth topics available in this guide as
well.
This is custom documentation. For more information, please visit the SAP Help Portal 104