MachineLearningPropertyModeling_UserGuide
MachineLearningPropertyModeling_UserGuide
Release Notes
Machine Learning for Petrel 2023.3
Version 2023
Machine Learning Property Modeling
User Guide
Version 2023.3.1.0
Copyright Notice
Copyright © 2023 SLB. All rights reserved.
This work contains the confidential and proprietary trade secrets of SLB and may not
be copied or stored in an information retrieval system, transferred, used, distributed,
translated or retransmitted in any form or by any means, electronic or mechanical, in
whole or in part, without the express written permission of the copyright owner.
Security Notice
The software described herein is configured to operate with at least the
minimum specifications set out by SLB. You are advised that such minimum
specifications are merely recommendations and not intended to be lim iting to
configurations that may be used to operate the software. Similarly, you are
advised that the software should be operated in a secure environment
whether such software is operated across a network, on a single system
and/or on a plurality of systems. It is up to you to configure and maintain your
networks and/or system(s) in a secure manner. If you have further questions
as to recommendations regarding recommended specifications or security,
please feel free to contact your local SLB representative.
Contents
The new stochastic algorithm enables you to provide any available data,
including seismic-derived attributes, facies, trend, regional, and geometrical
models, as input training features. It is locally adaptable and, therefore, there are
no requirements of stationarity, data analysis, data domain transformations, or
any other rigorous pre-conditioning of input target data. In addition to
deterministic petrophysical model estimations and stochastic model
simulations, you can request an uncertainty model of the estimate as an output,
as well as probability and quantiles volumes. You can do estimates of sweet
spots to get a realistic model of uncertainty and better constrained realizations.
The result is a robust data-driven prediction, with proper spatial conformance
and distribution.
1
EMBER trains a machine learning model using ensemble techniques,
fundamentally based on regressive decision trees, combining the input training
features with embedded Geostatistical prediction models (Kriging) to account
for stratigraphic information.
To access the EMBER dialog box, on the Property Modeling tab in Petrel, in the
Machine learning group, select EMBER.
Note: The Property modeling license needs to be selected first, before Petrel is
open. If you select File/License module and select the license after, it will not
activate the feature.
Overview
To open the EMBER dialog box, you must have an active 3D Grid. The dialog
box is organized with the upper section for training the model, and the lower
section is for generating simulations and additional controls.
2
The model itself and all the associated results are stored in a subfolder under
the Properties folder. To preserve the integrity and consistency between a
model and its results, you cannot move individual properties into or outside of
these EMBER folders. Use the Property calculator to duplicate an EMBER
property outside of its folder.
Each EMBER model is represented by a folder containing all the estimated and
simulated results.
3
To remove properties, see the Advanced tab.
There is no risk of overtraining, because the input data is ranked and sorted by
the algorithm.
Note: Any missing cells in the training data results in missing cells in the model
and predictions.
You must switch on Train ML engine to execute the training.
4
4 Optional: To request a (stochastic) simulation, in the Simulation tab,
switch on Generate simulations, and enter the details to refine your
simulation.
To find out more about the options on the Simulation tab, see Create simulated
models (page 6).
5 Optional: On the Trained distribution outputs tab, enter additional
estimated outputs.
To find out more about the additional estimated outputs you can select, see
Trained distribution outputs tab (page 8).
6 On the Advanced tab, you can enter advanced controls.
To find out more about the advanced controls, see Advanced tab (page 10).
7 Select Run to train the model.
Each new model creates a folder in the Models pane that contains the
(estimated) Mean, Median, and Spread properties. These and other child
properties are consistent with the training model. If the model is re-trained (that
is, if you are editing an existing model), all child properties are overwritten to
ensure consistency.
Estimation mean.
At each cell, the arithmetic average of all the values predicted for that cell from
the decision trees.
Estimation median
At each cell, the median value of all the values predicted for that cell from the
decision trees.
Estimation spread.
At each cell, the difference between P90 and P10 percentiles from all the
values predicted for that cell from the decision trees. In other words, it is the
uncertainty of the estimation model, measured as the width of the estimated
conditional probability distribution at each cell.
5
Create simulated models
When the Training target and Training features have been defined to train the
ML engine, in the Simulation tab, you can enter the details to generate a
(stochastic) simulation. The trained model creates a robust pool of conditional
distribution at each cell from which stochastic simulations can be generated.
The simulations add the necessary elements of variability and heterogeneity to
the model required for properly constraining flow simulations, with a
characteristic applied globally or locally across the grid.
To open the EMBER dialog box, you must have an active 3D Grid.
The results are numbered with an incremental suffix and added to the EMBER
folder.
Note: When you switch on Generate simulation, you can also run with the
default entries.
6
2 In the Realizations box, enter the number of simulations to generate.
When you run multiple realizations, the seed value is different for each
realization, and controls the random path during the simulation.
7
9 In the Continuity list, select the fine-scale characteristic of the interwell
texture.
This describes the local behavior of variability.
10 Select Run to apply.
The simulation seed number is preserved for each object to enable
reproducibility. You can fix the seed in the Advanced tab.
The simulation seed number is preserved for each simulation to enable
reproducibility. You can fix the seed in the Advanced tab.
You can select check boxes for each requested result. You can also select
along the slider bar, when it is available, to add a new slider control. You can
add more than one slider for each bar. Right-click a slider to remove it.
8
• Distribution mean & median: Select this check box to include the
arithmetic average and median from all the decision trees
predictions for each cell.
• Estimation mean (Arithmetic average): Select this check box to
include the arithmetic average from all the decision trees
predictions.
• Estimation spread (P90-P10): Select this check box to include
the uncertainty of the estimation model, measured as the P90
percentile subtracted by the P10 percentile of the estimated
conditional probability distribution at each cell.
• Probability of estimate above, Probability of estimate
below, Probability of estimate between: Select these check
boxes to include the probability of the estimated value at each cell
satisfying this cutoff, sampled from the conditional distribution
generated by the training model. The numbers in the slider bar
refer to the units of the property to be modeled.
• Additional quantiles: Select this check box to include the quantile
result for each cell drawn from the distribution generated by the
training module. Specify in decimal fraction from 0 to 1.0.
9
Advanced tab
In the Advanced tab, you can enter advanced controls, such as additional
training data, perform blind tests, and you can preview, inspect, or set the seed
numbers to control the pseudorandomness of the training and simulation
processes.
The Advanced tab has the controls for both the training and simulation
processes.
10
Training seed and Simulation seed
Select the Training seed and/or Simulation seed to enforce a specific seed
number for a run, and enter the seed number in the respective box.
The numbers observed in the seed boxes are the numbers used for each run.
• When Edit existing is selected in the EMBER dialog box, you can
inspect the training seed number for an existing training model in
the seed box.
• The starting seed number for an existing simulated result is
preserved in the Settings dialog box, in the Info tab, in
the Comments sub-tab of each simulated property.
Note: If multiple simulations are requested in one simulation run, all the multiple
realizations preserve the same starting seed number.
Blind wells
This feature duplicates the EMBER run without one or more wells to create a
blinded model which can be used to validate the quality of the prediction.
Select the Together check box to specify if multiple wells are blinded
simultaneously (such as, one blinded model), or individually (such as, multiple
blinded models). The icon to the left of Together check box creates a Well
section window that is populated with the model outputs at the blinded
location.
Select the Blind wells check box to insert wells, well folders, or wells saved
searches, to define the blinded well locations.
Note: This feature increases the execution time because it duplicates the
EMBER runs for each blind model.
11
Select the Exclude geometric features check box and select the properties to
exclude from use in training.
Note: To examine the influence of each training feature, view the relative
variable importance on the Info tab, in the Settings dialog box, of a trained
EMBER folder.
Simulation Intervention
Intervention enables the simulation results to be biased towards an existing
property and/or global numeric value. This provides a degree of manual control
of the results while still producing realizations that honor the distribution and
intervariable relationships.
The Intervention options are only available after an initial simulation has been
generated. It is good practice to first inspect the default data-driven simulation
results before choosing to intervene. You can specify a strength to govern the
degree and direction of the bias to the target property.
12
Figure 1. On the left (figure 1a), a 1d transect in the x direction for a single layer. Along the y axis,
the yellow curves show the conditional distribution of porosity at the x locations. The extremities of
shows the mean of the distribution. On the right (figure 1b), the dashed line is a simulation.
You can create an intervention using an additional variable, S(x), known at each
location, using a simple idea. Instead of using a standard Gaussian simulation to
sample within the envelope, you can use a co-simulation using S(x) instead.
Figure 2 sketches how this works. The upper figure is an example of a variable
S(x) that is required to influence the simulation. Sources of S(x) might be
variables such as a seismic attribute (to re-enforce or emphasize the role of the
attribute), a trend property, for example, indicating that there should be a trend
in the north-south direction, a user drawn attribute, for example, when a
geologist wishes to test the implications of possible channel sand complex at
some location. You can provide a 3D trend property S(x) and a weight between
-1 and +1 to indicate the degree of influence required. The algorithm then uses
Gaussian co-simulation to sample from the envelope. With a large positive
weight, the samples will draw higher values in the envelope when the S(x) is
large, so positively reinforcing the attribute on the simulation whereas with a
negative weight, the influence of S(x) will be reversed. For example, the
relationship between acoustic impedance and porosity is usually negative, so a
negative coefficient would be appropriate in this case. In the figure, the
simulation, indicated by the green dashed line, samples the upper end of the
distribution when S(x) is at its highest.
13
Figure 2. The upper figure is a schematic of a variable to be used in an intervention. With a positive
intervention coefficient, the envelope will be sampled at higher values where S(x) is high and lower
values where S(x) is low.
Note: The simulation will remain within the envelope, even for high coefficients,
which ensures reasonable simulated values (this contrasts with a standard
lead to
unreasonable simulated values particularly when there are correlations with the
secondary variable approach +/- 1). The method used is a simple co-simulation
using S(x). Provided S(x) is independent of the properties used for training and
is of reasonably short range, so the sampling will continue to reproduce
crossplots and histograms on average. Since such an intervention will not
change the histogram, it will not change the mean value either (on average). So,
a further type of intervention is provided to enable you to choose your target
mean value if you wish to, which is called an intervention on the mean.
Procedure
The Intervention toggle is only available after simulations have been run.
When you turn it on, there are options you can use to specify an existing
property and/or a target mean. Existing simulations are required to calibrate the
parameters associated with these controls.
14
Figure 3. (Left) Intervene toggle is off because no simulations have yet been run for this (new)
EMBER model. (Right) Intervene toggle is turned on and shows the options available for the next
simulation run.
Property-based intervention
When using Property-based intervention, the simulation can be biased towards
or against a property defined in the Constraint/Bias box. The property can be
one already used for training of the model (Use training feature), to increase
its influence. It can be another property outside of the training set (Use
additional features). The latter case is useful for some applications. For
example, when you want to work with a variable that is itself a simulation (when
simulating permeability from a realization of porosity without the considerable
effort of re-training), or to use a designed variable, such as a conceptual facies
fairway model designed by you as a particular scenario.
You can use this option with a user defined Intervention strength. The
strength establishes the degree of influence of the intervening property. A high
positive value leads EMBER to sample higher values from its simulation
envelope and will tend to make simulation results look like the intervening
property. A negative value will have the opposite effect, and is appropriate
where the correlation is negative, such as porosity vs. acoustic impedance. A
15
value of 0 is ignores the intervention.
If the intervention variable is independent of the variables used for training, then
the output statistics, such as cross-variable relationships, means, and quantiles
are unchanged on average over the set of realizations. In practice, it will often
be the case that the variable used for intervention is related to, or selected from,
the training variables. In this case, the intervention will modify the output
statistics somewhat.
Target mean
The Target mean option enables the mean value of the simulation outputs to
be requested. The numeric input is constrained to the envelope of the data and
is in the same unit as the domain of the data (for example, porosity).
The target mean can be set to a wide variety of values. However, there are limits
and requested target means far from the EMBER mean might not be possible to
achieve in certain cases. The simplest case is because you have asked for a
target mean that is not physically possible. For example, if the target value is
larger than the observed data values (also, see point 4 below).
There are several cases that you must be aware of.
1 When the intervention variable is not independent of the variables used for
training, some distortion to the output statistics is expected. For example,
to emphasize a particular seismic attribute you can select the variable and
an intervention strength. The correlation between the target and
intervention variables will typically increase beyond the empirical value
found at the wells as the intervention parameter increases. This is to be
expected as an intervention creates a scenario by forcing a variable to
have more influence than the model fitting stage finds in the data. Smaller
values of intervention strength produce smaller modifications yet still
increase the role of the intervention variable. Extreme values of
intervention strength (close to +/- 1) will distort the cross-variable
relationship and so do not fully respect empirical information but might be
of interest in some cases to robustly test certain geological scenarios. For
example, to produce a realization to investigate the effect on flow of a
channel in a particular part of the fairway. You must review the result and
decide that the level of change is acceptable for the current scenario.
2
the size of the field, then the changes to the sampling imposed by S(x)
16
distribution is modified (statisticians would say that the sampling is no
longer ergodic).
3 The user intervention variable can be either discrete or continuous, but
they are treated in slightly different ways. In both cases, they need to be
transformed to provide values suitable for sampling. For continuous
variables, the transformation takes place directly so that higher values of
the variable will have a larger influence. In the discrete case, the average
target value at well locations is calculated per discrete code and used as
the representative value for that code before transforming.
Note: This suggests that a user defined intervention that is unrelated
to well data, or is intended to partly override well data should be
input as a continuous variable to avoid a spurious calibration to wells
4
mean value. For example, if the target variable has an envelope between 0
and 2 across 90% of the field and between 20 and 30 in the other 10% of
the field, it would not make sense for you to set a target mean at 25 for the
full field. In other words, the set of permissible values of the target mean
depends on the envelope, but unfortunately it also depends on the
correlation length of the sampling random function, so is difficult to get
bounds for it in advance.
17
*Mark of SLB.
Copyright © 2023 SLB. All rights reserved