A Deep Learning Approach To Anomaly Detection in Geological - 2019 - Journal of
A Deep Learning Approach To Anomaly Detection in Geological - 2019 - Journal of
Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol
Bureau of Economic Geology, Jackson School of Geosciences, The University of Texas at Austin, Austin, TX 78758, USA
This manuscript was handled by P. Kitanidis, Carbon capture and storage (CCS) has been extensively investigated as a potential engineering measure to reduce
Editor-in-Chief, with the assistance of anthropogenic carbon emission to the atmosphere. Real-time monitoring of the safety and integrity of carbon
Jonghyun Lee, Associate Editor storage reservoirs is a critical aspect of any commercial-scale CCS deployment. Pressure-based sensing is cost
Keywords: effective, suitable for real-time monitoring, and scalable to large monitoring networks. However, questions
Anomaly detection remain on how to best harness intelligent information from the high-frequency pressure monitoring sensors to
Convolutional long short-term memory neural network support real-time decisions. This work presents a deep-learning-based framework for analyzing and detecting
Deep learning anomalies in pressure data streams by using a convolutional long short-term memory (ConvLSTM) neural net-
Pressure tests
work model, which allows for the fusion of both static and dynamic reservoir data. In ConvLSTM, the con-
volutional neural network (CNN) is used for spatial pattern mining and the LSTM is used for temporal pattern
recognition. The performance of the ConvLSTM model for real-time anomaly detection is demonstrated using a
set of pressure monitoring data collected from Cranfield, Mississippi, an active enhanced-oil-recovery field. The
anomaly detection model is trained using bottom-hole pressure data acquired from the base experiment (without
leak event) and then tested on pressure data collected during a series of controlled CO2 release experiments (with
artificially created leak events). Results show that the ConvLSTM neural network model successfully detected
anomalies in the pressure time series obtained from the controlled release experiments. Inclusion of static in-
formation into the model further improves the robustness of ConvLSTM.
1. Introduction detection of CO2 leakage, before the onset of any commercial scale CO2
storage project (Bielicki et al., 2014; Atamanchuk et al., 2015; Sun
Carbon capture and storage (CCS), which refers to separating CO2 et al., 2018). Monitoring and verification of CO2 leakage is challenging
from large-scale atmospheric releases of flue gases from industrial because (a) leakage pathways may manifest in different forms, such as
sources and then storing underground, has been recognized as a tech- fault leakage (Krevor et al., 2010; Lee et al., 2013; Vilarrasa and
nically feasible engineering measure for reducing global greenhouse Carrera, 2015), wellbore leakage (Watson and Bachu, 2009; Krevor
emission (Metz et al., 2005). It is estimated that approximately 9.5 et al., 2010; Jung et al., 2013), near surface leakage (Lewicki et al.,
gigatons (Gt) carbon/year or 35 Gt CO2 /year needs to be permanently 2005; Feitz et al., 2014), and groundwater aquifer leakage (Yang et al.,
stored in order to maintain future global warming below 2 °C above the 2014); and (b) leakage signals vary with monitoring methods and
pre-industrial levels (Quéré et al., 2015). From the aspect of commer- across spatial and temporal scales (Sun and Nicot, 2012; Sun et al.,
cial-scale adoption and deployment, geological carbon storage (GCS) 2013; Dixon and Romanak, 2015). Nordbotten et al. (2004) introduced
offers a more economically feasible greenhouse gas emission reduction an analytical model for predicting the magnitude of leakage flux
option than many other alternatives (Bachu, 2000). through abandoned wells. Viswanathan et al. (2008) developed a hy-
Monitoring represents one of the most critical components in geo- brid model for detecting wellbore leakage. Zeidouni (2012) and Wang
logical carbon sequestration (GCS) projects in order to minimize the et al. (2016) proposed analytical models to evaluate the potential of
risk of unintended migration of the injected CO2 from storage forma- leakage along preexisting faults. Sun et al. (2015) proposed a pressure-
tions (Haszeldine, 2009; Jenkins et al., 2015; Chen et al., 2018; Zhong based, pulse testing methodology for probing leakage pathways in
and Carr, 2019). Regulations in many countries now mandate that storage formations and later demonstrated at both the bench (Sun et al.,
comprehensive monitoring programs be put in place for evaluation and 2017) and field scales (Sun et al., 2016). Yang et al. (2018) present an
⁎
Corresponding author.
E-mail address: [email protected] (A.Y. Sun).
https://doi.org/10.1016/j.jhydrol.2019.04.015
Received 26 October 2018; Received in revised form 14 March 2019; Accepted 3 April 2019
Available online 05 April 2019
0022-1694/ © 2019 Elsevier B.V. All rights reserved.
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
adaptive methodology for risk-based leakage monitoring design, which 2. Materials and methods
uses a risk event tree to predict the likelihood of leakage occurrence,
with detection probabilities of risk events estimated for multiple mon- 2.1. Data description
itoring plans. Most methods mentioned herein involve certain process-
based mechanistic models, which are appropriate for process-level The Cranfield site, located in Southwest Mississippi, U.S., has been
understanding, but can be adversely affected by conceptualization and used as a field observatory for testing concepts related to enhanced oil
parameterization uncertainties when used for prediction. On the other recovery (EOR) and carbon storage by the Southeast Regional Carbon
hand, data-driven models are suitable for real-time anomaly prediction, Sequestration Partnership (SECARB,http://www.secarbon.org) since
especially when the underlying physical process is less fully understood 2008. Chevron first discovered the oil reservoir at Cranfield in 1943.
and characterized, which is often the case in subsurface modeling. More than 37 MMbbl (million barrel) oil and 672 BScf (billion of
In this work, we combine two machine learning algorithms, the standard cubic feet) gas had been produced until 1965. Most of the
convolutional neural network (CNN) and long short-term memory wells had been plugged and abandoned after 1965. In 2008, Denbury
(LSTM) neural network, for detecting anomalies in monitoring well Onshore LLC, in collaboration with SECARB, started to perform
pressure data streams. LSTM is a neural network model introduced by CO2 -EOR at the Cranfield site. The research objective of SECARB was to
Hochreiter and Schmidhuber (1997) and has shown successful perfor- demonstrate the concept of phased use of subsurface volumes, com-
mance in natural language processing and speech recognition. LSTM is bining early use of CO2 for EOR with later injection into the underlying
mainly suitable for time series analysis and does not provide mechan- or adjacent brine formations for storage (Hovorka et al., 2013). By
isms for incorporating spatial information. CNN, originally introduced 2013, more than 3 million tonnes of CO2 had been injected and stored
by LeCun et al. (1995), became popular in the last decade as a result of into the fluvial sandstone aquifer of Cretaceous Lower Tuscaloosa
advances in deep learning training algorithms and computing hard- Formation at depths of about 3300 m (Hovorka et al., 2013; Choi et al.,
ware. A number of winners of recent image classification competition 2013; Hosseini et al., 2013).
and human-machine matches are based on CNN (Silver et al., 2017; Extensive research activities were carried out in a small area east
Silver et al., 2016). Because of its origin in image analyses, CNN has the and outside of the main Cranfield reservoir in the so-called detailed
natural advantage of being able to extract embedded spatial patterns or area of study (DAS). The DAS site at Cranfield consists of one injection
features, but may give rather limited performance on time series re- well (CFU31-F1) and two nearby observation wells (CFU31-F2 and
gression problems. CFU31-F3) (Fig. 1). These three wells will be referred to as F1, F2, and
The combined CNN and LSTM model, called ConvLSTM, leverages F3 in the rest of this paper. The surface distances are 69.8 m between F1
the strengths of CNN and LSTM for obtaining “deep representations” of and F2, and 29.9 m between F2 and F3, respectively. The original CO2
complex spatiotemporal processes, which can then serve as a deep- injection experiment at F1 began on December 1, 2009 at a rate ranging
learning-based proxy model of the underlying physics. Shi et al. (2015) from 175 kg/min to 330 kg/min, and completed in late 2010. A number
successfully applied the ConvLSTM model to rainfall nowcasting. of modeling studies have been conducted to history-match the original
Donahue et al. (2015) proposed to use ConvLSTM for visual recognition Cranfield CO2 injection experiment (Hovorka et al., 2013; Hosseini
and description. Ji et al. (2013) used ConvLSTM to detect human mo- et al., 2013; Delshad et al., 2013; Soltanian et al., 2016; Min et al.,
tion, and Zhu et al. (2017) applied the ConvLSTM to detect body ges- 2018).
ture. So far, few studies have applied ConvLSTM to subsurface anomaly In January 2015, a University of Texas team conducted a new series
detection problems. of pulse testing experiments at DAS site to prove the concept of pres-
The contribution of this research is in the development of a deep- sure-based, pulse testing leakage detection technique. Prior to the 2015
learning-based approach to fuse both static and dynamic data for experiment, high-resolution permanent downhole gauges were installed
pressure anomaly detection, thus alleviating the “black box” nature of ̃
in the target zone (3220 m) in wells F2 and F3 (Sun et al., 2016). The
traditional machine learning methods while maintaining the real-time site had no known leakage at the time of the experiments. The pulse
detection capability. For the current application, the short-term CO2 testing experiment consisted of two phases (see Supporting information
plume migration or trapping is mostly influenced by reservoir porosity S1) to demonstrate the feasibility of pulse-testing-based leakage de-
and permeability under given operation conditions (Oldenburg, 2007). tection process. In the first phase, bottom-hole pressure was monitored
Thus, reservoir characterization data provide important additional in observation wells F2 and F3 to establish the nominal case (i.e., re-
constraints to improve the accuracy of anomaly detection algorithms. In servoir responses to pulsing in the absence of known leaks). Specifi-
particular, here we treat the porosity and permeability as static vari- cally, two pulse experiments, with cycle durations of 90-min and 150-
ables, and operation data (e.g., injection rate and well bottom-hole min (Fig. S2), respectively, were conducted on January 19 and January
pressure) as dynamic variables. Our ConvLSTM regression model is 20, 2015. Each pulsing cycle consisted of 50% shut-in period and 50%
trained using both the static and dynamic data to predict bottom-hole injection period. To simulate the effect of leakage events, in the second
pressure corresponding to the nominal scenario (i.e., no known leaks). phase of the experiment CO2 was artificially produced from well F3
It can then be deployed as a pressure-based anomaly detection algo- with a constant production rate of 60 kg/min while other experimental
rithm to detect significant deviations from the predicted nominal conditions were kept the same as those used in the base experiments.
bottom-hole pressure. Two separate controlled release experiments were conducted on Jan-
In the following, we first briefly describe the background of a series uary 30 and January 31, 2015 under the same cycle durations of 90-min
field experiments conducted at the Cranfield site that generated the and 150-min (Fig. S3). The main underlying principle for time-lapse
pressure data set used in this study, and then we describe the archi- detection is that leakage would cause different pressure responses
tecture of the ConvLSTM model. Using the Cranfield pressure data as a under the same pulse testing conditions. In both of the controlled re-
benchmark problem, we compare the anomalies detected by LSTM, lease experiments, F2 was used as an observation well to record bottom-
CNN, and ConvLSTM models, and conclude that ConvLSTM can be used hole pressure with a sampling rate of 30 readings per min. For this
for real-time CO2 leakage detection due to its strong ability to fuse both study, the pressure data were downsampled to the minute level by
spatial and temporal information. averaging. More details of the experimental setup can be found in Sun
et al. (2016). The Cranfield pressure data set is of high temporal re-
solution and is labeled (i.e., the times of leak events are known), thus
providing a unique data set for validating the anomaly detection al-
gorithms.
886
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
Fig. 1. (A) Map of the Cranfield site, where the DAS area (red colored rectangular box) is part of a larger domain called the High Volume Injection Test area (HiVIT)
(black trapezoidal box), with wells F1–F3 located down-dip of the oil field; (B) Areal view of the DAS site, where locations of well F1, F2, and F3, as well as the
distance between F1, F2, and F3, F2, are labeled. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this
article.)
2.2.1. Long short-term memory neural network (LSTM) it = (Whi ht 1 + Wci ct 1 + Wxix t + bi)
The LSTM neural network is a variant of recurrent neural networks ft = (Whf ht 1 + Wcf ct 1 + Wxf x t + bf )
(RNN). It was originally introduced to solve the vanishing/exploding c¯ t = tanh(Whc ht 1 + Wxc x t + bc )
,
gradient problem, which tends to cause training divergence in RNN ct = ft ct 1 + it c¯ t
(Hochreiter and Schmidhuber, 1997). Like RNN, LSTM has a strong ot = (Whoht 1 + Wxox t + bo)
ability to capture the dynamic features via cycles in the graph (Ma and ht = ot tanh(ct ) (1)
Hovy, 2016). Moreover, LSTM also shares the same parameters (i.e.,
network weights) across all time steps, which reduces the number of where is the element-wise sigmoid function, denotes the element-
unknowns significantly (Che et al., 2018). wise dot product operator, x t is the input vector at time t, and ht 1 is the
The structure of LSTM is shown in Fig. 2. For each hidden unit, hidden state vector storing all the useful information prior to time t.
LSTM use an input gate (it ), a forget gate (ft ) and previous cell state (c̄t ) Wxi, Wxf , Wxc , Wxo denote the weight matrices of different gates for
to control the current cell state (ct ), and use an output gate (ot ) and input x t ; Whi, Whf , Whc, Who are the weight matrices for hidden state
current cell state (ct ) to control the hidden state (ht ) at time t. The ht ; Wci, Wcf denote the weight matrices of cell state ct 1; and
bi , bf , bc , bo denote the bias vector.
887
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
feature is useful at one location it is also useful for other locations. only be based on their physical significance, but also on the basis of
Downsampling or pooling is an operation used to enforce translation statistical relevance to the response variable. We start with the diffu-
invariance (i.e., the same feature stays active even when the image sivity equation for reservoir pressure (in cylindrical coordinates)
undergoes slight translations). Because CNN has a strong ability to
2P 1 P µcp P
extract high-level features from spatial space, it is used to extract the + = ,
r2 r r k t (3)
spatial information at each gate and cell states, as shown in Fig. 4.
where P is the reservoir pressure, r is radial coordinate, t is time, cp is
2.2.3. Convolutional LSTM (ConvLSTM) total compressibility, µ is the dynamic viscosity of the fluid, is por-
As mentioned previously, a major drawback of LSTM is that it lacks osity, and k is the intrinsic permeability of the porous medium. The
a means for incorporating spatial information. To deal with this pro- reservoir pressure is a function of many factors, including geological
blem, ConvLSTM combines CNN and LSTM. In ConvLSTM, all of the information such as porosity and permeability, and operation condi-
inputs of LSTM are treated as tensors, denoted as (X1, …, X t ), cell out- tions such as injection rate and operation schedule. Geological in-
puts (C1, …, Ct ), hidden states (H1, …, Ht ), and gates (it , ft , ot ). The formation is considered static for the purpose of this study. Therefore,
tensors are high-dimensional arrays that can be used to represent static the geological data is referred as static data set. In this research, we only
and dynamic reservoir variables. The resulting update functions are consider the porosity and permeability information to simplify the
similar to the LSTM model given in Eq. (1), except that the data for each structure of ConvLSTM model. As Fig. 3 illustrates, two static variable
gate is now a 3D tensor. images ( and k) are included in the input data set.
In addition to the static variables, the injection rate (qt ) represents a
it = (Whi Ht 1 + Wci Ct 1 + Wxi X t + bi)
typical dynamic variable and is also the main driving force for the
ft = (Whf Ht 1 + Wcf Ct 1 + Wxf X t + bf )
movement of CO2 plume. To accommodate the time series measure-
C¯t = tanh(Whc Ht 1 + Wxc X t + bc) ments, we introduce a dynamic 2D image, in which the pixel value at
,
Ct = ft ct 1 + i t C¯t the injection well location represents the injection rate, the pixel values
ot = (WhoHt 1 + WxoX t + bo) at pressure monitoring well locations are the corresponding bottom-
Ht = ot tanh(Ct ) (2) hole pressure values, and the pixel values in all other grid blocks are
zero. The dynamic image variable is thus a sparse tensor. Due to the
As Eq. (2) shows, WhoHt 1 and WxoX t are obtained from convolu- different injection schedules, the dynamic variable image varies with
tional calculations in CNN. Moreover, cell outputs (Ct ), hidden states time at the injection location.
(Ht ), and gates (it , ft , ot ) are all calculated by convolutional calcula- Stacking the static variable images and dynamic variable images
tions. leads to a new 3D input variable, which is fed to the ConvLSTM model.
The formats of the input data set are defined as follows
2.3. Anomaly detection
t
Dstatic = 2 × 120 × 120
Here we describe a workflow for using ConvLSTM to detect t
Ddynamic = 1 × 120 × 120 ,
anomalies in the pressure data. In particular, the input data prepara- t
Dtotal = 3 × 120 × 120 (4)
tion, model architecture, and the algorithm performance on training,
validation, and testing data are described below. where t
is the static variable image with image sizes of 120 × 120 at
Dstatic
time t and consists of two static properties (porosity and permeability)
2.3.1. Reservoir model description maps; Ddynamic
t
is the dynamic variable image with image sizes of
The DAS site at Cranfield has been characterized by well logging 120 × 120 at time t and consists of injection rate and bottom-hole
and geophysical surveys. The top and bottom of the 3D static model pressure; and Dtotal
t
is the merged final input variable with image sizes of
were determined by a 3D seismic survey and are modeled as im- 120 × 120 × 3 at time t.
permeable layers (Soltanian et al., 2016). Here we use a single-layer The output variable of the ConvLSTM model is the predicted
reservoir model to incorporate static information, under the assumption bottom-hole pressure at time t in the monitoring well (i.e., well F2). The
that the static variables represent effective properties. The reservoir data structure used for the training, validation, and testing of
model has a high resolution with grid block sizes of 1m × 1m in the ConvLSTM is illustrated in Fig. 3.
lateral directions and one thick layer in the vertical direction. There- Data normalization is an important technique used in network
fore, the reservoir formation is represented as a 120 × 120 × 1 image training to improve prediction accuracy and training speed (Zhong and
(Fig. 3). F1, F2, and F3 represent the locations of wells in Fig. 3. Carr, 2016; Zhong et al., 2018). All input variables (including static
variables and dynamic variables) and output variables (bottom-hole
2.3.2. Input variable preprocessing pressure) are normalized to range [0, 1]. The following scaling formula
Input variables chosen to train a machine learning model should not is chosen:
Fig. 3. Structure of input data set. Dynamic data includes the injection rate for injection well F1 and bottom-hole pressure data from wells F2 and F3. Static data
includes the effective reservoir porosity and permeability. After concatenation, static and dynamic data form a 3D tensor.
888
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
3. Results
889
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
Fig. 5. Performance of training and validation for each regression model: (A) ConvLSTM model, (B) CNN model, and (C) LSTM model. The blue line indicates the true
observed bottom-hole pressure, the cyan line indicates the training data, and the red line indicates the validation data. (For interpretation of the references to colour
in this figure legend, the reader is referred to the web version of this article.)
Table 1 dropped more quickly and sharply in response to the controlled CO2
Statistical parameters of different regression models (boldface indicates best releases. Therefore, the pressure differences between CO2 leakage case
performance). and base case (as predicted by ConvLSTM model) become more sig-
ConvLSTM CNN LSTM nificant.
However, the opposite results were obtained when the permeability
Training R2 0.9931 0.9831 0.9894 increased from 10 mD to 1000 mD. The number of detected anomalies
RMSE 0.0891 0.1091 0.0942 still decreased compared to the base case (N = 57 ), but anomalies
happened during the injection time. This phenomenon is probably
Validation R2 0.9903 0.9774 0.9812
RMSE 0.0912 0.1121 0.1031
caused by faster pressure dispersion in a higher permeability reservoir.
Results shown here suggest that the trained ConvLSTM is relatively
robust to uncertain effective permeability values. We emphasize that
if the ConvLSTM regression model is well trained, it can be deployed as the approach presented here is general and can be applied to spatially
an anomaly detector to predict pressure anomalies. When a new bottom heterogeneous permeability and/or porosity maps when such in-
hole pressure measurement falls within the confidence interval, it is formation is available (examples are shown below in Section 4.3).
labeled normal; otherwise, it is classified as an anomaly.
As Fig. 7A shows, most detected anomalies happened when the in- 4.2. Impact of detection threshold
jector F1 is shut in, and the residual values (i.e., P ) between prediction
and ground truth bottom-hole pressure (Fig. 7B) during those time in- The number of anomalies detected by the trained ConvLSTM model
terval are significantly greater than 3 . During injection cycles, the is different if a different threshold is applied. As Fig. 7B shows, the
predicted pressure values are mostly located within the confidence in- anomalies count is 57 when the threshold is 3 . It changes to 78 if the
tervals. threshold is 2 and changes to 113 if the threshold is decreased to 1 .
Different values indicate different prediction intervals, which also
means different credibility of the trained ConvLSTM model. Higher
4. Discussion
values indicate wider prediction intervals, but also less reliability of the
trained ConvLSTM model. Smaller values indicate narrower predic-
4.1. Impact of permeability
tion interval, and also more reliability of the ConvLSTM model. Thus
the selection of threshold of anomalies is important, and depends on the
To study the impact of permeability on anomaly detection, we keep
reliability of the regression model.
the porosity map constant and change the permeability map. For each
case, we apply the trained ConvLSTM to detect the anomalies. The re-
sult is shown in Fig. 8. The permeability is varied from 0.1 mD to 4.3. Impact of heterogeneity
1000 mD. When k = 0.1, 1.5 and 10 mD, the anomalies always hap-
pened when well F1 was shut in, and the number of detected anomalies To investigate the impact of permeability heterogeneity on anomaly
decreased from 68 to 57. During the injection period, CO2 was con- detection, we generate the permeability maps using geostatistical
tinuously injected into the reservoir, so the reservoir pressure was al- methods. We established the permeability semi-variogram in the lateral
ways maintained during artificial CO2 release. Therefore, the pressure direction based on measured permeability values (Hosseini et al.,
difference between CO2 leakage case and base case (as predicted by this 2013), and then applied the semi-variogram to generate permeability
ConvLSTM model) is not significant. Thus, it is not easy to detect the maps for the study region, as shown in Fig. 9. Fig. 9A shows the
anomalies. During the shut-in period, reservoir pressure could not be homogeneous permeability map, and Fig. 9B-D show the heterogeneous
maintained because the injection was stopped, and the pressure permeability maps. As Fig. 9 shows, the number of detected anomalies
890
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
Fig. 6. Reservoir bottom-hole pressure as a function of permeability for the whole reservoir. Shaded areas represent confidence (or prediction) intervals estimated
from the trained ConvLSTM model with 1000 simulations, solid green line corresponds to ground truth bottom-hole pressure measured during the CO2 controlled
release experiment, and cyan line (with filled dots) represents the injection rate.
Fig. 7. The anomalies detection based on the ConvLSTM regression model. (A) Red dots indicate anomalies, the zone colored by cyan is the confidence interval,
within which the bottom-hole pressure is considered as normal (i.e., no leakage happened), and points outside the confidence interval are labeled as anomalies. (B)
illustrate the residual distribution between predicted and ground truth bottom-hole pressure. Colored zones are confidence intervals with different detection
thresholds, and N on the right hand side of the figure shows the number of anomalies corresponding to different detection thresholds. (For interpretation of the
references to colour in this figure legend, the reader is referred to the web version of this article.)
891
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
Fig. 8. Impact of permeability on CO2 leakage anomaly detection: k is permeability, which is changed from 0.1 mD to 1000 mD. N is the number of anomalies
detected based on the trained ConvLSTM model.
and the corresponding time point vary for different reservoir properties. 4.4. Limitations
In absence of detailed subsurface data set, an accurate reservoir char-
acterization is impossible, which undermines the value of history The ConvLSTM model we proposed in this research is mainly de-
matching for reservoir forecasting. The recently introduced data space monstrated for pressure anomaly detection in CO2 sequestration ap-
inversion (DSI) paradigm, which focuses more on propagating an en- plications. The pressure prediction depends on the antecedent time
semble of uncalibrated prior models as opposed to history matching, series information. If the amount of data is not enough (e.g., dis-
provides a more attractive alternative in the context of anomaly de- continuous time series measurements), or if the data set is noisy, our
tection under uncertainty (Jeong et al., 2018; Satija and Caers, 2015). proposed model may get poor prediction and has limited improvement.
For example, an ensemble of permeability maps can be formed and then Moreover, this work relies on a predictive reservoir model to cal-
fed to the ConvLSTM model to generate the prediction interval. culate pressure differences ( P ), and we assumed that the whole
Fig. 9. (A) Homogeneous permeability (default case); (B) Permeability map based on spherical variogram; (C) Permeability map based on power variogram; (D)
Permeability based on Gaussian variogram. The map for each subfigure corresponds to the permeability map calculated based on the corresponding semi-variogram
function.
892
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
reservoir can be represented by a single layer. In other cases, a 3D re- Bachu, S., 2000. Sequestration of CO2 in geological media: criteria and approach for site
servoir model might be more reasonable, but it may also increase the selection in response to climate change. Energy Convers. Manage. 41 (9), 953–970.
Bielicki, J.M., Pollak, M.F., Fitts, J.P., Peters, C.A., Wilson, E.J., 2014. Causes and fi-
memory and CPU demand for training the model. As an alternative to nancial consequences of geologic CO2 storage reservoir leakage and interference with
the approach taken in this study, Sun et al. (2019) applied a supervised other subsurface resources. Int. J. Greenhouse Gas Control 20, 272–284.
machine learning algorithm, IsolationForest, to partition pressure data Che, Z., Purushotham, S., Cho, K., Sontag, D., Liu, Y., 2018. Recurrent neural networks for
multivariate time series with missing values. Scientific Rep. 8 (1), 6085.
recursively using an ensemble of random trees, in order to identify Chen, B., Harp, D.R., Lin, Y., Keating, E.H., Pawar, R.J., 2018. Geologic CO2 sequestration
pressure anomalies. Their method does not require the use of a pre- monitoring design: a machine learning and uncertainty quantification based ap-
dictive reservoir model. Nevertheless, a physics-based approach is still proach. Appl. Energy 225, 332–345.
Choi, Y.-S., Young, D., Nešić, S., Gray, L.G., 2013. Wellbore integrity and corrosion of
required when it is necessary to incorporate knowledge on the (un- carbon steel in CO2 geologic storage environments: a literature review. Int. J.
certain) reservoir properties. Greenhouse Gas Control 16, S70–S77.
The structure of data set that is fed to ConvLSTM is redundant be- Delshad, M., Kong, X., Tavakoli, R., Hosseini, S.A., Wheeler, M.F., 2013. Modeling and
simulation of carbon sequestration at Cranfield incorporating new physical models.
cause we strive for generality (i.e., our model structure can be applied
Int. J. Greenhouse Gas Control 18, 463–473.
to heterogeneous static variable maps) at the expense of specificity in Dixon, T., Romanak, K.D., 2015. Improving monitoring protocols for CO2 geological
this case. In the base case, static variable images are constant but still storage with technical advances in CO2 attribution monitoring. Int. J. Greenhouse
participate in the convolution computation in each time step, and most Gas Control 41, 29–40.
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S.,
pixel values in dynamic variable images, except for the pixel value at Saenko, K., Darrell, T., 2015. Long-term recurrent convolutional networks for visual
the injection well location, are zero, which increases the computational recognition and description. In: Proceedings of the IEEE Conference on Computer
cost. Vision and Pattern Recognition, pp. 2625–2634.
Feitz, A., Jenkins, C., Schacht, U., McGrath, A., Berko, H., Schroder, I., Noble, R., Kuske,
T., George, S., Heath, C., et al., 2014. An assessment of near surface CO2 leakage
5. Conclusions detection techniques under australian conditions. Energy Proc. 63, 3891–3906.
Haszeldine, R.S., 2009. Carbon capture and storage: how green can black be? Science 325
(5948), 1647–1652.
In this study, we propose a spatiotemporal convolutional long short- Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8),
term memory (ConvLSTM) neural network model for CO2 leakage de- 1735–1780.
tection from a CO2 -EOR site. We apply, evaluate, and discuss the model Hosseini, S.A., Lashgari, H., Choi, J.W., Nicot, J.-P., Lu, J., Hovorka, S.D., 2013. Static and
dynamic reservoir modeling for geological co2 sequestration at Cranfield, Mississippi,
capability for predicting reservoir pressure before and after CO2 leakage USA. Int. J. Greenhouse Gas Control 18, 449–462.
using real data collected from the site. Our results indicate that: Hovorka, S.D., Meckel, T.A., Trevino, R.H., 2013. Monitoring a large-volume injection at
Cranfield, mississippi—project design and recommendations. Int. J. Greenhouse Gas
Control 18, 345–360.
1. The machine learning approach can fuse both spatial information
Jenkins, C., Chadwick, A., Hovorka, S.D., 2015. The state of the art in monitoring and
and temporal information, which increases detectability of anoma- verification—ten years on. Int. J. Greenhouse Gas Control 40, 312–349.
lies. Jeong, H., Sun, A.Y., Lee, J., Min, B., 2018. A learning-based data-driven forecast ap-
2. Spatial features from the input images can be learned by convolu- proach for predicting future reservoir performance. Adv. Water Resour. 118, 95–109.
Ji, S., Xu, W., Yang, M., Yu, K., 2013. 3d convolutional neural networks for human action
tional neural nets. recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35 (1), 221–231.
3. LSTM models can learn temporal features. Jung, Y., Zhou, Q., Birkholzer, J.T., 2013. Early detection of brine and CO2 leakage
4. A thorough reservoir characterization is the key for reliable leak through abandoned wells using pressure and surface-deformation monitoring data:
concept and demonstration. Adv. Water Resour. 62, 555–569.
detection when a predictive reservoir model is needed. In this study, Krevor, S., Perrin, J.-C., Esposito, A., Rella, C., Benson, S., 2010. Rapid detection and
anomalies can be more easily detected after injection in low per- characterization of surface CO2 leakage through the real-time measurement of δ13C
meability reservoir. signatures in CO2 flux from the ground. Int. J. Greenhouse Gas Control 4 (5),
811–815.
5. Availability of operation data (e.g., injection rate) is also critical for LeCun, Y., Bengio, Y., et al., 1995. Convolutional networks for images, speech, and time
predicting nominal pressure patterns. series. Handbook Brain Theory Neural Netw. 3361 (10), 1995.
Lee, J., Min, K.-B., Rutqvist, J., 2013. Probabilistic analysis of fracture reactivation as-
sociated with deep underground CO2 injection. Rock Mech. Rock Eng. 46 (4),
Although our results are mainly demonstrated using 2D reservoir
801–820.
models, it can be readily extended to 3D models if it is necessary to Le Quéré, C., Moriarty, R., Andrew, R.M., Canadell, J.G., Sitch, S., Korsbakken, J.I.,
consider vertical heterogeneity in the reservoir. Friedlingstein, P., Peters, G.P., Andres, R.J., Boden, T.A., et al., 2015. Global carbon
budget 2015. Earth Syst. Sci. Data 7 (2), 349–396.
Lewicki, J.L., Hilley, G.E., Oldenburg, C.M., 2005. An improved strategy to detect CO2
Conflict of interest leakage for verification of geologic carbon sequestration. Geophys. Res. Lett. 32 (19).
Ma, X., Hovy, E., 2016. End-to-end sequence labeling via bi-directional lstm-cnns-crf.
None. arXiv:1603.01354.
Metz, B., Davidson, O., De Coninck, H., Loos, M., Meyer, L., 2005. IPCC, 2005: IPCC
special report on carbon dioxide capture and storage. prepared by working group iii
Acknowledgements of the intergovernmental panel on climate change. Cambridge, United Kingdom and
New York, NY, USA 442.
Min, B., Sun, A.Y., Wheeler, M.F., Jeong, H., 2018. Utilization of multiobjective optimi-
This work was supported by the U.S. Department of Energy, zation for pulse testing dataset from a CO2 -EOR/sequestration field. J. Petrol. Sci.
National Energy Technology Laboratory (NETL) under Grant No. DE- Eng. 170, 244–266.
FE0026515. The authors are grateful to the handling Associate Editor Nordbotten, J.M., Celia, M.A., Bachu, S., 2004. Analytical solutions for leakage rates
through abandoned wells. Water Resour. Res. 40 (4).
and two anonymous reviewers for their constructive comments. Oldenburg, C.M., 2007. Carbon Capture and Sequestration: Integrating Technology,
Monitoring, Regulation. Blackwell Publishing, Ch. Migration mechanisms and po-
Appendix A. Supplementary data tential impacts of CO2 leakage and seepage. pp. 127–146.
Satija, A., Caers, J., 2015. Direct forecasting of subsurface flow response from non-linear
dynamic data by linear least-squares in canonical functional principal component
Supplementary data associated with this article can be found, in the space. Adv. Water Resour. 77, 69–81.
online version, athttps://doi.org/10.1016/j.jhydrol.2019.04.015. Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., Woo, W.-C., 2015. Convolutional
lstm network: a machine learning approach for precipitation nowcasting. In:
Advances in Neural Information Processing Systems, pp. 802–810.
References Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G.,
Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al., 2016.
Atamanchuk, D., Tengberg, A., Aleynik, D., Fietzek, P., Shitashima, K., Lichtschlag, A., Mastering the game of go with deep neural networks and tree search. Nature 529
Hall, P.O., Stahl, H., 2015. Detection of CO2 leakage from a simulated sub-seabed (7587), 484.
storage site using three different types of pCO2 sensors. Int. J. Greenhouse Gas Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T.,
Control 38, 121–134. Baker, L., Lai, M., Bolton, A., et al., 2017. Mastering the game of go without human
knowledge. Nature 550 (7676), 354.
893
Z. Zhong, et al. Journal of Hydrology 573 (2019) 885–894
Soltanian, M.R., Amooie, M.A., Cole, D.R., Graham, D.E., Hosseini, S.A., Hovorka, S., Keating, G.N., Kavetski, D., Guthrie, G.D., 2008. Development of a hybrid process and
Pfiffner, S.M., Phelps, T.J., Moortgat, J., 2016. Simulating the Cranfield geological system model for the assessment of wellbore leakage at a geologic CO2 sequestration
carbon sequestration project with high-resolution static models and an accurate site. Environ. Sci. Technol. 42 (19), 7280–7286.
equation of state. Int. J. Greenhouse Gas Control 54, 282–296. Wang, L., Bai, B., Li, X., Liu, M., Wu, H., Hu, S., 2016. An analytical model for assessing
Sun, A.Y., Nicot, J.-P., 2012. Inversion of pressure anomaly data for detecting leakage at stability of pre-existing faults in caprock caused by fluid injection and extraction in a
geologic carbon sequestration sites. Adv. Water Resour. 44, 20–29. reservoir. Rock Mech. Rock Eng. 49 (7), 2845–2863.
Sun, A.Y., Zeidouni, M., Nicot, J.-P., Lu, Z., Zhang, D., 2013. Assessing leakage detect- Watson, T.L., Bachu, S., et al., 2009. Evaluation of the potential for gas and CO2 leakage
ability at geologic CO2 sequestration sites using the probabilistic collocation method. along wellbores. SPE Drilling Completion 24 (01), 115–126.
Adv. Water Resour. 56, 49–60. Yang, C., Hovorka, S.D., Young, M.H., Trevino, R., 2014. Geochemical sensitivity to CO2
Sun, A.Y., Lu, J., Hovorka, S., 2015. A harmonic pulse testing method for leakage de- leakage: detection in potable aquifers at carbon sequestration sites. Greenhouse
tection in deep subsurface storage formations. Water Resour. Res. 51 (6), 4263–4281. Gases: Sci. Technol. 4 (3), 384–399.
Sun, A.Y., Lu, J., Freifeld, B.M., Hovorka, S.D., Islam, A., 2016. Using pulse testing for Yang, Y.-M., Dilmore, R.M., Bromhal, G.S., Small, M.J., 2018. Toward an adaptive
leakage detection in carbon storage reservoirs: a field demonstration. Int. J. monitoring design for leakage risk–closing the loop of monitoring and modeling. Int.
Greenhouse Gas Control 46, 215–227. J. Greenhouse Gas Control 76, 125–141.
Sun, A.Y., Lu, J., Islam, A., 2017. A laboratory validation study of the time-lapse oscil- Zeidouni, M., 2012. Analytical model of leakage through fault to overlying formations.
latory pumping test for leakage detection in geological repositories. J. Hydrol. 548, Water Resour. Res. 48 (12).
598–604. Zhong, Z., Carr, T.R., 2016. Application of mixed kernels function (mkf) based support
Sun, A.Y., Jeong, H., González-Nicolás, A., Templeton, T.C., 2018. Metamodeling-based vector regression model (svr) for CO2–reservoir oil minimum miscibility pressure
approach for risk assessment and cost estimation: Application to geological carbon prediction. Fuel 184, 590–603.
sequestration planning. Comput. Geosci. 113, 70–80. Zhong, Z., Carr, T.R., 2019. Geostatistical 3D geological model construction to estimate
Sun, A.Y., Zhong, Z., Jeong, H., Yang, Q., 2019. Building complex event processing the capacity of commercial scale injection and storage of CO2 in Jacksonburg-
capability for intelligent environmental monitoring. Environ. Modell. Software Stringtown oil field, West Virginia, USA. Int. J. Greenhouse Gas Control 80, 61–75.
116, 1–6. Zhong, Z., Liu, S., Kazemi, M., Carr, T.R., 2018. Dew point pressure prediction based on
Vilarrasa, V., Carrera, J., 2015. Geologic carbon storage is unlikely to trigger large mixed-kernels-function support vector machine in gas-condensate reservoir. Fuel
earthquakes and reactivate faults through which CO2 could leak. Proc. Natl. Acad. Sci 232, 600–609.
201413284. Zhu, G., Zhang, L., Shen, P., Song, J., 2017. Multimodal gesture recognition using 3-d
Viswanathan, H.S., Pawar, R.J., Stauffer, P.H., Kaszuba, J.P., Carey, J.W., Olsen, S.C., convolution and convolutional lstm. IEEE Access 5, 4517–4524.
894