基于机器学习的数据处理
基于机器学习的数据处理
Graduation Defense
1/31/20 10:41:40 AM 1
Introduction
• Project overview
- Big data from oil/gas production
- Missing values and errors to fix
- Data prediction after fixing
1/31/20 10:41:40 AM 2
Introduction
• Background
- In whole life-span of oil/gas
production, data from the surface
and underground is created, stored.
- Huge datasets should be
processed, analyzed to make the
right decisions for any enterprise
1/31/20 10:41:41 AM 3
Introduction
• Big data
- Huge datasets are created every day:
e.g.: in production data, a record with 20 attributes created every 3
minutes (175,200 records/year) for a single producing well
- Main attributes in data: fluid rates, pressure & temperature in downhole
and surface, and other parameters, like controlling device size, status, etc.
TIME OIL_PRODUCTION
>7000 MPFM_STATUS
GAS_PRODUCTION
WATER_PRODUCTION
MPFM_PRESSURE
MPFM_TEMPTHP THT
7/8/11 0:00 0 0 3 0 0 4258.31257 74.0502012 4263.49939 51.030014
7/8/11 0:03 0 0 3 0 0 4258.49291 73.9946195 4263.67881 50.9100151
7/8/11 0:06 0 0 3 0 0 4258.69183 73.9390926 4263.87671 50.7900124
7/8/11 0:09 0 0 3 0 0 4258.85497 73.8835649 4264.03901 50.6700134
7/8/11 0:12 0 0 3 0 0 4259.05251 73.8279839 4264.23555 50.5500107
7/8/11 0:15 0 0 3 0 0 4259.68645 73.7724563 4264.86625 50.4300118
1/31/20 10:41:41 AM 4
Introduction
• Big data
- All kinds of big data,
such as seismic, geology,
well-logging, drilling…
- Devices from recording
data can be damaged or
failed due to the harsh
environments List of attributes:
['OIL_PRODUCTION','MPFM_STATUS', 'GAS_PRODUCTION',
'WATER_PRODUCTION', 'MPFM_PRESSURE', 'MPFM_TEMP', 'THP', 'THT',
'FLP’, 'FLT', 'CHP', 'CHOKE_SIZE', 'CHT', 'DHPT1', 'DHTT1', 'DHPT2', 'DHTT2']
1/31/20 10:41:41 AM 5
Introduction
• Goals
Predict Pressure
Prediction
Temp
Prediction
1/31/20 10:41:41 AM 7
Solution
r u le s
a l
p hysic
s s in g r s by PE
p r o ce a re e rr o
AN
Da ta h va l u e s
to b e N
e s
• i z e w h i c
d e rr o rs
b le va l u
og n s a n so n a
- rec a l u e r r e a
i s s ing v r re c to
- put
m
w i th co
l l N AN
a
- fi l l g
d i c tin e di c t
a p re ata y ’ to p r
D at f i xe d d
‘h istor
ative
• th
d
e
on the
tc h te r n
a s al
e a ion
- bas l s tom ol u t
e m ode e n c e’ s
- us a ta s ci
e k a ‘d
- se
1/31/20 10:41:43 AM 8
Implementation
1/31/20 10:41:43 AM 10
Implementation
• Modelling
Ø Linear regression is used
1/31/20 10:41:43 AM 11
Implementation
• Predicting
Ø A forecast module Prophet from Python Library, which is for time series data
prediction, is use.
1/31/20 10:41:43 AM 13
Results
• Data Processing
Ø Pressure
o Recognize the ‘0’ values as errors
o Correct the responding values by
machine learning models
o Much more errors in wellhead
pressure data, due to the device
failures
1/31/20 10:41:43 AM 14
Results
• Data Processing
Ø Downhole temperature
correction and filling
o Recognize the temperature
values of downhole could not
be zeros
o Get the right values by
machine learning models
1/31/20 10:41:43 AM 15
Results
• Data Processing
Ø Other pressure and
temperature
o Remove all
outliers
o Put the calculated
values
1/31/20 10:41:43 AM 16
Results
• Data Prediction
Ø Production of Oil/gas
o Convert the processed data to daily
data
o Use the machine learning module
Prophet to predict
o Oil /gas will always decline in the
whole life-span of the wells
1/31/20 10:41:43 AM 17
Results Gas
Oil
• Data Prediction water
Ø Production of Water
o Oil reservoir driven by bottom
water will produce more and
more water
o The prediction can show this
characteristic
1/31/20 10:41:43 AM 18
Results
• Data Prediction
Ø Pressure
Surface pressure
o Pressure means the energy of the
reservoirs, which drives the fluids out
o Both downhole and surface pressure
will drop along the production
Downhole pressure
1/31/20 10:41:43 AM 19
Results
• Data Prediction
Ø Temperature
Surface temperature
o Temperature is controlled by
geothermal energy
o It is considered as a constant
o The predictions shows it is stable all
the time
o Surface temperature has some
vibrations due to environment
temperature changes
downhole temperature
1/31/20 10:41:45 AM 20
Discussion
1/31/20 10:41:45 AM 21
Discussion
----Jeff Leek,2013.
1/31/20 10:41:45 AM 22
Thank you!
1/31/20 10:41:45 AM 23