a data-driven smart proxy model for a comprehensive

A Data-Driven Smart Proxy Model for AComprehensive Reservoir Simulation

Faisal AleneziDepartment of Petroleum and Natural Gas Engineering

West Virginia University

Email: [email protected]

Shahab MohagheghDepartment of Petroleum and Natural Gas Engineering

West Virginia University

Email: [email protected]

Abstract— One of the most important tools for studying fluidflow behavior in oil and gas reservoirs is reservoir simulation. Itis constructed based on a comprehensive geological information.A comprehensive numerical reservoir model has tens of millionsof grid blocks. Therefore, it becomes computationally expensiveand time consuming to run the model for different reservoirsimulation scenarios. There are many efforts have been madeto reduce the computational size using the proxy models. Proxymodels are the substitute to the complex numerical simulationby producing a meaningful representation of the complex systemin a very short time. The conventional proxy models are eitherstatistical or mathematical approaches. These conventionalapproaches are still limited to the complexity of the reservoirand the number of the numerical simulation runs needed tobuild the proxy model. In this study, a smart proxy model thatis based on artificial intelligence and data mining is presented.A grid based smart proxy model is developed to reproduce thedynamic reservoir properties of a full- field numerical simulationin few seconds. A comprehensive spatio-temporal database isbuilt using the conducted numerical simulation run. The datafrom the database is trained, calibrated, and verified throughoutthe development of the smart proxy model. Smart proxy modelis able to produce pressure and saturation at each reservoirgrid block accurately and with a significantly less computationaltime compared to the numerical reservoir simulation model.

Keywords—Artificial Intelligence, Data Mining, Proxy Model-ing, Reservoir Simulation.

I. INTRODUCTION

Petroleum industry strives to find oil and gas reserves,

developing these resources, meet the world energy demand,

and maximize profits. One of the most important tools in oil

and gas reservoirs development and management is reservoir

simulation. It is a necessary tool for reservoir engineering

strategy plans. The key goal of reservoir simulation is to

predict future performance of the reservoir and find ways and

means of optimizing the recovery of some of the hydrocarbons

under different operating conditions. Accurate reservoir simu-

lation involves a comprehensive description of the reservoir

properties. To date, the computational science, addressing

numerical solution to complex multi-physic, non-linear, and

partial differential equations, are at the lead of engineering

problem solving and optimization [1].

Due to the complexity of a reservoir, sometimes it is com-

putationally extravagant to develop and run numerical simu-

lation models. Therefore, the petroleum industry investment

in reservoir simulation tools is expensive. The rate of return

on these investments should be calculated to maximize the

benefits from the reservoir simulation. Reservoir simulation

proxy models are one way to increase the return on investment

in reservoir simulation. Proxy-modeling (also known as surro-

gate modeling) is a computationally inexpensive alternative

to full numerical simulation in assisted history matching,

production optimization, and forecasting. A proxy model is

defined as a mathematically, statistically, or data driven model

defined function that replicates the simulation model output

for selected input parameters [2]. The proxy model’s results

are not to mimic the numerical simulation results with 100%

accuracy, but the outputs generated with the amount of time to

run these models, give a reasonable range of error. Reducing

the computational time to few seconds, make these models sig-

nificantly competent and attractive to the reservoir engineers

[3].

There are several approaches for generating the proxy models.

Response surface methodology (RSM), reduced order models

(ROD), reduced physics models (RPM) are the first techniques

introduced in this field. The most widely used approach is the

response surface methodology. Response surface methodology

(RSM) consists of a group of mathematical and statistical

techniques used in the development of a sufficient functional

relationship between a response of interest and a number of

associated input variables [4].

In recent years, a newly developed technique for generating

proxy modeling has introduced to the reservoir simulation. It

is neither statistical nor mathematical; it is a smart approach

that is based on data mining and artificial intelligence.

II. DATA MINING AND ARTIFICIAL INTTLEGENCE

TECHNIQUE

The amount of data in the world is increasing dramatically.

Data mining is about solving problems by analyzing and

discovering the patterns already present in databases [5].

Artificial Intelligence is a powerful technique that teaches the

machines how to process data. Data mining and Artificial

Intelligent have been applied in petroleum engineering field.

In his series of articles in Society of petroleum engineers

journal, Shahab D. Mohaghegh presented three types of the

virtual intelligence (neural networks, genetic algorithm, and

fuzzy logic) and their applications in the oil and gas industry

978-1-4673-8956-3/16/$31.00 ©2016 IEEE

[6][7][8]. The conclusions from these articles show the ability

and the potential of the Artificial Intelligence to solve complex

problems in reservoir engineering.

In reservoir simulation, Artificial Intelligence is used before

to generate a proxy model that is able to reproduce the

numerical simulation outputs. In 2006, Mohaghegh developed

the first SRM (surrogate reservoir model) based on data mining

and artificial intelligence (which is later called smart proxy

model). The proxy model built was able to solve the time

consuming challenge to run uncertainty analysis in one of the

giant oil fields [9]. Since then, the smart reservoir model has

applied to several studies to replicate the numerical reservoir

simulation outputs. Alireza Shahkarami verified the ability

of the smart proxy model in performing reservoir simulation

history matching [10]. S. Amini developed a smart proxy

model that is capable of mimicking a numerical reservoir

simulation results in a complex reservoir [11].

The main part of developing the smart proxy model is to

build a spatio-temporal database from a number of numerical

simulation runs in order to train, validate, and test the data for

successful predication. In this paper, the author investigates

the ability of the smart proxy model to mimic the numerical

simulation results using one numerical simulation run of a

complex reservoir. The main advantages of using the smart

proxy model against other proxy models in reservoir simula-

tion are; 1. There is no limitation in reservoir complexity; 2.

There is no simplification in the reservoir physics; 3. The time

to run the smart proxy model is very short.

III. FIELD OF STUDY AND GEOL-CELLULAR MODEL

The SACROC unit (Scurry Area Canyon Reef) is part of

the Kelly-Snyder Field located northeastern of the Permian

Basin in West Texas. The field discovered in 1948 with

approximately 2.73 billion barrels oil in place. In 1954, a

pressure maintenance program has established in the reser-

voir by water injection. A geo-cellular model is developed

for the SACROC unit with 221 layers and 149X287 cells

spatially. The high number of grid blocks (9,450,632) with

more than 2000 wells, make conducting study in the field time

consuming. Therefore, to serve the objective of this study, the

geo-cellular is upscale to 100X142 spatially and to 16 layers.

Also, a northern area of up-scaled geo-cellular is picked with

dimensions of 39X51 spatially. The final total number of grid

blocks for the study is 31,200. It is important to mention that

the porosity-permeability heterogeneity is preserved with the

up-scaled geo-cellular model. Figure 1 and Figure 2 show

the porosity-permeability histograms before and after the up-

scaling process.

Figure 3 shows the porosity 3D model of the study area. Using

the scale on the right side of the figure, this figure provides

strong evidence of the field heterogeneity.

There are 27 production wells and 12 water injection wells in

this part of the field.

Fig. 1. Porosity-Permeability Histogram of the High Resolution Geo-cellularModel

Fig. 2. Porosity-Permeability Histogram of the Up-scaled Geo-cellular Model

Fig. 3. Porosity Geo-cellular Model of the Study Area

IV. SMART PROXY DEVELOPMENT

As aforementioned, the smart proxy model is a data mining

and artificial intelligence approach. The development proce-

dure fall in four main stages; Design full field reservoir simu-

lation, extract the output data from the reservoir simulation to

generate the database, develop the smart proxy model by train

and validate data for a targeted output, and verify the smart

proxy model performance using blind sets of data (Figure

4). The following sections are discussing these steps in more

details.

A. Numerical Simulation Design

The purpose of designing and running the numerical sim-

ulation is using its outputs as inputs to the neural network

for training. The key of designing the simulation model is

to take into consideration the objective of the smart proxy

model. The aim of this study is to replicate and predict the

reservoir dynamic properties at grid block level under different

water injection flow rates. With this in mind, the simulation

run is designed to calculate the well production data at each

formation layer (grid block). Certainly, having the production

data at every grid block of the wells is generating the required

Fig. 4. Smart Proxy Model Work Flow

data heterogeneity for the neural network training (Figure 5).

In sum, it is only one simulation run designed to develop the

smart proxy model.

Fig. 5. Production Data Histogram at Grid Block

B. Database Generation

The majority of time in developing the smart proxy model

is consumed in database generation. The smart proxy model

is constructed to represent the principles pf the reservoir

physics. Therefore, reservoir engineer knowledge and inputs

are essential to achieve the goal of developing the smart proxy

model. The data input in the database are coming from two

sources; the geo-cellular model (static data) and from the

fluid flow model (dynamic data) at each grid block. Also, the

static and dynamic data from the neighboring grid blocks are

collected to monitor the pressure and saturation movement.

The static data include reservoir structure and grid blocks

information. The dynamic data are the properties that change

with time, such as well production/injection values and the grid

Data TypeStatic Data Dynamic DataGrid Location Grid Injection RateGrid Top Grid Injection CumulativeGrid Thickness Grid Production RateGrid Porosity Grid Production CumulativeGrid Permeability Grid PressureDistance to Injection and Bound-aries

Grid Saturation

TABLE IDATA SELECTED TO DEVELOP SMART PROXY MODEL

block pressure/saturation values. Data selected to construct the

smart proxy are shown in table 1.

C. Data Sampling

Mining data from 31200 grid blocks is generating a huge

data set of 1.5 million records. The tools used in this study

are not able to handle such an amount of data. Therefore, data

sampling is utilized for developing the smart proxy model.

Two sampling methods were verified. First, random sampling

is used with acceptable training and prediction results. Then, a

smart sampling technique designed. In this sampling approach,

the histogram of the targeted output is plotted. Then the

output distribution is divided based on the values of the

targeted output. In this data sampling process, the output

value represented by a high number of data will take a lower

percentage of data sampling. On the other hand, the values

with less number of data will take a higher proportion of the

data sampled. This sampling process will give the proxy model

the required data heterogeneity for a better training. Figure 6

explains the smart sampling procedure.

Fig. 6. Smart Sampling Technique for Pressure Data

D. Neural Networks Development and Training

A neural network is designed for each of the three reservoir

properties investigated in this paper (pressure, oil saturation,

and water saturation). The algorithm used to construct the

neural networks is Back Propagation. In this algorithm, the

error for each output is back-propagate to the input in order to

adjust the weights in each layer of the neural network [12]. The

network typically consists of three layers; input layer, hidden

layer, and the output layer. Each neural networks developed

in this work has 64 inputs (table 1), 90 hidden layers, and

1 output. The selection of the inputs is followed by data

partitioning. The objective of data partitioning is to divide the

input data during the training process into training data-set,

validation data-set, and testing data-set. Random partitioning

is used in this study (80% of the input data is assigned to

training set, 10% is assigned to validation set, and 10% is

assigned to testing set).

Once the network is constructed, the training process starts

and the network performance can be monitored using several

visualization plots in the software used [13].

V. RESULTS

A. Training Results

The static and dynamic input data (table 1) have been

gathered for 10 years (from 1953 to 1963). The gathered inputs

are used to train the neural networks for the targeted reservoir

properties (pressure, oil saturation, and water saturation). For

each property, neural network training is performed. For the

pressure property, the coefficient of determination (R-squared)

for the three data-sets, training, validation, and testing, is 0.99

(figure 7). The same R-squared values are achieved for the

training models for oil saturation and water saturation from

neural networks (figures 8 and 9).

Fig. 7. Training cross plots for the Pressure

B. Validation and Prediction Results

In addition to the objective of replicating the numerical

simulation results, the smart proxy model is verified by testing

its ability to predict the dynamic reservoir properties for the

forthcoming time steps under the same operational constraint.

The smart proxy model is verified by comparing the smart

proxy model prediction results to the numerical reservoir

model. The error is then measured between the two models.

In this study, the smart proxy model (trained neural networks

model) is applied to predict pressure, oil saturation, and water

saturation for the years from 1964 to 1968 (which are not

included in the training process). The results show that the

average error of the three reservoir properties is less than 4%.

Fig. 8. Training cross plots for the Oil Saturation

Fig. 9. Training cross plots for the Water Saturation

.

Figure 10 shows the distribution maps of pressure values of

layer 2 in 1964. The top part of the figure shows the pressure

map of the numerical simulator. The middle section of the

figure shows the pressure map of the smart proxy model. Next

to these maps is the pressure scale. At the bottom of the

figure is the error map, which measured between the upper

and middle pressure maps with the error scale next to it.

The same explanation can be written for the figures from 11

to 14 with different reservoir properties, different layers, and

different years.

VI. CONCLUSION

This paper presented an alternative tool for reservoir simu-

lation. The smart proxy model uses data mining and artificial

intelligence techniques, and the development procedure of this

model is discussed. It is shown that the smart proxy model is

able to replicate the complex numerical simulation results at

grid block level in a very short time (seconds).

REFERENCES

[1] Shahab Dean Mohaghegh. Reservoir simulation and modeling based onartificial intelligence and data mining (ai&dm). Journal of Natural GasScience and Engineering, 3(6):697–705, 2011.

[2] Denis Igorevich Zubarev et al. Pros and cons of applying proxy-modelsas a substitute for full reservoir simulations. In SPE Annual TechnicalConference and Exhibition. Society of Petroleum Engineers, 2009.

[3] S Amini, SD Mohaghegh, R Gaskari, GS Bromhal, et al. Patternrecognition and data-driven analytics for fast and accurate replicationof complex numerical reservoir models at the grid block level. InSPE Intelligent Energy Conference & Exhibition. Society of PetroleumEngineers, 2014.

[4] Andre I Khuri and Siuli Mukhopadhyay. Response surface methodology.Wiley Interdisciplinary Reviews: Computational Statistics, 2(2):128–149,2010.

[5] Terry Ngo. Data mining: practical machine learning tools and technique,by ian h. witten, eibe frank, mark a. hell. ACM Sigsoft SoftwareEngineering Notes, 36(5):51–52, 2011.

[6] Shahab Mohaghegh et al. Virtual-intelligence applications in petroleumengineering: Part 1artificial neural networks. Journal of PetroleumTechnology, 52(09):64–73, 2000.

[7] Shahab Mohaghegh et al. Virtual-intelligence applications in petroleumengineering: Part 2evolutionary computing. Journal of Petroleum Tech-nology, 52(10):40–46, 2000.

[8] Shahab Mohaghegh et al. Virtual-intelligence applications in petroleumengineering: Part 3fuzzy logic. Journal of petroleum technology,52(11):82–87, 2000.

[9] Shahab D Mohaghegh, Hafez H Hafez, Razi Gaskari, Masoud Haa-jizadeh, Maher Kenawy, et al. Uncertainty analysis of a giant oilfield in the middle east using surrogate reservoir model. In AbuDhabi International Petroleum Exhibition and Conference. Society ofPetroleum Engineers, 2006.

[10] Alireza Shahkarami, Shahab Mohaghegh, Vida Gholami, AlirezaHaghighat, and Daniel Moreno. Modeling pressure and saturationdistribution in a co2 storage project using a surrogate reservoir model(srm). Greenhouse Gases: Science and Technology, 4(3):289–315, 2014.

[11] Shohreh Amini. Developing a Grid-Based Surrogate Reservoir ModelUsing Artificial Intelligence. WEST VIRGINIA UNIVERSITY, 2015.

[12] RC Chakraborty. Back propagation network, 2010.[13] The Mathworks, Inc., Natick, Massachusetts. MATLAB version

8.5.0.197613 (R2015a), 2015.

Fig. 10. Distribution map of Layer 2 pressure in 1964 (Numerical SimulatorOutput, Smart proxy Output, and Error)



Fig. 13. Distribution map of Layer 3 Oil Saturation in 1964 (NumericalSimulator Output, Smart proxy Output, and Error)



a data-driven smart proxy model for a comprehensive

Documents