Scripting Modflow Model Development Using Python and Flopy: Methods Note
Scripting Modflow Model Development Using Python and Flopy: Methods Note
Scripting Modflow Model Development Using Python and Flopy: Methods Note
Abstract
Graphical user interfaces (GUIs) are commonly used to construct and postprocess numerical groundwater flow
and transport models. Scripting model development with the programming language Python is presented here as
an alternative approach. One advantage of Python is that there are many packages available to facilitate the model
development process, including packages for plotting, array manipulation, optimization, and data analysis. For
MODFLOW-based models, the FloPy package was developed by the authors to construct model input files, run
the model, and read and plot simulation results. Use of Python with the available scientific packages and FloPy
facilitates data exploration, alternative model evaluations, and model analyses that can be difficult to perform with
GUIs. Furthermore, Python scripts are a complete, transparent, and repeatable record of the modeling process.
The approach is introduced with a simple FloPy example to create and postprocess a MODFLOW model. A more
complicated capture-fraction analysis with a real-world model is presented to demonstrate the types of analyses
that can be performed using Python and FloPy.
NGWA.org Groundwater 1
2006), making publication-quality graphics (Matplotlib; documentation (e.g., BAS, LPF, WEL). These should not
Hunter 2007), optimization and statistics (Scipy; Jones be confused with Python packages. A FloPy script for
et al. 2001), working with geospatial information (Fiona; a MODFLOW model commonly consists of at least the
Gillies 2014, Shapely; Gillies 2013), and performing following eight steps:
data analysis (Pandas; McKinney 2012). These packages,
together with the interactive IPython environment (Pérez 1. Import the FloPy package.
and Granger 2007), form the core of what is called the 2. Create a MODFLOW model object.
Scipy Stack and are at the heart of exploratory computing 3. Define the model setup, including the discretization,
with Python. Python itself, the Scipy Stack, and a long active model cells, starting heads, hydraulic properties,
list of other packages are open-source software, and can and layer types.
be downloaded and used for free. 4. Add packages to simulate features of the flow system
The approach presented in this paper is implemented (e.g., wells, recharge, or rivers).
in FloPy, a Python package written by the authors 5. Define the solver that MODFLOW uses to obtain a
for developing, running, and postprocessing models head solution.
that are part of the MODFLOW family of codes (i.e., 6. Define what output MODFLOW needs to save.
MODFLOW; Harbaugh 2005, MODPATH; Pollock 7. Generate MODFLOW input files and call MODFLOW
2012, MT3DMS; Zheng and Wang 1999, and SEAWAT; to obtain a solution.
Langevin et al. 2008). FloPy provides functionality for 8. Read the (binary) output files for display and further
creating new models as well as working with existing analysis.
models. The concept is that the user writes a script to
As a first simple example, consider steady, one-
construct, run, and postprocess the groundwater model.
dimensional, unconfined flow between two long canals
The script sets Python variables to define the grid,
with fixed water levels equal to 20 m; the centers of the
hydrogeological parameters, initial conditions, boundary
canals are 2000 m apart. The bottom of the aquifer is at
conditions, solver parameters, and other information
elevation 0 m, and the top of the aquifer at elevation 50 m.
required by the model. Python packages for reading
The hydraulic conductivity is 10 m/d and the groundwater
and processing geospatial information (e.g., GIS-based
recharge is 1 mm/d. Two long ditches run parallel to the
shapefiles and rasters) can also be employed to facilitate
canals, one at 500 m from the left canal and one at 500 m
incorporation of property and boundary data from a vari-
from the right canal. Both ditches have an extraction rate
ety of sources. Once the model information is defined,
of 1 m3 /m/d.
FloPy can write input files for the specified models
(MODFLOW, MT3DMS, etc.). The script can then be The eight steps, as listed above, for developing,
used to run the model, read the model results, including running, and postprocessing this groundwater model are
binary model output, and make any type of customized as follows1 :
plot. Once model results are read, many other analyses 1. Import the MODFLOW and utilities subpackages of
using Python functionality and packages are available. FloPy and give them the aliases fpm and fpu,
The approach of using scripts to develop groundwater respectively
models is clearly different from using a GUI. Some might import numpy as np
argue that it is antiquated because it requires the user to import flopy.modflow as fpm
write syntactically correct computer code. There are no import flopy.utils as fpu
well-defined menu bars for performing common tasks, 2. Create a MODFLOW model object. Here, the MOD-
nor is there a main graphic panel showing the model FLOW model object is stored in a Python variable
grid, boundary conditions, and model results. In short, the called model, but this can be an arbitrary name.
approach is reminiscent of decades-old Unix and DOS This object name is important as it will be used as
a reference to the model in the remainder of the
batch scripting that so many practicing hydrogeologists FloPy script. In addition, a modelname is specified
have tried to forget. This paper shows that Python’s when the MODFLOW model object is created. This
easy-to-read syntax and powerful features justify the modelname is used for all the files that are created
rejuvenation of the script-based approach. First, FloPy by FloPy for this model.
is introduced with a basic example of one-dimensional model = fpm.Modflow(modelname = ’gwexample’)
flow, followed by a discussion of the power, flexibility,
and benefits behind the scripting approach, which is 3. The discretization of the model is specified with the
demonstrated by a more complicated example. discretization file (DIS) of MODFLOW. The aquifer
is divided into 201 cells of length 10 m and width 1 m.
The first input of the discretization package is the name
A Basic FloPy Example of the model object. All other input arguments are self
The use of FloPy requires a basic familiarity with explanatory.
MODFLOW and its packages, see, for example, Harbaugh
(2005), as well as a basic understanding of object-oriented
code design. MODFLOW itself consists of packages that 1
Computer-specific setup, such as specification of the
are labeled with three letter acronyms in the MODFLOW directory that contains the MODFLOW executable, is omitted here.
Active cells and the like are defined with the Basic
package (BAS), which is required for every MOD-
FLOW model. It contains the ibound array, which
is used to specify which cells are active (value is
positive), inactive (value is 0), or fixed head (value
is negative). The numpy package (aliased as np) can
be used to quickly initialize the ibound array with
values of 1, and then set the ibound value for the
first and last columns to −1. The numpy package
(and Python, in general) uses zero-based indexing and
supports negative indexing so that row 1 and column
1, and row 1 and column 201, can be referenced as [0, Figure 1. Steady, one-dimensional, unconfined flow between
0], and [0, −1], respectively. Although this simulation two canals (at x = 0 and x = 2000) with areal recharge
is for steady flow, starting heads still need to be and two ditches (at x = 500 and x = 1500) where water is
specified. They are used as the head for fixed-head extracted.
cells (where ibound is negative), and as a starting
point to compute the saturated thickness for cases of
unconfined flow. and heads are saved (the default), so no arguments are
needed.
ibound = np.ones((1, 201))
ibound[0, 0] = ibound[0, -1] = -1 fpm.ModflowOc(model)
fpm.ModflowBas(model, ibound=ibound, strt=20)
7. Finally the MODFLOW input files are written (eight
The hydraulic properties of the aquifer are specified files for this model) and the model is run. This
with the layer properties flow (LPF) package (alterna- requires, of course, that MODFLOW is installed on
tively, the block centered flow (BCF) package may be your computer and FloPy can find the executable in
used). Only the hydraulic conductivity of the aquifer your path.
and the layer type (laytyp) need to be specified. The
model.write_input()
latter is set to 1, which means that MODFLOW will
calculate the saturated thickness differently depending model.run_model()
on whether or not the head is above the top of the 8. After MODFLOW has responded with the positive
aquifer. Normal termination of simulation, the cal-
culated heads can be read from the binary output file.
fpm.ModflowLpf(model, hk=10, laytyp=1)
First, a file object is created. As the modelname used
for all MODFLOW files was specified as gwexample
4. Aquifer recharge is simulated with the Recharge in step 1, the file with the heads is called gwex-
package (RCH) and the extraction of water at the two ample.hds. FloPy includes functions to read data
ditches is simulated with the Well package (WEL); the from the file object, including heads for specified lay-
length of each ditch normal to the plane of flow is equal ers or time steps, or head time series at individual cells.
to 1 m (delc = 1). The latter requires specification For this simple mode, all computed heads are read.
of the layer, row, column, and injection rate of the
well for each stress period. The layers, rows, columns, hfile = fpu.HeadFile(’gwexample.hds’)
and the stress period are numbered (consistent with h = hfile.get_data(totim=1.0)
Python’s zero-based numbering convention) starting at
0. The required data are stored in a Python dictionary The heads are now stored in the Python variable h.
(lrcQ in the code below), which is used in FloPy to FloPy includes powerful plotting functions to plot the
store data that can vary by stress period. The lrcQ grid, boundary conditions, head, etc. This functionality
dictionary specifies that two wells (one in cell 1, 1, is demonstrated later. For this simple one-dimensional
51 and one in cell 1, 1, 151), each with a rate of
−1 m3 /d, will be active for the first stress period. example, a plot is created with the matplotlib package,
Because this is a steady-state model, there is only resulting in the plot shown in Figure 1.
one stress period and therefore only one entry in the
dictionary. MODFLOW input files contain much more informa-
fpm.ModflowRch(model, rech=0.001) tion than is shown in the script discussed above. The
lrcQ = { 0: [[0, 0, 50, -1], [0, 0, 150, -1]]} reason more information is not required is that FloPy
fpm.ModflowWel(model, stress_period_data=lrcQ) has default settings for nearly every MODFLOW input
parameter. The default value is used when a value is not
5. The preconditioned conjugate-gradient (PCG) solver, specified. For example, upon creation of the discretization
using the default settings, is specified to solve the
model. file, one of the input variables is whether flow is simulated
as steady state or transient. In this case, flow is steady,
fpm.ModflowPcg(model) which is the default, so it does not need to be specified. For
6. The frequency and type of output that MODFLOW transient flow, the additional argument steady = False
writes to an output file is specified with the output needs to be specified (and the storage coefficient needs to
control (OC) package. In this case, the budget is printed be specified as one of the hydraulic properties). FloPy also
has a sophisticated approach for handling arrays that result that require running the model repeatedly using slightly
in MODFLOW input files that are as small as possible. different input such as sensitivity/uncertainty anal-
This flexibility allows entire arrays to be specified with a ysis, parameter estimation, drawdown analysis, and
single value, as was used above when the hydraulic con- capture analysis. Python packages exist for a variety of
ductivity of all cells was specified with a single number parameter estimation and uncertainty analysis methods
(hk = 10). including Levenberg-Marquardt, Markov Chain Monte
Carlo, Simulated Annealing, and Differential Evolution.
For a drawdown analysis or capture-fraction analysis,
Advantages of Scripting Groundwater Model groundwater pumping needs to be added to each cell in a
Development model, one at a time, after which the drawdown or water
As was shown in the previous example, use of FloPy budget is evaluated and stored.
to create MODFLOW models requires a certain familiarity Besides the unparalleled flexibility of using Python
with Python, as well as MODFLOW and its packages. For scripts for groundwater model development, the most
the example presented above, input was relatively simple important advantage is that a script forms a record of
and could be entered by hand, but for more complicated the entire model construction process (e.g., Bakker 2014),
models, groundwater-related data are likely stored in a which makes it transparent and reproducible. Models can
variety of file formats. Python packages exist to read easily be shared, web-based repositories can be used for
and modify data from virtually any source, whether they version control, and multiple variants of the model can be
need to be scraped from websites, read from shapefiles, created that share the same base data.
or retrieved directly from a spreadsheet or database. All
that is required is familiarity with the Python package for
the specific task, and some lines of Python code to import An Example of the Use of Python Scripts
and convert the data into a format that FloPy can handle. and FloPy for Analysis
Once loaded into Python, model results can be written to a Leake et al. (2010) presented the capture-fraction
large number of formats, including image files, shapefiles, method to compute which fraction of pumpage comes
georeferenced images, or NetCDF files. from which source. This method requires the addition of
Python scripts can be written to automatically groundwater pumping and the calculation of water budget
rediscretize a model both in time and in space, in order to changes relative to a base case simulation for each cell
analyze how a prediction of interest is sensitive to model in a model layer and is straightforward to automate using
resolution. Determination of optimal model resolution FloPy, as shown in the following example.
is not common practice in groundwater modeling, Leake et al. (2010) presented a capture-fraction
because it is inefficient to do with existing GUI-based analysis for the Upper San Pedro Basin in southeastern
software. Python scripts are ideally suited for analyses Arizona, USA, and northern Sonora, Mexico, based on
Conclusion
Use of Python and FloPy allows programmatic cre-
ation and postprocessing of MODFLOW-based models. A
programmatic modeling approach facilitates analyses that
can be difficult or impossible to complete using currently
available GUIs, especially when many model runs with
slightly different input are required, such as for param-
eter estimation, uncertainty analysis, drawdown analysis,
and capture-fraction analysis. (Any of these analyses can
Figure 5. FloPy generated map showing the computed cap- be built into a GUI by the GUI developers, of course.)
ture fraction of water from head-dependent boundaries as a Use of a high-level programming language such as Python
function of well location in the Upper San Pedro Basin model allows relatively complicated tasks to be achieved with a
layer corresponding to the lower basin fill after 10 years of few lines of readable code.
pumpage. The maximum areal extent of the active model The first FloPy example presented in this paper shows
domain and the location of stream boundary conditions in
all model layers are also shown. how to create and postprocesses a simple MODFLOW
model. The capture-fraction analysis using the Upper San
Pedro Basin model presented in this paper demonstrates
a dictionary specifying the location and discharge of the the utility of advanced model development using Python
well for stress period 1 (similar to the previous example), scripts. More than 1500 model runs were automatically
adds the new well package to the model, and writes the executed and postprocessed using the Python script
new well package file (lines 2–5). Next, the modified developed for the analysis. The final script serves as
model is run and file objects are created for the head a record of the steps performed and can be distributed
file and the cell budget file (lines 6–8). The model name along with the original data and reproduced by other
is DG, and the precision is specified as double, because hydrogeologists.
use of a double precision version of MODFLOW-2005 FloPy is a community based open source project
improved model convergence. A list of time steps and hosted on https://github.com/modflowpy/flopy. The scripts
stress periods is read from the head file object, the heads discussed in this paper and the supporting files are
are read for all saved time steps at the specified cell, and available on the FloPy website. Installation instructions,
an empty array is created for the capture fraction at all documentation, as well as examples demonstrating how
saved time steps (lines 9–11). Finally, for each saved to use FloPy to create and postprocess MODFLOW-based
time step, the net flux to all head dependent boundaries models can also be found on the FloPy website.
are summed and the capture fraction is computed and
returned by the function (lines 12–24). The function
returns not-a-number (np.nan) for time steps where the Acknowledgments
layer, row, and column location passed to the function is The authors welcome additions, suggestions, and
dry (lines 13–14). assistance from the groundwater community, and thank all
With the capture-fraction calculation available as a past contributors for their work. Any use of trade, product,
function call, it is straightforward to embed the analysis or firm names is for descriptive purposes only and does
in a loop that iterates over every cell (or a subset of cells) not imply endorsement by the U.S. Government.
in a model layer to develop an array of transient capture-
fraction values for each time step in stress period 2.
The computed capture fraction for the lower basin fill References
(model layer 4) after 10 years of pumping is shown in Bakker, M. 2014. Python scripting: The return to programming.
Figure 5. Ground Water 52, no. 6: 821–822.
Because of the large number of grid cells in the Bakker, M., and V.A. Kelson. 2009. Writing analytic element
programs in python. Ground Water 47, no. 6: 828–834.
model, pumping locations were only considered in active Fienen, M.N., and R.J. Hunt. 2015. High-throughput computing
cells and in every fourth row and every fourth column versus high-performance computing for groundwater appli-
in model layer 4. This subset of model cells required a cations. Ground Water 53, no. 2: 180–184.