Tutorial Spatial
Tutorial Spatial
There are many other facilities within the package, which are given as alternative options on
the menus. To start the tutorial, choose PG2000 from your Start menu. See Tutorial One for
notes on starting an PG2000 run and specifying the ghost file.
As you can see from the above I have elected to read in a set of sample data by clicking on
option and selecting
from the menu which appears.
the
PG2000 will remember the last five data files accessed and include these in your options.
Three input file types can be read in. I will read in a standard Geostokos data file.
I will select WOLFCAMP.DAT for my input data file. This is a set of 85 samples of
hydrogeological data taken from the Wolfcamp aquifer in Northwestern Texas. The coordinates are in miles and the other measured variable is the Potentiometric pressure (or head)
within boreholes intersecting the aquifer. The units of the variable are in feet above sea level.
The routine which reads in the data shows the first 10 lines of your data file so that you can
check it is going in OK. The routine also checks whether we actually had the correct number
of samples on the file and informs you if there is any discrepancy.
Page 1 of 19
Even if you select a file from the list of previously analysed data files, PG2000 will ask you
to confirm your choice. This is actually a quick way of getting back to your working
directory, since you can change your choice at this point. Be warned, though, that if you
change which file you want to read it must be the same type of file that is, if you are
reading a standard Geostokos data file, you cannot change your mind at this point and read in
a CSV type file.
Page 2 of 19
For this example, we will stick with WOLFCAMP. As your data is read in, it is stored on a
working binary file. A progress bar will indicate how far the process has gone. When data
input is complete, your Window should look like the table above.
The routine which has been used shows the first 10 lines of your data file so that you can
check it is going in OK.
Displaying the data
When the data has been read in, you will see that the greyed out options on the main menu
bar will be activated. We use the menu bar to select an option, say:
This time we have chosen to display and summarise the data set in a spatial sense. A post plot
is a map showing the locations of the samples. Each sample will be coloured and shaded
according to the value of a selected variable. Since we are analysing the wolfcamp data,
choice of variables should prove fairly simple!
Page 3 of 19
The screen will prompt you to choose the three variables for the analysis. You will see two
dialog boxes: the one in the top left hand corner lists the variables available for analysis in
your data file; the bottom right box shows the variables already chosen (at this point, none!).
The routine, needs to have information on the position of the samples and on the value at
each sample location. This particular data file only contains three variables. However,
PG2000 does not know (as yet) which of these variables is which.
There is a lot of information on the screen. At the bottom of the Window, you see the status
bar which shows the name of the current data file and the title read from that file. The
already chosen dialog box shows you that you are expected to select variables to be the X
(east/west) co-ordinate, Y (north/south) co-ordinate and Measurement to be analysed
for your semi-variogram. The upper left dialog box lists the variable names as they appeared
in the data file and is prompting you to choose the variable which will be the X co-ordinate
on the graph. For this example, let us choose Easting for the X co-ordinate:
Finally, we must choose the variable to be analysed and state any relevant transformations to
be made. For this data we require no transformation of the variable Potentiometric Level,
so click on
Page 4 of 19
The
dialog now shows the complete set of chosen variables and has
moved to the upper left corner. You have the option to change your mind here by clicking on
.
Page 5 of 19
Clicking in the polygonal radio button will cause the software to ask you for a file
containing the boundary information. You may have up to 500 vertices stored on a file. they
may be stored either clockwise or anti-clockwise and the polygon does not have to be
closed. The default name for the boundary line file is that of the original data file plus the
extension BLN -- Boundary LiNe.
Page 6 of 19
You may respecify the minimum and maximum X and Y values at this point. The defaults
given in the dialog are the full extent of the polygonal boundary. However, there are times
when you might want to have the estimated grid points on some regular grid starting at a
standardised value. For example, changing Minimum Y value to 0 would mean that the
bottom left hand corner of the grid used would be at X = 135, Y = 0. Click on
accept this boundary definition.
to
Page 7 of 19
If you only wanted to see a subsection of the data (equivalent to zooming in) you could
specify a boundary for a smaller rectangle or polygon. For example:
will read in a boundary from the file county.bln which is a small area within the
Wolfcamp study area. The Wolfcamp data set covers the Texas panhandle and a little of New
Mexico. The county boundary is a fictitious area of interest in the Deaf Smith area west of
Hereford, Texas. The mapping parameters dialog will then look like:
Clicking on the
button will allow PG2000 to plot the sample data (see next page).
Page 8 of 19
Page 9 of 19
The routine needs two co-ordinates for the sample locations. If you have not chosen any
variables before this, you will have to select X co-ordinate and Y co-ordinate as
described previously. Since we already have these variables selected, we will be offered the
choice to keep them:
Click on
to continue with these variables. You will be prompted for the threshold
distance defining when samples are too close together.
Page 10 of 19
For this illustration we have chosen the value of 0.1 mile as our criterion for samples being
too close together. If you have also elected to store the results on a file, the default extension
for nearest neighbour files will be .NND. Progress bars will indicate how the calculations are
going. Dont get too impatient. This is one of the most time consuming exercises available in
PG2000.
When all the nearest neighbours have been identified --- and stored on file if requested --various options will be offered.
If you feel you have set the threshhold distance for duplicate locations too low (or too
high!), you have an option to
Page 11 of 19
The Window also gives various summary statistics. For example, the average distance to the
nearest samples is over 9 miles. However there are several pairs of samples closer than 1 mile
and some where the nearest well is over 20 miles away.
As an illustration, I repeated the analysis with a threshhold distance of 1 mile(!). Three
samples were found to have nearest neighbours closer than one mile. These are coloured red
if you have selected minimal or full graphics.
Page 12 of 19
Because there are now sample pairs within the defined too close together distance, you are
offered the option:
information is given:
The grid in the dialog shows which samples (by number) were giving problems and the coordinates of the first of these samples. The actual distance between the two is also given, so
that you can check whether they are really duplicates or just close together. In this case it is
clear that the samples are not (in any sense) duplicates.
Should you really have cause for concern, you may create a new data file with these samples
eliminated from the data set. It is, obviously, preferable for you to review your data and find
out just why you have duplicated samples. If this is a normal part of your type of data, you
will have to do something about this before you try any geostatistical estimation such as
kriging. One option available within PG2000 is to decluster the data by averaging into
small rectangular cells. This option is available on the same menu as the other data
manipulation routines.
If you request an output file with the duplicates eliminated, the software will prompt for a
name for the new file whose default extension will be .DAT. Because of this, no default
name is provided. This is in order not to overwrite your original data file by mistake. The
problem samples will still be on the new data file, but all the measurements will have been
replaced by missing values. This is a rather unsubtle way of making sure that you do not
disrupt your kriging system with duplicate samples.
Page 13 of 19
Interpolating a grid of points will produce a sketch map of the sample values. This map
reflects the actual values measured at the actual sample locations and uses a weighted average
estimator for grid points which have not been sampled. Weights are chosen as follows:
PG2000 will remember everything which has been defined during this run. So far, we have
defined: which variables we have been analysing and a boundary for the area being
estimated:
The routine also needs to know whether you want the results stored on a grid file:
The default name for a grid file is the original data file name with the extension .GID.
PG2000 will suggest contour levels based on the variability of the sample values.
Page 14 of 19
You can change these if you so desire. Alternatively you can run with the default contours
and draw prettier maps by reading the grid files back in. Please note that grid files are not
in the same format as data files. If you want to read them back in, you must use the option:
This option may be greyed out in teaching or demo versions of the software.
The software offers several alternative inverse distance weighting functions:
Click on the relevant button to make your choice. I chose simple inverse distance for this
illustration.
or
If you choose
so that you can specify the power you require:
Page 15 of 19
Once you have selected your weighting function, you need to define search parameters and
the area which is to be studied. The neighbouring samples will be used to produce an estimate
at each unsampled grid point. Before we can go any further, we need to define the
neighbourhood. That is, how far do we want the software to search for samples to be
included in the estimation process.
PG2000 cannot guess what an appropriate search radius would be. As a simple default, we
choose an area which will contain (on average) 20 samples. This is found by simply dividing
the rectangular area around the samples by the number of samples and then multiplying this
area by 20. Finding the radius of the circle with this area produces a likely search radius.
When the value at a specified grid point is being estimated, all samples within this circle of
the point will be used in the Kriging process. If there are too many samples within this circle,
those closest to the unsampled location will be selected.
In this run, we already defined a boundary of interest to us. If you wish to change this
boundary and, say, look at the whole Wolfcamp area, simply click on
Page 16 of 19
Once you have chosen the area to be studied, you must define the grid spacing to be used.
Points will be calculated at each grid node and represented on the screen as a shaded
rectangle of the appropriate size.
Since we have not previously specified a grid spacing or number of grid points, the software
defaults to 25 points in the X direction and the same grid spacing in the Y direction. The grid
does not have to be square, but the map may look a little strange if it isnt! We can alter the
grid spacing by changing the number in the relevant box:
If you make a change and want to check how many grid points you have before proceeding,
click on
and the rest of the parameters will be updated. You may also change
minimum and maximum X and Y values at this stage. Once you click on
parameters will be defined.
the map
Interpolating a grid of points produces a sketch map on the screen. The shading information
for the contour levels will appear in the left hand box and the map itself in the right. A shaded
square will be displayed on the map to show you which point is being estimated in addition to
the information in the prompt box. You may copy the screen to your printer at any stage
during the estimation process.
The map will not show the sampled locations within the map area, since these are honoured
by the interpolation process. If you want to see where the data lies, click on the appropriate
selection from the options bar at the top of the page:
Page 17 of 19
You can see that in this example only 4 samples lie inside the selected area. The samples
outside the area are used in the estimation of all points within the search radius, so that the
boundary may be considered soft.
I opted to plot all the sample locations for this display.
If you do not want the sample locations shown on the map, you can remove them by clicking
the appropriate button on the options bar:
This option bar also allows you return to the main menu options when you are ready.
will end your run with the software. You will see the
Page 18 of 19
The above Tutorial session should serve only to illustrate a possible use of the various
routines from PG2000. Try running the program again, choosing your own responses. try
looking at reef width instead of grade. This variable has a standard two parameter lognormal
distribution. Try reading in one of the other data files which are provided, say,
samples.dat.
General ,otes
There are a few points which you may have noted in following the Tutorial session above.
Most of the routines communicate between themselves, without you having to worry about
getting the right information from one to the other. For example, after you read in the
complete contents of the data file, the routines ask which of the variables you actually want to
analysis. This information is then stored internally and may be accessed by any of the other
routines. This is a feature of most of PG2000, in that it will recall what you chose previously
and ask whether this is to change or not. You should bear this in mind if you are analysing
more than one data file in a single run. In particular, the boundary used in mapping will be
remembered. If you change data file or even which variables you analyse this will not
automatically update.
PG2000 does not distinguish between upper and lower case letters, so you may type in
whatever you find most pleasing. When the program requires a numerical answer, your input
will be checked to make sure that it is actually a number. If you type in any illegal characters
and press ENTER, the checking routine will filter out the unacceptable characters which you
type. It should be noted that, if the routine is expecting a whole number then a decimal point
is unacceptable. Much of the numerical input is checked for valid values.
A copy of this run should have been made on a file called GHOST.LIS unless you changed
the name at the beginning of the run. Send this file to your printer if you want a record of the
analysis or look at it with Wordpad or Notepad.
PG2000 like any computer software is not completely error-free. Neither is it foolproof. You can always get out of the software by right clicking on the Taskbar. This will
invoke the 'End Task' facility to close the Window without damaging the rest of your system.
If you cannot figure out what went wrong, note down as much information as you can about
the program you were running, the data you were using and exactly where it broke down.
Contact your supplier locally or Geostokos direct for assistance, [email protected]. Send
us the ghost.lis file and (if you can) the data you were analysing at the time.
Page 19 of 19