Module 3 Basic Data Processing
Module 3 Basic Data Processing
Basic Data
Processing
2 Module 3: Basic Data Processing
Overview
In this module we will cover the data processing basics, no fancy stuff. With this
information you can go from raw data file to something you can print or plot.
• Run Data Conversion and convert your raw data to scientific units.
• Run Bin Average to reduce your data set, producing data at even depth values.
Data processing involves operating on your raw data (which came from the sensors
attached to the instrument) to convert it from the sensor outputs to scientific units,
calculate any additional oceanographic parameters of interest, and reduce the data set to a
tractable size. It is always the best practice to archive your raw data, because there is no
going backwards once you have processed it. Should you discover calibration errors, omit
necessary parameters, or make processing errors, it is imperative to return to your raw
data set.
4 Module 3: Basic Data Processing
This is the data processing user interface. For the basics, we will concern ourselves with
Data Conversion and Bin Averaging.
Module 3: Basic Data Processing 5
Data Conversion takes your raw data (.dat or .hex) file and, with the information
contained in the .con file, converts it to scientific units. The form the converted data takes
is set up in the SBE Data Processing-Win32 Data Conversion dialog. In addition to
quantities like temperature and conductivity, parameters that are calculated from these,
such as salinity and density, are also available. If you plan to do more advanced
processing of your data, you should wait until these steps are complete before calculating
parameters derived from the basics.
6 Module 3: Basic Data Processing
File Specification
The program setup (.psu) file contains information regarding how the program was set up
last time you used it. This relieves you of the task of choosing the variables and other
processing options. In addition, if you have more than one preferred processing protocol,
you can store them as separate .psu files. As always, you must select the appropriate .con
file for your instrument. You can select multiple files for processing; a separate .cnv file
will be created for each file that is processed.
Module 3: Basic Data Processing 7
Data Setup
You can process all or part of the data file and skip the beginning (during the time you were
checking your instrument before the cast). You can output data in ASCII (text) or binary (binary
file is shorter). You can convert the upcast, downcast, or both.
If you collected water samples, you can generate a file containing some CTD scans that were
collected while the water bottle was being closed. There are several means of determining where
these scans should come from:
• For the 9plus with a G.O. 1015 water sampler, the data stream is marked with a status bit
when the bottle closure occurs.
• For internally recording instruments with an AFM, the file uploaded from the AFM
(an .afm file) can be used.
• A .bsr file can be made from a .mrk file (collected while running Seasave) with the
Mark Scan module.
• If using an SBE 32 Carousel, Seasave writes a .bl file that contains the information required.
Once you have settled on the source of information for indicating bottle closure times, you have
to select the amount of data to be included in the file for each closure. Scan range offset is how
far back Data Conversion looks in your data stream in seconds and Scan range duration is how
long it looks, again in seconds. For example, an offset of 5 seconds and duration of 10 seconds
means you will get 5 seconds of data before the bottle closure and 5 seconds after the closure.
8 Module 3: Basic Data Processing
Output Variables
The output variables dialog presents a table of items that will be calculated and the order
they will be calculated in.
Module 3: Basic Data Processing 9
The header view can reassure you that you are working with the instrumentation you
think you are.
10 Module 3: Basic Data Processing
What’s a Header?
• Header is placed at the beginning of each
data file
– Header for internally recorded data is written when the
data is uploaded from the instrument
– Header for real-time data is written before the data
is recorded
– These lines are preceded with an *
• Operator comments are added to the header; these
lines are preceded with an **
• Headers are updated each time data in the file
is manipulated
– Update contains information relevant to how the data
was manipulated
– These lines are preceded with an #
Module 3: Basic Data Processing 11
The data conversion process also writes a file with data collected while water bottles
were being closed. You choose the time window of data you want written to the file. For
example, you might select to have all scans collected starting 5 seconds before the bottle
closes and ending 5 seconds after the bottle closure written to the file. We will discuss
the use of this data in correction of CTD conductivity data in the advanced data
processing topics.
Module 3: Basic Data Processing 13
Here is a Seasave plot of some data collected on Hawaii Ocean Time-series cruise 101.
14 Module 3: Basic Data Processing
Activity
To make the data conversion from raw sensor output to scientific units:
Click Start -> Programs -> Sea-Bird -> SBEDataProcessing-Win32
Click Run -> Data Conversion
Select instrument .con file: C:\Data\Module3\Hawaii.con
Select the data file: C:\Data\Module3\Hawaii.dat
Click the Data Set Up Tab
Choose to process to the end of the file
Skip 0 scans
Select output format ascii
Convert data from upcast and downcast
Create .cnv file only
Click the Select Output Variables button
Click on the first Variable Name
Select Pressure -> Digiquartz -> db
Click Add or Change
Click on the second Variable Name
Select Temperature -> ITS90 -> deg C
Click Add or Change
Click on the third Variable Name
Select Salinity
Click Add or Change
Click Start Process
We thank the Hawaii Ocean Time Series program for sharing this data with us.
Module 3: Basic Data Processing 15
Bin Averaging
Bin Averaging
• Reduces size of a data set by statistically
estimating data values at even intervals
(e.g., every meter or 10 meters)
• May work in depth (meters),
pressure (decibars), time, or by scan
• The surface bin is treated separately
Bin averaging is a means of reducing your data set to a more tractable, and perhaps
a more meaningful, size. The Bin Average module makes a statistical estimate of
data values at a user-prescribed interval based on the surrounding data. You can
bin data on the even meter or 10 meters. You can bin data with a bin size that
represents the resolution of your instrument. For time series measurements, you can
bin on time interval.
16 Module 3: Basic Data Processing
surface = 0 db
An estimate of each variable is made using the average value of that variable and
pressure in the previous bin, and the average values of the variable and pressure in the
current bin.
Module 3: Basic Data Processing 17
This protocol averages all the data within the bin, producing uneven bin pressures or
depths. For example, if you are binning on 10-meter intervals, the first bin start is
5 meters and the end is 15 meters. All data within this window is averaged, producing a
bin depth of approximately 10 meters (e.g., 10.123 meters).
18 Module 3: Basic Data Processing
First bin
Center (target) first bin =
Bin size=10 db
bin size=10 db
The surface bin is handled differently because the previous bin would be up in the air.
The surface bin is assigned a beginning pressure or depth, an ending pressure or depth,
and a target pressure or depth.
Module 3: Basic Data Processing 19
The Data Setup tab allows your choice of pressure, depth, time, or scan bins. You can
include the number of scans per bin in the output file, which is useful for instruments
with a low sample rate. Bins with 1 sample in them are not very accurate statistically.
You can skip the data that you acquired while you were checking out your instrument
before the cast started. Similar to previous processing, you can process the upcast,
downcast, or both.
In the advanced data processing portion of the course we will discuss techniques of
removing suspect data. These data are marked in the data set as bad scans. When you
are setting up Bin Averaging you may exclude scans marked bad by previous processing
steps.
As mentioned a moment ago, the surface bin is handled separately. Note that in our
example the surface bin is not included, because we are binning on a 1-meter interval. If
you bin on a small interval, it is very difficult to calculate a surface bin. For example, we
are using 1-meter bins, which begin at 0.5 meters and end at 1.5 meters. The surface bin
would begin 0.5 meters in the air. The surface bin is useful for a coarser bin size. For
example, with 10-meter bins, the first bin starts at 5 meters and runs to 15 meters. You
can succeed in calculating a surface bin that runs from 0 meters to 5 meters with value
2.5 meters.
Module 3: Basic Data Processing 21
Activity
Seaplot
• Seaplot is an application that will plot
converted data ( .cnv )
• You may plot 1 – 4 parameters against a
single Y axis
• You may plot 1 – 4 parameters against a
single X axis ( useful for time series data )
• You may select axis values interactively
• Seaplot is part of the SBEDataProcessing
suite of applications
24 Module 3: Basic Data Processing
Displaying Converted
Data: Seaplot
The file setup dialog includes an entry for the program setup file or .psu. This
allows you to store your plot setup for reuse. Output directory settings are included
for batch processing.
Module 3: Basic Data Processing 25
Displaying Converted
Data: Seaplot
Plot Setup allows you to choose the plot type (single Y, multiple X or single X,
multiple Y), fonts, titles, and color schemes. Symbols are also specified here; if you are
plotting to a black and white printer, symbols are very useful. The auto range padding
entry sets the amount beyond the range of the data that the minimum and maximum axis
values extend.
26 Module 3: Basic Data Processing
Displaying Converted
Data: Seaplot
The Y-axis setup tab allows selection of variable, line type, and color. Plot range and
scale direction are also specified here.
Module 3: Basic Data Processing 27
Displaying Converted
Data: Seaplot
28 Module 3: Basic Data Processing
Displaying Converted
Data: Seaplot
Module 3: Basic Data Processing 29
Activity
Run SBEDataProcessing-Win32
Select Seaplot
Click Select for input files and select C:\Data\Module3\Hawaii.cnv
Click the Plot Setup tab
Choose a Plot type of Single Y - Multiple X
Click the Y Axis tab
Choose Pressure, Digiquartz [db]
Check Auto-range
Click on the X Axis 1 tab
Choose Temperature
Uncheck Auto range
Enter a minimum of 0, a maximum of 30
Click on the X Axis 2 tab
Check the Include axis box
Choose Salinity
Uncheck the Auto range
Enter a minimum of 34, a maximum of 36
Click the Start Process button
30 Module 3: Basic Data Processing
Batch Processing
• Batch processing frees you from processing each
cast individually
• Batch processing is done from a command line
prompt
– Win2000/XP run “command” from the
Start -> Run dialog
– Win95/98 use an MSDOS window
– Run SBEBatch directly from the Start -> Run dialog
• You need to set your autoexec.bat path to include
the folder where SBEbatch resides
For processing large sets of cast data, batch mode processing automates the job. You can
use the windows scripting host or a program provided in Seasoft, SBEBatch.exe. Your
batch file can take advantage of command line parameters and wild card characters.
Module 3: Basic Data Processing 31
Batch Processing
• Batch processing uses an application that runs other
applications (ie data processing apps)
• You may use the Windows Scripting Host or an
application Sea-Bird provides, SBEBatch
• The applications that the batch processor runs are
listed in a text file that you make with a text editor
like Notepad
– A list of applications are shown in your notes
• SBEBatch reads each line of the text file and runs
each application in turn
Batch Processing
• Each line of your batch file contains
– The name of the application
– The name of the files to operate on
– Any additional parameters needed to do the job
• Parameters are denoted by the ‘/’ character and an
identifier, a table of parameters is shown in your notes
• For example, a batch processing file that runs Data
Conversion on one data file looks like:
DatCnv /iMyData.dat /cMyCTD.con
• The input file is “MyData.dat”, designated by the “/i”
• The configuration file is “MyCTD.con”…”/c"
• This will cause Data Conversion to use the last .psu file
and create “MyData.cnv”
Parameter Description
/cString String is the instrument configuration (.con) file. String must include
full path and file name.
Note: If using /cString, must also specify input file name (using
/iString).
/iString Use String as input file name. String must include full path and
file name.
This parameter supports standard wildcard expansion:
* matches any set of characters starting at specified position within
file name or extension and continuing until the end of file name or
extension or another specified character
/oString Use String as output directory (not including file name).
/fString Use String as output file name (not including directory).
/aString Append String to output file name (before extension).
/pString Use String as Program Setup (.psu) file. String must include full path
and file name.
/xModule:String Use String to define an additional parameter to pass to Module. Not
all modules have x parameters; see module descriptions for details. If
specifying multiple x parameters, enclose in double quotes and
separate with a space.
Example: Run DATA CONVERSION, telling it to skip first 1000
scans: /xdatcnv:skip1000
#m Minimize the SBE Data Processing window while processing the
data, allowing you to do other work on the computer.
If specifying multiple parameters, insert a space between each parameter in the list.
Module 3: Basic Data Processing 33
Running SBEBatch
• SBEBatch is run from the command line
• The first parameter is the name of the batch
file that SBEBatch will open and execute
• For example: c:\>SBEBatch MyBatch.txt
• Causes SBEBatch to open MyBatch.txt and
run the applications a line at a time
Module 3: Basic Data Processing 35
Activity
Activity
• Write a batch processing script to process the
files in C:\Data\Module3\Batch
• Run Data Conversion
– Calculate depth, temperature and salinity
– Down cast only
• Run Bin Average
– Calculate 1 meter bins