0% found this document useful (0 votes)
10 views

Module 3 Basic Data Processing

This document provides an overview of basic data processing steps for converting raw sensor data to scientific units, including: 1) Data Conversion to convert raw data files to calibrated scientific units using configuration files, producing .cnv files. 2) Bin Averaging to reduce large data sets into more manageable sizes by averaging data into bins based on factors like pressure, depth, or time. 3) Plotting data with Seaplot to visualize it in graphs. The document reviews the user interface and covers the Data Conversion process in more detail, explaining how it uses configuration files to make unit conversions and calculate derived parameters from raw sensor outputs.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Module 3 Basic Data Processing

This document provides an overview of basic data processing steps for converting raw sensor data to scientific units, including: 1) Data Conversion to convert raw data files to calibrated scientific units using configuration files, producing .cnv files. 2) Bin Averaging to reduce large data sets into more manageable sizes by averaging data into bins based on factors like pressure, depth, or time. 3) Plotting data with Seaplot to visualize it in graphs. The document reviews the user interface and covers the Data Conversion process in more detail, explaining how it uses configuration files to make unit conversions and calculate derived parameters from raw sensor outputs.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Module 3

Basic Data
Processing
2 Module 3: Basic Data Processing

Overview

Basic Data Processing


• Conversion from raw sensor outputs to
scientific units
• Bin averaging to a manageable size
• Plotting with Seaplot
• Batch processing large numbers of files

In this module we will cover the data processing basics, no fancy stuff. With this
information you can go from raw data file to something you can print or plot.

When we finish this module you should be able to:

• Run Data Conversion and convert your raw data to scientific units.

• Run Bin Average to reduce your data set, producing data at even depth values.

• Run Seaplot to display your work.

• Process large numbers of data files by making script files.


Module 3: Basic Data Processing 3

Data Processing: Flow Chart

OK, I’ve Got My Data,


Now What Do I Do?
Data Conversion Bin Average SEAPLOT
Convert raw data, creating Average data into Plot data using linear
.cnv file that can be used desired bins, based or log (for
by on pressure, depth, fluorescence and
Sea-Bird’s other data scan number, or PAR) scales, in
processing modules. time. black-and-white or
Converted data includes: multi-color overlaid
– Pressure, temperature, plots.
and conductivity data
– Auxiliary sensor data
(light transmission,
pH, etc.)

Data processing involves operating on your raw data (which came from the sensors
attached to the instrument) to convert it from the sensor outputs to scientific units,
calculate any additional oceanographic parameters of interest, and reduce the data set to a
tractable size. It is always the best practice to archive your raw data, because there is no
going backwards once you have processed it. Should you discover calibration errors, omit
necessary parameters, or make processing errors, it is imperative to return to your raw
data set.
4 Module 3: Basic Data Processing

Data Processing: User Interface

Data Processing Interface

This is the data processing user interface. For the basics, we will concern ourselves with
Data Conversion and Bin Averaging.
Module 3: Basic Data Processing 5

Data Processing: Conversion to Scientific Units

Data Conversion: Raw to MKS


• This step takes raw data that you collected
and makes conversion from raw sensor
output to MKS units
• Primary quantities (e.g., temperature,
conductivity, pressure) are calculated here
• Parametric quantities (e.g., salinity, density,
depth in meters) may also be calculated at
this time

Data Conversion takes your raw data (.dat or .hex) file and, with the information
contained in the .con file, converts it to scientific units. The form the converted data takes
is set up in the SBE Data Processing-Win32 Data Conversion dialog. In addition to
quantities like temperature and conductivity, parameters that are calculated from these,
such as salinity and density, are also available. If you plan to do more advanced
processing of your data, you should wait until these steps are complete before calculating
parameters derived from the basics.
6 Module 3: Basic Data Processing

Data Conversion: File Specification

File Specification

The program setup (.psu) file contains information regarding how the program was set up
last time you used it. This relieves you of the task of choosing the variables and other
processing options. In addition, if you have more than one preferred processing protocol,
you can store them as separate .psu files. As always, you must select the appropriate .con
file for your instrument. You can select multiple files for processing; a separate .cnv file
will be created for each file that is processed.
Module 3: Basic Data Processing 7

Data Conversion: Data Setup

Data Setup

The Data Setup tab gives you some choices:

You can process all or part of the data file and skip the beginning (during the time you were
checking your instrument before the cast). You can output data in ASCII (text) or binary (binary
file is shorter). You can convert the upcast, downcast, or both.

If you collected water samples, you can generate a file containing some CTD scans that were
collected while the water bottle was being closed. There are several means of determining where
these scans should come from:
• For the 9plus with a G.O. 1015 water sampler, the data stream is marked with a status bit
when the bottle closure occurs.
• For internally recording instruments with an AFM, the file uploaded from the AFM
(an .afm file) can be used.
• A .bsr file can be made from a .mrk file (collected while running Seasave) with the
Mark Scan module.
• If using an SBE 32 Carousel, Seasave writes a .bl file that contains the information required.

Once you have settled on the source of information for indicating bottle closure times, you have
to select the amount of data to be included in the file for each closure. Scan range offset is how
far back Data Conversion looks in your data stream in seconds and Scan range duration is how
long it looks, again in seconds. For example, an offset of 5 seconds and duration of 10 seconds
means you will get 5 seconds of data before the bottle closure and 5 seconds after the closure.
8 Module 3: Basic Data Processing

Data Conversion: Output Variable Choices

Output Variables

The output variables dialog presents a table of items that will be calculated and the order
they will be calculated in.
Module 3: Basic Data Processing 9

Data Conversion: Cast Headers

Cast Header Viewer

The header view can reassure you that you are working with the instrumentation you
think you are.
10 Module 3: Basic Data Processing

Data Conversion: File Headers Revealed

What’s a Header?
• Header is placed at the beginning of each
data file
– Header for internally recorded data is written when the
data is uploaded from the instrument
– Header for real-time data is written before the data
is recorded
– These lines are preceded with an *
• Operator comments are added to the header; these
lines are preceded with an **
• Headers are updated each time data in the file
is manipulated
– Update contains information relevant to how the data
was manipulated
– These lines are preceded with an #
Module 3: Basic Data Processing 11

Data Conversion: Output File Format

Data Conversion Output, .cnv


# datcnv_date = Jun 25 2001 10:21:54, 5.2
# datcnv_in = HOT101\0001a001.dat HOT101\S1C1\0001a001.con
# datcnv_skipover = 12000
# file_type = ascii
*END*
10.115 24.7800 35.2446 0.000e+00
10.178 24.7799 35.2442 0.000e+00
10.170 24.7798 35.2443 0.000e+00
10.115 24.7799 35.2444 0.000e+00

* Sea-Bird SBE 9 Raw Data File:


* FileName = D:34-8.DAT
* Software Version 4.0.b.g
* Temperature SN = 1123
* Conductivity SN = 915
* Pressure SN = 43440
* Number of Bytes Per Scan = 15
* System UpLoad Time = January 24, 1992 3:39:09 pm
* Ship: R/V Poseidon
* Cruise: 189-1a
* Station: 34
* Cast: 8
* Latitude: 35 22.00' N
* Longitude:12 43.36' W
* Second set of CTD trials, station 34 cast 8, SBE Mk3 Mk5a Mk5b probably no FSI trial.
# nquan = 5
# nvalues = 77205
# units = metric
# name 0 = depS: depth, salt water [m]
# name 1 = t068: temperature, IPTS-68 [deg C]
# name 2 = c0S/m: conductivity [S/m]
# name 3 = sal00: salinity, PSS-78 [PSU]
# name 4 = flag: 0.000e+00
# span 0 = 1.519, 3555.084
# span 1 = 2.4822, 17.1124
# span 2 = 3.257845, 4.669567
# span 3 = 34.9097, 36.4804
# span 4 = 0.0000e+00, 0.0000e+00
# interval = seconds: 0.0416667
# start_time = Jan 24 1992 15:39:09
# bad_flag = -9.990e-29
# sensor 0 = Frequency 0 temperature, 1123
# sensor 1 = Frequency 1 conductivity, 915, cpcor = -9.5700e-08
# sensor 2 = Frequency 2 pressure, 43440
# sensor 3 = Extrnl Volt 0 oxygen, current, 130259
# sensor 4 = Extrnl Volt 1 oxygen, temperature, 130259
# datcnv_date = Jun 22 2001 16:19:48, 4.249
# datcnv_in = 34-8.DAT 34-8.CON
# datcnv_skipover = 0
# file_type = ascii
*END*
1.636 16.9770 4.654863 36.4699 0.000e+00
1.691 16.9784 4.655062 36.4703 0.000e+00
12 Module 3: Basic Data Processing

Data Conversion: Water Sampler Output File Format

Data Conversion Output, .ros

1011.823 4.0297 34.5000 43234 1 0.000e+00


1011.696 4.0297 34.5000 43235 1 0.000e+00
1011.886 4.0297 34.5000 43236 1 0.000e+00
1011.759 4.0299 34.5000 43237 1 0.000e+00
1011.839 4.0303 34.4997 43238 1 0.000e+00
1011.759 4.0308 34.4998 43239 1 0.000e+00

The data conversion process also writes a file with data collected while water bottles
were being closed. You choose the time window of data you want written to the file. For
example, you might select to have all scans collected starting 5 seconds before the bottle
closes and ending 5 seconds after the bottle closure written to the file. We will discuss
the use of this data in correction of CTD conductivity data in the advanced data
processing topics.
Module 3: Basic Data Processing 13

Data Conversion: Example Data

Example Run Through -


Raw Data

Here is a Seasave plot of some data collected on Hawaii Ocean Time-series cruise 101.
14 Module 3: Basic Data Processing

Activity

Activity: Convert the


Example Data Set
• First view the data with Seasave
– Run Seasave, set up the plot
• Pressure 0 to 1200 decibars
• Temperature 0 to 30 degrees C
• Salinity 34 to 36 PSU
– Click Archived Data -> Start
• Select Data File: C:\Data\Module3\Hawaii.dat
• Select .con File: C:\Data\Module3\Hawaii.con
– Click Start Display
• Data Conversion
– Convert pressure, temperature, and salinity

To make the data conversion from raw sensor output to scientific units:
Click Start -> Programs -> Sea-Bird -> SBEDataProcessing-Win32
Click Run -> Data Conversion
Select instrument .con file: C:\Data\Module3\Hawaii.con
Select the data file: C:\Data\Module3\Hawaii.dat
Click the Data Set Up Tab
Choose to process to the end of the file
Skip 0 scans
Select output format ascii
Convert data from upcast and downcast
Create .cnv file only
Click the Select Output Variables button
Click on the first Variable Name
Select Pressure -> Digiquartz -> db
Click Add or Change
Click on the second Variable Name
Select Temperature -> ITS90 -> deg C
Click Add or Change
Click on the third Variable Name
Select Salinity
Click Add or Change
Click Start Process

We thank the Hawaii Ocean Time Series program for sharing this data with us.
Module 3: Basic Data Processing 15

Bin Averaging

Bin Averaging
• Reduces size of a data set by statistically
estimating data values at even intervals
(e.g., every meter or 10 meters)
• May work in depth (meters),
pressure (decibars), time, or by scan
• The surface bin is treated separately

Bin averaging is a means of reducing your data set to a more tractable, and perhaps
a more meaningful, size. The Bin Average module makes a statistical estimate of
data values at a user-prescribed interval based on the surrounding data. You can
bin data on the even meter or 10 meters. You can bin data with a bin size that
represents the resolution of your instrument. For time series measurements, you can
bin on time interval.
16 Module 3: Basic Data Processing

Bin Averaging: Processing Protocol

Bin Averaging Protocol:


Pressure Interpolated
• A linear estimate of variable Xi at bin pressure Pi
( Xc − Xp ) * (Pi − Pp )
Xi = + Xp
( Pc − Pp )
Pp =average pressure of previous bin Xp =average value of variable in previous bin
Pc =average pressure of current bin Xc =average value of variable in current bin
Pi = center value for pressure in current bin

surface = 0 db

Minimum first bin =


bin size - (bin size/2) = 5 db
First bin Bin size=10 db
Sum and average all data within bin, Center (target) first bin =
then interpolate to calculate value of bin size = 10 db
variable at center of bin
Maximum first bin =
bin size + (bin size/2) = 15 db

An estimate of each variable is made using the average value of that variable and
pressure in the previous bin, and the average values of the variable and pressure in the
current bin.
Module 3: Basic Data Processing 17

Bin Averaging: Processing Protocol

Bin Average Protocol:


Pressure, Not Interpolated
• Data within a bin is averaged by summing
and dividing by number of points within bin
surface = 0 db

Minimum first bin =


bin size - (bin size/2) = 5 db
First bin
Bin size=10 db Center (target) first bin =
Sum and average bin size = 10 db
all data within bin
Maximum first bin =
bin size + (bin size/2) = 15 db

This protocol averages all the data within the bin, producing uneven bin pressures or
depths. For example, if you are binning on 10-meter intervals, the first bin start is
5 meters and the end is 15 meters. All data within this window is averaged, producing a
bin depth of approximately 10 meters (e.g., 10.123 meters).
18 Module 3: Basic Data Processing

Bin Averaging: Surface Bin

The Surface Bin


• Surface bin constrained by user data entries:
minimum, maximum, and assigned pressure
or depth
minimum surface bin = 0 db
target surface bin = 0 db
Surface bin
Bin size=3 db maximum surface bin = 3 db

Minimum first bin =


bin size - (bin size/2) = 5 db

First bin
Center (target) first bin =
Bin size=10 db
bin size=10 db

Maximum first bin =


bin size + (bin size/2) = 15 db

The surface bin is handled differently because the previous bin would be up in the air.
The surface bin is assigned a beginning pressure or depth, an ending pressure or depth,
and a target pressure or depth.
Module 3: Basic Data Processing 19

Bin Averaging: File Selection

Bin Average: File Selection


20 Module 3: Basic Data Processing

Bin Averaging: Data Setup

Bin Average: Data Setup

The Data Setup tab allows your choice of pressure, depth, time, or scan bins. You can
include the number of scans per bin in the output file, which is useful for instruments
with a low sample rate. Bins with 1 sample in them are not very accurate statistically.

You can skip the data that you acquired while you were checking out your instrument
before the cast started. Similar to previous processing, you can process the upcast,
downcast, or both.

In the advanced data processing portion of the course we will discuss techniques of
removing suspect data. These data are marked in the data set as bad scans. When you
are setting up Bin Averaging you may exclude scans marked bad by previous processing
steps.

As mentioned a moment ago, the surface bin is handled separately. Note that in our
example the surface bin is not included, because we are binning on a 1-meter interval. If
you bin on a small interval, it is very difficult to calculate a surface bin. For example, we
are using 1-meter bins, which begin at 0.5 meters and end at 1.5 meters. The surface bin
would begin 0.5 meters in the air. The surface bin is useful for a coarser bin size. For
example, with 10-meter bins, the first bin starts at 5 meters and runs to 15 meters. You
can succeed in calculating a surface bin that runs from 0 meters to 5 meters with value
2.5 meters.
Module 3: Basic Data Processing 21

Bin Averaging: Output Data

Bin Average: Output Data


• # binavg_bintype = meters
• # binavg_binsize = 1
• # binavg_excl_bad_scans = yes
• # binavg_skipover = 0
• # binavg_surface_bin = no, min = 0.000, max = 5.000,
value = 2.500
• # file_type = ascii
• *END*
• 1.000 24.9124 35.2455 0.0000e+00 100.00
• 2.000 24.9582 35.2463 0.0000e+00 90.00
• 3.000 25.0029 35.2477 0.0000e+00 36.00
22 Module 3: Basic Data Processing

Activity

Activity: Bin Average the


Example Data
• Use the data you converted:
C:\Data\Module3\Hawaii.cnv
• Set output directory to
C:\Data\Module3
• 5 decibar pressure bins
• Include surface bin
– Minimum value 0
– Maximum value 5
– Surface bin value 2.5
• Name append “B”

Click Start -> Programs -> Sea-Bird -> SBEDataProcessing-Win32


Click Run -> Bin Average
Click Select beside Input files to choose C:\Data\Module3\Hawaii.cnv
Click Select beside Output directory to choose C:\Data\Module3
Enter Name append of “B”
Output file should be Hawaii.cnv
Click the Data Setup tab
Choose Bin Type of Pressure
Enter Bin size of 5
Check Include number of scans per bin
Check Exclude scans marked bad
Skip over 0 scans
Process the downcast
Click Start Process
Module 3: Basic Data Processing 23

Data Display: Seaplot

Seaplot
• Seaplot is an application that will plot
converted data ( .cnv )
• You may plot 1 – 4 parameters against a
single Y axis
• You may plot 1 – 4 parameters against a
single X axis ( useful for time series data )
• You may select axis values interactively
• Seaplot is part of the SBEDataProcessing
suite of applications
24 Module 3: Basic Data Processing

Seaplot: Displaying Converted Data

Displaying Converted
Data: Seaplot

The file setup dialog includes an entry for the program setup file or .psu. This
allows you to store your plot setup for reuse. Output directory settings are included
for batch processing.
Module 3: Basic Data Processing 25

Seaplot: Plot Setup

Displaying Converted
Data: Seaplot

Plot Setup allows you to choose the plot type (single Y, multiple X or single X,
multiple Y), fonts, titles, and color schemes. Symbols are also specified here; if you are
plotting to a black and white printer, symbols are very useful. The auto range padding
entry sets the amount beyond the range of the data that the minimum and maximum axis
values extend.
26 Module 3: Basic Data Processing

Seaplot: Y-Axis Specification

Displaying Converted
Data: Seaplot

The Y-axis setup tab allows selection of variable, line type, and color. Plot range and
scale direction are also specified here.
Module 3: Basic Data Processing 27

Seaplot: X-Axis Specification

Displaying Converted
Data: Seaplot
28 Module 3: Basic Data Processing

Seaplot: The Plot

Displaying Converted
Data: Seaplot
Module 3: Basic Data Processing 29

Activity

Activity: Plot your Data


• Use Seaplot to plot your converted files
– C:\Data\Module3\Hawaii.cnv
– C:\Data\Module3\HawaiiB.cnv

Run SBEDataProcessing-Win32
Select Seaplot
Click Select for input files and select C:\Data\Module3\Hawaii.cnv
Click the Plot Setup tab
Choose a Plot type of Single Y - Multiple X
Click the Y Axis tab
Choose Pressure, Digiquartz [db]
Check Auto-range
Click on the X Axis 1 tab
Choose Temperature
Uncheck Auto range
Enter a minimum of 0, a maximum of 30
Click on the X Axis 2 tab
Check the Include axis box
Choose Salinity
Uncheck the Auto range
Enter a minimum of 34, a maximum of 36
Click the Start Process button
30 Module 3: Basic Data Processing

Data Processing of Large Numbers of Files

Batch Processing
• Batch processing frees you from processing each
cast individually
• Batch processing is done from a command line
prompt
– Win2000/XP run “command” from the
Start -> Run dialog
– Win95/98 use an MSDOS window
– Run SBEBatch directly from the Start -> Run dialog
• You need to set your autoexec.bat path to include
the folder where SBEbatch resides

For processing large sets of cast data, batch mode processing automates the job. You can
use the windows scripting host or a program provided in Seasoft, SBEBatch.exe. Your
batch file can take advantage of command line parameters and wild card characters.
Module 3: Basic Data Processing 31

Data Processing of Large Numbers of Files (continued)

Batch Processing
• Batch processing uses an application that runs other
applications (ie data processing apps)
• You may use the Windows Scripting Host or an
application Sea-Bird provides, SBEBatch
• The applications that the batch processor runs are
listed in a text file that you make with a text editor
like Notepad
– A list of applications are shown in your notes
• SBEBatch reads each line of the text file and runs
each application in turn

Module Process Name


ALIGN CTD AlignCTD
ASCII IN ASCIIIn
ASCII OUT ASCIIOut
BIN AVERAGE BinAvg
BUOYANCY Buoyancy
CELL THERMAL MASS CellTM
DATA CONVERSION DatCnv
DERIVE Derive
FILTER Filter
LOOP EDIT LoopEdit
MARK SCAN MarkScan
ROSETTE SUMMARY RosSum
SECTION Section
SPLIT Split
STRIP Strip
TRANSLATE Trans
WILD EDIT WildEdit
WINDOW FILTER WFilter
32 Module 3: Basic Data Processing

Data Processing of Large Numbers of Files (continued)

Batch Processing
• Each line of your batch file contains
– The name of the application
– The name of the files to operate on
– Any additional parameters needed to do the job
• Parameters are denoted by the ‘/’ character and an
identifier, a table of parameters is shown in your notes
• For example, a batch processing file that runs Data
Conversion on one data file looks like:
DatCnv /iMyData.dat /cMyCTD.con
• The input file is “MyData.dat”, designated by the “/i”
• The configuration file is “MyCTD.con”…”/c"
• This will cause Data Conversion to use the last .psu file
and create “MyData.cnv”

Parameter Description
/cString String is the instrument configuration (.con) file. String must include
full path and file name.
Note: If using /cString, must also specify input file name (using
/iString).
/iString Use String as input file name. String must include full path and
file name.
This parameter supports standard wildcard expansion:
* matches any set of characters starting at specified position within
file name or extension and continuing until the end of file name or
extension or another specified character
/oString Use String as output directory (not including file name).
/fString Use String as output file name (not including directory).
/aString Append String to output file name (before extension).
/pString Use String as Program Setup (.psu) file. String must include full path
and file name.
/xModule:String Use String to define an additional parameter to pass to Module. Not
all modules have x parameters; see module descriptions for details. If
specifying multiple x parameters, enclose in double quotes and
separate with a space.
Example: Run DATA CONVERSION, telling it to skip first 1000
scans: /xdatcnv:skip1000
#m Minimize the SBE Data Processing window while processing the
data, allowing you to do other work on the computer.

If specifying multiple parameters, insert a space between each parameter in the list.
Module 3: Basic Data Processing 33

Data Processing of Large Numbers of Files (continued)

Batch Processing Script


• To Process all the files in a folder use a
wildcard: the ‘*’ character
• For example, a batch processing file that runs
Data Conversion on all the data files in a
folder looks like:
datcnv /iC:\Data\*.dat /cMyCTD.con
• The input files are all .dat files in C:\Data\
• The configuration file is “MyCTD.con”
34 Module 3: Basic Data Processing

Data Processing of Large Numbers of Files (continued)

Running SBEBatch
• SBEBatch is run from the command line
• The first parameter is the name of the batch
file that SBEBatch will open and execute
• For example: c:\>SBEBatch MyBatch.txt
• Causes SBEBatch to open MyBatch.txt and
run the applications a line at a time
Module 3: Basic Data Processing 35

Data Processing of Large Numbers of Files (continued)

Batch Processing Script


• You can operate on files in different folders with
the same batch file by using command line
parameters
• These are entered after the batch file name and are
denoted by the ‘%’ character and a number
– The first command line parameter is %1, the second is
%2 etc.
• Your batch file must have entries that use the ‘%’
parameters
36 Module 3: Basic Data Processing

Data Processing of Large Numbers of Files (continued)

Batch Processing Script


• For example a batch file that has this line
DatCnv /i%1\*.dat /c%1\MyCTD.con
• Executed with this command line
c:\>SBEBatch MyBatch.txt C:\Data
• Will cause Data Conversion to be run like this:
DatCnv /iC:\Data\*.dat /cC:\Data\MyCTD.con
• C:\Data is the %1 parameter
• All the .dat files in C:\Data will be converted
• The command line c:\>SBEBatch MyBatch.txt C:\NewData
• Causes all the .dat files in C:\NewData to be converted
Module 3: Basic Data Processing 37

Activity

Activity
• Write a batch processing script to process the
files in C:\Data\Module3\Batch
• Run Data Conversion
– Calculate depth, temperature and salinity
– Down cast only
• Run Bin Average
– Calculate 1 meter bins

You might also like