Module 3 Script
Module 3 Script
Slide 5: Demonstration
For more information on this, you can see the steps in bats.pdf of workshop_material. These steps are
beyond the scope of this workshop, however, it is worth nothing the last step in bats.pdf, turns all of the
downloaded data into a CTD object so that it was something that oce could deal with.
Now in RStudio, were going to open bats.R and in line 2 we're loading our required packages. In line three
we're loading up the data set of all of the CTD data from the BATS data set. This is given to the user as
CTD.rda in workshop_material. In the user’s console, if they now type length(CTD) they’ll see we're now
dealing with almost 6000 measurements. The main difference here compared to what we did earlier
Module 3 Script
was that before we were only dealing with one CTD measurement. This now means when we want to
use the [[ function, we would first have to specify which profile we’re interested in. For example, if we
were interested in temperature. we can’t just do CTD[[‘temperature’]]. We would instead have to do
CTD[[1]][[‘temperature’]], which is the temperature of the first profile.
Slide 3: Demonstration
Back in RStudio, using the bats.R script, we’ve already loaded in our data and now were going to extract
all of the time from every single profile. Instead of doing CTD[[1]][[‘time’]], followed by CTD[[2]][[‘time’]]
over 6000 times, were instead going to use what’s called a lapply or “list apply”. The lapply looks at a list,
in our case the list of CTD profiles, and performs an action on all items in the list. If we look at the output
from line 10, we see that it is a list of extracted times form all of the CTD profiles. In line 11, we’re just
unlisting all of the times and putting them into one object.
Remembering our goal is to look at the change in temperature overtime, in line 14 we're setting the
parameter to be “temperature”, however if the user was interested in oxygen, salinity, fluorescence, etc.
they could set the parameter to be whatever you were interested in. Now, we’re doing a for loop, a for
loop is similar to the lapply(), which performs an action on a list. If for example, we wanted to look at the
mean temperature of the upper 200 m for the first profile we would do the following:
keep <- which(CTD[[1]][[‘depth’]] < 200 ) # Determine which depths are less than 200
param <- CTD[[1]][[parameter]][[keep]] # Determine the temperature associated with a depth less
than 200 m
meanUpperParam <- mean(param, na.rm=TRUE) # Determine the mean temperature while removing
any NA values
upper[1] <- meanUpperParam # Inserts the mean into storage that was previously created
The for loop allows us to only perform that operation once instead of more than 6000 times. This means
that “upper” is a list of the mean temperature of the upper 200 m for all of the 6000 profiles. The user
may notice in “upper” that sometimes there are values NaN. When the user takes an average of all NA
values, they receive NaN. This means that even though we removed all of the NA values when we took
the average, in some cases there were only NA values, so it created an NaN value. Starting at line 23,
Module 3 Script
we’re simply removing all of the NaN values associated with time (line 25) and upper (line 26). In line 27,
we are then determining the order of time to make sure it is in chronological order. In line 28 and 29, we
are then ordering our mean temperature values and our time in chronological time order for good
practice.
M3 Lesson 3: Analysis
Slide 1: Title Slide
Slide 2: Demonstration
Now that we extracted our mean temperature for the upper 200 meters of each profile, we've removed
any bad data, and ordered our data chronologically for time, we're now at the point where we're going
to perform a linear regression to determine the trend overtime. In line 31, we're plotting the upper main
temperature as a function of time, while changing our symbol shape, the size of our symbol, giving
names to the X and Y labels, and changing colour in a sophisticated way to make the points slightly
transparent.
In this plot, we see the mean surface temperature of the upper 200 meters and its changes over time.
We’re now going to find the rolling average of this change in temperature. The rolling average, otherwise
known as the boxcar average, is a way to remove the noise from our analysis. In our case, we specified
n=5, which means 5 points on the right and 5 points on the left of each point will be averaged to remove
noise.
Slide 4: Demonstration
Now we’re going to look at the linear regression of mean upper 200 m temperature over time in line 35
and then draw it onto our graph in line 36. If we then type “trend” in our console, we see that we’re given
two numbers, the intercept and the slope. Initially, this slope, ie. the change in temperature over time,
might seem very low. If we look at class(time[1]), however, we see that the class of the time on our x-
axis is POSIXct.
Slide 6: Demonstration
This means, the slope is the change in temperature per second. In order to get the change in
temperature per decade we need to do the calculation in line 40. In line 40, we extracted the slope using
coef(tend)[[2]] (if we wanted the intercept we would do coef(trend)[[1]]), then we multiplied by 86400
because there are 86400 s in a day, then we multiplied by 365 because there are 365 days in a year, and
finally by 10 because there are 10 years in a decade. After this calculation, we determine that the change
in the upper 200 m temperature per decade is 0.39 degrees Celsius.
Slide 3: Limitations
Now, there are some limitations to our study. The first one is the temporal coverage. For example, if
one year we had samples that were taken all in the summer, versus the next year, if they were all taken
during the winter, that could impact our analysis. One way around this is for months with irregular
sampling we could interpolate to the middle of every month.
Secondly, another limitation is pressure. For example, we took the mean temperature of the upper 200
meters but if there was a float that only went down to say 50 meters and we took the average of that
that would make it look like that particular float was warmer than the other ones. One way around this is
that in a given month we could only use the deepest CTD.
Thirdly, there may be differences between CTD measurements. To address this, we could calibrate all
CTD's to the same scale.
And then lastly, there's also a concern about quality control and we therefore may need to work with
experts to flag poor quality data.
§ Sea level rise: Sea level rise is caused by melting of polar ice sheets and glaciers, decrease in
salinity, ocean thermal expansion, ocean circulation, solid earth deformation, exchanges with
Module 3 Script
surface and ground-water, and melting of mountain glaciers. This can lead to problems
including coastal erosion, marine flooding, and saltwater intrusions in coastal aquifers
(Cazenave and Cozannet, 2014).
§ Ocean heat waves: Global warming and ocean temperature increase can lead to ocean heat
waves. These heat waves can be particularly harmful to our oceans as they can lead to fish kills
and migration of tropical fish species and megafauna (Pearce and Feng, 2013).
§ Coral bleaching: Elevated temperature, paired with solar irradiance and sometimes disease
can lead to coral bleaching. This can cause many side effects to our oceans including declines
in topographic complexity leading to consequences for organisms that depend on live corals
for food, shelter, or settlement such as death or extinction (Pratchett, et al. 2008).
This information can be relevant to inform various decision making and policy related to: