Citi 2020
Citi 2020
1 Introduction
Oceanography is an important Earth science that studies the physical and bio-
logical aspects of the ocean, and it requires a large amounts of data for modelling,
investigating, predicting and explaining the different natural phenomena. Such
data is usually provided by scientific instruments like satellites, oceanographic
ships, buoys, and others. Because available data are growing in quantity and in
complexity, the need is emerging for a machine learning approach to integrate,
if not substitute, the more classical statistical take in oceanographic research.
Among the many applications of oceanography to marine resources management,
controlling and preventing illegal fishing stands out as a very important one. Ille-
gal, unregulated, and unreported fishing is becoming more sophisticated, and, as
it turn out, oceanographic conditions are predominant predictors of the seasonal
variations in fishing effort [2]. Being able to automatically identifying favorable
2 F. Author et al.
fishing zone is one possible strategy for helping illegal fishing activity monitoring
and preventing, and, to this end, automatically identifying favourable oceanic
conditions in a promising strategy.
Chlorophyll-a is a specific form of chlorophyll used in oxygenic photosynthesis
which has been linked to nutrient presence in several different areas [7, 19, 6]. It
is known that certain kind of satellite data can be used to predict the presence
of Chlorophyll-a in oceanic areas [6, 9]. In this work, we consider open access
data taken from the Copernicus space program, currently used in the European
Union for Earth observation and monitoring, in an attempt to build a reliable
spatial-temporal prediction model for Chlorophyll-a presence around Galapagos
Islands, and, in particular, in the Galapagos Marine Reserve (GMR), with the
purpose of creating the basis for an implementable, cost-effective, and reliable
model for potential fishing area prediction to be used in illegal fishing control
activities. Chlorophyll-a presence in a certain geographical point, in combination
with relevant physical, chemical, and biological variables of the same point can be
thought of as a multivariate spatial-temporal series, in which the Chlorophyll-a
plays the role of dependent variable. As such, multivariate spatial-temporal re-
gression can be used to estimate not only the functional model, but also the
temporal component for each predictor. In extracting a regression model from
spatial-temporal data, one can benefit from a suitable transformation of the data
itself that highlight the roles played by the spatial and by the temporal compo-
nent. Such a transformation is known as convolution vector [8]. In this work we
want to asses the expected improvement that can be obtained by applying a con-
volution vector to the original data, effectively paving the way toward a more
systematic exploration and optimization of the possible data transformations
that take into account both component of space and time.
This paper is organized as follows. In Section 2 we give a short account of
the current literature that concerns oceanographic data and learning. In Sec-
tion ?? we give some practical motivations for this work and the problem we
want to solve. Then, in Section ?? we describe the data that we have used and
mathematical model that we have applied. Finally, in Section ?? we describe our
practical approach and its results, before concluding.
2 Related Work
As the quantity, the complexity, and the availability of oceanographic data grows,
machine learning-based approaches to their analysis are becoming ever more
common [1]. Typical applications range from climate prediction, habitat model-
ing, and climate change analysis, to species distribution, species identification,
resource management, and environmental protection (see [18] for a recent re-
view). Examples of concrete applications include species identification [10], au-
tomatic detection and classification of ocean pollution, oil spills, alga bloom,
plastic pollution [5], as well as several fishing control-related applications. Fish-
ing control, and connected activities, in particular, are of special interest in this
work.
Temporal Aspects of Chlorophyll-a Prediction 3
Fig. 1. Fishing Ships around Galapagos exclusive economic zone. Snapshot taken in
2019, August. Source: http://www.globalfishingwatch.org.
in [9]. The main differences between [9] and our approach are that the former
uses costal data extracted from sensor, instead of high sea data from satellite,
and, moreover, it does not include a spatio-temporal study of the cause-effect
relationships.
3 Motivation
Galapagos Islands are located more than 1000kms to the west from Ecuador’s
continental coast. In 1998, the Government of Ecuador created the Galapagos
Marine Reserve (in short, GMR) to preserve the resources of the islands. In 2001,
Galapagos was declared a World Heritage by UNESCO. Due to its location,
the islands receive, from the east, the influence of two currents, the so-called
Humboldt’s cold current and Panama’s warm current, while, from the west,
that of the so-called cold and deep Cromwell current. These currents which
carry waters plenty of nutrients from the bottom of the sea to the surface. As a
result of this combination, Galapagos has extremely high productivity areas with
diverse marine organisms [11]. Because of its rich seas, this ecosystem attracts
various species towards the exclusive economic zone of the Galapagos and its
surroundings, and, for this reason, it receives pressure from industrial fishing,
principally from Asia, as it can be seen in Fig. 1. Many times, the activities
around the maritime limits derive in illegal fishery, which is very difficult to
control due to the immense Galapagos’ maritime territory. The effective control
of maritime spaces can only be carried on through satellite monitoring; however,
this is an extremely expensive solution. Other less expensive solutions require
the use of VMSs and S-AISs systems, which are installed onboard the fishing
ships, to monitor the position and fishing activities; nevertheless, these systems
can be disconnected by the ships when performing illegal fishing activities.
An alternative solution to help illegal fishing control while reducing the cost
of surveillance is being able to predict the are where fishing activity may take
Temporal Aspects of Chlorophyll-a Prediction 5
4 Clorophille Prediction
4.1 Data
The space program that is currently used in the European Union for Earth obser-
vation and monitoring is called Copernicus, and it encompasses three complete
constellations, each one with two satellites plus an additional single satellite.
This system provides 150 TB of open access data every day, including funda-
mental measurements or estimates of several physical, chemical, and biological
oceanic variables. Ocean color information of the Sentinel satellites is employed
for monitoring water quality through chlorophyll-a and phytoplankton analysis;
other oceanic variables such as wave height, tide, sea current, salinity, tempera-
ture, nutrient, and oxygen are used to develop hydrodynamic models to forecast
the evolution of ocean variables relevant for aquaculture. Moreover, the temper-
ature, salinity, mixed layer thickness, wind, sea currents, wave heights, mixed
layer thickness, chlorophyll, phytoplankton, zooplankton, and nutrients are used
to develop models related with Oceanic conditions and fish’es habitat spatial
distribution [15].
Satellite data. Years. Granularity. Available columns. Basic statistical values
for the columns (mean, variance, skewness, kurtosis, normality, etc..). Missing
data. Pre processing (only classical preprocessing, not the preprocessing specific
to this problem).
The study area of this research is around the Galapagos Islands between
6◦ N to 10◦ S and 85◦ W to 116◦ W., and has been conducted using E.U. Coper-
nicus Marine Service Information products [3] and [4], as like: Global Phys-
ical Analysis and Coupled System Forecasting (Global Analysis Forecast-PHY
-CPL-001-012) and Global Biogeochemical Analysis and Forecast (Global Anal-
ysis Forecast-BIO-001-028). The first product has information about physical
variables and the second of biological and chemical variables. A summary of
the variables considered are in Table N. 1 and Table N. 2. The dataset was
constructed with daily mean values of these variables of January 2018, with
granularity of 1/4◦ , obtaining a total of 8124 samples per day for each variable
without missing data. Basic statiscal values of physical and bio-chemical vari-
ables are in Table N. 4 and Table N. 3. An advantage of Copernic’s system
is the availability of data at various depths; The depth of 40 meters has been
selected, because the scales are different between the products; however they
coincide in the depth indicated above.
6 F. Author et al.
4.3 Experiment
Explain experimental parameters . Describe training data set and test data set.
Table with test results. Results for 1-unit prediction, 2-units prediction, etc.
The relationships between chlorophyll levels and other physical and biochem-
ical variables were indentified through a simple correlation matrixes (Table N. 5)
and (Table N. 6). Taking as reference the chlorophyll, exist a high positive cor-
relation with nutrients (NO3, PO4, Si, Fe), phytoplankton, primary production,
and negative correlation with oxygen, pH, temperature, and mixed layer depth.
Other parameters like surface CO2, salinity, sea surface height, and sea current
water velocity have a low correlation.
4.4 Results
Discuss the results.
5 Conclusions
References
1. Ahmad, H.: Machine learning applications in oceanography. Aquatic Research 2(3),
161–169 (2019)
Temporal Aspects of Chlorophyll-a Prediction 9
2. Cimino, M.A., Anderson, M., Schramek, T., Merrifield, S., Terrill, E.J.: Towards a
fishing pressure prediction system for a western pacific eez. Scientific reports 9(1),
1–10 (2019)
3. COPERNICUS: Product User Manual for Global Biogeochemical Analy-
sis and Forecasting Product. Marine Enviroment Monitoring Service (2019),
https://resources.marine.copernicus.eu/
4. COPERNICUS: Product User Manual for Global Physical Analysis and Cou-
pled System Forecasting Product. Marine Enviroment Monitoring Service (2020),
https://resources.marine.copernicus.eu/
5. Del Frate, F., Petrocchi, A., Lichtenegger, J., Calabresi, G.: Neural networks for
oil spill detection using ers-sar data. IEEE Transactions on geoscience and remote
sensing 38(5), 2282–2287 (2000)
6. Desortová, B.: Relationship between chlorophyll-α concentration and phytoplank-
ton biomass in several reservoirs in Czechoslovakia. Internationale Revue der
gesamten Hydrobiologie und Hydrographie 66(2), 153–169 (1981)
7. Dutta, S., Chanda, A., Akhand, A., Hazra, S.: Correlation of phytoplankton
biomass (chlorophyll-a) and nutrients with the catch per unit effort in the PFZ
forecast areas of northern bay of bengal during simultaneous validation of winter
fishing season. Turkish Journal of Fisheries and Aquatic Sciences 16, 767–777 (06
2016)
8. F. Jiménez, J. Kamı́nska, E.L.S.J.P., Sciavicco, G.: Multi-objective evolutionary
optimization for time series lag regression. In: Proceedings of the 6th International
Conference on Time Series and Forecasting (ITISE). pp. 373 – 384 (2019)
9. Franklin, J.B., Sathish, T., Vinithkumar, N.V., Kirubagaran, R.: A novel approach
to predict chlorophyll-a in coastal-marine ecosystems using multiple linear regres-
sion and principal component scores. Marine Pollution Bulletin 152, 110902 (2020)
10. Guisande, C., Manjarrés-Hernández, A., Pelayo-Villamil, P., Granado-Lorencio,
C., Riveiro, I., Acuña, A., Prieto-Piraquive, E., Janeiro, E., Matı́as, J., Patti, C.,
et al.: Ipez: an expert system for the taxonomic identification of fishes based on
machine learning techniques. Fisheries Research 102(3), 240–247 (2010)
11. Jones, P.J.: A governance analysis of the galápagos marine reserve. Marine Policy
41, 65–71 (2013)
10 F. Author et al.
12. Kwon, Y.S., Baek, S.H., Lim, Y.K., Pyo, J., Ligaray, M., Park, Y., Cho, K.H.:
Monitoring coastal chlorophyll-a concentrations in coastal areas using machine
learning models. Water 10(8), 1020 (2018)
13. Marzuki, M.I., Gaspar, P., Garello, R., Kerbaol, V., Fablet, R.: Fishing gear identi-
fication from vessel-monitoring-system-based fishing vessel trajectories. IEEE Jour-
nal of Oceanic Engineering 43(3), 689–699 (2017)
14. Monolisha, S., George, G., Platt, T.: Fisheries oceanography-established links in
the eastern arabian sea (2017)
15. Ricardo, S.: Chapter 4 - Living Ocean, pp. 2830–2841. Mercator Ocean Interna-
tional (2019)
16. Santos, A.M.P.: Fisheries oceanography using satellite and airborne remote sensing
methods: a review. Fisheries Research 49(1), 1–20 (2000)
17. de Souza, E.N., Boerder, K., Matwin, S., Worm, B.: Improving fishing pattern
detection from satellite ais using data mining and machine learning. PloS one
11(7) (2016)
18. Thessen, A.: Adoption of machine learning techniques in ecology and earth science.
One Ecosystem 1, e8621 (2016)
19. Zainuddin, M.: Skipjack tuna in relation to sea surface temperature and
chlorophyll-a concentration of Bone Bay using remotely sensed satellite data. Ju-
rnal Ilmu dan Teknologi Kelautan Tropis 3(1) (2011)