Cjes 2015 0015

Canadian Journal of Earth Sciences
Comparing global and local calibration schemes

from a differential split-sample test perspective
Journal: Canadian Journal of Earth Sciences
Manuscript ID cjes-2015-0015.R1
Manuscript Type: Article
Date Submitted by the Author: 29-Jun-2015
Complete List of Authors: Gaborit, Étienne; Environment Canada, E-NPR

Ricard, Simon; CEHQ,
Lachance-Cloutier, Simon; CEHQ,
Anctil, Francois; Université Laval, Civil Engineering
Dr
Turcotte, Richard; CEHQ,
large-scale hydrologic modeling, local and global calibration, differential

Keyword:
split-sample test, model robustness, climate-change context
af
t
https://mc06.manuscriptcentral.com/cjes-pubs
Page 1 of 38 Canadian Journal of Earth Sciences
Comparing global and local calibration schemes

from a differential split-sample test perspective
É. Gaborit1, S. Ricard2, S. Lachance-Cloutier2, F. Anctil1, and R. Turcotte2.
1. Chaire de recherche EDS en prévisions et actions hydrologiques, Département de génie civil
et de génie des eaux, Université Laval, 1065 Avenue de la Médecine, Québec, Qc, G1V 0A6,
Canada.
2. Centre d’Expertise Hydrique du Québec (CEHQ), 675 boulevard René-Lévesque Est, Québec,
Québec, Canada, G1R 5V7.

Dr
af
corresponding author: Etienne Gaborit, 6380 av. Boniface, Brossard J4Z3L7, 514-421-5305, 418-
t
572-2109, [email protected]
Canadian Journal of Earth Sciences Page 2 of 38
Abstract
This work explores the performances of the hydrologic model Hydrotel applied to 36
catchments located in the Province of Québec, Canada. A local calibration (each catchment
taken individually) and a global calibration (a single parameter set sought for all catchments)
schemes are compared in a differential split-sample test perspective. Such a methodology is
useful to gain insights on a model's skills under different climatic conditions, in view of its use
for Climate-Change (CC) impact studies. The model was calibrated using both schemes on five
non-continuous dry and cold years and then evaluated on five dissimilar humid and warm years.
Results indicate that, as expected, local calibration leads to better performances than the global
one. However, global calibration achieves satisfactory simulations while producing a better
Dr
temporal robustness (i.e. model transposability to periods with different climatic conditions).
af
Global calibration, in opposition to local calibration, thus imposes spatial consistency to the
t
calibrated parameter values, while locally adjusted parameter sets can significantly vary from
one catchment to another due to equifinality.. It is hence stated that a global calibration scheme
represents a good trade-off between local performance, temporal robustness, and the spatial
consistency of parameter values, which is for example of interest in the context of ungauged
catchments' simulation, climate-change impact studies, or even simply large-scale modeling.
key words: large-scale hydrologic modeling, local and global calibration, differential split-sample
test, model robustness, climate-change context.
1-Introduction
For climate change (CC) impact studies, hydrologic models have to adequately represent
physical processes under future conditions. In other words, they have to achieve satisfactory
performances for periods with climatic conditions different from those used to train the model.
However, it is generally common in Hydrology to assess model robustness (i.e. the model
capacity to produce satisfactory simulations in validation) with using the conventional split-
sample test – which consists in using calibration and test periods of similar (climatic) properties
(Klemeš 1986). When it comes to CC applications, Xu (1999) and Seiller et al. (2012) recommend
using a differential split-sample test (DSST, Klemeš 1986) which aims at evaluating the skills of a
hydrologic model on a period with (climatic) conditions dissimilar from the calibration data.
Dr
Another important issue (among many others, see Xu et al. 2005) of CC impact studies
af
pertains to the difficulty of implementing hydrologic models over very large areas. There are
t
several possibilities to represent the detailed hydrologic behavior of a region. One could, for
example, apply numerous versions of the same model and locally calibrate them at each
gauging site (local calibration). But that leaves open the question of ungauged basins within that
same region, and remains labor-intensive in the case where the large area is covered with a high
number of streamgauges.
Distributed hydrologic models can partly overcome this problem, simulating flows at any
point within the defined hydrographic network; this was explored for example by Pietroniro et
al. (2007) over the Great Lakes. However, limitations associated with the lack of observed
streamflows may remain an issue because the internal hydrologic simulation (i.e., the simulation
at a point located upstream of the outlet) of a distributed hydrologic model is not always
reliable (Andersen et al. 2001). Transferring parameters to ungauged catchments (which could
be a solution to both aforementioned approaches' limitations) also remains hazardous (Xu et al.
2005, Gotzinger and Bárdossy 2007), mostly because of equifinality issues (Beven and Freer
2001, Bárdossy 2007). To overcome such limitations, a macroscale hydrologic model (Xu et al.
2005) may be locally calibrated using as many gauges as possible.
Another possibility would be to use a single physically-based distributed and calibration-
free hydrologic model over the whole area (see for example Mauser and Bach 2009). At large
scales however, construction of such model is labor intensive and limited by data availability
and quality (Pietroniro and Soulis 2001, Xu et al. 2005).

Dr
There is no easy solution to achieve a detailed representation of the hydrologic

af
behaviors of a wide region. The Centre d'Expertise Hydrique du Québec opted for a lumped
t
conceptual regionalization of the parameters of their operational model (Ricard et al., 2013). It
follows the work of Fortier Filion (2011) and consists in identifying a unique parameter set for
the semi-distributed hydrologic model Hydrotel (Fortin et al. 2001 a, b). This way, the hydrologic
simulation still benefits from the many spatial heterogeneities of the southern portion of the
Province of Québec, an area of 388 000 km2. The spatial heterogeneities consist for example in
the topography, soil types and land uses. Choosing a unique parameter set assumes that
catchments with similar characteristics will express a similar hydrological behavior (Bárdossy
2007), and allows to preserve spatial consistency (the way parameters vary in space according
to watershed properties) over the whole region. It is referred to as global calibration in Ricard et
al. (2013) because the unique parameter set is obtained through a calibration process
simultaneously taking into account the performance at all gauged stations inside the domain.
The aim of this work is to gain knowledge on the behavior, in terms of Hydrotel
performances and robustness, of the local and global calibration schemes, when submitted to a
DSST. This work is actually trying to answer two main questions: "does the Hydrologic model
Hydrotel present enough robustness in the context of a DSST so that it can be applied with
some confidence in CC impact studies?", and "does global calibration present some interest in
terms of model performance and robustness so that it could reasonably be used to efficiently
implement Hydrotel over very large watersheds?" The database available to the experiment
consists of 36 gauged catchments, ranging from 200 to 15 000 km2 with a mean area of 2 735
Dr
km2 and a total area of 98 460 km2, exempt of any major infrastructure that would modify their
af
natural hydrologic regime (Figure 1). As previously stated, it is common practice to identify an
t
optimal model parameter set for each of them (local calibration). This however leaves
uncalibrated the rest (75 %) of the 388 000 km² territory targeted for a climate change impact
study (Province of Québec, Canada – Figure 1). To circumvent this problem, it is proposed,
following Ricard et al. (2013), to identify a global (unique) parameter set suitable to all 36
catchments (global calibration), which could be applied afterward to the other rivers of the area
flowing into the St. Lawrence River. Because of the vastness of the territory, a semi-global
calibration is also explored at the end, considering separately the western, eastern and northern
sections of the global watershed (Figure 1) to independently perform a global calibration on
each of these three main catchments.
The manuscript is organized as follows: in section 2, the model, data base and study area
are described. In section 3, the DSST framework used in this study is carefully explained, while
sections 4 and 5 respectively present the results and conclusions emanating from this study.
2-Model and data
The territory in Figure 1 is drained by three large river systems: the Ottawa (west), the
Saguenay (north), and the Saint Lawrence (east) rivers. The first two systems extend 144 000
and 87 000 km2, respectively, and lay on the low altitude orogenic system of the Canadian
Shield (highest point reaching 1000 m). The portion of the Saint Lawrence basin studied here
(157 000 km2), encompassing the land between the outlets of the Ottawa River and of the
Dr
Saguenay River, is constrained to the south by the Appalachian Mountains, and is mainly
composed of sedimentary material (limestone, sandstone). Urban and agricultural

af
developments are mainly restricted to the Saint Lawrence valley.

t
Most of the 36 watersheds are exposed to the humid continental climate of the boreal
forest, with variations driven by latitude and topography. The average total annual precipitation
amounts to about 1000 mm with a substantial portion (about 25%) falling in solid form, leading
to a nivo-pluvial hydrologic regime dominated by a spring freshet (Dolores Bejarano et al. 2010).
Yet the territory beyond the 50th parallel is subjected to lower sub-arctic temperature and
receives lesser precipitation totals.
Precipitation and temperature products provided by the Centre d'Expertise Hydrique du
Québec (CEHQ) are available at a daily time step on a regular 0.1˚ grid generated by the kriging
of 971 meteorological time series extending from January 1, 1969 to December 31, 2010. Daily
streamflows are available for the same period, but observations from December 1 to March 1
are not included in the calibration, because of ice effects that greatly diminish their reliability
(CEHQ, personal communication).
The semi-distributed hydrologic model Hydrotel (Fortin et al. 2001a, b; Turcotte et al.
2003) was implemented following Ricard et al. (2013), leaving 13 free parameters for the
simulation of the snowmelt, potential evapotranspiration, vertical water budget, and routing.
Hydrotel is a physically based, semi-distributed Hydrologic model (Fortin et al. 2001a, b;
Turcotte et al. 2003). It simulates snow-related processes, evapo-transpiration, surface and sub-
Dr
surface runoff, vertical water budget, and streamflow routing. Since it does not take into
account energy balance, Hydrotel’s complexity can be considered as intermediary. It was

af
implemented in this study following Ricard et al. (2013), i.e. by calibrating 13 free parameters.
t
3-Differential Split-Sample Test methodology
The general framework of the DSST methodology adopted here follows Seiller et al.
(2012). Annual total precipitation (P, mm) and mean temperature (T, ˚C.) are exploited for the
allocation of each hydrologic year into one of the four following climatic categories: Humid Cold
(HC), Dry Cold (DC), Dry Warm (DW), and Humid Warm (HW). Precipitation and temperature are
the most important inputs of a hydrologic model. Even if a classification based on annual values
rests very crude, since the succession of the short-term dry and wet spells are aggregated into a
single metric, one has to keep in mind that the objective of a DSST analysis is to provide insights
on the robustness of the model to contrasted climate. Other statistical properties than (or in
addition to) the annual total or average values could have been used to classify the different
years' climates (such as their extremes, the number of days above or under a given threshold,
etc.), but this was not performed here because such other statistics are more appropriate for
studies focusing on specific flow aspects such as low or high values (Seiller et al. 2012), whereas
we preferred placing this work under a more general framework.
A given year can then be located on a graph of the annual average temperature versus
the total precipitation amount. One can next select the x most extreme years in each of the four
quadrants defined by Figure 2 below.
The DSST consists of two steps. The model is first calibrated on a set of selected
Dr
hydrological years taken from the same climatic quadrant. Four calibrations under contrasted
climate are hence available to the project, for each of the 36 watersheds (local) and for all the
af
sites simultaneously (global). In a second step, the parameter sets are tested on contrasted
t
climate: the model may be calibrated on HC (DC) and tested on DW (HW), which is here referred
to as the diagonal 1 (diagonal 2) in Figure 2. Each of the two climatic diagonals can be used with
the aforementioned direction (from cold to warm), or in the reverse direction (warm to cold).
Note that non-continuous years are selected for the DSST, but simulation is conducted using a
continuous period covering all selected calibration (or testing) years to avoid interruptions.
Other authors such as Vaze et al. (2010), Merz et al. (2011), or Coron et al. (2012), used sliding
windows of several continuous years or decades to perform the calibrations and tests, instead
of the strategy chosen here. However, using a small number of non-continuous years leads to
higher climatic contrasts for the DSSTs than long continuous periods (Seiller et al. 2012).
3.1 The climatic contrast issue
The aim of a DSST is to evaluate the robustness of a model under contrasted climate, an
issue that is raised by CC applications for which one has to assume that calibrations performed
under actual climate hold for future conditions. Even if it would be appropriate to recreate a
DSST contrast that is of similar magnitude than the one expected under CC (i.e. try to "match"
the projected contrast during the selection process), it will still not be a direct evaluation of the
model robustness under CC, since it would strictly be based on contrasted portions of an
observed (actual) time series. It is preferable to interpret the DSST as an indicator of the
robustness. This is also why we explore all diagonals in Figure 2 and not just the expected
direction of the climate projections.

Dr
Furthermore, large uncertainties persist in climate projections. Hence "matching" these

af
projections does not make any sense unless one also matches their associated uncertainties. It
t
means that it is not possible to try to assess the impact of CC without taking the associated
uncertainties into account. In other words, it is not possible to argue that a model will be robust
in the context of a CC study because it is robust under a DSST involving a climatic contrast of the
same magnitude as the average CC forecasted one. Also, as stated before, P and T consist in
rough representatives of a year's climate. Consequently, matching these variables anticipated
evolutions does not guarantee that the CC will be matched, because the term "climate"
encompasses variables, statistics, and interactions way more numerous and complex than the
two climatic variables considered in this study. Finally, CC impacts on the land use and
vegetation characteristics are unaccounted for in a DSST (Refsgaard and Knudsen 1996).
3.2 DSST protocol
A global selection (same years' selection for all catchments) was conducted. To do so,
the total P and average T of each hydrologic year were computed for a given catchment using all
meteorological grid points located inside it. Hydrologic years span from the 15th of November
of the previous year to the 14th of November of the current year, in order to include the snow
accumulation and melting processes of the winter season. P and T values were finally averaged
over all 36 catchments.
The DSST exploited the 30-year period from 1980 to 2009. Selection of the 5 most
extreme years per climatic quadrant relied on the distance with the median intersection in
Dr
Figure 2. From experience, it is acceptable to calibrate the selected model on 5-year time series.
The selected hydrologic years, for each climatic quadrant, are identified in Table 1 and drawn in
af
Figure 3. Note that a test was made by selecting these years independently for each of the three
t
main hydrologic regions of Figure 1, but this led to the exact same final set of selected years.
The DSST climatic contrast of the study is evaluated by the following two indicators:
, ,
= ∗ 100 (1)
,
= − (2)
where P1 and P2 are the mean P values, and Tmoy and Tmoy are the mean T values, of
opposite quadrants. Subscripts 1 and 2 respectively refer to the destination and original
quadrants of a DSST, where by "original" we designate the quadrant used in calibration.
stands for "Contrast in precipitation" and for "Contrast in temperature". Results are given in
Table 2 for Diagonal 2 in its normal direction (from Dry Cold to Humid Warm, see Figure 6, and
as expected by CC projections, see below), for which the percentiles represent the variability
among the 36 catchments.
For comparison, contrasts reported on the same geographical region by Logan et al.
(2011) for 2050 range between +2 and +4˚C. (10th and 90th percentiles) for temperature and
between 0 and +25% (10th and 90th percentiles) for precipitation. Hence, the contrasts
constructed by the DSST built here are lower than the anticipated ones.
3.3 Calibration details
13 free parameters of the hydrologic model Hydrotel were calibrated with the SCE-UA
Dr
algorithm (Shuffled Complex Evolution algorithm from the University of Arizona, see Duan et al.,
1994). This algorithm has been extensively used in the context of the calibration of complex
af
hydrologic models. It is able to handle a large number of free parameters and avoids being
t
trapped in local bumps or pits in the objective function surface, among other abilities. Table 3
below presents the description and ranges of these parameters.
The local objective function is the mean squared error
!"# = $ ∑$)* &' − &( (3)
between the simulated &( and observed flows &' where n is the number of observations, while
the global objective function +,-.'/0. applied to the 36 catchments is
12 4
+,-.'/0. = ∑67 3
4* 1 − 12 4 (4)
5
where NS: i is the Nash-Sutcliffe coefficient (Nash and Sutcliffe 1970) of catchment < and
NS= i is the same coefficient but from the local calibration procedure:
?2@
>" = 1 − A (5)
∑B C C
EEEE F
B GHA D D
During the global calibration procedure, the aim is thus to simultaneously match, for the 36
catchments, the performances obtained from the local calibration procedure.
3.4 Evaluation
The whole concept behind the NS is that a reference model is used to standardize the
MSE for facilitating comparison between watersheds of different sizes and climates. It is
Dr
however well known that the NS standardization is not perfect in that respect (Krause 2005). It
is thus unadvisable to calculate the difference between two NS values, unless they originate
af
from the same time series and watershed, for example comparing different versions of a model.
t
In DSST, one encourages contrast between the calibration and test series, precluding a direct
comparison of the NS values computed from these dissimilar climates. A way to circumvent this
problem is to compare the calibration performance (say on DC) to the test performance (also on
DC) exploiting the parameter set obtained for the opposite series (HW), since both NS would
then be computed on the same times series (DC). This alternative evaluation was used in Figure
4.
Such analysis is complemented with the relative errors
KL KD
IJ = KD
∗ 100 (6)
where LN and LO are indicator values taken from the simulated and observed flow time series,
respectively, applied to the maximum daily flow, the minimum seven-day flow (i.e., the flow
averaged over a 7-days sliding window), and the mean flow over the period considered. The
values of the maximum and minimum seven-day flows were obtained by taking the mean of the
5 annual values of each 5-year calibration or validation period, as a proxy for the value
corresponding to a two-year return period. Despite the Nash-Sutcliffe criteria is generally the
main score used to calibrate and assess the quality of an hydrologic model (Moriasi et al. 2007),
Dr
it is strongly recommended by the hydrologic community (see previous reference) to use other
metrics that focus on various aspects of the simulated streamflows, in order to have a broad
af
and detailed view of a simulation's performances.

t
Finally, the Nash-Sutcliffe efficiency coefficient and the relative error for the mean flow
were calculated for each month of the calibration and validation (or test) periods, in order to
investigate a scheme's performances as a function of the period of the year considered.
4-Results
The local calibration strategy logically leads to better performances than the global one,
both for calibration and test. This can be seen when looking at any of the criteria used in this
study (Figures 4, 5 and 6), following Ricard et al. (2013) findings under SST. There is thus a
performance cost to the advantages of a global calibration strategy, which allows preserving
some spatial consistency for the parameter values and simulating ungauged catchments. For an
example of spatial consistency of global calibration compared to local calibration ranges, see
Figure 10. The term "spatial consistency" is here used in opposition to the results of local
calibration, which tend to produce parameter values with a very high degree of variability inside
a same region.
To investigate the parameter temporal transferability for both calibration schemes, it
was chosen to compare NS values obtained over a test period, to those obtained over the same
period but when it was used to calibrate the model when performing the DSST along the same
climatic diagonal but in the reverse direction (see Figure 6). In other words, NS values obtained
in calibration over a climatic quadrant of Figure 6 are taken as reference, and NS values
Dr
obtained for the same quadrant when in test (or validation) mode with parameters obtained
from the diagonally opposite quadrant are compared to it. To do the comparison, we hence
af
examine NS in test minus NS in calibration (see section 3.4) but over the same period. A
t
negative result depicts the logical superiority of the calibration over the test. This procedure is
repeated for each calibration scheme. Ultimately, the global calibration presented in Figure 4
has more robustness than the local one (i.e., less performance difference between calibration
and test periods), probably because the latter focuses on finding specialized parameters. Over-
specialized calibration parameter sets may compensate for errors in the model structure and/or
observed data (Merz et al. 2011). However, this errors' magnitude or influence may not be the
same over the test period and the parameter set which was (too) appropriate over calibration is
often less appropriate for testing. The global calibration is, by nature, less subject to over-
specialization since it trades off parameters over numerous catchments. The observed data
errors are therefore less compensated for, probably because they differ from one catchment to
the other. This confers global calibration an important advantage.
Other tests were performed to investigate the potential benefits of conducting global
calibration schemes on smaller areas than the one of 388 000 km2 used in this study. More
precisely, a global calibration was independently performed for each of the three main
catchments of the studied area (see Figure 1). This strategy is here referred to as a semi-global
calibration. This investigation is based on the assumption that using a unique parameter set for
such a wide geographical zone may not be optimal. This strategy was only tested in the
conventional direction of the CC, i.e. diagonal 2 in Figure 2. A priori, it seems that a semi-global
calibration is beneficial to the model robustness, while limiting its loss in performance when
Dr
compared to local calibration. In Figure 7, NS values in calibration and validation are not directly
comparable because they were obtained over periods with different climatic properties (see
af
above).
t
No climatic period (among the four climatic quadrants of this study, see Figure 2)
simultaneously led to better performance in calibration and temporal robustness for all the
criteria considered here, when compared to the other climatic quadrants.
In agreement with Ricard et al. (2013), it was noticed that high flows are generally
underestimated (Figure 5) and low flows overestimated (Figure 6). This difficulty encountered
by Hydrotel to simulate extreme flows (even with the local calibrations) may be due to
inadequate model structure, the low resolution of the implemented models, but also partly to
the difficulty of measuring the high spatial rainfall variability which can occur during summer
periods (the rain gauge density inside a catchment is sometimes low).
Investigating the mean flows relative errors on a monthly basis, one can see that it is
generally overestimated during summer (June - October) and underestimated during winter
(November, March - May) Figure 8). This situation is consistent whatever the climatic diagonal,
direction, or period.
Performance obtained over winter is generally better than for summer (not shown
here). This may be due to the fact that winter processes are easier to simulate since they are
mostly governed by temperature during snowmelt and by the soil water content depletion at
the beginning of the winter season. Summer periods are characterized by more complex events,
involving a higher rainfall spatial variability caused by convective events, which are less common
during winter. However, high winter flows present strong underestimations, as well as summer
Dr
periods.
Finally, a discussion follows concerning a tendency relative to the simulated flows'

af
behavior when moving from calibration to tests. This tendency is linked to the climatic contrast
t
of the DSST. Moreover, it seems to be predominantly governed by temperature. Indeed, the
mean simulated flows tend to increase when moving from warm to cold periods (i.e. when
calibrating on warm and testing on cold periods), and vice-versa. This can be seen for example
on Figure 9, but also on Figure 8. Of course, when moving from a cold to a warm period,
evaporation is expected to increase, leading to a decrease in runoff. But what is observed in this
work, and displayed on Figures 8 and 9, is that it is the relative error between observed and
simulated flows which presents a tendency associated to the climatic change involved with the
DSST. To give an example by putting Figure 9 into words, if observed flows are generally
underestimated by the model in calibration over a cold (humid) period, they will be even more
during validation over a warm (dry) period, as illustrated by local calibration on Figure 9. Hence
this is independent from the expected change related to the climatic properties of the periods.
Such a tendency was also observed by Vaze et al. (2010) and Seiller et al. (2012), but it is
assumed here that it can sometimes be more governed by precipitation than by temperature,
maybe depending on each climatic variable's involved contrast. Other studies identified an
opposite behavior, namely that the simulated flows can conserve the dynamic with which they
were trained (see for example Klemeš 1986, Xu 1999, Merz et al. 2011, Coron et al. 2012). In
other words, in such cases the observed flows can be underestimated in validation when
training on dry (and/or warm) periods (and hence testing on humid and/or cold periods), and
overestimated in validation when calibrating in runoff-favored situations like humid and/or cold
Dr
periods.
af
It is hard to explain the reasons for such phenomena because it requires to link the
t
calibrated parameter values (and their interactions) to model structure and behavior under
different climatic conditions (Vaze et al. 2010). No explanation could be found in literature.
Moreover, when looking at the aforementioned studies, no strong correlation could be
established between the flows' tendency and either the catchment size, location, climate,
model type (semi-distributed or global lumped conceptual), or objective function involved.
In situations where the model overestimates flows during test, when trained on a humid
and/or cold period (and vice-versa), it seems that it tries to generate more runoff in order to
simulate (i.e. mimic) the high flows of the calibration time series. In other cases like ours, for
which the mean simulated flows decrease (possibly leading to underestimation) when moving
from humid (and/or cold) to dry (and/or warm) periods, it seems that the model tries to take
away water or at least to decrease runoff (i.e., to "fight" the climatic conditions) during
calibration.
This behavior may be due to the global water balance of the model: if it does not
produce enough runoff compared to what should be produced with its given inputs, it will try to
increase it (by modifying the way precipitation is partitioned between evaporation and runoff,
for example), which will result in overestimating low flows when exploited over dry periods.
Otherwise (like for our findings), if it produces too much runoff when compared to what should
be expected, it will compensate the water balance excess. This happens here with global
calibration, for which the thickness of the third soil layer and the PET compensation factor tend
Dr
to increase (which both decrease runoff) when the calibration period is humid and/or cold, and
af
vice-versa (not shown here), as was envisioned by Vaze et al. (2010). This parameter's behavior
t
is depicted in Figure 10 that illustrates the thickness of the third soil layer. The over-
specialization of the local parameter sets hides this behavior that, associated to the parameter
interdependency, leads to equi-finality and hence to a high variability of the parameter values.
No hypothesis was found as to why third layer thickness values (Figure 10) are generally higher
with global than with local calibration, as no in-depth analysis was performed in this study about
the calibrated parameter values as a function of the climatic period used for calibration.
Another argument in favor of the explanation given here consists in the more
pronounced tendency along the climatic diagonal 1, which involves a higher climatic contrast
than the second one. Indeed, climatic diagonal 1 involves combinations of P and T with both
variables being either in favor or against the generation of runoff.
To confirm the hypotheses formulated here in regard of the behavior of a model general
water balance when submitted to a DSST, the following test is proposed. One should calibrate
and evaluate the model over a relatively long period (let's say 10 years in calibration and 10 in
validation) using a standard SST procedure and local calibration, and assess if the model has a
tendency of under or over-underestimating the overall water balance (using a bias criteria). If
such a tendency is highlighted for the model using the conventional SST, then the DSST could be
performed to see if the general tendency of the model can be linked to one of the two types of
behavior discussed above when in DSST. Of course, several models and sub-catchments should
be used to strengthen the resulting conclusions. This was done here as it would imply re-
calibrating the model using a conventional SST, whereas this paper focuses on DSST.
Dr
5-Conclusion
af
t
As was logically expected, local calibration leads, for a given gauged site, to better
performance than global calibration, from the view point of all studied criteria and whatever the
climatic diagonal or direction considered. However, local calibration is not devoid of drawbacks.
Parameter sets can be over-specialized and lacking spatial (Bàrdossy 2007) or temporal
transferability. Global calibration, however, allows preserving a spatial consistency of the
parameter set and avoiding over-specialization, which can result in a better temporal
robustness than local calibration. The semi-global strategy, which consists in performing the
global calibration over a limited area, may even perform better than global calibration,
depending on the degree of the spatial heterogeneities of the total area considered.
It is here stated that a global calibration of the semi-distributed Hydrologic model
Hydrotel over the large area of 388 000 km2 studied here (south of Québec province) seems a
promising direction in order to issue hydrologic projections. Indeed, the lower performances
associated to using global instead of local calibration is tempered by the fact that the hydrologic
projections (i.e., projections of climate change impacts on hydrologic regimes) mainly focus on
evolution tendencies rather than on the absolute projected flow values themselves. In other
words, even if global calibration predicts simulated flow values with a larger error than local
calibration, it can predict correct flow tendencies that, in conjunction with its advantages, seem
to make global calibration an appropriate tool for climate change impact studies over very large
areas.
Dr
The Hydrotel model however may need improvements, especially regarding the
af
simulation of extreme flow values. Using another objective function than simply the mean
t
squared error during calibration, or increasing the model spatial resolution and the observation
network density could also help reducing these drawbacks.
6-Acknowledgments
We hereby acknowledge the Centre d'Expertise Hydrique du Québec (CEHQ) for the
data, hydrologic model, Matlab functions, and financial support.
7-References
Andersen, J., Refsgaard, J., and Jensen, K. 2001. Distributed hydrological modelling of the
Senegal River Basin--model construction and validation. J. Hydrol., 247: 200-214.
Bárdossy, A. 2007. Calibration of hydrological model parameters for ungauged catchments.

Dr
Hydrology Earth Syst. Sci., 11: 703-710.

af
Beven, K., and Freer, J. 2001. Equifinality, data assimilation, and uncertainty estimation in
t
mechanistic modelling of complex environmental systems using the GLUE methodology.
J. Hydrol., 249: 11-29.
Centre d’expertise hydrique du Québec (CEHQ). 2012a. Mise en place d’une plateforme de
modélisation hydrologique à l’échelle du Québec méridional. Québec, Québec. 25 pp. and
appendices.
Centre d’expertise hydrique du Québec (CEHQ). 2012b. Production d’un atlas préliminaire des
changements anticipés du régime hydrique du Québec méridional à l’horizon 2050. Québec,
Québec. 55 pp. and appendices.
Coron, L., Andréassian, V., Perrin, C., Lerat, J., Vaze, J., Bourqui, M., and Hendrickx, F. 2012.
Crash testing hydrological models in contrasted climate conditions: an experiment on 216
Australian catchments. Water Resources Research, 48, W05552. doi:10.1029/2011WR011721,
2012
Bejarano, M.D., Marchamalo, M., de Jalon, D.G., del Tanago, M.G. 2010. Flow regime patterns
and their controlling factors in the Ebro basin (Spain). J. Hydrol., 385: 323-335.
Duan, Q., Sorooshian, S., and Gupta, V. 1994. Optimal use of the SCE-UA global optimization
method for calibrating watershed models. J. Hydrol., 158: 265-284.
Fortier Filion, T.C. 2011. Développement d’une procédure de mise en place d’un modèle
Dr
hydrologique global sur des bassins jaugés et non jaugés : application du modèle MOHYSE au
af
Québec. MSc. thesis, INRS-ETE, Québec, Canada.

t
Fortin, J., Turcotte, R., Massicotte, S., Moussa, R., Fitzback, J., and Villeneuve, J. 2001a.
Distributed watershed model compatible with remote sensing and GIS data. I: Description of
model. J. Hydrol. Eng., 6(2): 91-99.
Fortin, J., Turcotte, R., Massicotte, S., Moussa, R., Fitzback, J., and Villeneuve, J. 2001b.
Distributed watershed model compatible with remote sensing and GIS data. II: Application to
Chaudière watershed. J. Hydrol. Eng., 6(2): 100-108.
Gotzinger, J., and Bárdossy, A. 2007. Comparison of four regionalisation methods for a
distributed hydrological model. J. Hydrol., 333: 374-384.
Klemeš, V. 1986. Operational testing of hydrological simulation models. Hydrolog. Sci. J., 31(1):
13-24.
Krause, P., Boyle, D.P., and Bäse, F. 2005. Comparison of different efficiency criteria for
hydrological model assessment. Advances in Geosciences, 5: 89-97.
Logan, T., Charron, I., Chaumont, D., and Houle, D. 2011. Atlas de scénarios climatiques pour la
forêt québécoise. Ouranos et MRNF. 55 pp. and appendices.
Mauser, W., and Bach, H. 2009. PROMET-Large scale distributed hydrological modelling to study
the impact of climate change on the water flows of mountain watersheds. J. Hydrol., 376: 362-
377.
Dr
Merz, R., Parajka, J., and Blöschl, G. 2011. Time stability of catchment model parameters:
af
Implications for climate impact analyses. Water Resour. Res., 47(1–17).

t
Nash, J.E., and Sutcliffe, J.V. 1970. River flow forecasting through conceptual models part I — A
discussion of principles. Journal of Hydrology, 10 (3): 282–290.
Pietroniro, A., and Soulis, E. 2001. Comparison of global land-cover databases in the Mackenzie
basin, Canada. Remote Sensing and Hydrology 2000 (Proceedings of a symposium held at Santa
Fe, New Mexico, USA, April 2000). IAHS Publ. no. 267: 552-557.
Pietroniro, A., Fortin, V., Kouwen, N., Neal, C., Turcotte, R., Davison, B., Verseghy, D., Soulis, E.,
Caldwell, R., Evora, N., and others. 2007. Development of the MESH modelling system for
hydrological ensemble forecasting of the Laurentian Great Lakes at the regional scale.
Hydrology Earth Syst. Sci., 11: 1279-129.
Plummer, D.A., Caya, D., Frigon, A., Côté, H., Giguère, M., Paquin, D., Biner, S., Harvey, R., and
De Elia, R. 2005. Climate and Climate Change over North America as Simulated by the Canadian
RCM. Journal of Climate, 19: 3112-3132.
Ricard, S., Bourdillon, R., Roussel, D., and Turcotte, R. 2013. Global calibration of distributed
hydrological models for large scale applications. Journal of Hydrologic Engineering, 18: 719-721.
Seiller, G., Anctil, F., and Perrin, C. 2012. Multimodel evaluation of twenty lumped hydrological
models under contrasted climate conditions. Hydrol. Earth Syst. Sci., 16: 1171–1189.
Dr
Turcotte, R., Rousseau, A. N., Fortin J.-P., and Villeneuve. J.-P. 2003. Development of a process-
af
oriented, multiple-objective, hydrological calibration strategy accounting for model structure.

t
Calibration of Watershed Models, Duan, Q., S. Sorooshian, H. Gupta, A. N. Rousseau et R.
Turcotte, eds., Water Science and Application 6, American Geophysical Union, Washington, DC.
153-163.
Vaze, J., Post, D. A., Chiew, F. H. S., Perraud, J.-M., Viney, N. R., and Teng, J. 2010. Climate non-
stationarity – Validity of calibrated rainfall-runoff models for use in climate change studies. J.
Hydrol., 394: 447–457.
Xu, C.-Y. 1999. Operational testing of a water balance model for predicting climate change
impacts. Agr. Forest Meteorol., 98–99: 295–304.
Xu, C.-Y., Wid´en, E., and Halldin, S. 2005. Modelling hydrological consequences of climate
change – progress and challenges. Adv. Atmos. Sci., 22: 789–797.
Dr
af
t
List of Tables:
Table 1: Selected hydrological years for each climatic quadrant
Table 2: DSST climatic contrast along diagonal 2 (Figure 6) for the 36 catchments
Table 3: Description and ranges of the free parameters used to calibrate the model; PET:
Potential Evapo-Transpiration. See CEHQ (2012a) for the work originally proposing these ranges.
Table 1: Selected hydrological years for each climatic quadrant
DW HC HW DC
1998 1990 2006 1992
1999 1996 2008 1982
2002 1986 1983 1989

Dr
2007 1993 2005 1985

af
1987 1994 2000 2003

t
Table 2: DSST climatic contrast along diagonal 2 (Figure 6) for the 36 catchments
Percentiles (%) T (˚C) P (%)
25 1.40 10
50 1.55 14
75 1.70 20
Table 3: Description and ranges of the free parameters used to calibrate the model; PET: Potential
Evapo-Transpiration. See CEHQ (2012a) for the work originally proposing these ranges.
lower upper
parameter description / range unit initial
bound bound
PET multiplicative coefficient [] 0.7 0.7 1.5
Ratio between evergreen and

[] 1 0.5 1
deciduous trees snowmelt factors
deciduous trees snowmelt factor [] 6.4 1 20
Ratio between bare ground and

[] 1 1 2
deciduous trees snowmelt factors
Difference between evergreen and
˚C. 2.2 0 5
deciduous trees snowmelt threshlods
Deciduous tree snowmelt threshold ˚C. 1,6 -3 3
Difference between bare ground and

˚C. -2.2 -5 0
deciduous trees snowmelt threshold
Depth of first soil layer m 0.1327 0.01 0.2

Dr
Second soil layer thickness m 0.4984 0.1 0.5

af
Third soil layer thickness m 0.5 0.5 2

t
recession coeff logarithm [] -10 -10 -1
Rain-snow temperature boundary ˚C. 0 -5 5
Hydro-geomorphologic hydrogram
m 0.006 0.0015 0.025
reference depth
List of Figures:
Figure 1: The local (36 natural catchments), semi-global, and global scales of the experiment. A
natural catchment is unregulated.
Figure 2: Definition of the four climatic quadrants, the two climatic diagonals, and the default
directions used in the DSSTs. The dashed lines correspond to the median precipitation and
temperature values for the period considered.
Figure 3: Selected hydrological years for each contrasted climate. Black squares represent the
median of each climatic quadrant.
Figure 4: Nash values in test over a climatic period (i.e., simulations performed with parameters
obtained from the diagonally opposite climatic quadrant), minus the Nash values obtained in
calibration over the same period (see section 3.4). See Figure 2 for the explanation of the four
climatic periods. Box-plots present the 5, 25, median, 75 and 90 percentiles over the 36
catchments. The blue line depicts the limit above which performances are generally judged as
"satisfactory" in Hydrology. (a) and (b) panels: Global calibration along diagonals 1 and 2,
Dr
respectively; (c) and (d) panels: local calibration along diagonals 1 and 2, respectively.
Figure 5: Relative errors of the maximum daily flows in calibration, for the four climatic periods.
af
The ideal "no-error" value of 0 is displayed by a dashed grey line. Box-plots present the 5, 25,
median, 75 and 90 percentiles over the 36 catchments.
t
Figure 6: Relative errors of the minimum seven-day flows in calibration, for the four climatic
periods. The ideal "no-error" value of 0 is displayed by a dashed grey line. Box-plots present the
5, 25, median, 75 and 90 percentiles over the 36 catchments.
Figure 7: Nash-Sutcliffe values (ideal value is 1) along climatic diagonal 2. Box-plots present the
5, 25, median, 75 and 90 percentiles over the 36 catchments. The blue line depicts the limit
above which performances are generally judged as "satisfactory" in Hydrology.
Figure 8: Mean relative errors (along the 36 catchments used) of the mean monthly flows, using
climatic diagonal 1.
Figure 9: Interannual hydrographs (i.e., simulated flows averaged over a set of 5 years)
computed using climatic diagonal 1 (Calibration over Humid Cold and validation with Dry Warm
climate) and station 041902 (Dumoine river, outlet of lake Robinson), located at the outlet of a
3760 km2 catchment in the Ottawa River basin (second most western catchment of Figure 1).
Figure 10: Calibrated third soil layer thickness; box-plots depict the variability over the 36
catchments in local calibration and the black dots represent the global calibration values.
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t
Dr
af
t

Cjes 2015 0015

Uploaded by

Copyright:

Available Formats

Cjes 2015 0015

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cjes 2015 0015

Uploaded by

Copyright:

Available Formats

Canadian Journal of Earth Sciences

Comparing global and local calibration schemes

Journal: Canadian Journal of Earth Sciences

Manuscript Type: Article

Date Submitted by the Author: 29-Jun-2015

Complete List of Authors: Gaborit, Étienne; Environment Canada, E-NPR

Turcotte, Richard; CEHQ,

large-scale hydrologic modeling, local and global calibration, differential

Comparing global and local calibration schemes

É. Gaborit1, S. Ricard2, S. Lachance-Cloutier2, F. Anctil1, and R. Turcotte2.

1. Chaire de recherche EDS en prévisions et actions hydrologiques, Département de génie civil

Québec, Canada, G1R 5V7.

schemes are compared in a differential split-sample test perspective. Such a methodology is

catchments' simulation, climate-change impact studies, or even simply large-scale modeling.

test, model robustness, climate-change context.

2005) may be locally calibrated using as many gauges as possible.

Another possibility would be to use a single physically-based distributed and calibration-

and quality (Pietroniro and Soulis 2001, Xu et al. 2005).

There is no easy solution to achieve a detailed representation of the hydrologic

sections of the global watershed (Figure 1) to independently perform a global calibration on

each of these three main catchments.

2-Model and data

composed of sedimentary material (limestone, sandstone). Urban and agricultural

developments are mainly restricted to the Saint Lawrence valley.

receives lesser precipitation totals.

Precipitation and temperature products provided by the Centre d'Expertise Hydrique du

(CEHQ, personal communication).

Hydrotel is a physically based, semi-distributed Hydrologic model (Fortin et al. 2001a, b;

account energy balance, Hydrotel’s complexity can be considered as intermediary. It was

3-Differential Split-Sample Test methodology

we preferred placing this work under a more general framework.

quadrants defined by Figure 2 below.

3.1 The climatic contrast issue

direction of the climate projections.

Furthermore, large uncertainties persist in climate projections. Hence "matching" these

rough representatives of a year's climate. Consequently, matching these variables anticipated

3.2 DSST protocol

over all 36 catchments.

 =  −  (2)

quadrants of a DSST, where by "original" we designate the quadrant used in calibration.

among the 36 catchments.

3.3 Calibration details

below presents the description and ranges of these parameters.

The local objective function is the mean squared error

!"# = $ ∑$)* &' − &( (3)

the global objective function +,-.'/0. applied to the 36 catchments is

catchments, the performances obtained from the local calibration procedure.

Such analysis is complemented with the relative errors

and detailed view of a simulation's performances.

investigate a scheme's performances as a function of the period of the year considered.

To investigate the parameter temporal transferability for both calibration schemes, it

the other. This confers global calibration an important advantage.

criteria considered here, when compared to the other climatic quadrants.

periods (the rain gauge density inside a catchment is sometimes low).

Finally, a discussion follows concerning a tendency relative to the simulated flows'

of the DSST. Moreover, it seems to be predominantly governed by temperature. Indeed, the

Moreover, when looking at the aforementioned studies, no strong correlation could be

model type (semi-distributed or global lumped conceptual), or objective function involved.

variables being either in favor or against the generation of runoff.

= − (2)

!"# = $ ∑$)* &' − &( (3)