Evaluating Hydrological Model Performance Using Information Theory-Based Metrics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Evaluating Hydrological Model Performance using Information

Theory-based Metrics
Yakov A. Pachepsky1, Gonzalo Martinez2*, Feng Pan3,4, Thorsten Wagener5, Thomas Nicholson6

1
5 USDA-ARS Environmental Microbial and Food Safety Laboratory, Beltsville, MD 20705, USA
2
Department of Agronomy, University of Cordoba, 14071, Cordoba, Spain
3
Department of Civil & Environmental Engineering, the University of Utah, Salt Lake City, UT 84112, USA
4
Energy & Geoscience Institute, the University of Utah, Salt Lake City, UT 84108, USA
5
Department of Civil Engineering, University of Bristol, Bristol, UK
6
10 Office of Regulatory Research, US Nuclear Regulatory Commission, Rockville, MD 20852, USA
*Correspondence to: G. Martinez ([email protected])

Abstract. The accuracy-based model performance metrics not necessarily reflect the qualitative correspondence between
simulated and measured streamflow time series. The objective of this work was to use the information theory-based metrics
to see whether they can be used as complementary tool for hydrologic model evaluation and selection. We simulated 10-year
15 streamflow time series in five watersheds located in Texas, North Carolina, Mississippi, and West Virginia. Eight model of
different complexity were applied. The information theory based metrics were obtained after representing the time series as
strings of symbols where different symbols corresponded to different quantiles of the probability distribution of streamflow.
The symbol alphabet was used. Three metrics were computed for those strings – mean information gain that measures the
randomness of the signal, effective measure complexity that characterizes predictability and fluctuation complexity that
20 characterizes the presence of a pattern in the signal. The observed streamflow time series has smaller information content
and larger complexity metrics than the precipitation time series. Watersheds served as information filters and and streamflow
time series were less random and more complex than the ones of precipitation. This is reflected by the fact that the watershed
acts as the information filter in the hydrologic conversion process from precipitation to streamflow. The Nash Sutcliffe
efficiency metric increased as the complexity of models increased, but in many cases several model had this efficiency
25 values not statistically significant from each other. In such cases, ranking models by the closeness of the information theory
based parameters in simulated and measured streamflow time series can provide an additional criterion for the evaluation of
hydrologic model performance.

1 Introduction

Hydrologic modeling plays the critical role in hydrologic response prediction for the applications such as water resources

30 management activities, flood control, and water quality evaluation (Singh and Woolhiser, 2002; Pechlivanidis et al., 2011,

1
Wagener et al., 2010). Over the last few decades, lumped and physics-based distributed hydrologic models have been

developed and widely applied to simulate the hydrologic processes for understanding of watershed behaviors. Lumped

models are represented, for example, by Stanford Watershed Model (SWM) (Crawford and Linsley, 1966), the Tank Model

(Sugawara et al., 1976), and Xinanjiang Model (Zhao et al., 1980) etc. With the rapid development of computational power,

5 applications of distributed models have become feasible. The family of such models include Systeme Hydrologique

Europeen (SHE) (Abbott et al., 1986a, b), Physically Based Runoff Production Model (TOPMODEL) (Beven and Kirkby,

1979), Soil Water Assessment Tool (SWAT) (Arnold et al., 1998), Hydrologic Model System (Yu et al., 1999), and Variable

Infiltration Capacity (VIC) model (Liang et al., 1994). The evaluation of model performance is indispensable to examine

both accuracy and reliability of models.

10 The common model evaluation metrics in hydrology include the Nash-Sutcliffe efficiency NSE (Nash and Sutcliffe,

1970; Krause et al., 2005; Bai et al., 2009), the root-mean-squared error, the coefficient of determination, the Akaike

information criterion AIC (Akaike, 1973), the Bayesian information criterion BIC (Schwarz, 1978), and the Kashyap

information criterion KIC (Kashyap, 1982). Recently, new approaches have been proposed to evaluate the performance of

hydrologic models, such as maximum likelihood Bayesian model averaging MLBMA (Ye et al., 2004), a wavelet-based

15 multiscale performance metric (Rathinasamy et al., 2014), a data-reduction method based on self-organizing maps (Reusser

et al., 2009), an interval-deviation approach (Chen et al., 2014), and a top-down methodology (Bai et al., 2009) among

others. Although these metrics/approaches can evaluate the correspondence between the simulation results and observed

data, they cannot capture all the features reproduced by the hydrologic models such as information content of data and model

complexity under uncertainty (Gupta et al., 1998; Reusser et al., 2009; Pachepsky et al., 2006; Weijs et al., 2010).

20 Information theory has been recently applied to develop additional metric to characterize the patterns of observed and

simulated data sets to provide the insight and complementary knowledge on the evaluation of model performance

(Pachepsky et al., 2006; Pan et al., 2011, 2012; Li et al., 2012; Gong et al., 2013; Pechlivanidis et al., 2014; Beven and

Smith, 2015). The predictive performance of hydrologic models was evaluated by fully exploiting the available information

in the data set using the information-based indices (Gong et al., 2013). Li et al. (2012) proposed an entropy-based criterion

25 named maximum information minimum redundancy (MIMR) to evaluate and optimize the design of the hydrometric

2
networks. The information theory has also been applied in the calibration of hydrologic models to improve model

performance (Pechlivanidis et al, 2014; Beven and Smith, 2015). The complexity and information content metrics have been

employed by Pachepsky et al. (2006) to discriminate the different soil water flow models that gave the same accuracy of soil

water flux estimates, and by Pan et al. (2011) to evaluate the ability of the model to reproduce the temporal trends of soil

5 moisture content in variably saturated soil.

The objectives of this study are (1) to characterize the patterns of observed precipitation and streamflow time series

in arid and humid watersheds; (2) to evaluate the performance of eight hydrologic models in five watersheds using

complexity and information content metrics and to compare the results of this performance evaluation with the results of

performance evaluation based on the Nash-Sutcliffe efficiency metrics. The eight hydrologic model structures have been

10 developed by Bai et al. (2009) including two evapotranspiration modules, four soil moisture accounting modules, and three

routing modules. The details of model structure are referred to Bai et al. (2009). The five watersheds selected in this study

include two dry watersheds, Guadalupe River and San Marcos River catchments in Texas, and three wet watersheds, Tygart

Valley River in West Virginia, French Broad River in North Carolina, and Leaf River in Mississippi.

2. MATERIALS AND METHODS

15 2.1 Study Sites

The five watersheds were selected in Texas, North Carolina, Mississippi, and West Virginia to represent a range of

hydro-climatic conditions. The eleven-year data (1960-1970) of daily precipitation (P), streamflow (Q) and potential

evapotranspiration (PE) in the five watersheds were used in this study. The characteristics of the five watersheds are listed in

Table 1.

20 The Guadalupe River and San Marcos River catchments located in Texas are two dry watersheds with mean annual

precipitation of around 800 mm and mean annual PE of 1500 mm. Tygart Valley River in West Virginia, French Broad

River in North Carolina, and Leaf River in Mississippi are three wet watersheds with mean annual precipitation of about

1300 mm and mean annual PE of around 800-1000 mm. The more detailed information of the watersheds can be found in

Bai et al. (2009).

3
2.2 Hydrologic Models

The eight hydrologic model structures have been selected to represent differences in hydrologic model complexity

for the model evaluation with different metrics. The eight models, which are briefly described in Table 2, were derived from

the different combination of three modules: soil moisture accounting, actual evapotranspiration, and routing (Bai et al.,

5 2009). Models S1 and M1 estimated streamflow as a surface runoff resulting from the saturation excess, models S2 and M2

added subsurface flow to the streams appearing after soil reached filed capacity, models S3 and M3 added subsurface flow

from saturated zone, and models S4 and M4 added the deep storage recharge. The difference between S models and M

models consisted in the treatment of soil moisture accounting. S models used the single-layer models (Atkinson et al., 2002;

Farmer et al., 2003), and M models used the multi-layer formulation (Son and Sivapalan, 2007). The ET module included

10 two options with the estimation from the moisture storage as one zone, and from the unsaturated zone and shallow saturated

zone (Bai et al., 2009). The routing modules were deployed to simulate flow release from storages (e.g., saturated zone, deep

storage). The eight models were formed with the combination of the three modules with the increase in complexity (Bai et al.

2009). The streamflow in the five watersheds was simulated with each of eight models for ten years. The Nash-Sutcliffe

efficiency index (NSE, Nash and Sutcliffe, 1970) was used as the model performance metric.

15 2.3 Information Content and Complexity Metrics

The general idea of information theory-based metrics in this work is to

(a) replace the time series by the string of symbols from some (small) alphabet; each letter denotes a particular range

within the overall range of data variation

(b) define the number of points in the data window; for each data window, the replacement of numerical data with

20 letters creates the word;

(c) research probabilities of changes in words as the data window moves over the time series;

(d) derive metrics of information content and complexity based on those probabilities

We represented the time series of hydrologic state variables (e.g., observed and modeled precipitation and streamflow in this

study) as symbolic strings following Lange (1999) and Wolf (1999) methodologies. To do so, we chose a binary encoding

25 using the median value of each state variable as a threshold; all the observations above the threshold were coded as one and

4
all the observations at the median value or below were coded as zero. The alphabet, therefore, had two letters – ‘0’ and ‘1’.

Both measured and simulated time series were encoded. Within the encoded strings we could analyze words of length L

(𝐿 ∈ ℕ) composed of L consecutive symbols. Assuming that each word characterizes the state of the studied system, we

have 2L different words or states; the base ‘2’ in this equation corresponds to the number of letters in the alphabet. For the

5 binary encoding. we have the four (22) different words 11, 10, 01, 10. The first word shows the state in which the variable

exceeds the median value at both times in the data window, the second word shows the transition from that state (11) to that

in which the second observation falls below the median value (10), etc. For any particular string, we can compute various

empirical probabilities to the occurrence and transition of states for words of length L such as:

𝑝!,! probability for the word “i” to appear in the symbolic string

10 𝑝!,!" probability for the sequence of words “i” and “j” to appear

𝑝!,!→! conditional probability of the occurrence of the jth word after ith word

After defining this set of probabilities we can compute two information-based metrics, namely as the metric entropy and

mean information gain. The metric entropy (ME), is a normalized version of Shannon´s entropy (H, Shannon, 1948):

! !
𝑀𝐸 = (1)
!

15 where

!!
𝐻 𝐿 =− !!! 𝑝!,! log ! 𝑝!,! , (2)

Shannon's entropy is a measure in bits of the average information content per code or unpredictability of the

information contained in the time series. Its normalized version, ME, gives a measure independent of the word length. While

it has a value of zero for constant strings it increases with the randomness of the string up to a value of 1 for uniformly

20 random sequences.

The mean information gain (MIG), measures the average amount of new information obtained by knowing the next

symbol. Given that the MIG includes the transition probability and the occurrence of the sequence of words, knowing the

symbol that follows a word increases the local information. Therefore, the larger the MIG is the less predictable and more

random is the time series.

5
!!
𝑀𝐼𝐺 𝐿 = !,!!! 𝑝!,!" log ! 𝑝!,!→! , (3)

The complexity in the time series under study was assessed with the fluctuation complexity (FC) measure and the

effective measure of complexity (EMC, Eq. 5). These two metrics allowed us to quantify the internal structure and the

presence of patterns in the encoded symbolic strings.

!
!! !!,!
5 𝐹𝐶 = !,!!! 𝑝!,!" log ! , (4)
!!,!

!! !!,!→!
𝐸𝑀𝐶 = !,!!! 𝑝!,!" log ! , (5)
!!,!

The fluctuation complexity considers vaguely the ordering of, and relationship between, words in a sequence. It is

obtained as the mean square deviation of the differences between information gained associated with the transition from the

state “i” to the state “j” and the information lost associated with that transition. Strings that show a high degree of fluctuation

10 in their symbols give larger fluctuation complexity values (Bates and Shepard, 1993). Grassberger (1986) defined the

effective measure complexity (EMC) as “the minimal information that that would have to be stored for optimal predictions if

it could be used with 100% efficiency”. Time series of random data or periodic sequences present are simple and show low

values of FC and EMC. On the contrary, time series that present more structure and less randomness require a larger number

of parameters to describe their behavior and show high values of FC and EMC (Pachepsky et al., 2006; Wolf, 1999).

15 One way of thinking about information theory-based metrics is to consider them as metrics characterizing the

presence of patterns in time series. The comparison of these metrics for two time series informs about the similarity in

shapes found in graphs representing the time series.

We computed the ME, MIG, FC and EMC with the SYMDYN software (Wolf, 1999). The length of words L was

set as maximal word length, which guarantees the precision for the information content and complexity metrics at the worst

20 random case. The fluctuation complexity metric usually required the largest number of time series for the same word length

(Pachepsky et al., 2006). The word length was set to two in this work as in the work of Pachepsky et al. (2006).

To evaluate model performance by both information content and complexity, distances between measured and

observed streamflow time series were calculated in the two-dimensional spaces of information metrics coordinates:

d!"#,!"#   =   (MIG!"#  –  MIG!"# )!   +   (EMC  !"# – EMC!"# )! /4 (6)

6
d!"#,!"   =   (MIG!"#  –  MIG!"# )!   +   (FC  !"# – FC  !"# )! /4 (7)

Here subscripts ”mod” and ”obs” denote information metrics computed from simulated and observed streamflow,

respectively, The differences of FC values are normalized by division by two.

Significance of differences between Nash-Sutcliffe efficiency (NSE, Nash and Sutcliffe, 1970) values was

5 estimated based on the approximate NSE distributions developed by McCuen et al. (2006)

3. RESULTS and DISCUSSION

3.1 Watersheds Data Overview.

Figure 1 plots observed daily time series of precipitation and streamflow from Oct. 2 1961 to Oct. 1 1971. The

10 studied watersheds vary with average elevation from 98 m to 594 m, average annual precipitation from 765 mm to 1383 mm,

average annual streamflow from 116 mm to 800 mm, and average annual potential evaporation from 711 mm to 1528 mm

(Table 2). Since the watersheds ranging from dry to wet represent quite different hydro-climatic conditions, the patterns of

streamflow vary significantly among the watersheds. The daily precipitation and streamflow in the three wet watersheds

(Tygart Valley River, French Broad River, and Leaf River) are larger than the ones in the two dry watersheds (Guadalupe

15 and San Marcos). Prolonged and frequent periods with streamflow below the detection limit can be found in the dry

watersheds as a consequence of prolonged dry periods.

3. 2 Information Content and Complexity Metrics of Precipitation and Streamflow

Information content and complexity metrics for the five watersheds studied are presented in Fig. 2 and in the Table

Supp1 in Supplementary material. Since there is no definite recommendation on the word length that has appeared to be an

20 ad hoc value in previous publications (e.g., Lange, 1999; Pachepsky et al., 2006; Engelhardt et al., 2009; Pan et al., 2011,

2012) the research of the effect of the word length on the efficiency of information theory based metric needs a separate

research and presents an interesting avenue to explore.

The mean information gain and metric entropy of daily precipitation data are larger than 0.78 for all five watersheds

(Table S1), indicating the high randomness of the daily precipitation time series and a relatively uniform distribution of the

7
system states. Similar metric entropy values were found among the wetter (0.91-0.96, Tygart, French broad and Leaf river)

and among the drier watersheds (0.83, Guadalupe and San Marcos) showing the ability of the information theory-based

metrics to differentiate and group precipitation time series in terms of the frequency and depth of rainfall.

Streamflow MIG values are about 0.5 less than precipitation MIGs, and the difference is approximately the same for

5 wet and dry watersheds. High values of MIG in precipitation reflect high randomness in time series. The randomness is

slightly less in precipitation in dry watersheds than in wet ones. The much lower values of streamflow MIG reflect the fact

that watersheds work as information filters that remove substantial random noise from precipitation signal while converting

it in the streamflow signal. Streamflow time series are not only less noisy, but also more complex. In particular, streamflow

EMC values are substantially higher than precipitation EMC values (Fig. 2). This indicates that, as water is delivered to

10 streams, not only noise is removed but also additional structure is in introduced in the signal, which improves chances of

predictions (higher EMC) and makes fluctuations less random (higher FC). Physical processes of canopy interception,

evapotranspiration, infiltration, soil water flow, etc. control the information filtering and these controls impose structure and

dampen randomness in the streamflow generation (Pan et al., 2012; Roberts, 2015). Similar behavior has been described for

soil water flow with the soil acting as an information filter between rainfall and the resulting soil water content (Pachepsky et

15 al., 2006; Pan et al., 2011; Mishra et al., 2015).

Complexity metrics of precipitation appear to be inversely related to their information content (Fig. 2a, 2b). The

larger is information content and apparent randomness of precipitation the smaller is the complexity of the time series, and

less structure is found in the this time series. Wet watersheds are affected with rainfall with the visibly higher randomness

(Fig. 1), and this is reflected in the higher MIG values. Values of the precipitation MIG are somewhat lower in dry

20 watersheds than in wet ones. Apparently, dry watersheds receive precipitation that exhibits higher complexity that wet ones.

This indicates the presence of structure and better-expressed patterns in precipitation received in dry watersheds.

Measured streamflow time series also demonstrate dependencies between information content and complexity

measures (Fig. 2c, 2d). The character of these dependencies is different for two complexity measures that reflect different

aspects of streamflow patterns. The EMC values reflect the presence of patterns in time series allowing predictability.

25 Streamflow EMC values for wet watersheds are also lower than for dry ones. It is not clear if this happens because

8
precipitation EMC is lower in wet watersheds, or because the watershed has fewer mechanisms to impose the structure on

precipitation signal. The latter suggestion may be supported by results on the dependence of FC on streamflow.

3. 3 Model Performance Evaluation Using Nash- Sutcliffe efficiency and Information Theory-based Metrics

Values of the Nash-Sutcliffe efficiency for eight modes applied at five watersheds are presented in Table 3. Models

5 S1 and M1 perform in unsatisfactory manner. Their values of NSE are close to zero in dry watersheds, and negative in wet

watersheds. The latter means that model predictions are worse than prediction using simply average. These results indicate

that one cannot assume that the role of subsurface flows is insignificant and knowing runoff is sufficient to predict

streamflow dynamics.

According to the classification of Moriasi et al. (2007), performance of models is very good, good, satisfactory, and

10 unsatisfactory if the NSE statistic is larger than 0.75, between 0.65 and 0.75, between 0.5 and 0.65 and less than 0.5,

respectively. Based in this classification, performance of all models appears to be unsatisfactory for the Guadalupe

watershed. Only S4 and M4 perform satisfactorily in San Marcos watershed, Only S3, S4, M3 and M4 perform satisfactorily

in the Tygard Valley watershed. The French Broad and Leaf watersheds have good or very good performance of S3, S3, M3

and M4. Overall, performance of models is better in wet watersheds. The significant improvement occurred for watersheds

15 French Broad, Guadalupe and San Marcos after recharge was added as a mechanism affecting streamflow, i.e. when one

changes models S3 and M3 to S4 and M4 respectively (Table 3).

NSE values increase as the conceptual complexity of models increases (see Table 2). It can be seen that the NSE

values of S2 models are very close to NSE values of M2 models, NSE values of S3 models are close to NSE values of M3

models, and NSE values of S4 models are very close to the NSE values of M4 models for all watersheds except the San

20 Marcos watershed where M2, M3, and M4 Models have larger NSE than S2, S3, and S4 models respectively

Inspection of significance of differences between NSE of different models (Table 3) shows that no significant

differences are found between average values of NSE of S4 and M4 and among S3, S2, M3, and M2 for the French Broad,

among S3, S4, M3 and M4 for the Tygard Valley and Leaf River, between S4 and M4 and between S3 and M3 for the

Guadalupe. The absence of significant differences indicates the opportunity of using other indicators of model performance

25 for model selections.

9
Performance of models in terms of information content and complexity of simulated streamflow is compared with

the information content and complexity of measured streamflow in Fig. 3 and 4. The corresponding distances between

measured and simulated streamfows in coordinates of information-based metrics are shown in the Table Supp2 in the

Supplemental materials. Inspection of graphs in Fig. 3 and 4 shows that, although there is some similarity between ranking

5 models by NSE and by information-based metrics, the latter can provide additional insight in the model performance. In

particular, the information content and complexity of the French Broad watershed are best simulated by models S2, M2 and

M3 (Fig. 3 and 4) although NSE of those models is lower than the one of M4 and S4. The M4 and S4 models seem to

generate simulated streamflows that are more complex than measured ones. Ranking of models by the two complexity

metrics – EMC and FC – can be quite different since these metrics reflect different aspects of the complexity in time series.

10 The French Broad watershed provides a good example of that with regard to the model M1. It is almost perfect based on the

fluctuation complexity but a very poor result based on effective complexity measure (Fig. 3 and 4).

In the Tygard Valley watershed there is no disagreement between NSE-based and information theory based top-

ranked model, both methods point to the model M4. We note that whereas the NSE-based ranking does not discriminate

between S4, and M4, the information theory based metrics clearly indicate that the multi-layer soil modeling (M4) better

15 reflect the information content and complexity of this watershed’s streamflow than the “single layer soil model” S4 does. A

similar situation is observed for the Leaf River watershed where the values NSE for S4 and M4 are indistinguishable, and yet

M4 provides much more similarity in information content and complexity between simulated and measured streamflows

than S4 does. Models S3 and S4 generate streamflows with substantially smaller information content than M3 and M4. This

may indicate that what looks as a noise is actually the result of soil layering.

20 The Guadalupe watershed gives an example of model not actually working well. Models S4 and M4 give the

performance borderline with satisfactory. The information based metrics indicate that M4 is much more preferable, since the

single layer models S2, S3, and S4 do not create enough variation to get the information content right. More complexity is

needed and this is provided by multi-layer soil models M2, M3, and M4. The example of the Guadalupe River shows also

that using two complexity metrics – EMC and FC – can be more efficient than using only one. Model M2, for example,

25 provides values of FC that are very similar to measured ones, i.e. it generates a hidden structure in streamflow time series

10
that is close to that in measured ones. However, this model fails to generate a correct metric EMC, which reflects the

predictability of changes in the time series. The same is also true for the San Marcos watershed. The situation here is

somewhat similar to the case of the French Broad watershed; the NSE values point to the preferability of S4 and M4 models,

but the information content and complexity metrics show that S4 and M4 indeed perform reasonably well, but the best

5 performance is shown by the M3 model which has the third rank in its NSE at this watershed. This indicates that although

NSE values are helpful in model discrimination, they are far from capable of integrate qualitative aspects of correspondence

between measured and simulated time series (Schaefli and Gupta, 2007).

The simple notion of squared error (Eq. 5) is the first attempt to define the distance between time series in the

coordinates of complexity and information content metrics. Weights may be needed to account for the different roles that

10 information content metrics and complexity metrics may play in the evaluation of models. It is possible that these weights

can be found from the comparative evaluation of predictive capability of the models. We note that other recently suggested

information theory-based methods, such as the so-called Hodrick-Prescott filter (Arias-Hidalgo, 2012), Jensen–Shannon

divergence and phase space reconstruction called complexity–entropy causality plane (Serinaldi et al., 2013), can be used to

find series patterns and identify recurrent changes in hydrographs. Also, methods of this work may be applied with different

15 word lengths dependent on the length of available time series (Wolf, 1999). Further search for information theory-based

metrics to complement accuracy-based metrics presents an interesting research avenue to explore.

5. CONCLUSIONS

The information theory-based metrics were applied in this study to characterize the patterns of observed

precipitation and streamflow time series in arid and humid watersheds and to evaluate the performance of eight hydrologic

20 model structures in five watersheds using both traditional Nash-Sutcliffe efficiency (NSE) statistic and usability of

information theory-based metrics as complementary to NSE means for comparison and selection models.

We found that:

• patterns of precipitation and streamflow in humid watersheds were more random and less complex than the ones in

arid watersheds;

11
• watersheds served as information filters and the streamflow time series were much less random and much more

complex than the precipitation time series,

• information content and complexity were substantially different in watersheds with wet and dry climate;

• in pairs of models that differed only by the use of the single-layer or mutilayered soil model, the multi-layer model

5 simulated information content and complexity better than the single-layer model in majority of cases;

• values of NSE appeared to be not significantly different for two or more models for each watersheds; in all these

cases the information-theory based metrics provided a clear distinction between models and the best models could

be selected.

ACKNOWLEDGEMENTS

10 The Interagency Agreement IAA-NRC-05-005 of USDA-ARS with the US Nuclear Regulatory commission supported YP

and FP; GM was supported by the Spanish Ministry of Economy and Competitiveness through the grant FPDI-2013-16742.

REFERENCES

Abbott, M.B., Bathurst, J.C., Cunge, J.A., O’Connell, P.E., Rasmussen, J. 1986a. An introduction to European Hydrologic

System-Systeme Hydrologique Europeen, SHE, 1: History and philosophy of a physically-based, distributed

15 modeling system. J. Hydrol., 87, 45-59.

Abbott, M.B., Bathurst, J.C., Cunge, J.A., O’Connell, P.E., Rasmussen, J. 1986b. An introduction to European Hydrologic

System-Syteme Hydrologique Europeen, SHE, 2: Structure of a physically-based, distributed modeling system. J.

Hydrol., 87, 61-77.

Akaike, H., 1973. Information theory as an extension of the maximum likelihood principle. In: Petrov, B.N., Csaksi, F.

20 (Eds.), 2nd International Symposium on Information Theory. Akademiai Kiado, Budapest, Hungary, pp. 267-281.

Arias Hidalgo, M. E. 2012. A Decision Framework for Integrated Wetland-River Basin Management In A Tropical And

Data Scarce Environment. UNESCO-IHE, Institute for Water Education.

Arnold, J. G., Srinivasan, R., Muttiah, R. S. and Williams, J. R. 1998. Large area hydrologic modeling and assessment part I:

Model development, JAWRA J. Am. Water Resour. Assoc., 34(1), 73–89.

12
Atkinson S., Woods, R.A., Sivapalan, M., 2002. Climate and landscape controls on water balance model complexity over

changing time scales. Water Reours. Res. 38(12), 1314, doi:10.1029/2002WR001487.

Bai, Y., Wagener, T., Reed, P., 2009. A top-down framework for watershed model evaluation and selection uncertainty.

Environ. Modell. Softw. 24, 901-916.

5 Bates, J.E., Shepard, H.K., 1993. Measuring complexity using information fluctuation. Phys. Lett. A 172(6), 416-425.

Beven, K.J., Kirkby, M.J., 1979. A physically-based variable contributing area model of basin hydrology. Hydrol. Sci. Bull.,

24(1), 43-69.

Beven, K.J., Smith, P., 2015.Concepts of information content and likelihood in parameter calibration for hydrological

simulation models. J. Hydrol. Eng. 20(1), A4014010.

10 Chen, L., Shen, Z., Yang, X., Liao, Q., Yu, S.L., 2014. An interval-deviation approach for hydrology and water quality

model evaluation within an uncertainty framework. J. Hydrol. 509, 207-214.

Crawford, N.H., Linsley, R.K., 1966. Digital simulation in hydrology: Stanford Watershed MODEL IV. Technical Report

No. 39, Stanford University, Palo Alto, California.

Engelhardt, S., Matyssek, R. and Huwe, B.: Complexity and information propagation in hydrological time series of mountain

15 forest catchments, Eur. J. For. Res., 128(6), 621–631, doi:10.1007/s10342-009-0306-2, 2009.

Farmer, D., Sivapalan, M., Jothiyangkoon, C., 2003. Climate, soil, and vegetation controls upon the variability of water

balance in temperate and semiarid landscapes: Downward approach to water balance analysis. Water Resour. Res.

39(2), 1035, doi:10.1029/2001WR000328.

Gong, W., Gupta, H.V., Yang, D., Sricharan, K., Hero III, A.O., 2013. Estimating epistemic and aleatory uncertainties

20 during hydrologic modeling: An information theoretic approach. Water Resour. Res. 49, 2253-2273, doi:

10.1002/wrcr.20161.

Grassberger, P., 1986. Toward a quantitative theory of self-generated complexity. Int. J. Theor. Phys. 25, 907-938.

Gupta, H.V., Sorooshian, S., Yapo, P.O., 1998. Toward improved calibration of hydrologic models: Multiple and

noncommensurable measures of information. Water Resour. Res., 34(4), 751-763.

13
Kashyap, R.L., 1982. Optimal choice of AR and MA parts in autoregressive moving average models. IEEE T. Pattern Anal.

4(2), 99-104.

Krause, P., Boyle, D.P., Bäse, F., 2005. Comparison of different efficiency criteria for hydrological model assessment. Adv.

Geosci. 5, 89-97.

5 Lange, H., 1999. Time series analysis of Ecosystem variables Uwe Ehret with complexity measures. InterJournal for

Complex Systems Mauscript #250. New England Complex Systems Institute, Cambridge, MA.

Li, C. Singh, V.P., Mishra, A.K., 2012. Entropy theory-based criterion for hydrometric network evaluation and design:

Maximum information minimum redundancy. Water Resour. Res. 48, W05521, doi: 10.1029/2011WR011251.

Liang, X., Lettenmaier, D.P., Wood, E.F., Burges, S.J., 1994. A simple hydrologically based model of land surface water and

10 energy fluxes for general circulation models. J. Geophys. Res. 99(D7), 14415-14428.

McCuen, R. H., Knight, Z., & Cutter, A. G. 2006. Evaluation of the Nash–Sutcliffe efficiency index. Journal of Hydrologic

Engineering. 11(6): 597-602.

Mishra, V., Ellenburg, W., Al-Hamdan, O., Bruce, J., Cruise, J., 2015. Modeling Soil Moisture Profiles in Irrigated Fields by

the Principle of Maximum Entropy. Entropy 17, 4454–4484. doi:10.3390/e17064454

15 Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., & Veith, T. L. 2007. Model evaluation

guidelines for systematic quantification of accuracy in watershed simulations. Trans. Asabe, 50(3), 885-900

Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models, Part I – A discussion of principles. J.

Hydrol. 10, 282-290.

Pachepsky, Y., Guber, A., Jacques, D., Simunek, J., Van Genuchten, M.T., Nicholson, T., Cady, R., 2006. Information

20 content and complexity of simulated soil water fluxes. Geoderma 134, 253–266. doi:10.1016/j.geoderma.2006.03.003

Pan, F., Pachepsky, Y. a., Guber, A.K., Hill, R.L., 2011. Information and complexity measures applied to observed and

simulated soil moisture time series. Hydrol. Sci. J. 56, 1027–1039. doi:10.1080/02626667.2011.595374

Pan, F., Pachepsky, Y. a., Guber, A.K., McPherson, B.J., Hill, R.L., 2012. Scale effects on information theory-based

measures applied to streamflow patterns in two rural watersheds. J. Hydrol. 414-415, 99–107.

25 doi:10.1016/j.jhydrol.2011.10.018

14
Pechlivanidis, I.G., Jackson, B., McMillan, H., Gupta, H., 2014. Use of an entropy-based metric in multiobjective calibration

to improve model performance. Water Resour. Res. 50, 8066-8083, doi: 10.1002/2013WR014537.

Pechlivanidis, I.G., Jackson, B.M., Mcintyre, N.R., Wheater, H.S., 2011. Catchment scale hydrological modeling: A review

of model types, calibration approaches and uncertainty analysis methods in the context of recent developments in

5 technology and applications. Global Nest J. 13(3), 193-214.

Rathinasamy, M., Khosa, R., Adamowski, J., Ch, S., Partheepan, G., Anand, J., Narsimlu, B., 2014. Wavelet-based

multiscale performance analysis: An approach to assess and improve hydrological models. Water Resour. Res. 50,

9721-9737, doi: 10.1002/2013WR014650.

Reusser, D.E., Blume, T., Schaefli, B., Zehe, E., 2009. Analysiing the temporal dynamics of model performance for

10 hydrological models. Hydro. Earth Syst. Sc. 13, 999-1018.

Roberts, A.D., 2015. The effects of current landscape configuration on streamflow within selected small watersheds of the

Atlanta metropolitan region. J. Hydrol. Reg. Stud. doi:10.1016/j.ejrh.2015.11.002

Schwarz, G., 1978. Estimating the dimension of a model. Ann. Stat. 6(2), 461-464.

Serinaldi, F., Zunino, L., Rosso, O. a., 2013. Complexity–entropy analysis of daily stream flow time series in the continental

15 United States. Stoch. Environ. Res. Risk Assess. 28, 1685–1708. doi:10.1007/s00477-013-0825-8

Shannon, C.E., 1948. A mathematical theory of communication. AT&T Tech. J. 27, 379-423, 623-656.

Singh, V.P., Woolhiser, D.A., 2002. Mathematical modeling of watershed hydrology. J. Hydrol. Eng. 7(4), 270-292.

Son, K., Sivapalan, M., 2007. Improving model structure and reducing parameter uncertainty in conceptual water balance

models through the use of auxiliary data. Water Resour. Res. 43, W01415, doi: 10.1029/2006WR005032.

20 Sugawara, M., Ozaki, E., Wantanabe, I., & Katsuyama, Y. (1976). Tank Model and its Application to Bird Creek, Wollombi

Brook, Bihin River, Sanaga River, and Nam Mune. National Center for Disaster Prevention, Tokyo, Research Note,

11, Kyoto, Japan, pp. 1-64.

Wagener, T., Sivapalan, M., Troch, P.A., McGlynn, B.L., Harman, C.J., Gupta, H.V., Kumar, P., Rao, P.S.C., Basu, N.B.,

Wilson, J.S., 2010. The future of hydrology: An evolving science for a changing world. Water Resour. Res. 46,

25 W05301, doi: 10.1029/2009WR008906.

15
Weijs, S.V., Schoups, G., van de Giesen, N., 2010. Why hydrological predictions should be evaluated using information

theory. Hydrol. Earth Syst. Sc. 14, 2545-2558.

Wolf, F., 1999. Berechnung von Information und Komplexität von Zeitreihen – Analyse des Wasserhaushaltes von

bewaldeten Einzugsgebieten. Bayreuth. Forum Okol. 65, 164 S.

5 Ye, M., Neuman, S.P., Meyer, P.D., 2004. Maximum likelihood Bayesian averaging of spatial variability models in

unsaturated fractured tuff. Water Resour. Res. 40, W05113, doi:10.1029/2003WR002557.

Yu, Z., Lakhtakia, M.N., Yarnal, B., White, R.A., Miller, D.A., Frakes, B., Barron, E.J., Duffy, C., Schwartz, F.W., 1999.

Simulating the river-basin response to atmospheric forcing by linking a mesoscale meteorological model and

hydrologic model system. J. Hydrol. 218, 72-91.

10 Zhao, R., Zhuang, Y., Fang, L., Liu, X., Zhang, Q., 1980. The Xinanjiang model. Proceedings of Oxford Symposium on

Hydrological Forcasting, IAHS Publication No. 129, International Association of Hydrological Sciences,

Wallingford, U.K., 351-356.

16
Table 1. Selected properties of watersheds in this study.

Basin name and Area [km2] Mean Mean Mean Mean


sampling location elevation annual P annual Q annual PE
[m] [mm] [mm] [mm]
French Broad River 2448 594 1383 800 819
near Asheville, NC
Tygart Valley River 2372 390 1166 736 711
near Pipestem, WV
Leaf River near 1950 111 1346 415 1052
Collins, MS
Guadalupe River near 3406 289 765 116 1528
Spring Branch, TX
San Marcos River near 2170 98 827 179 1449
Luling, TX

17
Table 2. General description of the models used (after Bai et al., 2009).

ID General description
S1 Single-layer model with single store. Runoff generation controlled by maximum soil water storage
S2 Single-layer model with single store. Runoff generation by saturation excess and subsurface flow controlled by
threshold storage
S3 Single-layer model with two stores (unsaturated and saturated zones). Evaporation and transpiration from both stores.
Runoff generation by saturation excess and subsurface flow from the saturated zone
S4 Single-layer model with three stores (unsaturated and saturated zones and deep store). Evaporation and transpiration
from saturated and saturated zones. Base flow losses from deep store. Runoff generation by saturation excess and
subsurface flow from the saturated zone
M1 Multi-layer (10 layers to represent a soil moisture profile that fits the Xinanjiang model distribution) model with single
store. Runoff generation controlled by maximum soil water storage
M2 Multi-layer model with single store. Runoff generation by saturation excess and subsurface flow controlled by
threshold storage
M3 Multi-layer model with two stores (unsaturated and saturated zones). Evaporation and transpiration from both stores.
Runoff generation by saturation excess and subsurface flow from the saturated zone
M4 Multi-layer model with three stores (unsaturated and saturated zones and deep store). Evaporation and transpiration
from saturated and saturated zones. Recharge of the deep store. Runoff generation by saturation excess and subsurface
flow from the saturated zone

18
Table 3. The Nash-Sutcliffe efficiency values for eight models in five watersheds.

French Tygard
Model Leaf River Guadalupe San Marcos
Broad Village
S1 -1.499 -0.231 -0.227 0.205 0.076
b b b c
S2 0.590 0.477 0.643 0.407 0.378e
S3 0.608b 0.541a 0.682a 0.450b 0.389e
S4 0.764a 0.567a 0.700a 0.508a 0.548b
M1 -1.236 -0.198 -0.130 0.211 0.114
b b b c
M2 0.589 0.476 0.640 0.418 0.448d
M3 0.609b 0.545a 0.704a 0.460ab 0.497c
M4 0.754a 0.559a 0.699a 0.478a 0.584a
The same superscript indicates that NSE values are not significantly different at the 0.05 significance level.

19
List of figures.
Figure 1. Daily observed precipitation and streamflow time series from Oct. 2 1961 to Oct. 1 1971 at five different
watersheds across US.
Figure 2. Relationships between the mean information content (MIG) and complexity metrics – effective complexity
5 measure (EMC) and fluctuation complexity (FC) in precipitation time series of watersheds in this study: l - French Broad
river, n - Tygard Valley river, u - Leaf river, r - Guadalupe river, s- San Marcos river.
Figure 3. Relationships between mean information content (MIG) and effective measure of complexity (EMC) in measured
(Q) and simulated (numbers) streamflow time series. Blue symbols 1, 2, 3, 4 correspond to single-layer soil models S1, S2,
s3, and S4, red symbols 1, 2, 3, 4 correspond to multi-layer soil models M1, M2, M3, M4.

10 Figure 4. Relationships between mean information content (MIG) and fluctuation complexity (EMC) in measured (Q) and
simulated (numbers) streamflow time series. Blue symbols 1,2,3,4 correspond to single-layer soil models S1, S2, s3, and S4,
red symbols 1,2,3,4 correspond to multi-layer soil models M1, M2, M3, and M4.

20
Figure 1. Daily observed precipitation and streamflow time series from Oct. 2 1961 to Oct. 1 1971 at five different
watersheds across US.

21
Figure 2. Relationships between the mean information content (MIG) and complexity metrics – effective complexity
measure (EMC) and fluctuation complexity (FC) in precipitation time series of watersheds in this study: l - French Broad
river, n - Tygard Valley river, u - Leaf river, r - Guadalupe river, s- San Marcos river.

22
Figure 3. Relationships between mean information content (MIG) and effective measure of complexity (EMC) in measured
(Q) and simulated (numbers) streamflow time series. Blue symbols 1, 2, 3, 4 correspond to single-layer soil models S1, S2,
s3, and S4, red symbols 1, 2, 3, 4 correspond to multi-layer soil models M1, M2, M3, M4.

23
Figure 4. Relationships between mean information content (MIG) and fluctuation complexity (EMC) in measured (Q) and
simulated (numbers) streamflow time series. Blue symbols 1,2,3,4 correspond to single-layer soil models S1, S2, s3, and S4,
red symbols 1,2,3,4 correspond to multi-layer soil models M1, M2, M3, and M4.

24
Supplementary material

Table Supp1. Information content and complexity measures of daily precipitation (P), observed (Q) and simulated daily
streamflow time series using 8 models in five watersheds (S1-M4). ME – Metric Entropy; MIG – Mean Information Gain;
5 EMC – Effective Measure of Complexity; FC – Fluctuation Complexity.
Measures French Broad Tygart Leaf River
Models ME MIG EMC FC ME MIG EMC FC ME MIG EMC FC
P 0.905 0.872 0.198 0.613 0.960 0.943 0.103 0.277 0.915 0.890 0.150 0.552
Q 0.498 0.379 0.717 1.551 0.431 0.301 0.778 1.658 0.420 0.286 0.804 1.520
S1 0.081 0.065 0.093 0.600 0.394 0.337 0.342 1.103 0.014 0.013 0.002 0.214
S2 0.513 0.404 0.650 1.727 0.470 0.348 0.737 1.574 0.332 0.195 0.824 1.248
S3 0.553 0.452 0.603 1.759 0.384 0.254 0.781 1.575 0.243 0.091 0.912 0.941
S4 0.426 0.306 0.724 1.695 0.352 0.217 0.814 1.444 0.247 0.094 0.914 0.950
M1 0.389 0.368 0.125 1.578 0.520 0.471 0.289 1.299 0.280 0.251 0.173 1.361
M2 0.506 0.395 0.667 1.708 0.518 0.405 0.679 1.626 0.326 0.191 0.810 1.257
M3 0.499 0.391 0.647 1.758 0.442 0.324 0.705 1.717 0.302 0.159 0.858 1.321
M4 0.414 0.286 0.771 1.606 0.407 0.282 0.755 1.620 0.368 0.233 0.806 1.515
Models Guadalupe San Marcos
P 0.829 0.785 0.268 1.005 0.830 0.785 0.266 1.000
Q 0.371 0.233 0.827 1.445 0.375 0.225 0.901 1.257
S1 0.008 0.007 0.007 0.129 0.004 0.004 0.004 0.077
S2 0.102 0.081 0.131 0.732 0.236 0.152 0.502 1.070
S3 0.141 0.068 0.439 0.696 0.186 0.063 0.742 0.707
S4 0.217 0.059 0.948 0.668 0.233 0.075 0.947 0.759
M1 0.008 0.007 0.003 0.130 0.017 0.011 0.036 0.176
M2 0.314 0.280 0.204 1.490 0.384 0.290 0.566 1.559
M3 0.423 0.326 0.583 1.685 0.298 0.156 0.854 1.286
M4 0.357 0.210 0.884 1.283 0.259 0.103 0.935 0.890

25
Table Supp2. Distances between observed and simulated streamflow in the coordinates of information content and
complexity measures.

Model French Broad Tygard Valley Leaf Guadalupe San Marcos


d!"#,!"# d!"#,!" d!"#,!"# d!"#,!" d!"#,!"# d!"#,!" d!"#,!"# d!"#,!" d!"#,!"# d!"#,!"
S1 0.699 0.570 0.437 0.280 0.847 0.708 0.851 0.696 0.924 0.630
S2 0.072 0.091 0.062 0.063 0.093 0.164 0.712 0.388 0.406 0.119
S3 0.135 0.127 0.047 0.063 0.223 0.349 0.422 0.409 0.227 0.319
S4 0.073 0.103 0.091 0.136 0.221 0.344 0.212 0.426 0.157 0.291
M1 0.592 0.017 0.518 0.247 0.632 0.087 0.854 0.695 0.891 0.581
M2 0.052 0.080 0.144 0.105 0.095 0.162 0.625 0.052 0.341 0.164
M3 0.071 0.104 0.077 0.037 0.138 0.161 0.261 0.152 0.083 0.071
M4 0.108 0.097 0.030 0.027 0.053 0.053 0.061 0.084 0.127 0.220

26

You might also like