Chapter 14 Model Validat 2008 PEM Fuel Cell Modeling and Simulation Using
Chapter 14 Model Validat 2008 PEM Fuel Cell Modeling and Simulation Using
Model Validation
14.1 Introduction
Model validation is the most important step in the model-building process.
However, it is often neglected. Validating a mathematical model usually
consists of quoting the R~ statistic from the fit. Unfortunately, a high R2
value does not mean that the data actually fit well. If the model does not
fit the data well, this negates the purpose of building the model in the first
place.
There are many statistical tools that can be used for model validation.
This chapter covers the basic concepts for model validation, which include
residuals, normal distribution of random errors, missing terms in the func-
tional part of the model, and unnecessary terms in the model. The most
useful is graphical residual analysis. There are many types of plots of residu-
als that allow the model accuracy to be evaluated. There are also several
methods that are important to confirm the adequacy of graphical techniques.
To help interpret a borderline residual plot, a lack-of-fit test for assessing the
correctness of the functional part of the model can be used. The number of
plots that can be used for model validation is limited when the number of
parameters being estimated is relatively close to the size of the data set. This
occurs when there are designed experiments. In this case, residual plots are
often difficult to interpret because of the number of unknown parameters.
14.2 Residuals
The residuals from a fitted model are the differences of the responses at
each combination of variables, and the predicted response using the regres-
sion function. The definition of the residual for the ith observation in the
data set can be written as:
eij = Yii - xYij ( 14-1 )
with yij denoting the ith response in the data set and 'Jij represents the list
of explanatory variables, each set at the corresponding values found in the
ith observation in the data set. If a model is adequate, the residuals should
394 PEM Fuel Cell Modeling and Simulation Using MATLAB |
14.2.2 I n d e p e n d e n t R a n d o m Errors
A lag plot of residuals helps to assess whether the random errors are inde-
pendent from one to the next. If the errors are independent, the estimate
of the error in the standard deviation will be biased, which leads to improper
inferences about the process. The lag plot works by plotting each residual
value versus the value of the successive residual. Due to the way that the
residuals are paired, there will be one less point than most other types of
residual plots.
There will be no pattern or structure in the lag plot if the errors are
independent. The points will appear randomly scattered across the plot,
and if there is a significant dependence between errors, there will be some
sort of deterministic pattern that is evident.
Example 14-1 shows the types of scatterplots used for determining
consistent standard deviation.
TABLE 14-1
Experimental and Calculated Results for Example 14-1
Run Temperature Current Experimental Activation Predicted Activation
Order (K) (Amps) Overvoltage (V) Overvoltage (V)
1 358 2.72 -0.2717 -0.2647
2 328 6.66 -0.4017 -0.4014
3 343 6.66 -0.3522 -0.35
4 358 6.66 -0.3038 -0.3025
5 343 6.66 -0.3341 -0.3283
6 343 6.66 -0.3 756 -0.3775
7 328 6.66 -0.3727 -0.3747
8 343 2.72 -0.322 -0.3188
9 343 6.66 -0.3492 -0.35
10 343 6.66 -0.3472 -0.35
11 328 2.72 -0.3141 -0.3193
12 328 6.66 -0.352 -0.3541
13 343 6.66 -0.3482 -0.35
14 358 6.66 -0.3473 -0.353 7
15 358 6.66 -0.325 -0.3252
16 358 6.66 -0.3218 -0.3252
17 343 16.33 -0.4075 -0.4083
18 343 16.33 -0.3902 -0.3868
19 343 6.66 -0.3492 -0.35
20 343 6.66 -0.3 788 -0.3775
21 358 16.33 -0.3834 -0.386
22 343 2.72 -0.2969 -0.292
23 343 6.66 -0.3249 -0.3283
24 343 2.72 -0.2868 -0.292
25 343 16.33 -0.4453 -0.4379
26 343 6.66 -0.3502 -0.35
27 328 6.66 -0.3793 -0.3747
28 343 16.33 -0.4062 -0.4083
396 PEM Fuel Cell Modeling and Simulation Using MATLAB |
Model Validation 397
X 10 -3
8
0
0 0
-1 "T
'
I 0 'I 0
' 0 '
' ' 0
m
t~
- - - @ - - I '
I
I
ei
-I . . . . . .
'I 0 'I
I I _ ~ _ _ _
"0
|m
t~ 0 I I
I
I
I
I
I I I I
, 0 , , , 0
I I I I
-2 1. . . . ~,q~ -- - -t 1" t-
I t,,D / I I
I I I I
I ~ I I I
I ~ I I I
I I I I
-4 I I I I
0 I
I
~
~
I
I
I
I
I
I
I I I I
I ~ I I I
-6 ......... I........
I
~ -I . . . . . ~.
I
.~
I
.I.
I
L . . . . . . .
I
-I
I
I I I I I I I
I I I I I I 0 I
I I I 0 I I I I
II I ,I ; ,I ; ,
-8 I
Figures 14-1 through 14-4 show the residuals versus the experimental
factors, the run order plot, and the lag plot for Example 14-2.
For Figures 14-1 and 14-2, the range of the residuals in these figures
looks essentially constant across the levels of the predictor variables, which
are temperature, and current. There are points randomly scattered above
and below y = 0 line. This suggests that the standard deviation of the
398 PEM Fuel Cell Modeling and Simulation Using MATLAB |
X 10 "3
........ ;- -; T T
; -;- ;
I I I I I I I
I I I I I I I
I I i
I I I I I
I I ~ I I I I I
. . . . . . . . I. . . . . . . -4 . . . . . . . 4. . . . . . . . 4- . . . . . . . I- . . . . . . . I. . . . . . . . . . . . . .
I I I
I I I
I I I
I I I
I I I
J . . . . . . . 1 . . . . . . . /- . . . . . . .
0
0 o
m
t _ . . . . . .
O
:3
. . . . . . . .]
T . . . . . . . T . . . . . . . . . . . . . . . I. . . . . . . I
iim
I I I I
I I I I
I I I I
IZ, 0 I I I I
I I I I
-2 - D . . . . T . . . . . . . T
. . . . . . . r . . . . . . . I. . . . . . . "Ii
I I
I I I I
I I I I
0 I I I I
O
I I I I
-4 . . . . . . . . I. . . . . . .
I
-I . . . . . . .
I
-I- .
I
. . . . . . t-
I
. . . . . . . I-
I
-I
0 '
I , 0
I
I
I
I
I I
I I I I I
I I _~il'IP~i_ I I I
-6 . . . . . . . . I. . . . . . . -.I . . . . . . 4.. .
I
.. . . . . . .. . . .4-
I
. . . .
--Ii . . . . . . . 41 . . . . . . .
I I I I I I I I
I I I I I I I I
I I I I I I I O I
,
I ', ,, I , I
-8 I
2 4 6 8 10 12 14 16 18
Current(A)
random error is the same for the responses observed at each temperature
and current. These scatter plots indicate that the parametric model is most
likely a good fit to the experimental data.
14.3 N o r m a l D i s t r i b u t i o n of N o r m a l R a n d o m Errors
x 10.3
I
F
I
~I
I I I I
I I I I
I I I I
-I- 4- -4
I
I I R I I
,0
I I
I ~ I
I
I
I
I I I I
_1 J. L J
0 0
()
0
w -0-0 O-0 v
0 0 0
"o 3" "1
..(
we
0 0 I
I
-4 4"
I I ~
"4
I
I I I R ~ I
I I I ~ I
I I I
-6 6 I
I
I
I
I
I
I I I
0 I
I
I
I f~
I
I
-8
0 5 10 15 20 25 30
Run Order
X 10 "3
. . . . . . . . I. . . . . . .
I -I ....... "I ....... 7 ....... T ....... ,r- ....... I
I I I I I I I
I I I I I I I
I I I llml I I I
I I I ~ I I I
. . . . . . . . I. . . . . . . -1 . . . . . . . "1 . . . . . . . "1' . . . . . . . t . . . . . . . I- . . . . . . . I-- . . . . . . .
I / I l /
0
I
I
I
I
I
I
I
I
I
I
I
b I
I
I
I
I
I
I
I I I I I I I
. . . . . . . . I. . . . . . . 4 . . . . . . . 4 . . . . . . . 4 . . . . . . . 4- . . . . . . . I- . . . . . . . I. . . . . . . .
I I I I I
, , 0 , , 0 ,
I I I I I
I I I I I
0 I I ( ~ I I
_ _ L~Jk _ _ _ I . . . . . . . I . . . . . . . . . . . . . . L . . . . . . .
V
'-0
i
0 0 0
Q
"o
ii u Q
in I
IZ: 0 '
I 0
I I
-2
I v I
I I
,
I
,
I
~
I I
. . . . . . . . l. . . . . . . -I . . . . . . . "I' . . . . . . . "I" . . . . . . . 1"- . . . . . . I- . . . . . . . I. . . . . . .
I I I I 0 I I
I I I I ~ I I
I I I I I ~ I I
I
I
I
I
I
I
I
I l:i '
I 'I
-6 . . . . . . . . 11. . . . . . . . . . . -~--
. . -4I . . . . . . . . . . 4.. . . . . . . . . . I I--
I . . . . . . . Ii. . . . . . . .
I I I I I I I
I I I (~1 I I I
I I I bll I I I
-8 I 1 I , 1 I ,'
-8 -6 -4 -2 0 2 4 6 8
Residuals x lo .3
Figures 14-5 through 14-7 show the normal probability plot, histo-
gram, and box plot for Example 14-2.
0.95
I
I
I
,
I'-
.__,
I
I
.....
=l = S*-I
I
~'_ _1
I I h ~'- I
I I * I
0.90 . . . . . . . . I. . . . . . . . . _ _ ~ _ ~'_'_"]"_ _ _,_ _ _
,,iu,* I
0.75
m
m
I
~
9
|-.= 0.50 . . . . . . . . I. . . . . . . .
I
.... L- . . . . . . . -L
I
(D I
I ,/~ ",
I
I
0.25 . . . . . . . +
I
'1-
I
-
t ........ ', ~ ,
I I I I I I
I I I I I I
I I I I
I "~.,J I I I I I
-6 -4 -2 0 2 4 6
Quantities from Standard Normal Distribution x 10.3
nebulous, it may be helpful to use statistical tests for the hypothesis of the
model. One may wonder if it may be more useful to jump directly to the
statistical tests (since they are more quantitative), however, residual plots
provide the best overall feedback of the model fit. These quantitative tests
are termed "lack-of-fit" tests, and there are many illustrated in any statis-
tics textbook.
The most commonly used strategy is to compare the amount of vari-
ation in the residuals with an estimate of the random variation in the model
is to use an additional data set. If the random variation is similar, then it
can be assumed that no terms are missing from the model. If the random
variation from the model is larger than the random variation from the
independent data set, then terms may be missing or misspecified in the
functional part of the model.
Comparing the variation between experimental and model data sets
is very useful, however, there are many instances where a replicate meas-
urement is not available. If this is the case, the lack-of-fit statistics can be
Model Validation 403
0.9
0.8
0.7
0.6
,. 4
c
0.5
o
O3
0.4
0.3
0.2
0.1
-0.01 -0.008 -0.006 -0.004 -0.002 0 0.002 0.004 0.006 0.008 0.01
Residuals
O'm = I ni~
I
(nu 1- p) .= n,(yij - Yij )2 (14-2)
X 10 .3
if}
i .
>
-2
I
-4
I
-6
I
,,, !
-8 I
1
Column Number
at the ith combination, then Crm should be close in value to cr~ and should
also be a good estimate of cr. If the model is missing any important terms,
or any of the terms are correctly specified, then the function will provide
a poor estimate of the mean response for some combination of predictors,
and Crmwill probably be greater than cry.
The model-dependent estimator can be calculated usingT:
I 1 n~n, )~
= 2(Yij--Yij (14-3)
G ( n - n u ) .= i=l
Since Crr depends only on the data and not on the functional part of the
model, this indicates that cr~ will be a good estimator of rx, regardless of
whether the model is a complete description of the process. Typically, if
O'm > Or, then one or more parts of the model may be missing or improperly
L = (14-4)
T = fl~ (14-5)
aa
FIGURE 14-8. ANOVA for temperature and current for Example 14-3.
Figure 14-8 illustrates the ANOVA table with the temperature and
current as factors.
The confidence limits printed in the MATLAB workspace from
Example 14-3 are as follows:
Sample mean = -3.5714e-006
Confidence interval for sample mean at 95% confidence l e v e l -
-0.0014334 <= Sample mean <= 0.0014262
In Figure 14-8, the model F-values of 50.93 and 11.45 indicate that the
model terms are significant. There is a 0.0% and 0.04% chance that a model
F-Value this large can be due to noise. When the values of Prob > F are less
than 0.05, this typically indicates that the model terms are significant. If
the values are greater than 0.100, this indicates that the model terms are
not significant. If there are many insignificant model terms, model reduc-
tion may improve the model.
Chapter Summary
Fuel cell validation is the most important step in the model-building
process. However, little attention is usually given to this step. A fast
method for analyzing the validity of a model is to look at plots of residuals
versus the experimental factors, run plots, and lag plots. These plots give
a good feel for how accurately a model fits the experimental data, and
how dependable it is. Selecting various statistical techniques, or using a
combination of them, will tell the user if there are any unnecessary por-
tions of the model, or will help determine the amount of noise. Some of
the techniques that are useful in comparing experimental and calculated
408 PEM Fuel Cell Modeling and Simulation Using MATLAB |
Problems
9 Perform a regression analysis for the data in Example 14-1.
9 Determine the T and F distribution tests for the data in Example 14-1.
9 How well do you think that the calculated data in Example 14-1 fit the
experimental data?
9 Perform a lack-of-fit test for the data in Example 14-1.
Endnotes
[1] NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist
.gov/div898/handbook/. Date created 06/01/2003. Last updated 07/18/2006.
[2] Montgomery, D.C. Design and Analysis of Experiments. 5th ed. 2001. New
York: John Wiley & Sons.
[3] Amphlett, J.C., R.M. Baumert, R.F. Mann, B.A. Peppley, P.R. Roberge, and T.J.
Harris. Performance modeling of the Ballard Marck IV solid polymer electrolyte
fuel cell. J. Electrochem. Soc. Vol. 142, No. 1, January 1995.
[4] NIST/SEMATECH e-Handbook of Statistical Methods.
[5] Montgomery, Design and Analysis of Experiments.
[6] NIST/SEMATECH e-Handbook of Statistical Methods.
[7] Ibid.
[8] Ibid.
Bibliography
Barbir, F. 2005. PEM Fuel Cells: Theory and Practice. Burlington, MA: Elsevier
Academic Press.
Lu, G.Q., and C.Y. Wang. Development of micro direct methanol fuel cells for high
power applications. J. Power Sources. Vol. 144, 2005, pp. 141-145.
O'Hayre, R., S.-W. Cha, W. Colella, and F.B. Prinz. 2006. Fuel Cell Fundamentals.
New York: John Wiley & Sons.
Pekula, N., K. Heller, P.A. Chuang, A. Turhan, M.M. Mench, J.S. Brenzier, and K. Unlu.
Study of water distribution and transport in a polymer electrolyte fuel cell using
neutron imaging. Nucl. Instrum. Methods Phys. Res. A. Vol. 542, 2005, pp. 134-141.
Raposa, G. Performing AC impedance spectroscopy measurements on fuel cells.
Fuel Cell Magazine. February/March 2003.
Smith, M., K. Cooper, D. Johnson, and L. Scribner. Comparison of fuel cell electro-
lyte resistance measurement techniques. Fuel Cell Magazine. April/May 2005.
Turhan, A., K. Heller, J.S. Brenizer, and M.M Mench. Quantification of liquid water
accumulation and distribution in a polymer electrolyte fuel cell using neutron
imaging. J. Power Sources. Vol. 160, 2006, pp. 1195-1203.
The U.S. Fuel Cell Council's Joint Hydrogen Quality Task Force. November 29,
2004. Primer on Fuel Cell Component Testing: Primer for Generating Test Plans.
Document No. USFCC 04-003. Available at: www.usfcc.com.