0% found this document useful (0 votes)
66 views22 pages

Verification of Ensemble Forecasts - A Survey: Laurence J. Wilson Meteorological Service of Canada Montreal, Quebec

This document surveys methods for verifying ensemble forecasts, including verifying the ensemble distribution, individual members, and probability forecasts derived from the ensemble. It discusses scores like the ranked probability score (RPS) and continuous ranked probability score (CRPS) to verify the ensemble distribution, as well as rank histograms to assess calibration. Methods for verifying individual members and the ensemble mean are also outlined. The document concludes by noting reliability diagrams and receiver operating characteristic (ROC) curves can verify probability forecasts from the ensemble.

Uploaded by

suraj.atmos458
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views22 pages

Verification of Ensemble Forecasts - A Survey: Laurence J. Wilson Meteorological Service of Canada Montreal, Quebec

This document surveys methods for verifying ensemble forecasts, including verifying the ensemble distribution, individual members, and probability forecasts derived from the ensemble. It discusses scores like the ranked probability score (RPS) and continuous ranked probability score (CRPS) to verify the ensemble distribution, as well as rank histograms to assess calibration. Methods for verifying individual members and the ensemble mean are also outlined. The document concludes by noting reliability diagrams and receiver operating characteristic (ROC) curves can verify probability forecasts from the ensemble.

Uploaded by

suraj.atmos458
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Verification of Ensemble Forecasts - A

Survey

Laurence J. Wilson
Meteorological Service of Canada
Montreal, Quebec
Outline
• The ensemble verification problem
– Attributes applied to the ensemble distribution
• Verification of the ensemble distribution
– Wilson 1999
– RPS and CRPS
– Rank Histogram
• Verification of individual ensemble members
• Verification of probability forecasts from the
ensemble
– Reliability tables
– The ROC
Verification of the ensemble
• Problem:
– how to compare a distribution with an observation
• The concept of “consistency”:
– For each possible probability distribution f, the a posteriori
verifying observations are distributed according to f in
those circumstances when the system predicts the
distribution f. (Talagrand)
– similar to reliability
• The concept of “non-triviality”
– the eps must predict different distributions at different
times
Strategy for ensemble verification
Ensemble verification - distribution
Ensemble verification - 500 mb
Ensemble verification - 500 mb
Comments on “Wilson” score
• Sensitive both to “nearness” of the ensemble mean
and to ensemble spread
• Verifies the distribution only in the vicinity of the
observation; variations outside the window have
no impact
• Believed to be strictly proper - shown empirically
• Related to Brier Score for a single forecast
Sc = 1 − BS
• Can account for forecast “difficulty” by choosing
window based on climatological variance
Verification of approximations to the eps
distribution
• The Rank probability score (RPS)
1  K  i i

2

RPS = ∑ ∑ P − ∑O  
K − 1  i =1  n =1 n n 

  n =1  
– discrete form, choose categories; samples distribution
according to categories
• Continuous RPS

[P ( x ) − Pa ( x ) ] dx
∞ 2
CRPS ( P , x a ) = ∫
−∞
CRPS example

CDF - Forecast-observed

0.9

0.8

0.7

0.6
Probability

0.5

0.4

0.3

0.2

0.1

0
X

Forecast Observed
Rank Histogram (Talagrand Diagram)
• Preparation
– order the members of the ensemble from lowest to
highest - identifies n+1 ranges including the two
extremes
– identify the location of the observation, tally over a
large number of cases
• Interpretation
– Flat indicates ensemble spread about right to represent
uncertainty
– U-shaped - ensemble spread too small
– dome-shaped - ensemble spread too large
– assymetric - over- or under-forecasting bias
– This is NOT a true verification measure
Rank Histogram example
Rank Histogram
Verification of individual members
• Preferred for comparison with operational model
than verification of ensemble mean
• Unperturbed control
– compare with full resolution model
• Best and worst member
– a “posteriori” verification - less use to forecasters
– select over a forecast range or individually at each range
• Methods
– all that apply to continuous fields: RMSE, MAE, bias,
anomaly correlation etc.
– preferable to verify against data than analysis.
The Ensemble mean
• Popular, because scores well with quadratic rules
• Should NOT be compared to individual outcomes:
– different sampling distribution
– not a trajectory of the model
Verification of probability forecasts from the
Ensemble
• Same as verification of any probability forecasts
• Reliability Table (with unconditional
distribution of forecasts) + ROC (with
likelihood diagram) sufficient for complete
diagnostic verification
• Reliability table: Distribution conditioned by fcst
• ROC: Distribution conditioned by obs.
• Attributes:
• reliability
• sharpness
• resolution
• discrimination
ROC - ECMWF Ensemble Forecasts
Temperature 850 mb anomaly <-4C (vs. analysis)

Like lihood Dia gra m - 96 h


Re la tive Ope ra ting Cha ra cte ristic
T850 a nom a ly <-4, Europe a na l 2000
1 1500

Cases
y es
1000
0.9 no
500

0.05

0.25

0.45

0.65

0.85
0.8

For e cas t Pro bability


0.7

Like lihood Dia gra m - 144 h


0.6 3000
96 h 2500
Hit Rate

0 s kill 2000

Cases
0.5 y es
144 h 1500
240 h 1000 no
0.4 500
0

0.05

0.25

0.45

0.65

0.85
0.3
For e cas t Pro bability
AZ DA
0.2 96 h 0.900 1.812 Like lihood Dia gra m - 240 h
144 h 0.831 1.357
3500
240 h 0.725 0.844
0.1 3000
2500

Cases
2000
y es
0 1500
1000 no
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Fals e Alarm Rate 500
0

0.05

0.25

0.45

0.65

0.85
For e cas t Pro bability
ROC Issues
• Empirical vs. fitted
• No. points needed to define the ROC
• ROC and value (“potential value”)
ROC - threshold variation
(Wilson, 2000)

ROC - Summer 97, Europe

0.9

0.8

0.7

d 3 - 1 mm
0.6
d 3 - 2 mm
Hit Rate

0.5
d 3 - 5 mm

0.4 d 3 - 10 mm

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Alarm Rate
ROC - Summer 97 -Europe

0.9

0.8

0.7

0.6
d 3 - 1 mm
Hit Rate

0.5 No skill
d 3 - 10 mm
HR - 1 mm
0.4
HR - 10 mm

0.3

Az s
0.2 d3 1mm - 0.866 d3 1 mm - 1.221
d3 10mm - 0.851 d3 10 mm - 1.096
0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Alarm Rate
Summary
• Verification of the ensemble distribution - depends
on how it is to be used by forecaster
• Two aspects: verification of distribution vs.
verification of probabilities from the distribution
• Several measures shown, characteristics identified
• Sufficiency of Reliability table and ROC graph for
diagnostic verification of probability forecasts

You might also like