0% found this document useful (0 votes)
14 views26 pages

Automated Classification of Gust Events in The Contiguous USA

Uploaded by

omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views26 pages

Automated Classification of Gust Events in The Contiguous USA

Uploaded by

omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Contents lists available at ScienceDirect

Journal of Wind Engineering & Industrial Aerodynamics


journal homepage: www.elsevier.com/locate/jweia

Automated classification of gust events in the contiguous USA


Nicholas J. Cook
Wind Engineer, Highcliffe-on-Sea, Dorset, UK

A R T I C L E I N F O A B S T R A C T

Keywords: Extreme gust observations near the ground, which are critical in assessing the impact of wind on structures, are a
Thunderstorms disjoint mixture of events created by different meteorological mechanisms which require separation before
Frontal downbursts assessment. Classification by visual inspection of the meteorological data becomes impractical for very large
Mesoscale gusts
datasets. In this study recent automated approaches that use statistics, pattern recognition and/or neural net­
Synoptic-scale gusts
ASOS
works, were calibrated against 9014 visually classified gust events of ≥40kn from 25 locations across the USA
Kernel density estimation over 22 years. The visual classification distinguished between five classes of valid gust event: synoptic scale
K-means storms, deep convection, the forward flank and the rear flank of gust fronts, and downbursts from isolated
Neural network thunderstorms; and between two classes of artefact. The ensemble-averaged timeseries of each class formed a
Shapelet transform distinctive hierarchy. The misclassification rate against the visual classification varied between 24% for the
Fuzzy membership poorest method to 10% for the best method, with most differences between adjacent classes. Extending the most
promising method to include contemporaneous temperature and pressure observations reduced the misclassifi­
cation rate to 2.3%. When applied to >107 gust events ≥20kn from 450 locations across the USA, the class
hierarchy remained stable. Implementation is by open-source R scripts.

damage to structures across the contiguous United States (CONUS). In


addition to the peak gust speed in a thunderstorm downburst, sudden
1. Introduction
off-axis changes in wind direction are a particular issue for
horizontal-axis wind turbines (Kwon et al., 2012) (Ahmed et al., 2022)
Wind observations in the atmospheric boundary layer (ABL) near the
(Lu et al., 2019). Zhang et al. (2015) report thunderstorms as the cause
ground surface form the critical first link in the Davenport Chain (IAWE,
of 20% of all turbine accidents. Downbursts are also a major hazard to
2011) for assessing the impact of wind on the built environment. A
aircraft on take-off and landing.
concept now generally accepted is that all wind climates are mixed in
An impressive number of full-scale field studies have used a variety
the sense that physically different wind mechanisms govern surface
of approaches: single and groups of anemometers (Canepa et al., 2020)
winds at different times in an exclusive, or disjoint, manner. Gomes and
(Xhelaj et al., 2020) (Zhang et al., 2019a), vertical arrays on single
Vickery (1978) were first to separate observations from the various
towers (Zhang et al., 2019b) (Choi, 2004) and on groups of towers
mechanisms for analysis before assessing their joint contribution to
(Orwig and Schroeder, 2007), by Doppler RADAR, SODAR or LiDAR
extreme winds, especially from rare (e.g., hurricanes and tropical cy­
soundings (Canepa et al., 2020) (Xhelaj et al., 2020) (Gunter and
clones) or localised (e.g., thunderstorm and tornado) events. Now that
Schroeder, 2015) (Shu et al., 2017), and in various combinations. Most
analysis and prediction models for hurricanes and tropical cyclones are
employ the classical directional decomposition method for synoptic
well established (Vickery et al., 2009), the current focus of attention has
observations, with adjustments for the transient nature of downbursts,
moved to thunderstorm and other convective downbursts. The transient,
as addressed by (Zhang et al., 2019a). Key consensus findings are that
non-stationary, meso-scale nature of these convective gusts, mixed with
the direction of the peak gust speed is invariant with height (Canepa
the ABL gusts in synoptic-scale windstorms, makes their detection and
et al., 2020) (Xhelaj et al., 2020) (Choi, 2004), and that the depth of the
study a particular challenge. Downbursts from isolated thunderstorms
downburst gust front increases with time (Gunter and Schroeder, 2015).
produce distinctive wind fields that affect structures differently from
Downbursts are reported to be “arranged either randomly, in
normal engineering assumptions, in ways that are only recently
squall-lines, or in mesoscale convective systems” (Xhelaj et al., 2020).
becoming understood.
Identification and classification of gust events in a timeseries of ob­
The importance of thunderstorms to Wind Engineering is discussed
servations are generally implemented as two sequential processes in
by Lombardo et al. (2014) who report they are the cause of most of the

E-mail address: [email protected].

https://doi.org/10.1016/j.jweia.2023.105330
Received 27 November 2022; Received in revised form 19 January 2023; Accepted 24 January 2023
Available online 9 February 2023
0167-6105/© 2023 Elsevier Ltd. All rights reserved.
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Acronyms NNDT Neural Network Direct Traces method


NOAA US National Oceanic and Atmospheric Administration
ABL Atmospheric boundary layer POT Peaks-over-threshold
ASOS Automated Surface Observing System of the US National QC Quality control
Weather Service SPC NOAA Storm Prediction Center
BHKD Bivariate Highest Kernel Density method SPECI Special current weather report of a significant change
CM Confusion matrix between METAR reports (FM-16)
CONUS Contiguous United States TD6405 NCEI data set of ASOS wind speed and direction
CVA Canonical Variate Analysis observations at 1-min intervals
ETNN Emergence-Trend Neural Network method TD6406 NCEI data set of ASOS temperature, dewpoint, pressure
EVA Extreme-value analysis and precipitation observations at 1-min intervals
GCP Great Colorado Plateau TNNDT Trivariate Neural Network Direct Traces method
HKD Highest Kernel Density method T/NT Binary thunderstorm/non-thunderstorm or thermal/non-
KDE Kernel Density Estimation thermal classification
METAR Meteorological Aerodrome Report of current weather (FM- T/NT/A Trinary thermal/non-thermal/artefact classification
15) UTC Coordinated Universal Time (Zulu)
NCEI US National Centers for Environmental Information WMO World Meteorological Organisation
NN Artificial Neural Network XTNNDT Trivariate Neural Network Direct Traces method trained
NNT Nearest Normalised Trace method for extremes

either order: a) identify all gust events, then classify each event; or b) gust factor events that lacked the expected temperature/pressure
pre-define classes, then search for events of that class. In chronological anomalies as artefacts.
order of publication:- • 2020, Samanta et al. (2020): detected thunderstorm days for
pre-monsoon winds at Kolkata from cloud-base height and potential
• 1992, Twisdale and Vickery (1992) classified thunderstorm gusts as temperature at the 850 hPa level.
the maximum gust observed on “thunderdays” – days where thunder • 2020, Chen and Lombardo (2020): built a convolutional Neural
is seen or heard. Network (NN) that differentiated between thunderstorm and syn­
• 1999, Choi (1999) identified thunderstorm gusts by visual search optic gusts in 1-min interval wind observations from the US Auto­
through 13 years of anemograph charts and other records. mated Surface Observing System (ASOS). This was trained using 76,
• 2002, Choi and Hidayat (2002) identified thunderstorms in the 20 480 records of 91-min duration, centred on a maximum gust ≥40kn,
highest gusts per year as coinciding with “thunder and rain” to and mutually separated by ≥ 45 min. The thunderstorms in the
calibrate the difference in gust factor between thunderstorm and training set were automatically identified from the thunderstorm flag
monsoon winds in Singapore. in the corresponding METAR, as in (Lombardo et al., 2009). Training
• 2002 Kasperski (2002), for Germany, defined the three classes: the NN took 161 CPU hours. They noted an issue of false positive
depression, gust front, and thunderstorm, for gusts separated by 24h, spikes in the ASOS data that were classified as thunderstorm gusts.
identifying by the peak, mean and gust factor, but found it difficult to • 2022, Arul et al. (2022): Used 240 non-stationary 1h-periods of wind
separate depressions from fronts. speed to train a Stationary Shapelet Transform (SST) – a method
• 2009 Lombardo et al. (2009) used the thunderstorm flag in METAR originally developed (Ye and Keogh, 2011) to find patterns in
to identify hourly maximum gusts from thunderstorms. electro-cardiograms. Here a “shapelet” is a short timeseries that is
• 2014, De Gaetano et al. (2014) adopted a statistical approach, characteristic of an event class. From 168 records of 1-h duration at
evaluating 12 parameters: peak 1s gust; 1-min mean speed; 10-min 10Hz from sonic anemometers, and without presuming any class
mean, and 1-h mean: speed, direction, turbulence intensity, skew­ structure, an automatic process generated 35,998 candidate shape­
ness, and kurtosis; of 2Hz–10Hz sonic anemometer data. They used lets which were winnowed down to 32 “mother shapelets” that best
various combinations of gust factor in a logic tree to identify the represented recurring shapes in the timeseries. On visually classi­
same three classes as Kasperski (2002) but could not definitively fying these shapelets, 21 indicated thunderstorm gusts and 11 indi­
separate fronts and thunderstorms – “classifying an event not attrib­ cated synoptic gusts. Transforming the whole timeseries with each
utable to a depression (D) as a thunderstorm (T) or a gust front (F) is the mother shapelet gave a set of coefficients containing peaks, each
ratio G10/G60: when it is less than 0.90, the event is usually a thun­ corresponding to a section of the timeseries matching a mother
derstorm (T); when it is greater than 0.90, the event is usually a gust front shapelet, so indicating a classified gust event. Although the classifi­
(F).” cation is only binary, thunderstorm/non-thunderstorm (T/NT),
• 2019, Huang et al. (2019) used daily outliers in gust, mean temp and multiple mother shapelets were required because of the high vari­
mean humidity: Gust >15 m/s, plus outliers of temperature and ability of thunderstorm shapes.
humidity within 20 min of the peak gust, to identify “thermal­
ly-developed” wind. 2. Motivation for this study
• 2019, Guerova et al. (2019) used instability indices and integrated
water vapour to predict thunderstorm activity some minutes before In each of the previous studies, summarised above, the methodology
observed lightning flashes. This is just one example of similar pre­ used one of three basic approaches.
dictive methods using satellite data.
• 2019, Vallis et al. (2019) identified thunderstorm downbursts from a) Conventional statistical moments of the wind speed timeseries
the gust factor and from anomalies in atmospheric pressure, tem­ (Kasperski, 2002) (Solari et al., 2020).
perature and dewpoint. Through the passage of cold fronts, they b) Pattern recognition applied to the wind speed timeseries (Arul et al.,
differentiate between the expected synoptic changes of temper­ 2022) (Chen and Lombardo, 2020).
ature/pressure and non-synoptic events. They also classified high

2
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

c) Non-anemometric meteorological parameters (Twisdale and Vick­ 3.2. TD6405


ery, 1992) (Choi and Hidayat, 2002) (Li et al., 2022) (Samanta et al.,
2020) (Guerova et al., 2019). Each TD6405 file includes the 2-min mean and 1-min maximum 3s
gust wind speed and direction at a resolution of 1kn and 1◦ . Initially,
Each approach has its strengths and weaknesses. Methods using from 2000, Belfort cup and vane anemometers with a 5s gust response
satellite data, e.g., Guerova et al. (2019), are principally for short-term were used, most at 10m above ground, but some at 26 feet (7.9m). Be­
forecasting and nowcasting, and cannot be applied to long-term his­ tween 2005 and 2010 these were replaced by Vaisala sonic anemome­
torical records. Dependence on the METAR thunder flag is appropriate ters averaged to give the WMO standard 3s gust. Although presented
in selecting gust events for calibration or training, but as a general (NCDC, 2006a) as a homogeneous set of data in fixed format text files,
classification criterion it risks producing numerous false positives and TD6405 is neither of these things owing to incremental changes in
false negatives, because not all thunderstorms are flagged, and many do instrumentation, acquisition, and quality control (QC) procedures, as
not produce downbursts at the anemometer. A potential weakness of well as errors in transmission and archiving which require each line of
(Solari et al., 2020) is that it used high-frequency observations to eval­ data to be parsed. In addition to the typical artefacts found in long-term
uate the statistical moments and it is not immediately clear whether the wind records (Cook, 2014a), most of which can be detected automati­
method will operate successfully with sparser observations, e.g., the cally (Cook, 2014b), there are two recurring artefacts particular to
1-min interval in (Vallis et al., 2019). A strength of SST in (Arul et al., TD6405.
2022) is as a “glass box” method where all workings are transparent and
monitorable by the user, whereas the convolutional NN in Chen and 1. Occasionally, data between 00:00 and 12:00 UTC on one day are
Lombardo (2020) is a “black box” with all its workings hidden. archived as having occurred on the previous day in addition to the
Although some of the above studies share the same source observa­ valid data for that day. This duplication is correctable by exploiting
tions, each one stands on its own in the literature. These methods serial continuity to determine which of each pair of observations
deserve a considered and fair intercomparison of their effectiveness. belongs to which day.
There is also the prospect of releasing some hidden synergy between the 2. Having no moving parts and heated in the winter to prevent icing,
methods, to capitalise on their respective strengths. The binary T/NT the sonic anemometers provide ideal perches for birds which block
classification in all the methods distributes downbursts in gust fronts the acoustic path and generate spurious large “spikes” of gust speed
between the two classes with, as reported (Kasperski, 2002) (Vallis et al., on landing and take-off. These spikes are difficult to distinguish from
2019), biased to NT and (Arul et al., 2022) (De Gaetano et al., 2014) instrumentation and transmission glitches, and localised thermal
biased to T. Thus, each class remains a mixture, rather than the intended events (e.g., dust devils) lasting less than 1 min. Although not
exclusive class. De Gaetano et al. (2014) called for a method that correctable, they can be detected and removed.
definitively identifies gust front downbursts and deep-convection
downdrafts. This curiosity-driven study examines whether that aim is At the end of 2013, NOAA implemented a new QC test – Test 10 – to
possible using only anemometric data as in De Gaetano et al. (2014) and the ASOS data to detect and remove spurious bird-generated gust values
Vallis et al. (2019), or whether additional meteorological parameters are (NOAA, 2013) before reporting/archiving. A curation of the ASOS data
required. (Cook, 2022) revealed that Test 10 produced four times as many false
positives than true positives, and each false positive unnecessarily culled
3. ASOS 1-min interval data the following 5 min of valid data. Although the proportion of lost data is
small, less than 0.03% (NOAA, 2013) and insignificant for the overall
This study uses the 1-min interval weather observations from some of dataset, these false positives are strongly biased towards thunderstorm
the 860 Automated Surface Observation System (ASOS) stations across downbursts and gusts in otherwise calm periods (Cook, 2022). The
CONUS. The progress of the ASOS implementation and upgrade pro­ deleterious effects of this test became apparent in Cook (2022) and
gramme is documented in a series of NOAA reports currently available at affected the present study.
https://weather.gov/asos/ASOSImplementation. The ASOS 1-min data The automated procedures used here to identify and correct artefacts
are a most valuable resource for Wind Engineering because they permit in the TD6405 data are fully described in (Cook, 2014b). They are
study of mesoscale events over much longer observational periods than principally intended to produce corrected synoptic-scale homogeneous
is possible with targeted measurement campaigns. The data are avail­ datasets. When applied to this study, the automatic thresholds for speed
able as text files by year and station in the TD6405 and TD6406 data­ and direction were found to cull some valid thunderstorms, especially in
bases from NCEI.1 light winds where direction changes are very large and can reverse. The
wind speed threshold was increased, and the direction artefact detection
was disabled by setting an unreachably high threshold. This inevitably
3.1. ASOS stations led to the retention of more artefacts in the data.

A Development set of twenty-five ASOS stations were chosen for this 3.3. TD6406
study. Indicated by the crosses in Fig. 1, they are distributed across
CONUS, except for two stations serving Dallas, TX, three serving Chi­ Relevant data in each TD6406 file are atmospheric pressure from
cago, and three serving Washington, DC, intended to expose any non- three independent sensors, dry bulb temperature, dewpoint and pre­
geographical disparities. Five stations were selected for their unreli­ cipitation. The high precision of the pressure sensors, ±0.0001 in Hg,
ability, e.g., KPRB Paso Robles, CA, to stress-test the classification and their triplication is required for accurate setting of altimeters in
methods. Also shown by grey circles in Fig. 1 are the locations of the landing aircraft. The temperatures are recorded and processed to 0.1 ◦ F
Analysis set of 450 ASOS stations with WMO Class 1 or 2 exposures precision but are reported in integer ◦ F increments. Only the precipita­
(Cook, 2021), selected for fuller geographic cover. Relevant metadata tion type codes: NP = no precipitation, R = rain and S = snow, preceded
for the 25 development stations are given in Appendix A. with – or + for light/heavy, are documented (NCDC, 2006b) so the
meanings of the other codes are unknown. Precipitation amount is
measured by a tipping-bucket gauge in 0.01-inch increments which has
1
The TD6405 and TD6406 files from 2000 to June 2022 may be downloaded no practical value at 1-min intervals, only when integrated over longer
by FTP or HTTP. NCEI has transitioned to HTTP only, with observations from periods.
January 2022 onwards available here, updated monthly. TD6406 contains the same kind of errors as TD6405, including the

3
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 1. Locations of ASOS stations used. Crosses: Development set of 25 stations. Circles: Analysis set of 450 stations.

duplication problem but excluding bird-perching, and the same auto­ calibrating, tuning or training the automated classification method.
mated procedures were used to identify and correct. Curiously, the pe­ There is consensus in these studies, e.g., (Canepa et al., 2020) (Choi,
riods of duplication and of gaps in the data are often not 1999), that downbursts from thunderstorms and gust fronts are identi­
contemporaneous with TD6405 and this leaves some gust events fiable by visual inspection of the meteorological record. The ASOS 1-min
without corroborating TD6406 data. interval 3s gust data permits resolution of gust events persisting for ~3
min, or longer. This includes downbursts from isolated thunderstorms,
4. Independent gust events downbursts embedded in the forward and rear flanks of fronts, and
downbursts able to penetrate through the ABL to the surface from deep
For this study, an “independent gust event” is defined as a 1-h period convection in moderately strong and steady winds. The aim here is to
centred on the maximum gust and separated from other gust events by at identify classes corresponding to the meteorological characteristics
least 30 min. The consensus from the earlier studies is that thunderstorm these physical mechanisms.
downbursts generally last less than 10 min, so this dead-time between The gust events were visually sorted into five valid Classes (1–5),
events complies with the common Wind Engineering rule-of-thumb of 3 plus one (6) to collect the characteristic “spike” artefacts, and another
× timescale for effective statistical independence. Thirty minutes are not (0) to collect events that are unclassifiable due to multiple artefacts or
sufficient for independence of non-convective gusts in synoptic-scale instrumentation malfunctions. These are designated here as.
windstorms, which require a longer separation.
Following the approach of Lombardo and Zickar (2019) and earlier 1. Unclassifiable – comprising events with large data gaps or with non-
studies, any observations that might have come from hurricanes were meteorological artefacts, e.g., instrumentation malfunctions.
removed over the three-day period centred on the arrival of the hurri­ 2. Synoptic – comprising non-convective gusts generated by the ABL in
cane eye into the relevant State. Owing to their rarity, hurricanes are synoptic-scale weather systems and near-neutral atmospheric sta­
assessed differently from the frequent synoptic and convective wind bility. Steady, strong, locally stationary wind with low gust factor;
mechanisms (Vickery et al., 2009). This was relevant only to the coastal little variation in direction and temperature; linear or no trend in
US States along the Gulf and Eastern Seaboard. atmospheric pressure; precipitation absent or continuous.
Following Chen and Lombardo (2020), all gust events >40 kn and 3. Storm-burst – comprising short-duration non-stationary events in
separated by at least 30 min at each Development station were extracted otherwise strong steady winds, sometimes with discernible variation
to give a Development set of 9014 gust events. Initially, a recursive in temperature or pressure. May include downdrafts from deep
search was made of each record to find the next highest gust event, convection which penetrate through the gust structure of the ABL.
which was simple to implement and validate but very slow to execute. 4. Front-down – comprising convective downdrafts in the rear flank of
Execution time was shortened by a factor of ~60 by first extracting the active fronts, where the mean wind speed is decreasing from a higher
much smaller set of local maxima and their times of occurrence and steady value. Mean gust speeds after the peak are lower than before;
searching this. By excluding the period of each gust event from future direction veers or backs; temperature drops rapidly through the
searches, the recursion cycle became progressively quicker. This event; pressure increases temporarily when the downburst is directly
allowed an Analysis set of all gust events above ≥20kn to be extracted over the station, otherwise the variation is that expected for the
from each of the 450 stations in Fig. 1, comprising >107 events. passage of a front.
5. Front-up – comprising convective downdrafts in the forward flank of
5. Datum classification by visual inspection active fronts, where the mean wind speed is increasing to a higher
steady value. Mean gust speeds after the peak are lower than before;
5.1. Gust event classes usually, but not always, associated with a change in mean wind di­
rection; otherwise, like Front-down.
All the earlier studies require a datum set of classified events for

4
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

6. Thunderstorm – comprising downbursts from isolated thunderstorms, • There is a sudden drop in the temperature anomaly (red line) coin­
often in relatively light winds, where the initial wind speed and di­ cident with the peak gust that is maintained over the following 30
rection are restored after the event. Temporary sharp rise in gust min (scale ±10 ◦ F).
speed over several minutes; temporary change in gust direction; • The pressure anomaly (blue line) rises through the peak gust and
sudden large drop in temperature; temporary increase of pressure partially recovers after around 7 min (scale ±0.05in Hg).
when the downburst is directly over the station, otherwise slight rise • There are bursts of heavy rain (purple bars) for 11 min following the
due to temperature drop; often a sudden burst of heavy rain. peak gust.
7. Spike – comprising single isolated instrumentation/transmission
spikes, possibly including very short-duration surface-generated The suggested physics for this event is that the downburst was
thermal events like dust-devils. Bird-generated gusts register as sin­ initiated just upwind of the station, so that the sudden increase in gust
gle spikes with no loss of data if the acoustic path of the sonics is speed and drop in temperature corresponds to the leading gust front and
blocked for less than 30s (Cook, 2022). Associated changes in indi­ the later increase in pressure corresponds to the center of the downburst
cated direction are not reliable as they come from the same instru­ as it is advected past the station. This example illustrates the most
ment. There should be little variation in temperature or pressure. critical classification between Class 5 Thunderstorm and Class 6 Spike for
Multiple Spike artefacts within the event period are included in Class extreme-value analysis (EVA) where, ideally, all artefacts are found and
0 to preserve the ensemble-averaged character of Class 6. removed, and no valid thunderstorms are lost in doing this.

It is acknowledged that events caused by other meteorological


mechanisms will occur, and that these need to be assigned to the defined 5.3. Timeseries of classified gust events
Class with the best matching characteristics.
All the 9014 gust events of the Development set were visually sorted
5.2. Visual classification into the Class with the nearest matching characteristics. Typical
example classifiable events are shown in the left-hand column of Fig. 3
The visual classification process was semi-automated to cope with to illustrate the principal distinguishing characteristics of each Class.
the large number of gust events. Events were displayed in sequence in The middle column presents the ensemble average of each Class in the
the form of Fig. 2 which presents the speed, direction, pressure, tem­ Development set and shows a consistent trend in gust speed of increasing
perature and precipitation traces together on the same chart. Classifi­ sharpness with Class index. The frontal Class 3 & 4 speeds are distinctly
cation was made by mouse click on the scale at the bottom, skewed in opposing sense, as expected. Temperature and pressure also
automatically advancing to the next event. The text above this figure is show a consistent trend, but in the order 6, 1, 2, etc. The right-hand
additional QC and de-bugging information. column presents the ensemble average of each Class of all gusts
Fig. 2 shows a very short-duration Thunderstorm downburst that, if ≥40kn in the Analysis set, as identified by the later Trivariate Neural
assessed from the speed trace alone, might have been assigned to Spike. Network Direct Traces (TNNDT) method.
Comparing the two sets of ensemble-averaged traces confirms that
• The gust speed, shown by the linked (black) circles, relative to the the distinctive ensemble-averaged shapes of each Class persists when the
left-hand scale, rises suddenly to the peak value then falls quickly, population increases from 9014 to >107. Note that the peak gust of
recovering within around 7 min. Synoptic must always emerge above the steady incident wind speed
• There is a corresponding temporary variation of direction, shown by because the ensemble-average of the other values is always less. The
the un-linked (green) circles, relative to the right-hand scale. ensemble-averaged values play no part in the visual classification of the
individual events but are included here to illustrate that the differences
between the resulting Classes are consistent.

6. Automated classification using only anemometric


observations from TD6405

Most of the previously published classification methods used only


anemometric data to visually classify the gust events used to calibrate/
tune/train the method. The present study differs by using additional
temperature/pressure/precipitation data and by including the frontal
and storm-burst classes.
The shapes of the visually classified accumulated gust traces are
plotted together in Fig. 4, normalised to unity peak value, and show a
clear hierarchy of increasing sharpness. The challenge for statistical
methods, e.g., De Gaetano et al. (2014), is to define metrics that reliably
differentiate between these shapes.

6.1. Statistical metrics of the visually classified gust events

With only 9014 events in the Development set, unevenly divided


between the Classes, it was impractical to evaluate probability distri­
butions by conventional binning methods, so Kernel Density Estimation
(KDE) was used.

6.1.1. Metrics for gust speed


Fig. 2. The visual classification chart for a Class 5 Thunderstorm at KDAL Dallas Fig. 5 assesses the KDEs of some of the conventional metrics for gust
Love Field. speed for their ability to discriminate between classes.

5
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 3. Typical (left) and ensemble-averaged (center) visually classified gust events >40kn at the 25 Development stations, and (right) of all gust events >40kn at the
450 Analysis stations classified by TNNDT method.

• (a) and (b) present “gust intensity”, the standard deviation of the 3s mean. This is the reason Kasperski (2002) and De Gaetano et al.
gust divided by its 10-min and 1h datum mean, respectively. (2014) can classify T/NT, but not identify the frontal classes.
Although there is a left-right trend in the mode with Class, there is • (e) presents the speed trend, the change from the mean for the 30 min
considerable overlap between distributions. before the peak to the 30 min after the peak, divided by the mean for
• (c) and (d) present the conventional gust factor, peak divided by the hour centred on the peak. This shows a good separation between
mean, for the same datum means. There is considerable overlap in (c) Class 3: Front-down and Class 4: Front-up, with the other classes
using the 10-min mean, but better separation in (d) using the 1h clustered together in between.

6
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 3. (continued).

• (f) presents the gust “emergence”, defined as the peak gust divided 6.1.2. Metrics for gust direction
by the mean of the 10 next-highest local peaks in the event. Like gust There are often clear variations in wind direction through a gust
factor, it is a measure of sharpness of the central peak. This is a novel event but, as noted earlier, direction is not a reliable discriminator for
metric designed to overcome an issue with gust factor when applied Spike artefacts as it is produced by the same sonic anemometer as the
to the frontal Classes 3 & 4. speed. Fig. 6 presents the KDEs for veer: the difference between the peak
and the incident mean, and trend: the change in 30-min mean from
Although gust factor appears to provide better separation than before to after the event. These KDEs are clustered around the origin and
emergence, the datum mean for Front-down and Front-up lies somewhere provide no useful discrimination, although Classes 3, 4 & 5 show an
between the high/low value before the central peak and the low/high increasing asymmetry in trend. Gomes and Vickery (1978) commented
value afterwards, so is representative of neither. Gust factor therefore in 1978 that the average peak thunderstorm gust direction remains
tends to exaggerate frontal events, biasing the result. The proposed new consistent with the approach mean direction, so this negative result was
emergence metric indicates how far the peak emerges above the enve­ expected. Both Class 4: Front-up and Class 5: Thunderstorm direction
lope of its peers. Whether these occur before, after or either side of the trends show a slight trend bias which is too small to be useful.
peak is irrelevant, so it treats all the event classes fairly and without bias.

7
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

6.2. Comparison of automated classification methods using only


anemographic data

This section compares the performance of the more successful cur­


rent methods that use only anemographic data, then develops and as­
sesses improvements. Here, each method is compared with the visual
classification by a confusion matrix (CM) (Arul et al., 2022) (Chen and
Lombardo, 2020) in which correctly classified events lie on the diagonal
and misclassified events lie off-diagonal. The overall error is indicated
by the proportion of events that are misclassified.

6.2.1. De Gaetano et al. method


The De Gaetano et al. (2014) method is a statistical method in which
the principal metrics are the gust factors normalised by the 1-min,
10-min and the 60-min mean. These are compared against datum
threshold values sequentially in a logic tree to achieve a binary T/NT
choice. The method was applied to the Development set exactly as
specified in De Gaetano et al. (2014) to produce the CMs of Table 1.
Table 1(a) represents the binary T/NT thunderstorm/non-
thunderstorm classification re-defined as thermal/non-thermal to
accommodate the frontal classes, with Classes 3–5 combined into T and
Class 1 and 2 into NT as described in De Gaetano et al. (2014). This gives
an overall classification error of 15.4%. As expected, the method fails to
discriminate the intermediate classes, but the principal issue is the large
Fig. 4. Normalised averaged gust events of the Development set for each number of Class 6 Spike misclassified as T, thunderstorm. Table 1(b)
visually identified class.
shows the distribution of the errors across the Classes.

Fig. 5. Kernel Density Estimates of statistical metrics for gust speed.

8
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 6. Kernel Density Estimates of statistical metrics for gust direction.

events that fall outside the sharp boundaries to contribute partially to


Table 1
their parent Class.
Confusion matrices for the De Gaetano method.
(a) de Gaetano (b) de Gaetano 6.2.4. Emergence-Trend Neural Network method
Class NT T Error Class NT T Error In the Emergence-Trend Neural Network (ETNN) method, the
0 8 55 100.0% 0 8 55 100.0% neuralnet package for R was trained to predict the Class from the gust
1&2 6928 178 2.5% 1 6334 72 1.1%
emergence and trend metrics, using disjoint training/testing data. The
3, 4 & 5 672 700 49.0% 2 594 106 15.1%
6 5 468 100.0% 3 151 101 59.9% Development set was split in half in two ways: north/south halves, and
4 267 285 48.4% east/west halves. Training used one half, then testing used the disjoint
Error 15.4% 5 254 314 44.7% other half. The metric for optimising the network configuration was the
6 5 468 100.0% average prediction accuracy all disjoint combinations of the training/
testing sets.
6.2.2. Bivariate k-means method It can be assumed that each Class contains characteristic components
The classical k-means method is a way of clustering data into k plus a random “noise” component. Optimisation of the ETNN configu­
similar sets and assigning membership of any observed value to the set ration started by increasing the number of neurons in a single-layer
with the closest mean value. After first discounting Unclassifiable for network, analogous to increasing the number of free parameters in a
which the statistical metrics are meaningless, there remains k = 6 parametric fit which eventually leads to overfitting by including the
Classes requiring assignment of membership. On Fig. 7(a) the ellipses random component in the fit. Misclassification error of the testing set
represent the one standard deviation boundary around the mean of each decreases with increasing numbers of neurons up to the point where
Class across the two-dimensional emergence-trend field. The ellipses are overfitting begins, after which it starts to rise again. With a single layer,
reasonably well separated, except that Class 2 lies inside Class 5. the overall error of the testing set was minimised with 50 neurons.
Assignment of each event to the closest mean of each Class produces the Extending to two- and three-layer networks produced only a marginal
CM in Table 2 and a classification error of 18.2%. The principal concerns reduction in error, but with significantly shorter training times. The
are the high rate of Spike misclassified as Thunderstorm and the need to optimal configuration was found to be three layers of 4 neurons which,
separately identify Unclassifiable events. when applied to all visually classified events, produced the CM of
Table 4 and an error of 19.7%.
6.2.3. Highest Kernel Density method
The two-dimensional KDEs of the gust emergence and trend metrics, 6.2.5. Nearest normalised trace method
evaluated2 for each Class, are shown in Fig. 7(b) as contours on the The Nearest Normalised Trace (NNT) method is essentially the
emergence-trend plane. For each event the Highest Kernel Density shapelet method of Arul et al. (2022) using the normalised averaged gust
(HKD) method selects the Class with the highest KDE. The selection was events in Fig. 4 as the “mother” shapelets. Training was much faster than
tuned by optimising the kernel bandwidths to give the lowest overall in (Arul et al., 2022) as the shapelet search over the full records was not
error, 17.2%, in the CM, Table 3. Implementation of the method was required. The timeseries of each event, normalised to unity at the peak,
simplified by evaluating a look-up table in small increments (0.01) of was compared with each of the timeseries in Fig. 4 and the Class with the
emergence and trend, shown as a chart in Fig. 8. Events in the tail of a minimum Euclidean distance was selected. The CM is in Table 5 and the
KDE will be misclassified into a neighbouring Class when its value falls error is 23.7%.
below that of the neighbour. The HKD method can be viewed as making
a sharp cut along the boundary curves of equal KDE value in Fig. 8. 6.2.6. Neural Network Direct Traces method
As an incidental by-product of the method, the membership proba­ The Neural Network Direct Traces (NNDT) method applied all 61
bility, Fi, of Class i for each event, e, may be obtained from the KDE, p, gust speed values of a gust event as inputs into neuralnet. This is
by: essentially the method of Chen and Lombardo (2020) but using a less
sophisticated NN, with much shorter training times. All NN require the
/∑
6
input data to be normalised to a common scale, so that all variables have
Fi {e} = pi {e} pj {e} Equation1
j=1
approximately equal ranges, and produce a normalised output. This is
simple for the Classes and the gust speed traces which in individually
Membership probability may be used as a weighting factor in any normalise to the range (0,1), as in Fig. 4. Note that the outputs are real
statistic that implements fuzzy logic, e.g., weighted moments, allowing values that require de-normalising to the nearest integer Class. Training
with the disjoint training/testing sets indicated that increasing the
number of nodes to 28 in three hidden layers of 16, 8 and 4, respectively,
2
By kde2d() in the “MASS” package for R, using axis-aligned bivariate was the first configuration to include estimates for Class 0 before the
Normal kernels. Based on Venables WN and Ripley BD (2002) Modern Applied testing set error started to rise due to overfitting, as indicated in Fig. 9.
Statistics with S, Springer, pp510. When applied to all the visually classified events, NNDT produced the

9
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 7. Bivariate k-means clusters and kernel density contours of gust “trend” and “emergence” metrics.

Table 2
Confusion matrix for Bivariate k-means method.
Bivariate k-means

Class 0 1 2 3 4 5 6
0 0 13 11 15 6 17 1
1 0 5606 675 83 15 27 0
2 0 44 581 2 2 71 0
3 0 13 0 222 0 17 0
4 0 0 78 0 454 20 0
5 0 36 184 82 38 224 4
6 0 1 11 30 18 122 291
Error 18.2%

Table 3
Confusion matrix for Highest Kernel Density method.
Highest Kernel Density

Class 0 1 2 3 4 5 6
0 0 9 20 18 9 6 1
1 0 5517 750 79 60 0 0
2 0 39 640 2 0 18 1
3 0 14 0 238 0 0 0
4 0 0 2 0 549 1 0
5 0 43 222 76 36 160 31
6 0 0 20 12 9 70 362
Error 17.2% Fig. 8. Look-up table for Highest Kernel Density method.

CM in Table 6 and a misclassification error of 10%.


Table 4
Confusion matrix for Emergence-Trend Neural Network method.
6.2.7. Comparison of results
Emergence-Trend Neural Network
The five methods tested in this study are compared in Table 7 for
effectiveness based on minimum overall classification error. Without the Class 0 1 2 3 4 5 6
benefit of the additional temperature/pressure/precipitation data used 0 0 9 16 21 7 9 1
1 0 5653 648 65 40 0 0
in the visual classification, they are all relatively poor at resolving 2 0 230 424 26 2 18 0
Thunderstorm that looks like Spike and vice-versa. The worst performer is 3 0 10 23 198 1 20 0
the NNT method, which is the shapelet transform method of Arul et al. 4 0 0 7 23 496 26 0
(2022) operating with only one mother shapelet for each class. As each 5 0 46 197 99 39 184 3
6 0 0 15 17 4 156 281
individual event differs in shape from the ensemble average, Arul et al.
Error 19.7%
(2022) required 32 mother shapelets to make the binary T/TN choice, so
this result comes as no surprise.

10
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Table 5 the look-up table, Fig. 8.


Confusion matrix for Nearest Normalised Trace method. It must be concluded that none of the five assessed methods that used
Nearest Normalised Trace only the TD6405 gust speeds are sufficiently reliable to classify ASOS
gust events for EVA by individual Class.
Class 0 1 2 3 4 5 6
0 0 2 17 6 5 21 12
1 0 5365 975 29 23 14 0 7. Automated classification using TD6405 and TD6406
2 0 214 387 10 17 63 9 observations
3 0 29 34 117 1 54 17
4 0 1 77 0 373 72 29
5 0 21 92 54 48 250 103
The ensemble-averaged traces of temperature and pressure in Fig. 3
6 0 0 15 16 3 54 385 also show a clear hierarchy. Gradual synoptic-scale changes were
Error 23.7% removed by subtracting the mean values over the 60-min duration of
each event to give the corresponding event anomaly. The temperature at
the surface is expected to drop suddenly in a convective downburst, as
exploited in Vallis et al. (2019), but this will also occur with the passage
of a cold front. Transient pressures in downbursts are significant only
when the downburst is directly over the station, otherwise they are
dwarfed by the changes associated with fronts. The apparent high res­
olution in Fig. 3 is achieved statistically by averaging the large popu­
lation of events but only the coarse 1 ◦ F resolution of temperature in
TD6406 is available in any single event.

7.1. Statistical metrics of the temperature and pressure anomalies

In interpreting the metrics of the temperature and pressure anoma­


lies and assuming the occurrences of Spike to be random in time or from
bird-perching events at low wind speeds, the expectation is for Spike to
be like Synoptic because of the rate of Synoptic events is so high
compared with convective events.
Fig. 9. Training and testing errors for Neural Network Direct Traces method:
three hidden layer network with n, n/2 and n/4 nodes.
7.1.1. Metrics for dry-bulb temperature anomaly
In the KDEs of dry-bulb temperature shown in Fig. 10.
Table 6
Confusion matrix for Neural Network Direct Traces method. (a) Presents the same trend parameter as for the gust speed. This
indicates narrow distributions centred on the origin for Synoptic
Neural Network Direct Traces
and Spike, broad distributions for a drop in temperature for
Class 0 1 2 3 4 5 6 Thunderstorm and Front-up, and transitional distributions for
0 11 23 3 3 0 1 0
1 1 6715 35 20 2 1 0
Storm-burst and Front-down.
2 0 202 107 26 30 5 0 (b) Presents the range of first differences – the maximum positive
3 0 15 2 181 24 11 0 change minus the maximum negative change between successive
4 0 7 3 45 386 110 0 observations (hence only positive values). The narrow Synoptic
5 0 36 3 25 165 334 11
peak to the left indicates the small range which is characteristic of
6 0 1 0 1 3 83 383
Error 10.0% adiabatic conditions. Thunderstorm and Front-up indicate rapid
changes, while Front-down is transitional.

It is apparent in Fig. 3 typical examples that the drop in temperature


Table 7
Comparison of method overall error. tends to precede the peak gust for the frontal classes, 3 & 4, but coincides
with the peak for Class 5 Thunderstorm, and this was seen to occur quite
Input data Method Error
frequently during the visual classification. As the shape of the temper­
Emergence and trend Bivariate k-means method 18.2% ature drop approximates to a linear ramp down from a steady high to a
Highest Kernel Density 17.2%
steady low value, this characteristic “escarpment” shape was fitted, by
Emergence-Trend Neural Network 19.7%
Directly from traces Nearest Normalised Traces 23.7% optimisation, to each event trace to obtain: (c) the span (the range) and
Neural Network Direct Traces 10.0% (d) the slope (the rate of change).
The two principal findings are: 1) that Thunderstorm and Spike, which
were overlapped in the gust speed KDEs, are now very well separated;
NNDT gives the least misclassification error. All the other methods and 2) that Thunderstorm and Front-up are almost identical which does
are unable to recognise Class 0 because there are no valid statistical not help resolve their overlap in gust trend.
metrics or characteristic ensemble-averaged traces for this heteroge­
neous class of artefacts, although only a tenth are classified correctly. In 7.1.2. Metrics for atmospheric pressure anomaly
effect, NNDT designates Class 0 as not fitting any of the other Classes. In the KDEs of the pressure anomaly shown in Fig. 11: (a) presents
Misclassification of Class 5 Thunderstorm into all the other Classes is a the trend parameter; and (b) first difference range, complementary to
major defect, as is the poor discrimination between Thunderstorm and Fig. 10(a) and (b). Both these charts show strong overlaps between
Spike which is critical for EVA. Classes, so pressure is clearly less useful than temperature, but (b) does
Of the other methods using only TD6405 observations, the HKD give good isolation of Spike.
method of §6.2.3 is marginally the best. It is relatively simple and has the
additional benefit of providing the class membership probabilities to
implement fuzzy logic. Its application is further simplified by means of

11
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 10. Kernel Density Estimates of statistical metrics for temperature anomaly.

Fig. 11. Kernel Density Estimates of statistical metrics for pressure anomaly.

7.2. Development of classification methods


Table 8
Confusion matrices for Bivariate Highest Kernel Density method.
7.2.1. Bivariate Highest Kernel Density method
The obvious first extension of HKD is to add a temperature metric to (a) Bivariate KDE: Temperature trend with gust speed trend and emergence
evaluate3 a 3-dimensional KDE and this was calibrated for both trend Class 0 1 2 3 4 5 6
and span metrics, in Table 8. Although the span metric quantifies the 0 26 14 0 0 0 0 0
temperature drop wherever it may occur in the event trace, it gives the 1 105 6469 45 39 26 46 0
2 48 247 56 4 2 7 0
larger error. The trend metric quantifies the temperature change on 3 57 39 32 70 0 28 0
either side of the gust peak and gives a small reduction in overall error 4 64 20 89 1 301 66 0
from the HKD method. This improvement is mostly in the Thunderstorm- 5 26 4 33 29 17 459 1
Spike resolution where now only 3 out of 775 (0.4%) are misclassified. 6 65 19 47 6 0 2 313
Error 13.8%

7.2.2. Trivariate Neural Network Direct Traces method (b) Bivariate KDE: Temperature span with gust speed trend and emergence
The Trivariate Neural Network Direct Traces (TNNDT) method ap­ Vclass 0 1 2 3 4 5 6
pends the temperature and pressure anomaly values to the gust speed to 0 34 6 0 0 0 0 0
give 183 inputs into neuralnet. Adding the temperature and pressure 1 434 6093 1 6 31 166 0
2 63 243 8 0 7 43 0
anomaly traces required normalisations that preserved the relative scale
3 57 50 7 43 0 69 0
of each individual trace yet approximated to the (0,1) range, on average, 4 130 25 38 0 322 25 0
for all traces of the Development set. 5 43 49 23 16 46 391 1
The Classes hierarchy is not a continuous sequence because the speed 6 54 24 50 9 0 9 313
traces for Classes 3 and 4 in Fig. 3 are an antisymmetric pair. It is better Error 19.3%

3
Evaluated by the kde3d function in the R package “misc3d” by Dai Feng and
Luke Tierney.

12
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

denoted by the two separate continuous sequences: 1 → 2→ 3 → 5 and 1 Table 9


→ 2→ 4 → 5, by virtue of their opposite trends. In the Class normal­ Confusion matrix for Trivariate Neural Network Direct Traces method.
isation of the earlier NNDT method of §6.2.6, any event that is half-way Trivariate Neural Network Direct Traces
transitional between Classes 2 and 4 was misclassified as 3, and any
Class 0 1 2 3 4 5 6
transitional between 2 and 5 was misclassified as 3 or 4. This is the 0 27 13 0 0 0 0 1
primary reason for the off-diagonal spread around Classes 3 and 4 in 1 2 6732 29 5 5 1 0
Table 6. 2 0 132 230 0 2 5 1
Fig. 12 defines a two-parameter normalisation which resolves this 3 0 3 1 228 0 1 0
4 0 2 1 1 544 3 0
issue, where the two parameters are analogous to Emergence and Trend 5 0 0 1 2 3 554 14
for the metrics in §6.2.2–3. The arrow indicates that events which de- 6 0 0 0 0 1 7 463
normalise into this bin are assigned to Class 5, i.e., are thermal with Error 2.6%
no significant trend. Training the NN on the Development set, using the
same 32, 16 and 8 neuron configurations, produced the CM in Table 9
and a misclassification error of 2.6%. After merging the Class into
thermal, non-thermal or artefact (T/NT/A) the error reduces to 0.7%.
Compared with NNDT, the error is smaller by a factor of four, with most
of the residual misclassifications between Classes 1 and 2, i.e., nuances
of NT.

7.3. Impact of misclassifications on EVA by XIMIS

Fig. 13 presents the number of visually classified Unclassifiable or


Spike artefacts ≥40kn for each Development station. The five unreliable
“stress-test” stations account for more than half the total number of
artefacts. The TNNDT method error rate for these is 2.9% while the error
rate for the remaining 20 more reliable stations is 2.6%. This indicates
that TNNDT works almost as well on stations with many artefacts as it
does on typical reliable stations.
Misclassification of artefacts raises two principal concerns in EVA.

1. Class 0 or 6 artefacts misclassified as valid events will contaminate


analyses with invalid data.
2. Valid events misclassified as artefacts result in missing data in
addition to those valid events culled by the real-time QC “Test 10”
(Cook, 2014b) (Cook, 2022).

The consequence of each misclassified event depends on its value


rank in the events for its station. In classical EVA of annual maxima, the
annual probability of exceedance, P, represented by the reduced variate,
y = –ln (–ln(P)), is estimated from the value ranks of the annual maxima
(Gumbel, 1958) (Castillo, 1988). Here the lowest value of interest is
taken to be the smallest annual maximum expected in the R ~20 years of
this study which, using the unbiased Gringorten (Gringorten, 1963)
estimator for P, evaluates to y = − 1.28. The gust event speeds are not
Fig. 13. Distribution of artefacts by Development station.
conventional epoch maxima but are peak-over-threshold (POT) values
which, if statistically independent, should follow the Poisson recurrence
model for which the Harris (2009) XIMIS method of EVA is the most
appropriate. In XIMIS the estimates of y, the mean plotting position for the reciprocals of which are the fitting weights for an unbiased least-
each event are given by the recursive formula: mean-squares fit.
For an R = 20-year record, Equation (2) predicts that the smallest
ym+1 = ym − 1/m with ym=1 = γ + ln R Equation2
value of interest corresponds to the 72nd highest and the annual mode
where m is the value rank in descending order, γ = 0.5772 … (Euler’s (y = 0) corresponds to the 20th highest gust event at each station. It is
constant) and R is the length of the observational record in years. The convenient here to refer to values higher than the mode as lying in the
variances of y are given by the recursive formula: “upper tail” and those lower as in the “lower tail”.
/ / Of the 21 artefacts misclassified as valid events, 10 rank within the
σ 2m+1 = σ2m − 1 m2 with σ 2m=1 = π2 6 Equation3 range of interest but none lie in the upper tail. Of the 17 valid events
misclassified as artefacts, 14 are Thunderstorm, 2 are Synoptic and 1 is
Storm-burst. Of these, 5 are in the range of interest but only 2 lie in the
upper tail. Both were visually classified as Thunderstorm and their traces
are presented in Fig. 14. In (a) the transient temperature drop is rela­
tively modest and there are no direction, pressure or precipitation
transients. (Note that the later XTNNDT classification is correct.) In (b)
there is a modest transient temperature rise, indicating a “warm” event
as discussed later in §8.2.3, a large pressure transient and a 360◦ rotation
of direction, all coincident with onset of heavy rain – clearly indicating a
Fig. 12. Normalisation avoiding misclassification of transitional events.

13
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 14. Thunderstorm events in the upper tail misclassified by TNNDT as Spike.

valid event.4
A preliminary, simplified, XIMIS (Harris, 2009) analysis for the dis­
tribution of annual maximum gust speed was made to assess the impact
on the 50-year return gust speed, V50. The simplifications included
taking no account of serial correlation, nor of asymptotic convergence,
and merging the five valid Classes into the binary T/NT classes. The
analyses were performed for each Development station assuming the
classical Fisher-Tippet type 1 model (Gumbel, 1958) (Castillo, 1988)
was applicable to both T/NT classes:
y = (V − U)/b Equation4

where y = –ln (–ln(P)), V is gust speed, U is the mode and b the


dispersion. In each case an unbiased fit was made by the weighted
method of least squares, using y from Equation (2) with 1/ σ2 from
Equation (3) as the fitting weights. Then the separate T and NT distri­
butions were combined into a joint mixture model following the original Fig. 15. XIMIS analyses of NT classes (Classes 1 & 2) for KORD, Chicago
approach of Gomes & Vickery (Gomes and Vickery, 1978): O’Hare, IL.

PT.NT = PT × PNT Equation5

yT.NT = − ln( − ln(PT × PNT )) = − ln(e− yT


+e− yNT
) Equation6
The three analysis steps are illustrated for KORD, Chicago O’Hare, IL.

1. The NT classes, Fig. 15, show that the highest 63 values in the Visual
and TNNDT sets are identical, leading to identical estimates of V50 to
0.1kn tolerance. The simplified confidence limits correspond to ±σ
from Equation (3) and, in this case, all values lie within these limits.
2. The T classes analysis, Fig. 16, shows the 3rd-highest Visual value is
missing from the TNNTD values, resulting in a 1.8kn underestimate
of V50.
3. The resulting T/NT mixture model, Fig. 17, resolves as a curve which
is asymptotic to NT in the lower tail and T in the upper tail. The
predicted values of V50 are slightly higher than the T values due to
the small additional contribution from NT. Fig. 16. XIMIS analyses of T classes (Classes 3, 4 & 5) for KORD, Chicago
O’Hare, IL.
Table 10 lists the estimates of the 50-year return gust speed, V50,
from each step in the simplified XIMIS analysis for all Development
stations. In general, missing valid values in the upper tail, y > 0, produce
4
SPC reports concurrent wind damage in nearby Irving. Some hours later, underestimates and surviving artefacts produce overestimates. The
tornados seen in nearby Burleson county. principal misclassification anomaly for the 6 largest discrepancies

14
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

1. A central database keeps track of the ASOS stations to be processed


and their status.
2. Processing runs fully automatically, in stages, under the control of
master scripts.
3. Processing may be interrupted and re-started at any time without loss
of completed stages.
4. Multiple instances of R can be run on the same PC, or on multiple PCs
with network access to the central database.

Principle 3 will allow incremental updating as future observations


are added to the TD6405 and TD6406 databases.
The scripts were validated by extracting and classifying the events
≥40kn for each station of the 25 Development stations and comparing
the results with the earlier analysis. The scripts were then run to extract
and classify gust events ≥20kn for the 450 Analysis stations indicated in
Fig. 17. XIMIS T/NT mixture model for KORD, Chicago O’Hare, IL. Fig. 1.
Events in the upper tail at all 450 Analysis stations were validated by
between Visual and TNNDT are annotated on the right, the overall repeating the visual classification process of §5, starting with the highest
standard errors are given below the table. The standard error of the T/ ranked gust and working downwards in gust speed until 20 valid events
NT mixture is reduced to less than the ±0.5kn resolution of the obser­ had been found and checked. This was double the 10 highest events
vations by correcting just the 6 largest anomalies. This suggests that suggested in §7.3.4 to be sufficient but provided a similar number of
subjecting all gust events y > 1 to a visual check would be sufficient events (9646) to be visually classified as for the Development set (9014).
mitigation, i.e., checking all gusts greater in value than the 10th-highest Particularly unreliable stations required many checks to find the highest
TNNDT gust value at each station. This represents a smaller sub-set than 20 valid events: 267 at KSBA Santa Barbara, CA, 196 at KSMX Santa
the datum visual classification of §5, corresponding to around 0.05% of Maria, CA, 118 at KACV Arcarta, CA and 109 at KCAK Akron, OH.
the 450-station Analysis set. As noted in §5.3, the ensemble-mean traces of all Analysis set events
≥40kn for the TNNDT Classes in the right-hand column of Fig. 3 pre­
8. Classified gust events at the analysis stations serve the same characteristic shapes as the Development set. This sug­
gests that the 25 Development stations, distributed across CONUS, were
8.1. TNNDT classification and validation sufficient to train TNNDT. However, the CM for TNNDT for the visually
classified upper tail of the Analysis set, Table 11, reveals a much larger
R scripts were coded to automate the download of the TD6405 and overall error (21.3%) than expected. Much of this is due to multiple
TD6406 observations from any ASOS station, to detect and correct/ Spike events visually classified as Class 0 being assigned as Class 6, i.e.,
remove artefacts, extract all gust events above a specified threshold TNNDT recognises the central Spike and discounts the others. Merging
speed, and classify by the TNNDT method. These scripts follow four key Classes 0 and 6 reduces the error to 13.8%. Merging the Classes into
principles. thermal, non-thermal or artefact (T/NT/A) further reduces the error to
8.9%. This reinforces the desirability for visual classification of events in

Table 10
Estimates of 50-year return gust speed by XIMIS for all Development stations.
V50 (kn) T = Classes 1 & 2 NT = Classes 3, 4 & 5 T/NT mixture

Kcode Visual TNNDT Error Visual TNNDT Error Visual TNNDT Error
KABQ 71.7 71.7 0 68.1 68.2 0.1 73.2 73.3 0.1
KBUF 70.4 69.1 − 1.3 67.8 67.8 0 72.3 71.5 − 0.8 Missing T rank 6
KBWI 59.5 59.6 0.1 57.9 57.9 0 61.1 61.2 0.1
KDAL 64.3 64.0 − 0.3 55.3 55.3 0 64.6 64.4 − 0.2
KDCA 64.8 63.5 − 1.3 59.4 59.4 0 65.5 64.5 − 1
KDEN 70.4 70.3 − 0.1 64.9 64.9 0 71.2 71.1 − 0.1
KDFW 75.9 74.6 − 1.3 56.7 56.7 0 75.9 74.6 − 1.3 Missing T rank 3
KGFK 71.5 72.0 0.5 62.3 62.8 0.5 71.8 72.4 0.6
KGTF 69.5 68.7 − 0.8 68.2 68.1 − 0.1 71.8 71.3 − 0.5
KIAD 60.8 65.1 4.3 62.1 62.4 0.3 64.2 66.9 2.7 Outlier T rank 1
KICT 78.4 77.6 − 0.8 63.4 63.8 0.4 78.4 77.7 − 0.7 Missing T rank 5
KILM 67.2 67.8 0.6 54.3 54.3 0 67.3 67.9 0.6
KJAN 62.8 62.2 − 0.6 61.0 61.0 0 65.5 65.4 − 0.1
KLAS 62.8 64.4 1.6 61.5 61.5 0 65.1 66.1 1
KLAX NA NA NA 46.6 46.6 0 46.6 46.6 0
KMDW 77.8 78.2 0.4 62.6 62.6 0 77.9 78.2 0.3
KORD 70.5 68.6 − 1.9 59.9 59.9 0 70.7 68.9 − 1.8 Missing T rank 3
KPNS 64.6 64.6 0 71.0 71.0 0 72.7 72.7 0
KPRB NA NA NA 52.9 52.9 0 52.9 52.9 0
KPWM 52.3 51.6 − 0.7 63.4 64.3 0.9 63.5 64.4 0.9 Outlier NT rank 1
KRFD 70.7 70.7 0 57.7 57.6 − 0.1 70.8 70.8 0
KSEA 50.8 50.8 0 60.0 59.9 − 0.1 60.2 60.1 − 0.1
KSUX 70.6 70.6 0 65.6 65.6 0 71.7 71.7 0
KTPA 50.4 50.4 0 63.2 63.2 0 63.3 63.3 0
KWMC 62.9 63.2 0.3 60.7 60.6 − 0.1 64.7 64.8 0.1
Uncorrected std error 1.22 0.23 0.85
Corrected ranks 1 to 6 std error 0.53 0.11 0.36

15
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Table 11 8.3. Some curiosities revealed in the analysis set


Confusion matrix for TNNDT method applied to upper tail (y > 0) of Analysis set.
TNNDT Class The figures illustrating this section show the time-series traces of 3s
gust speed, 2 min direction, dry-bulb temperature and pressure. The
Class 0 1 2 3 4 5 6
0 25 66 17 24 25 60 722 speed and direction scales are shown on the left and right axes,
1 12 2884 36 8 13 21 8 respectively. Temperature and pressure are shown as anomalies (dif­
2 16 12 193 31 43 49 12 ference from mean) over fixed ranges of ±10 ◦ F and ±0.05 in Hg,
3 2 4 6 194 0 31 2 respectively. The TNNDT and XTNNDT class estimates are given to one
4 2 22 27 1 772 91 4
5 21 51 70 103 178 1397 193
decimal place to indicate when events were transitional between classes.
6 0 3 6 1 3 18 1971 The “Rank” values in the heading correspond to all the events, including
Error 21.3% artefacts.

8.3.1. The Thunderstorm-Spike dilemma


the upper tail to correct all these misclassifications. It also suggests that
This classification choice is critical because Thunderstorm contributes
re-training TNNDT weighted to favour the upper tail would be a bene­
the highest gust speeds while Spike is discounted from the data. Fig. 18
ficial precursor to EVA because misclassifications in the lower tail are
(a) shows a Thunderstorm downburst with typical characteristics which
not addressed.
TNNDT and XTNNDT classify as Spike due to its short duration, while (b)
shows a Spike which TNNDT and XTNNDT classify as Thunderstorm. This
8.2. XTNNDT re-training and classification
dilemma, noted in Chen and Lombardo (2020), can only be resolved by
reference to the SPC archive or by visually checking events in the upper
TNNDT was re-trained using the visually classified events in the
tail.
upper tail of all Analysis stations and designated as XTNNDT to distin­
guish it from the original. Then the visual classification was repeated to
8.3.2. Calibration sequence
confirm/correct all remaining mismatches. Multiple Spike events, pre­
A calibration sequence 0→51→76→0, Fig. 19(a), or in reversed order
viously assigned to Class 0 to avoid contaminating the ensemble means
(b), occasionally appears in the early cup-anemometer data at some
in Fig. 3, were re-assigned to Class 6 when indicated by XTNNDT. Sus­
stations. This seems to be manually controlled because the duration of
picious events were cross-checked by reference to the NOAA Storm
each step varies and, presumably, the calibration is included when the
Prediction Center (SPC) Storm Report archive at: https://www.spc.noaa
operator fails to suppress the data feed. These events always classify as
.gov/exper/archive/and to the interactive data display provided by
Spike.
“Weather Spark” at: https://weatherspark.com/. Some of these events
are highlighted as “curiosities” in §8.3, below. Otherwise, most correc­
8.3.3. “Warm” events
tions were to events transitional half-way between Classes. The resulting
In the §5.1 definition of Spike it was expected that short-duration
CM, Table 12, shows the error reduced to 2.4% for the individual Class
surface-generated thermal events would be included in Class 6. They
and 1.7% for T/NT/A, comparable with the error rates for the Devel­
would be expected to show a transient warming, but the events in Fig. 20
opment stations, previously.
(a) and (b) also show the transient pressure rise associated with down­
The events >40kn of the Development set contain ten times more NT
bursts and are reported by a SPECI as squalls. A squall report is not an
than T, whereas the upper tails of the Analysis set contain slightly more
independent indicator of a valid event because it is defined from the
T than NT. So, the principal difference between TNNDT and XTNNDT is
observed speed itself.5 Event (b) corresponds to a severe tornado day
due to the different weighting of each class in the training process. As the
across Illinois and Missouri, and SPC reports “considerable damage to …
larger number of stations introduced many more Unclassifiable events,
buildings on west side of airport”. Whatever the mechanism, these warm
XTNNDT was better able to distinguish these after training. Conse­
events are clearly very significant for Wind Engineering so those in the
quently, XTNNDT is weighted to give more accuracy to the extremes and
upper tail have been reassigned from Class 6 to Class 5. Also see Fig. 14
artefacts, while TNNDT is weighted more into the body of the distri­
(b) where there is a 360◦ clockwise rotation of direction. On the other
butions. The CM for Class and XClass estimates shows that 8.7% differ,
hand, the “warm” events in Fig. 20(c) and (d) are likely to have been
but only 3.4% differ by more than one class, indicating that more than
caused by the passage of active warm fronts, which are correctly iden­
half of the differences are due to reassignment of transitional events
tified as Class 5 by TNNDT and XTNNDT. “Warm” events represent 7.3%
between adjacent classes. Note that the real-valued class predictions by
of the Thunderstorm events in the upper tails.
TNNDT are analogous to the fuzzy probabilities of Equation (1) in that a
value of, say, 4.6 (which rounds to Class 5) could be interpreted as F4 =
8.3.4. Stable temperature events
0.4 and F5 = 0.6.
Unfortunately, not all Thunderstorm events misclassified as Spike
show significant anomalies of temperature or pressure (Vallis et al.,
2019). The event in Fig. 21(a) which was preceded by a transient drop in
pressure is confirmed by SPC as associated with tornadic activity, while
(b) which shows no change of temperature or pressure is reported by
Table 12 SPECI as a squall.5 Note the 360◦ clockwise rotation of direction in (b).
Confusion matrix for XTNNDT method applied to upper tail (y > 0) of Analysis
Events of this kind were also assigned to Class 5.
set.
XTNNDT Class 8.3.5. Gust events culled by QC test 10
Class 0 1 2 3 4 5 6 Fig. 22 shows two examples of gust events culled by QC “Test 10” in
0 871 49 0 0 0 1 1 which the peak gust values that should coincide with the rapid drop in
1 10 3026 5 0 1 2 0 temperature have been culled. The event in (a) is classed as a 63kn
2 0 17 311 0 4 5 0
3 0 0 3 199 1 17 2
4 0 0 7 0 891 7 1
5 1 0 3 1 14 1982 70 5
“A strong wind characterized by a sudden onset in which the wind speed
6 0 0 0 0 0 7 2025
increases at least 16 knots and is sustained at 22 knots or more for at least 1
Error: 2.4%
min” – NWS Glossary.

16
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 18. Examples of the Thunderstorm-Spike dilemma.

Fig. 19. Examples of the embedded calibration sequence.

Thunderstorm with apparently a lowly rank close to the annual mode, check events in the upper tail against independent sources.
however most of the higher ranks are artefacts. There is only one higher
valid event of 65 kn, so that the missing peak is likely to have been the 8.3.7. Spurious sustained high gust speeds
maximum in the record. The event in (b) is classed as a 76kn Front-up Occasionally, a sonic anemometer may malfunction and transmit
and is the highest valid value in the record which, interpolating the spurious high gust speeds sustained over an hour, so that the spurious
shape of the surviving trace, was quite possibly not exceeded in the cull. gust event classifies as Synoptic, as in Fig. 24(a). However, the traces of
gust and mean speed, viewed in (b) over a longer period, suggests the
8.3.6. Hail or freezing rain event is not of meteorological origin and this is confirmed by SPC. Valid
Fig. 23 shows two examples of apparent (a) relatively steady and (b) synoptic-scale events may often be confirmed by a neighbouring station,
transient high gust speed events occurring in periods of hail or freezing as in Fig. 24(c) and (d). In cases where there is no SPC report and the
rain, respectively. TNNDT classifies both (a) and (b) as Synoptic. wind speeds have been culled in the METAR record, events in the upper
XTNNDT classifies both (a) and (b) as transitional between Unclassifiable tail have been visually classified as Unclassifiable.
and Synoptic. On checking SPC both are confirmed as artefacts. Despite
searches, no information has been found on the performance of sonic
anemometers in hail and freezing rain. This again reinforces the need to

17
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 20. Examples of “warm” events.

8.4. Geographic distribution of event frequency serial correlation, so that each synoptic-scale storm contributes
multiple events.
Fig. 25 presents maps of the geographical distribution of the annual
rate of gust events in each XTNNDT Class (XClass) for all gust events The remaining XClasses, 2 to 5, representing non-stationary events,
≥20kn to the end of 2021. are also strongly correlated with Station altitude, with some subtle
differences.
• For reference, (a) presents the elevation of each Station which
highlights the Great Colorado Plateau (GCP) west of the Rocky • Storm-burst events in (d), although defined as non-stationary in­
Mountains and east of the Sierra Nevada and Cascades, as well as the terruptions in synoptic-scale storms, will include events transitional
Appalachians in the northeast. The elevation contours from (a) are between Synoptic and the two frontal classes, so are expected to
superimposed on the XClass annual rates in the other maps. inherit much of their character. The higher rate in the northeast is
• The rate of artefacts: 0 Unclassifiable and 6 Spike in (b) and (h) not shared with the frontal classes.
correlate strongly with station altitude which suggests that instru­ • Front-down, the least frequent XClass, is more frequent in Florida and
mentation reliability is the primary cause, in contrast with the NOAA less in the northeast.
preoccupation with perching birds (NOAA, 2013). • Front-up is, after GCP, more frequent in the southern States, espe­
• Synoptic events in (c) are most frequent in a north-south band cially Texas and Florida.
through the High Plains – the very high annual rates being due to

18
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 21. Examples of stable temperature events.

Fig. 22. Examples of events culled by QC “Test 10”.

• Thunderstorm is, after GCP, more frequent east of the Mississippi, instrumentation, QC and reporting procedures, as well as possible effects
particularly along the Appalachians and in Florida. of climate changes, during the 22-year observation period of the fre­
quencies of occurrence. The observational period in Fig. 26 and Fig. 27 is
These observations are of frequency of occurrence and are not divided into three sub-periods: from 2000 to 2006 for the cup ane­
directly related to value, i.e., gust strength, which requires additional mometers6; 2007–2013 for the sonic anemometers before QC Test 10;
processing to eliminate serial correlation before applying EVA. This is and 2014–2021 for sonic with QC Test 10.
the subject of a separate study.

8.5. Evolution through time 6


After a slow start, the cup-sonic change was phased in quickly at all ASOS
stations. 14 February 2017 represents the mean installation date. Not all sta­
Analysis by year reveals any effects of the changes in tions contribute to earlier years and the results are weighted accordingly.

19
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 23. Spurious gust speeds in (a) hail and (b) freezing rain.

8.5.1. Calms 8.6. Correction for cup anemometer response


The frequency of calms is relevant to some Wind Engineering ap­
plications, e.g., pollution dispersal. Fig. 26 shows how the reported rate The Belfort cup anemometers were quoted as having an 5s response
of 2-min mean and 3-s gust calms, both reported at 1-min intervals, time, but cup anemometers have a constant run-of-wind response length
decreases through the three sub-periods. The initial drop at the change (Kristensen and Frost Hansen, 2002), so that the actual response time
from cup to sonic corrects the hysteresis lag of cup anemometers due to reduces with increasing mean wind speed. The superior response of the
friction in the bearings. The second drop at the implementation of QC Vaisala sonic anemometers was restricted by a running 3s-mean to give
Test 10 reduces the rate of mean calms by a factor of 4 and gust calms by the standard WMO response, irrespective of wind speed. The sonic/cup
a factor of 10. Values in years 2000 and 2004 appear to be outliers. gust ratio represents the factor that could be applied to correct the cup
gust speeds to the 3s standard. This was estimated for the valid Classes of
8.5.2. Bird perching events the Analysis set by ensemble-averaging all peak gust speeds over a
Fig. 27(a) shows the total count of bird gusts in each year found by threshold for the cup and the sonic periods and assuming that the wind
the new algorithm in Cook (2022) which registers a bird-generated gust climate remained constant.
only when a spike in value immediately precedes or succeeds a loss of Fig. 28(a)–(c) show the distribution of the 3s/5s ratio for three
data. This also includes spikes immediately preceding an instrumenta­ thresholds. Fig. 28(d) shows that the mean ratio falls in a consistent
tion failure, mimicking the characteristics of bird gusts, and accounts for manner with threshold as expected. Lombardo et al. (2009) report using
all counts in the initial cup period. The counts drop before the average the factor 1.02 for extremes at the design risk.
installation date of the sonics, suggesting that the most unreliable sta­
tions were prioritised. All bird gusts after the start of 2014 should have
been eliminated by Test10, but the count doubles. These are all false 8.7. Degrees of separation and the potential to merge classes for EVA
positives of QC Test 10 which the new algorithm (Cook, 2022) interprets
as additional bird-perching events when preceded/succeeded by a spike, Having separated the gust events into the six defined Classes, how
particularly after introducing spurious gaps by culling the peak gusts of well are they separated and are all six necessary for application of EVA?
valid Thunderstorm and Front-up events, e.g. Fig. 22. There is clearly no issue in merging Classes 0 and 6 into a single Artefact
class as these are discounted from later analysis. In §7.3 the valid classes
8.5.3. Classified events were merged into T and NT classes to assess the impact of mis­
The ensemble-averaged annual rates for Unclassifiable and the valid classifications. But there is a physical imperative to distinguish Thun­
events, Classes 1 to 5 in Fig. 27(b)–(g) all show very little variation derstorm since the action of thunderstorm downbursts on surface
between the three sub-periods, and between years, except that 2000 and structures differs from that for typical ABL gusts and is currently an
2004 again appear to be outliers. This includes Front-up and Thunder­ active field of study, e.g., (Solari et al., 2020) (Li et al., 2022). Following
storm which might be expected to show some effect of QC Test 10, except the call by De Gaetano et al. (De Gaetano et al., 2014), distinguishing the
that the examples in Fig. 22 show that the events are usually correctly frontal classes from T/NT was a principal motivation for the present
classified, and it is the peak value of the gust that is culled. study.
In the cup period the Spike anomalies can only be due to acquisition The potential to merge Classes depends on their frequency and on
and transmission glitches. The increase on the change from cup to sonic, their relative characteristics. Table 13 gives the annual rate of each Class
Fig. 27(h), is presumably due to birds perching for less than 1 min as the averaged for all Analysis stations in which.
new algorithm (Cook, 2022) would have removed them if there had
been an associated gap in data. Implementation of QC Test 10 restores • Class 1 stands out as the most frequent because 30-min separation is
the rate to around the cup-period value, and this is its only perceived not sufficient to ensure independence at synoptic scales. Following
virtue. Simiu and Heckert (1996), applying a 2-day minimum separation
reduces the mean annual rate to 64.5.

20
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 24. Spurious (a, b) and valid (c, d) sustained high gust speeds.

• Class 3 is the least frequent and would be dominated by the four- (Mahalanobis, 1936), a generalised measure of separation, on the
times more frequent Class 4 if the frontal classes were merged. taxonomic dendrogram, Fig. 29.
In Fig. 29 the best separation is between the cold front events, 4:
Canonical Variate Analysis (CVA) has developed considerably from Front-up, and the remaining T/NT/A classes – achieving the principal
its inception (Hotelling, 1933). It is widely used in taxonomy to distin­ objective of this study. The second best is between T/NT and Artefact
guish between species, and in morphology for facial recognition. The and the third between T and NT. All these separations are in the upper
method uses correlation to define canonical variates which are linear quartile of the range of Mahalahobis distance. The final separations form
combinations of the original data that maximally separate classes. three pairs: the best between Classes 3 and 5, and the worst between
Applied7 to the checked TNNDT classes and timeseries, it produced the Classes 1 and 2. It follows that the only reasonable candidates for
CM of Table 14 with an error rate comparable with the discarded merging are (0,6) as Artefacts to be discarded and (1,2) as ABL gusts,
methods in Table 7. Although it uses the same data, its performance falls which is how they are presently treated in design practice.
short of TNNDT, but its usefulness here is in estimating the degree of
separation between Classes, in terms of the Mahalahobis distance

7
Using CVA in the R package “Morpho”.

21
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 25. Annual rate of occurrence of gust events ≥20kn in each Class.

22
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 26. Rates of calms through the observation period.

Fig. 27. Effect of instrumentation and quality control changes through the observation period.

23
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

Fig. 28. Ratio of the mean sonic and cup observations for peak gusts over thresholds, ensemble averaged for all analysis stations.

9. Conclusions
Table 13
Average annual rate of XNNDT Classes at Analysis stations.
• The ASOS 1-min interval data are compromised by several types of
Class 0 1 2 3 4 5 6 artefacts specific to this dataset, additional to those commonly found
Average annual rate 22.4 1119 27.0 3.4 11.6 27.4 10.1 in meteorological observations.
• A calibration sequence is occasionally included in the cup-
anemometer data at some stations.
Table 14 • The sonic anemometers produce spurious high gust values in hail and
Confusion matrix for CVA applied to upper tail (y > 0) of Analysis set. freezing rain.
• Visual classification using timeseries of gust speed, direction, dry
CVA Class
bulb temperature, atmospheric pressure and precipitation reliably
Class 0 1 2 3 4 5 6
distinguishes five Classes of valid gust events and two Classes of
0 542 174 30 31 20 11 510
1 15 4567 39 17 28 15 9
artefacts.
2 17 301 88 22 23 25 9 • Most isolated and multiple Spike artefacts are due to instrumentation,
3 4 43 4 158 0 63 2 acquisition, or transmission glitches, with bird-generated gusts
4 12 30 27 4 967 173 3 adding only a minor contribution.
5 127 210 64 85 160 2092 47
• Methods that rely on only anemometric data do not sufficiently
6 214 128 7 4 3 24 2850
Error: 19.5% accurately classify ASOS gusts for later EVA.
• Methods relying on statistical metrics of the meteorological param­
eters, e.g., gust factor, perform less well than machine-learning
methods operating directly on the meteorological timeseries.
• A sensitivity EVA analysis shows that removing residual artefacts in
the far upper tail (y > 1) reduces errors in predicted 50-year return
speeds, V50, to less than ±0.5kn. Missing valid values in this range
produce underestimates of V50, while introduced artefacts produce
overestimates.
• The TNNDT method operating directly on the timeseries of three
parameters: gust speed, dry bulb temperature and atmospheric
pressure, achieves a classification error of 2.4% for gust speeds
greater than the annual mode.
• The TNNDT method identifies artefacts of very unreliable stations
effectively, rendering them as useful as the reliable stations.
• The TNNDT method training can be weighted to different ranges of
Fig. 29. CVA dendrogram of TNNDT Class separation for upper tail (y > 0) of
value: here TNNDT to all gusts >40kn and XTNNDT to gusts greater
Analysis set.

24
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

than the annual mode; the Class differences between the two administration, Resources, Software, Supervision, Validation, Visuali­
weightings is 8.7%, with only 3.4% differing by more than one Class. zation, Writing – original draft, Writing – review & editing.
• It is proposed that gust speeds greater than the annual mode (y > 0)
should be checked for validity visually and against independent Funding statement
sources, e.g., SPC Storm Reports.
• There is strong correlation between rates of thermal events and of This is self-funded curiosity-driven research which did not receive
artefacts with station altitude, resulting in a geographical bias in any specific grant from funding agencies in the public, commercial, or
event frequency towards the Great Colorado Plateau. The Synoptic not-for-profit sectors.
Class is most frequent in the High Plains, longitudes − 105◦ to − 95◦ .
• Quality control Test 10, introduced in 2014, suppresses the reporting
of nearly all valid calms. It has little effect on the frequencies of valid Declaration of competing interest
events classified by TNNDT, but it culls the peak gust values. The
consequences of this require further investigation. The author declares that there are no known competing financial
• Estimates of the factor required to correct the peak gusts measured interests or personal relationships that could have appeared to influence
by the cup anemometer to the WMO standard 3s gust decreases the work reported in this paper.
almost linearly over the observed range, from 1.026 at 20kn to 1.006
at 50kn. Data availability
• The principal aim of distinguishing frontal events from T and NT
events has been achieved. Mendeley Data link provided. Further data on application.
• This present study and the earlier curation of the TD6405 database
(Cook, 2022) complete the data validation and classification steps Acknowledgements
required prior to render the ASOS 1-min interval data suitable for a
comprehensive study of extreme gust speeds across CONUS. The helpful and constructive comments and suggestions made by the
two reviewers of a previous draft paper, which was restricted in scope to
Credit author statement anemometric data only, and of the first draft of this paper are gratefully
acknowledged.The author also thanks Dr Brian Hempel, CIWRO, Uni­
Nicholas J Cook: Conceptualization, Data curation, Formal analysis, versity of Oklahoma (https://ciwro.ou.edu/), for drawing attention to
Funding acquisition, Investigation, Methodology, Project the occasional calibration sequence illustrated in §8.3.2.

Supplementary information

The R scripts and instructions to extract gust events from any ASOS station and classify by the TNNDT method are available from Mendeley at URL:
https://doi.org/10.17632/88jp3swkn6.1. Rdata files of gust events ≥20kn at each the 450 ASOS stations of the Analysis set, classified by TNNDT and
XTNNDT, are also provided (118 Mb).

Appendix A. Parameters of the ASOS Development stations of this study

ICAO Code UTC (hours) Elevation (m) Sonic, date installed Latitude (degrees) Longitude (degrees) Station name

KABQ − 7 1618.5 22/05/2007 35.04191 − 106.61545 AlbuquerqueNM


KBUF − 5 218.2 04/06/2009 42.94001 − 78.73608 BuffaloNiagaraNY
KBWI − 5 47.5 20/09/2006 39.17332 − 76.68414 BaltimoreWashingtonMD
KDAL − 6 134.1 28/05/2009 32.83838 − 96.83584 DallasLoveFieldTX
KDCA − 5 3 26/09/2006 38.84720 − 77.03454 WashingtonReaganDC
KDEN − 7 1650.2 12/09/2005 39.84661 − 104.65624 DenverIntCO
KDFW − 6 170.7 27/05/2009 32.89744 − 97.02196 DallasFtWorthTX
KGFK − 6 256.6 17/10/2002 47.94272 − 97.18293 GrandForksND
KGTF − 7 1116.8 26/03/2007 47.47330 − 111.3828 GreatFallsMT
KIAD − 5 88.4 03/10/2006 38.93487 − 77.44728 WashingtonDullesVA
KICT − 6 402.6 06/10/2005 37.64754 − 97.43000 WichitaEisenhowerKS
KILM − 5 10.1 13/04/2007 34.26680 − 77.89989 WilmingtonNC
KJAN − 6 100.6 22/05/2007 32.31986 − 90.07778 JacksonIntlMS
KLAS − 8 664.5 25/04/2007 36.0719 − 115.16344 LasVegasMcCarranNV
KLAX − 8 29.6 27/10/2006 33.9382 − 118.3866 LosAngelesIntlCA
KMDW − 6 186.5 13/06/2007 41.78409 − 87.75515 ChicagoMidwayIL
KORD − 6 201.8 27/06/2007 41.96015 − 87.93165 ChicagoOHareIL
KPNS − 6 34.1 27/03/2007 30.47800 − 87.18686 PensacolaFL
KPRB − 8 246.9 27/02/2006 35.66938 − 120.62918 PasoRoblesCA
KPWM − 5 13.7 06/10/2006 43.64246 − 70.30446 PortlandJetportME
KRFD − 6 222.5 22/05/2007 42.19325 − 89.09335 RockfordIL
KSEA − 8 112.8 17/05/2007 47.44468 − 122.31441 SeattleTacomaWA
KSUX − 6 333.8 30/04/2009 42.39171 − 96.37949 SiouxCityIA
KTPA − 5 5.8 27/01/2009 27.96334 − 82.54001 TampaFL
KWMC − 8 1309.4 17/11/2005 40.90179 − 117.80811 WinnemuccaNV

25
N.J. Cook Journal of Wind Engineering & Industrial Aerodynamics 234 (2023) 105330

References thunderstorm-like wind. J. Wind Eng. Ind. Aerod. 229, 105161 https://doi.org/
10.1016/j.jweia.2022.105161.
Lombardo, F.T., Zickar, A.S., 2019. Characteristics of measured extreme thunderstorm
Ahmed, M.R., El Damatty, A.A., Dai, K., Ibrahim, A., Lu, W., 2022. Parametric study of
near-surface wind gusts in the United States. J. Wind Eng. Ind. Aerod. 193, 103961
the quasi-static response of wind turbines in downburst conditions using a numerical
https://doi.org/10.1016/j.jweia.2019.103961.
model. Eng. Struct. 250, 113440 https://doi.org/10.1016/j.engstruct.2021.113440.
Lombardo, F.T., Main, J.A., Simiu, E., 2009. Automated extraction and classification of
Arul, M., Kareem, A., Burlando, M., Solari, G., 2022. Machine learning based automated
thunderstorm and non-thunderstorm wind data for extreme-value analysis. J. Wind
identification of thunderstorms from anemometric records using shapelet transform.
Eng. Ind. Aerod. 97, 120–131. https://doi.org/10.1016/j.jweia.2009.03.001.
J. Wind Eng. Ind. Aerod. 220, 104856 https://doi.org/10.1016/j.
Lombardo, F.T., Smith, D.A., Schroeder, D.L., Mehta, K.C., 2014. Thunderstorm
jweia.2021.104856.
characteristics of importance to wind engineering. J. Wind Eng. Ind. Aerod. 125,
Canepa, F., Burlando, M., Solari, G., 2020. Vertical profile characteristics of
121–132. https://doi.org/10.1016/j.jweia.2013.12.004.
thunderstorm outflows. J. Wind Eng. Ind. Aerod. 206, 104332 https://doi.org/
Lu, N.Y., Hawbecker, P., Basu, S., Manuel, L., 2019. On wind turbine loads during
10.1016/j.jweia.2020.104332.
thunderstorm downbursts in contrasting atmospheric stability regimes. Energies 12,
Castillo, E., 1988. Extreme Value Theory in Engineering. Academic Press, ISBN 0-12-
2773. https://doi.org/10.3390/en12142773.
163475-2, p. 389.
Mahalanobis, P.C., 1936. On the generalised distance in statistics. Proc. Natl. Inst. Sci.
Chen, G., Lombardo, F.T., 2020. An automated classification method of thunderstorm
India 2, 49–55. http://library.isical.ac.in:8080/xmlui/bitstream/handle/10263/67
and non-thunderstorm wind data based on a convolutional neural network. J. Wind
65/Vol02_1936_1_Art05-pcm.pdf.
Eng. Ind. Aerod. 207, 104407 https://doi.org/10.1016/j.jweia.2020.104407.
NCDC, 2006a. Data Documentation for Data Set 6405: ASOS Surface 1-minute, Page 1
Choi, E.C.C., 1999. Extreme wind characteristics over Singapore – an area in the
Data. July. National Climatic Data Center, Asheville NC, USA, p. 5. Available at: htt
equatorial belt. J. Wind Eng. Ind. Aerod. 83, 61–69. https://doi.org/10.1016/S0167-
ps://www.ncei.noaa.gov/pub/data/asos-onemin/td6405.txt.
6105(99)00061-6.
NCDC, 2006b. Data Documentation for Data Set 6406: ASOS Surface 1-minute, Page 2
Choi, E.C.C., 2004. Field measurement and experimental study of wind speed profile
Data. July. National Climatic Data Center, Asheville NC, USA, p. 5. Available at: htt
during thunderstorms. J. Wind Eng. Ind. Aerod. 92, 275–290. https://doi.org/
ps://www.ncei.noaa.gov/pub/data/asos-onemin/td6406.txt.
10.1016/j.jweia.2003.12.001.
NOAA, 2013. Primer for the ASOS Software Version 3.10 Ice Free Wind Sensor Quality
Choi, E.C.C., Hidayat, F.A., 2002. Gust factors for thunderstorm and non-thunderstorm
Control Algorithm. July 24. NOAA. Last accessed 13 November 2022: https://www.
winds. J. Wind Eng. Ind. Aerod. 90, 1683–1696. https://doi.org/10.1016/S0167-
weather.gov/media/asos/ASOS%20Implementation/IFWS%20QC%20Algorithm
6105(02)00279-9.
_primer.pdf.
Cook, N.J., 2014a. Review of errors in archived wind data. Weather 69, 72–81. https://
Orwig, K.D., Schroeder, J.D., 2007. Near-surface wind characteristics of extreme
doi.org/10.1002/wea.2148, 3.
thunderstorm outflows. J. Wind Eng. Ind. Aerod. 95, 565–584. https://doi.org/
Cook, N.J., 2014b. Detecting artefacts in analyses of extreme wind speeds. Wind Struct.
10.1016/j.jweia.2006.12.002.
19, 271–294. https://doi.org/10.12989/was.2014.19.3.271.
Samanta, S., Tyagi, B., Vissa, N.K., Sahu, R.K., 2020. A new thermodynamic index for
Cook, N.J., 2021. Locating the anemometers of the US ASOS network and classifying
thunderstorm detection based on cloud base height and equivalent potential
their local shelter. Weather wea 4131. https://doi.org/10.1002/wea.4131.
temperature. J. Atmos. Sol. Terr. Phys. 207, 105367 https://doi.org/10.1016/j.
Cook, N.J., 2022. Curating the TD6405 database of 1-minute interval wind observations
jastp.2020.105367.
across the USA for use in Wind Engineering studies. J. Wind Eng. Ind. Aerod. 224,
Shu, Z.R., Li, Q.S., He, Y.C., Chan, P.W., 2017. Vertical wind profiles for typhoon,
104961 https://doi.org/10.1016/j.jweia.2022.104961.
monsoon and thunderstorm winds. J. Wind Eng. Ind. Aerod. 168, 190–199. https://
De Gaetano, P., Repetto, M.P., Repetto, T., Solari, G., 2014. Separation and classification
doi.org/10.1016/j.jweia.2017.06.004.
of extreme wind events from anemometric records. J. Wind Eng. Ind. Aerod. 126,
Simiu, E., Heckert, N.A., 1996. Extreme wind distribution tails: a “Peaks over Threshold”
132–143. https://doi.org/10.1016/j.jweia.2014.01.006.
approach. J. Struct. Eng. ASCE 122, 539–547. https://doi.org/10.1061/(ASCE)0733-
Gomes, L., Vickery, B.J., 1978. Extreme wind speeds in mixed wind climates. J. Wind
9445(1996)122:5(539).
Eng. Ind. Aerod. 2, 331–344. https://doi.org/10.1016/0167-6105(78)90018-1.
Solari, G., Burlando, M., Repetto, M.P., 2020. Detection, simulation, modelling and
Gringorten II., 1963. A plotting rule for extreme probability paper. J. Geophys. Res. 68,
loading of thunderstorm outflows to design wind-safer and cost-efficient structures.
813–814.
J. Wind Eng. Ind. Aerod. 200, 104142 https://doi.org/10.1016/j.
Guerova, G., Dimitrova, T., Georgiev, S., 2019. Thunderstorm classification functions
jweia.2020.104142.
based on instability indices and GNSS IWV for the sofia plain. Rem. Sens. 11, 2988.
Twisdale, L.A., Vickery, P.J., 1992. Research on thunderstorm wind design parameters.
https://doi.org/10.3390/rs11242988.
J. Wind Eng. Ind. Aerod. 41, 545–556. https://doi.org/10.1016/0167-6105(92)
Gumbel, E.J., 1958. Statistics of Extremes. Columbia University Press, New York, ISBN 0-
90461-I.
231-02190-9, p. 371.
Vallis, M.B., Loredo-Souza, A.M., Ferriera, V., de Lima Nascimento, E., 2019.
Gunter, W.S., Schroeder, J.L., 2015. High-resolution full-scale measurements of
Classification and identification of synoptic and non-synoptic extreme wind events
thunderstorm outflow winds. J. Wind Eng. Ind. Aerod. 138, 13–26. https://doi.org/
from surface observations in South America. J. Wind Eng. Ind. Aerod. 193, 103963
10.1016/j.jweia.2014.12.005.
https://doi.org/10.1016/j.jweia.2019.103963.
Harris, R.I., 2009. XIMIS, a penultimate extreme value method suitable for all types of
Vickery, P.J., Masters, F.J., Powell, M.D., Wadhera, D., 2009. Hurricane hazard
wind climate. J. Wind Eng. Ind. Aerod. 97, 271–286. https://doi.org/10.1016/j.
modeling: the past, present, and future. J. Wind Eng. Ind. Aerod. 97, 392–405.
jweia.2009.06.011.
https://doi.org/10.1016/j.jweia.2009.05.005.
Hotelling, H., 1933. Analysis of a complex of statistical variables into principal
Xhelaj, A., Burlando, M., Solari, G., 2020. A general-purpose analytical model for
components. J. Educ. Psychol. 24, 417–441. https://doi.org/10.1037/h0071325.
reconstructing the thunderstorm outflows of travelling downbursts immersed in ABL
Huang, G., Jiang, Y., Peng, L., Solari, L., Liao, H., Li, M., 2019. Characteristics of intense
flows. J. Wind Eng. Ind. Aerod. 207, 104373 https://doi.org/10.1016/j.
winds in mountain area based on field measurement: focusing on thunderstorm
jweia.2020.104373.
winds. J. Wind Eng. Ind. Aerod. 190, 166–182. https://doi.org/10.1016/j.
Ye, L., Keogh, E., 2011. Time series shapelets: a novel technique that allows accurate,
jweia.2019.04.020.
interpretable and fast classification. Data Min. Knowl. Discov. 22, 149–182. https://
IAWE, 2011. Announcement of the Alan G. Davenport wind loading chain. J. Wind Eng.
doi.org/10.1007/s10618-010-0179-5.
Ind. Aerod. 99, 998–999. https://doi.org/10.1016/j.jweia.2011.09.005.
Zhang, Y., Sarkar, P.P., Hu, H., 2015. An experimental investigation on the
Kasperski, M., 2002. A new wind zone map of Germany. J. Wind Eng. Ind. Aerod. 90,
characteristics of fluid–structure interactions of a wind turbine model sited in
1271–1287. https://doi.org/10.1016/S0167-6105(02)00257-X.
microburst-like winds. J. Fluid Struct. 57, 206–218. https://doi.org/10.1016/j.
Kristensen, L., Frost Hansen, O., 2002. Distance constant of the Risø cup anemometer.
jfluidstructs.2015.06.016.
Forskingcenter Risoe. Riso-R No. 1320 (EN), 23. https://orbit.dtu.dk/en/publication
Zhang, S., Solari, G., Burlando, M., Yang, Q., 2019a. Directional decomposition and
s/distance-constant-of-the-ris%C3%B8-cup-anemometer.
properties of thunderstorm outflows. J. Wind Eng. Ind. Aerod. 189, 71–90. https://
Kwon, D.K., Kareem, A., Butler, K., 2012. Gust-front loading effects on wind turbine
doi.org/10.1016/j.jweia.2019.03.014.
tower systems. J. Wind Eng. Ind. Aerod. 104–106, 109–115. https://doi.org/
Zhang, S., Yang, Q., Solari, G., Li, B., Huang, G., 2019b. Characteristics of thunderstorm
10.1016/j.jweia.2012.03.030.
outflows in Beijing urban area. J. Wind Eng. Ind. Aerod. 195, 104011 https://doi.
Li, X., Li, S., Su, Y., Peng, L., Cao, S., Liu, M., 2022. Study on the time-varying extreme
org/10.1016/j.jweia.2019.104011.
value characteristic of the transient loads on a 5:1 rectangular cylinder subjected to a

26

You might also like