4 Wind Data Analysis - Boopathi
4 Wind Data Analysis - Boopathi
DP
Data analysis
Data Analysis:
• The term data analysis has long been synonymous
with the term statistics
• massive amounts of data available in business and
DP
many other fields
• Two separate fields: statistics and management
science.
• In a nutshell, statistics is the study of data analysis,
whereas management science is the study of model
building, optimization, and decision-making
VS
Decisions-data-analysis
DP
VS
What is Data analysis
➢ The process of evaluating data using analytical and logical
reasoning to examine each component of the data provided.
➢ This form of analysis is just one of the many steps that must
DP
be completed when conducting a research experiment.
➢ Data from various sources is gathered, reviewed, and then
analyzed to form some sort of finding or conclusion.
VS
➢ There are a variety of specific data analysis method, some of
which include data mining, text analytics, business
intelligence, and data visualizations.
WIND DATA
➢ Wind data may be measured, modeled, or compiled from both
measured and modeled data, and they have spatial and temporal
components.
DP
➢ Wind resource data can have a high degree of spatial variability,
which emphasizes the need for high spatial resolution data.
➢ The data can represent wind speeds and wind direction at different
VS
heights off the ground, called the hub height.
➢ Hub height refers to the height of the hub of a wind turbine. Typical
hub heights are 30 m, 60 m, 80 m, 100 m, and increasingly 120 m
and 150m
Instruments
DP
VS T = L/(c+v)
DATA TRANSFER
Manual data transfer
✓ Remove and replace current storage device or data card
✓ Transfer of data directly to the lap top
DP
Advantage – leading to visual on site inspection.
Disadvantage - frequent site visits.
Remote data transfer
✓ By direct wire cabling
✓ Phone modem
✓ Cellular modem
VS
✓ Satellite communication
✓ Internet connection
Advantage - Retrieve and inspect data more frequently and allows promptly to
identify and resolve problems.
Disadvantage – Cost
Raw Data (Binary)
DP
3032 2011
VS 04 01 009.rwd
DP
10/1/2004 17:40 14.1 1.5 17.2 9.9 204 16 213 280
10/1/2004 17:50 14.5 1.6 17.6 8.8 200 12 202 280
10/1/2004 18:00 13.2 1.7 16.8 8.4 202 13 178 280
10/1/2004 18:10 13.9 1.9 18.3 8 214 9 217 280
10/1/2004 18:20 13.3 2 18.3 8 214 8 219 280
VS
10/1/2004 18:30
10/1/2004 18:40
10/1/2004 18:50
10/1/2004 19:00
10/1/2004 19:10
10/1/2004 19:20
10/1/2004 19:30
14.2
15.1
14.4
13.5
14.3
13
13.4
2.1
1.9
1.7
1.5
1.7
1.9
2.4
18.7
19.9
18.7
16.8
17.9
17.9
18.7
9.5
11.1
9.1
8.8
9.5
8
8
208
213
215
209
213
211
212
11
7
7
14
8
8
9
205
200
211
213
215
212
205
280
280
280
280
280
280
280
10/1/2004 19:40 14.6 1.9 18.7 9.9 213 8 213 280
10/1/2004 19:50 14.8 2.2 19.5 8.4 210 10 214 280
9
Wind data analysis Introduction
• Evaluation of the power potential of a particular type of wind turbine at a specific
site is necessary for economic decisions.
• Therefore, the information of a wind turbine and that of a site have to be
DP
measured or predicted and then combined with the power curve of a wind
turbine.
• It is known that great volumes of data are collected in the wind energy industry
which is often not utilized fully. This could be for a variety of reasons:
managing), VS
• The data volume is too large to handle effectively (processing, storing and
• It’s not clear how to use the data to answer questions of interest,
• The full potential of the value which could be extracted from the data is not
known.
Specific Objectives
➢ To analyze the wind energy resource potential
➢ To model the wind data with different statistical methods and
software.
➢ To generate site wind resource map
DP
➢ To forecast site wind power and energy density by using one year
wind data,
➢ To select IEC wind turbine class as per the site wind resource
map,
VS
➢ To conduct preliminary wind farm micro turbine sitting and
estimate the farm annual energy production (AEP), and
➢ To perform preliminary estimates of the investment cost and cost
of energy
Data Analysis Process
Data Analysis Process is nothing but gathering information by using
proper application or tool which allows you to explore the data and
find a pattern in it. Based on that, you can take decisions, or you can
get ultimate conclusions.
DP
Data Analysis consists of the following phases:
➢ Data Requirement Gathering
➢ Data Collection
➢ Data Cleaning
➢ Data Analysis VS
➢ Data Interpretation
➢ Data Visualization
QC system
• Quality Control (QC) is defined as those operational procedures that will be routinely followed
during the normal operation of the monitoring system to ensure that a measurement process is
working properly.
• These procedures include periodic calibration of the instruments, site inspections, data
DP
screening, data validation, and preventive maintenance.
• The QC procedures should produce quantitative documentation to support claims of accuracy.
• Quality Assurance (QA) is defined as those procedures that will be performed on a more
occasional basis to provide assurance that the measurement process is producing data that
VS
meets the data quality objectives (DQO).
• These procedures include routine evaluation of how the QC procedures are implemented
(system audits) and assessments of instrument performance (performance audits)
• QC system should include procedures for returning to the source of data to:
• verify them and prevent recurrence of the errors.
Data Quality and Reliability
Completeness.
Missing data vs. no data.
DP
Accuracy.
Calibrated sensors, resolution, sensor location.
VS
Reliability.
Icing, data spikes, failed sensor.
Expertise.
Consultants, contractors, experience.
14
Data Analysis
• The data analysis starts by downloading and utilizing the measured wind data
from the data logger
Data validation (Wind data summary) has to be done through the
DP
following processes:
Data screening which has to be done by:
➢ Arranging as per their category
➢ Check for the availability of missed data
➢ Range test
➢ Relational test
VS
Data verification which was done by:
➢ Trend test
• Finally, validated data file will be created for next processes.
VS
16
DP
DATA VALIDATION METHODS
Data Screening:
• screen all the data for suspect (questionable and erroneous) values
• The result of this part is a data validation report (a printout) that lists
DP
the suspect values and which validation routine each value failed.
Data Verification: The second part requires a case-by-case decision on
what to do with the suspect values - retain them as valid, reject them
VS
as invalid, or replace them with redundant, valid values (if available).
This part is where judgment by a qualified person familiar with the
monitoring equipment and local meteorology is needed
Validation Routines
Validation routines are designed to screen each measured parameter
for suspect values before they are incorporated into the archived
database and used for site analysis.
They can be grouped into two main categories,
DP
• general system checks and -- Data Records and Time Sequence
measured parameter checks--- range tests, relational tests, and
trend tests.
VS
Measured Parameter Checks
• . Three measurement parameter checks are commonly performed: range tests,
relational tests, and trend tests. These tests are applied in sequence, and data
must pass all three to be deemed valiation
DP
VS
Treatment of Suspect and Missing Data
DP
Data Recovery
VS
2
3
Data Validation–Importance
• Data validation is critical because serious errors in data analysis and
modeling results can be caused by erroneous individual data values.
DP
• Timely data validation is required to minimize the generation of
additional data that may be invalid or suspect and to maximize the
recoverable data.
VS
Data Validation–QC Levels (1 of 3)
Level 0 Data Validation: Routine checks are made during the initial data
processing and generation of data, including proper data file
identification, review of unusual events, review of field data sheets and
DP
result reports, and instrument performance checks.
• Verify computer file entries against data sheets.
• Flag samples when significant deviations from measurement
assumptions have occurred.
• Eliminate values for measurements that are known to be invalid
VS
because of instrument malfunctions.
• Replace data from a backup data acquisition system in the event of
failure of the primary system.
• Adjust measurement values of quantifiable calibration or interference
bias.
• Document the changes made to the data.
Data Validation–QC Levels (2 of 3)
Level 0.5 Data Validation: Automatic (objective) checks are applied to the data to
identify outliers.
DP
Types of checks
– Range
– Rate of change
– Pattern recognitions (Webber-Wuertz)
VS
Level I Data Validation: Manual review of data for internal consistency to identify values that
appear a typical when compared to values for the entire data set and to the reviewer’s
knowledge of expected meteorological conditions.
• Compare data collected from nearby sites at similar heights and times.
• Compare data to surface meteorology.
Data Validation–QC Levels (3 of 3)
DP
meteorologist to verify consistency over time. This level is often part of the
data interpretation or analysis process.
VS
rawinsondes) or upper-air maps.
Level III Data Validation: Occurs when the data are used during modeling and
analysis efforts, for example, when inconsistencies in analysis and
modeling results are found to be caused by measurement errors.
Data Analysis
The Weibull statistical models and software Excel and MatLab has to be
implemented for the wind mast site data analysis to determine and
DP
plot the following parameters:
DP
Flagging rules were applied to the data to detect and flag suspected anemometer and direction vane icing
VS
Several factors that often need to be considered to accurately estimate the true free-stream speed.
three types of adjustment: tower effects, turbulence, and inclined flow. Some adjustments apply to only
certain types of anemometers.
DP
VS
Plots of speed ratios as a function of wind direction for pairs of anemometers at the same height. The red lines are the result of a computational fluid dynamical
model of the tower effects. The prominent dips and spikes in both charts represent the effect of tower shadow. Note that the implied boom direction is 180◦
opposite the wind direction. (a) A normal degree of scatter and secondary tower influences outside the shadow directions. (b) Relatively large secondary
influences, which may be due to equipment or obstructions on the tower.
Source: AWS Truepower.
Measure of Center
❖ Measure of Center
the value at the center or middle of a data set
❖ The three common measures of center are the mean,
DP
the median, and the mode.
VS
What most people call an average x
also called the arithmetic mean. x =
n
Mean
❖ Disadvantage
Is sensitive to every data value, one extreme value can
affect it dramatically; is not a resistant measure of center
DP
Example:
x = 45.9
VS
21,25,32,48,53,62,62,64 →
21,25,32,48,53,62,62,300 → x = 75.4
Median
❖ Median
the measure of center which is the middle value when the
DP
original data values are arranged in order of increasing
(or decreasing) magnitude
❖ Disadvantage
Is sensitive to every data value, one extreme value can
VS
affect it dramatically; is not a resistant measure of center
Finding the Median
First sort the values (arrange them in order), the follow one of
these
1. If the number of data values is odd, the median is the number
DP
located in the exact middle of the list.Its position in the listis:
n +1
th
2
VS
2. If the number of data values is even, the median is
found by computing the mean of the two middle
numbers which are those that lie on either side of the
data value in the position:
n +1
2
th
Example of Median
• 6 data values:
5.40 1.10 0.42 0.73 0.48 1.10
DP
• Sorted data:
0.42 0.48 0.73 1.10 1.10 5.40
0.73 +1.1
median = =0.915
2
Median
❖Median is not affected by an extreme value -
DP
is a resistant measure of the center
Example:
VS
21,25,32,48,53,62,62,64
21,25,32,48,53,62,62,300
Median is 50.5 for both data sets.
❖ Mode
the value that occurs with the greatest frequency
❖ Data set can have one, more than one, or no mode
DP
Bimodal two data values occur with the same greatest
frequency
No Mode VS
Multimodal more than two data values occur with the same
greatest frequency
no data value is repeated
DP
a) 5.40 1.10 0.42 0.73 0.48 1.10 Mode is 1.10
b) 27 27 27 55 55 55 88 88 99 Bimodal - 27 & 55
VS
c) 1 2 3 6 7 8 9 10 No Mode
Population Variance
The population variance is the mean of the squared
deviations in the population
DP
(x − ) 2
=2
VS N
Population Standard Deviation
The population standard deviation is the square
root of the population variance.
DP
(x − ) 2
VS
=
N
Outliers
An outlier is an extreme data value.
DP
deviations from the mean.
VS
An outlier may be due to variability in the measurement or it may
indicate experimental error; the latter are sometimes excluded from
the data sets.
3.1 -
Outliers and z Scores
❖Data values are not unusual (exteme) if
−2 z 2
z −3or z 3
DP
For the green regions:
3.
4.
z=-1.
VS
Therefore, 16% (=50%-34%) of the data is to the left of
47.5% (half of 95%) of the data lies between z=-2 and z=0.
Therefore, 2.5% (=50%-47.5%) of the data is to the left of
z=-2.
Subtracting areas gives that 13.5%=16%-2.5% of the
data lies between z=-2 and z=-1.
5. Using symmetry, 13.5% of the data also lies between
z=1 and z=2.
Errors in the WMS Data
• Sensor Failure (Constant Value repetition)
• Sensor Theft
• Orientation of the sensor error
DP
• Environmental change (cyclone formation)
• Data Missing
DP
❖ Turbulence Intensity
VS
❖ Air Density
DP
If we have a set of measured wind speeds (ui), the mean of the wind
speed is defined as
VS
n = The sample size or the number of measured values
ui = set of measured wind speeds
Month wise Daily Mean Wind Speed
Graph
DP
VS
STANDARD DEVIATION
DP
VS
ui = set of wind speeds numbers
= mean wind speed
Example
Calculate the mean and standard deviation for the given wind speed values of
2,4,7,8, and 9 m/s.
DP
= (2+4+7+8+9)/5 = 6.00 m/s
VS
TURBULENCE INTENSITY (TI)
❖Wind turbulence is the rapid disturbances or
irregularities in the wind speed, direction, and
vertical component.
DP
❖It is an important site characteristic, because high
turbulence levels may decrease power output and
cause extreme loading on wind turbine
components.
VS
❖ The most common indicator of turbulence for
0.40
0.35
Turbulence Intensity
0.30
speed. 0.15
0.10
0.05
0.00
DP
VS
IEC WT CLASS
• IEC Wind Turbine Classes, Ratings, and Characteristics of Turbulence Intensity
DP
VS
Tower shadow
• The tower can influence the wind speed that is measured by the
anemometers. This effect is known as tower shading. The effect can most
easily be seen mathematically or graphically byc omparing the wind speed
DP
ratios of the redundant anemometers
VS
ENERGY PATTERN FACTOR
❖Energy Pattern Factor (EPF) is a useful parameter for calculating the available
energy in the wind from the values of annual or monthly wind speed
DP
❖It is also useful while choosing locations with limited wind data, because long-term
data from neighboring stations can be correlated with onsite short-term
measurements.
VS
❖The range of EPF values will be generally
between 1.50 to 2.50
Example
Calculate EPF for a given mean wind speed 6 m/s for a
month of January in a 10 minute interval and
DP
VS
AIR DENSITY
❖ Air density varies with pressure and
❖ If the site pressure is not available, air density
temperature and also vary 10% to 15%
can be estimated as a function of site elevation
seasonally.
(z) and temperature (T) as follows:
DP
❖ If the site pressure is known the hourly air = ( PO / RT) exp (-g.z/RT) (Kg/m3)
density values with respect to air temperature Where
can be calculated from the following equation P O= the standard sea level atmospheric pressure
(101,325 Pa), or the actual sea level adjusted
pressure reading from a local airport.
g = the gravitational constant (9.8 m/s2)
DP
❖Its Value combines the effect of a site’s
wind speed distribution, its dependence on
air density and wind speed.
VS
❖WPD is defined as the wind power
available per unit area swept by the turbine
blades and is given by the following
equation.
Example
Calculate WPD for a given Air density is 1.180 kg/m3 and
for a month of January in 10 minute interval.
DP
VS
Example
The 40m height wind speed is 3.82 m/s and
WPD is 47.55 W/m2 and the 50m height wind
speed is 4.07 m/s and WPD is 60 W/m2.Find
power law index and extrapolation of 70m
DP
height WS and WPD?
Solution:
= Log10 4.07 – Log10 3.82
= 0.28 VS
Log10 50 – Log10 40
DP
Class Speed(m/s WPD(W/Sq.m) ➢Wind resources are characterized by wind-power
density classes, ranging from class 1 (the lowest) to class 7
)
(the highest).
1 0-5.6 0-200
➢Good wind resources (class 3 and above) which have an
2 5.6-6.4 200-300 average annual wind speed of at least 6.5m/s and above ,
5
6.4-7.0
7.0-7.5
7.5-8.0
VS 300-400
400-500
500-600
6 8.0-8.8 600-800
7 8.8-11.9 800-2000
64
Analysis
DP
❑ time distribution
❑ frequency distribution
VS
65
Frequency distribution
• Apart from the distribution of the wind speeds over a day or a year it is important
DP
to know the number of hours per month or per year during which the given wind
speeds occurred, i.e. the frequency distribution of the wind speeds* To arrive at
this frequency distribution we must first divide the wind speed domain into a number
of intervals, mostly of equal width of 1 m/s or 0.5 m/s. Then, starting at the first
VS
interval of say 0-1 m/s, the number of hours is counted in the period concerned that
the wind speed was in this interval. When the number of hours in each interval is
plotted against the wind speed, the frequency distribution emerges as a histogram
66
PERCENTAGE FREQUENCY DISTRIBUTION
OF WIND SPEED
DP
be used to estimate the annual energy
production of a wind turbine by multiplying
the number of hours in each interval with
the power output that the windmill
VS
generates at that wind speed interval.
DP
VS
Horizontal and Inter annual Variability
DP
VS
Wind Resource Assessment Techniques-K.Boopathi-
70
ITP_Sept.13
Boxplot
❖ A boxplot (or box-and-whisker- diagram) is a graph of a data set that
consists of a line extending from the minimum value to the maximum
value, and a box with lines drawn at the first quartile, Q1; the median;
DP
and the third quartile, Q3.
VS
3.1 -
Box plot
DP
VS
DP
VS Vertical Wind Variability
Wind rose
• Can be used to design layouts of wind farms: make most of the wind
turbines face the prevailing direction (wake effect)
DP
VS Prevailing wind
direction
CUMULATIVE DISTRIBUTIVE FUNCTION
DP
VS
What to do with the measurements?
DP
60
strange values such as -999).
• Scale the measurements
Height (m)
0
0 2
VS 4
Wind speed (m/s)
6
- What if the measurements are
made at 20m and the hub height
will be at 70m?
DP
components, such as the blades and gearbox
One of two mathematical relations is typically used to characterize the
measured wind shear:
• Power law profile,
vertical wind shear factor
• Logarithmic profile,
VS
Scaling
• The wind shear depends on the
roughness length, z0, of the
DP
terrain
• Examples
- Lawn, water: z0 = 0.01 m
VS - Bushland: z0 = 0.1 m
- Towns, forests: z0 = 1m
DP
VS
79
Scaling – Power law
DP
U ref
zref
• α: power law coefficient. Two ways to calculate it
- From the reference values
VS =
0.37 − 0.088 ln(Uref )
1− 0.088ln (zref /10)
DP
• Uref: wind speed at zref
VS
• zref: altitude at which we know the wind speed
Vertical wind profile for different roughness
• z: altitude at which we want to calculate the wind lengths z0, assumed “geostrophic wind” of
15 m/s
speed U ln(z /z0 )
• U: wind speed at z (to be calculated) Uref ln(zref / z0 ) 81
Negative wind shear
DP
VS
Rayleigh distribution
DP
VS
Rayleigh distributions for four average wind speeds.
DP
VS
Weibull distribution
k −1
k U U k
f (U ) = exp −
cc c
DP
k>0: shape parameter c>0:
scale parameter
DP
VS
Source: Robert Gasch and Jochen Twele, “Wind Power Plants. Fundamentals, Design, Construction and
Operation”. Chapter 4, 2012.
Energy yield - Bromma
DP
VS
DP
VS
THANK YOU