100% found this document useful (1 vote)
128 views15 pages

Matlab Fundamental 13

This document discusses analyzing electricity consumption data through moving window operations and linear correlation. It introduces calculating statistics like the mean on moving subsets of data using functions like movmean. Noncentered windows and leading/trailing windows are also covered. Linear correlation between variables is investigated through plotting variables on the same axes using different scales, as well as through computational correlation coefficients.

Uploaded by

duc anh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
128 views15 pages

Matlab Fundamental 13

This document discusses analyzing electricity consumption data through moving window operations and linear correlation. It introduces calculating statistics like the mean on moving subsets of data using functions like movmean. Noncentered windows and leading/trailing windows are also covered. Linear correlation between variables is investigated through plotting variables on the same axes using different scales, as well as through computational correlation coefficients.

Uploaded by

duc anh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

13.

1 Course Example - Analyzing


Electricity Consumption

Course Example: Analyzing


Electricity Consumption

The electricity consumption data shown


contains monthly electricity usage in the
United States across different sectors. The
values represent the total consumption for
the given month, in units of megawatt hours.

How can you determine future


electricity usage? 

To forecast the demand in the


future, you may want to understand
how different sectors relate to each
other and model the trend of the
average consumption values.
 

In this chapter, you will perform common data analysis tasks such as
 Computing trailing averages and other moving window operations
 Finding linear correlations between data sets
 Fitting and evaluating polynomial models
Introduction: Moving The electricity usage data shows both a long-term trend and a short-term
Window Operations seasonality. In such a situation it is common to calculate summary statistics, such
the mean, on a moving subset of the data.

Take the data points in a "window".


Take the data points in a "window".
Calculate a given statistic on this
subset of data.

Calculate a given statistic on this


subset of data.
Slide the window across one data
point and repeat the calculation on
the new subset.

Slide the window across one data


point and repeat the calculation on
the new subset.
Keep sliding the window across the
data.

A value is computed for each


window.

13.2 Moving Window >> y = movmean(x,k)


Operations: (2/7) Moving
Statistics Functions y Result of the k-point moving mean applied to the data in x.

Moving Statistics Outputs

Functions x Array of data.


MATLAB provides functions for k Number of points in window.

performing several statistical Inputs


operations, such as the mean, on As with other statistical functions in MATLAB, if x is a matrix, the function is
moving windows. These functions applied to the columns of x independently. You can change this behavior by
all have the same syntax: specifying an optional dimension argument. You can also provide an optional flag
to specify how NaN values are handled.

You can use This code creates and plots two vectors.
the movmean function to calculate x = 0:0.2:6
the centered k-point moving y = sin(x)
average of an array. plot(x,y,'o-')

xavg = movmean(x,k) ym9 = movmean(y,9)


TASK
Create a vector named ym9 that
contains the centered 9-point
moving average of y.

TASK hold on
Plot ym9 as a function of x with plot(x,ym9,'.-')
point markers and a solid line. Add hold off
the plot to the same figure
using holdon and hold off.

TASK ym17 = movmean(y,17)


Create a vector named ym17 that hold on
contains the centered 17-point plot(x,ym17, '.-')
moving average of y.  hold off

Plot ym17 as a function of x with


point markers and a solid line. Add
the plot to the same figure
using hold on and hold off.

Notice the change in the shape of the curve of the moving movmin Moving minimum
average near the ends of the data. A 17-point window needs
8 points on either side of the current point. For the first and movmax Moving maximum
last 8 data points, the movmean function by default shrinks
the window as needed. You can change this behavior by movsum Moving sum
specifying the optional 'Endpoints' parameter. Try filling the
movmean Moving mean
window with zeros or simply discarding any values too close
to the endpoints to fit the entire 17-point window. movmedia Moving median
n
You may also want to try applying other statistical
functions such the minimum or maximum: movstd Moving standard deviation
movvar Moving variance
When you are finished, you may move on to the next section.

13.2 Moving Window Operations: (4/7) Smoothing


Electricity Data

The table edata contains monthly electricity usage, in MWh, tot12 = movmean(total,12);


for the U.S., separated by sector. The matrix usagecontains hold on
the consumption for three sectors (residential, commercial, plot(dates,tot12,'.-')
and industrial). The vector total contains the total tot24 = movmean(total,24);
consumption. The months are stored in the datetime plot(dates,tot24,'.-')
vector dates. hold off
TASK
Calculate the 1-year moving average of total. Add this to the
existing plot using point markers and a solid line. Repeat for
the 2-year moving average.

TASK us24 = movmean(usage,24)


Calculate the 2-year moving average of all the sectors
in usage. Create a new plot of the result (with solid lines and plot(dates, us24,'-')
no markers).

13.2 Moving Window Operations: (5/7) Noncentered Windows


>> y = movmean(x,[kb kf])

When you provide just the number of points in the window,


the  mov* functions use a centered window
For example, a 5-point window uses the current point, the
previous 2 points, and the next 2 points.

A 4-point window uses the current point, the previous 2


points, and the next point.
However, you can also explicitly specify the number of points
in the window backward and forward from the current point.
For example, 1 point backward and 3 points forward from the
current point.

13.2 Moving Window Operations: (6/7) Leading and


Trailing Windows

You can find a moving average using a noncentered window. This code creates and plots two vectors.
x = 0:0.2:6
dN = movmean(d,[kb,kf]) y = sin(x)
plot(x,y,'o-')
kb is the number of trailing points to include, and kf is the
number of leading points. trail4max = movmax(y,[3 0])
TASK
Create a vector named trail4max that contains the trailing 4-
point maximum of y (i.e., the maximum of the current point
and the three previous points).

TASK hold on
Plot trail4max as a function of x with point markers and a plot(x,trail4max,'.-')
solid line. hold off

TASK lead4max = movmax(y, [0 3])


Create a vector named lead4max that contains the leading
4-point maximum of y (i.e., the maximum of the current point
and the three next points).
TASK plot(x,lead4max,'.-')
Plot lead4max as a function of x with point markers and a
solid line.

Summary: Moving Window Mean calculated with z = movmean(y,k)


a centered moving k-
Operations point window.
movmin Moving minimum
Mean calculated with z = movmean(y,
movmax Moving maximum a moving window [kb kf])
with kb points
movsum Moving sum backward
and kf points forward
movmean Moving mean from the current point.
movmedia Moving median
n
movstd Moving standard deviation
movvar Moving variance

13.3 Linear Correlation: (1/9) Introduction


Introduction: Linear Correlation
From a plot of the electricity usage for the individual sectors, it is clear there
are similarities between the different sectors. How strong is this relationship?
Are variables similar enough that a model for one will work for another?

You can investigate relationships between variables graphically and


computationally. In particular, it is common to look for linear correlations where
a change in one variable corresponds to a directly proportional change in
another variable.
13.3 Linear Correlation: (2/9) Plotting with
Different Scales

Plotting with Different Scales


You can often see relationships between variables
by simply plotting them together on the same axes.
However, when different variables have different
units, the difference in scale can make it hard to
see important features of both variables together.

You can use the yyaxis command to create plots


with two independent scales on the vertical axis:
The command  yyaxis  left  creates new axes with
independent vertical axis scales. The axis on the
left is currently active. 

yyaxis left

Plots are created in the active axes.

plot(t,y1)

The command  yyaxis  right changes the


currently active axis to the axis on the right. Plots
are now created in this axis which uses a different
scale to the axis on the left.

yyaxis right
plot(t,y2)

Issuing the command 


yyaxis left
  a second time does not modify the axis on the left
but makes it active again, allowing you to make
modifications to the axes without replotting.
yyaxis left
ylabel('y_1')
ylim([0 20])

Similarly, 
yyaxis right
  makes the axis on the right active again.

yyaxis right
ylabel('y_2')
ylim([0 600])
xlabel('x')

13.3 Linear Correlation: (3/9) Plotting Electricity


Usage
The goal of this interaction is to create the
following plot of electricity usage:

You can use the yyaxis command to create


separate axes on the left and right.

yyaxis left
plot(x1,y1)
yyaxis right
plot(x2,y2)

TASK Mình làm mà sai


Plot the first three columns yyaxis left
of usagesmooth(residential, commercial, and plot(dates,usagesmooth(:,1))
industrial usage) on one vertical axis hold on
against dates on the horizontal axis. On the same plot(dates,usagesmooth(:,2))
figure, plot the last column (total usage) with a plot(dates,usagesmooth(:,3))
second vertical axis. hold off
yyaxis right
plot(date,usagesmooth(:,4))

Nhưng kết quả là

yyaxis left
plot(dates,usagesmooth(:,1:3)) -> hoàn tòan có thể viết theo kiểu
matrix
yyaxis right
plot(dates,usagesmooth(:,4))

13.3 Linear Correlation: (4/9) Scatter Plots

Given two vectors


named residentialand commercial containing
the electricity usage data, you can visualize the
relationship between them.

What if you want to visualize the relationship


between three or more variables? You can create
scatter plots for each pair of variables in the data
using the function plotmatrix.

plotmatrix
Creates separate scatter plots for each pair of
columns in the input matrix.

The input to the function plotmatrix is a matrix,


with each variable in a separate column. 

The result is a matrix of scatter plots.


The plot in the second column and the first row is
the scatter plot of column 2 against column 1 of the
input matrix.

Similarly, the plot in the fourth column and the


second row is the scatter plot of column 4 against
column 2 of the input matrix.

13.3 Linear Correlation: (5/9) Linear Correlation

Correlation Coefficient
In addition to visualizing the relationship between the variables, you can quantify the strength of the linear relationship
numerically by calculating the correlation coefficient.
The MATLAB function corrcoef computes the linear correlation using the data from the input matrix. The correlation
coefficient has a value between +1 and -1.
 A correlation of +1 or -1 indicates a perfect linear relationship between the variables.
 +1 means that an increase in one variable is associated with an increase in the other.
 -1 means that an increase in one variable is associated with a decrease in the other.
 A correlation of 0 indicates that the variables are not linearly related.

corrcoef
Compute the correlation coefficients from the input matrix.

Like plotmatrix, the input to


the  corrcoef function is a matrix, with each
variable in a separate column. 

The result is a symmetric matrix containing the


correlation coefficients between the variables
(columns) of the input matrix.
The value in the second column and the first row of
the output matrix is the correlation coefficient
between column 2 and column 1 of the input
matrix data.

Similarly, the value in the fourth column and the


second row of the output matrix is the correlation
coefficient between column 4 and column 2
of data.

A variable is always perfectly correlated with itself.


Hence, the diagonal elements of the output matrix
are always +1.

Quiz Corrcoef([a,b,c])
Given three column vectors a, b, and c, which of
the following commands can be used to find the
coefficient of correlation between the three
vectors?
correlation([a b c])
corrcoef([a b c])
corrcoef(a,b,c)

13.3 Linear Correlation: (8/9) Correlations in


Electricity Usage

TASK This code import, organizes, and plots the usage data.
Use plotmatrix to create a matrix of plots of all
edata = readtable('electricity.csv');
the sectors (columns) in usage against each
other.  dates = edata.Date;
usage = edata{:,2:end};
Use corrcoef to quantify these correlations by sectors = edata.Properties.VariableNames(2:end);
calculating the corresponding correlation plot(dates,usage)
coefficients. Store the result in a matrix
legend(sectors,'Location','northwest')
called usagecorr

plotmatrix(usage)
usagecorr = corrcoef(usage)
Summary: Linear Correlation
You can investigate relationships between variables visually and computationally:
 Plot multiple series together. Use yyaxis to add another vertical axis to allow for different scales.
 Plot variables against each other. Use plotmatrix to create an array of scatter plots.
 Calculate linear correlation coefficients. Use corrcoef to calculate pairwise correlations.

Plot multiple series together. yyaxis left


plot(...)
yyaxis right
plot(...)

Plot variables against each other. plotmatrix(data)

Calculate linear correlation coefficients.

13.4 Polynomial Fitting: (1/7) Introduction

After the seasonal variation is removed, long-term


trends in the electricity usage data become clear.
Plotting the sectors together shows a strong
correlation between residential and total usage. Is
it possible to build a predictive model of the
residential usage by fitting a model to the known
data?

You can easily fit and evaluate polynomial models


using the polyfit and polyval functions.
13.4 Polynomial Fitting: (2/7) Polynomial Fitting
Determine the coefficients
You can use the function polyfit to compute the
coefficients of a least-squares polynomial fit to the
data.

>> c = polyfit(x,y,n)
Suppose that you have two vectors x and y.
x = 0:5;
y = [2 1 4 4 3 2];
plot(x,y)

Fit a polynomial of degree 3 to the x-ydata. c = polyfit(x,y,3)


c=

-0.1296 0.6865 -0.1759 1.6746


Coefficients in the output vector are ordered from 0.1296
the highest to the lowest degree. So, the
polynomial which fits the x-y data can be
expressed as

Evaluate the polynomial


Given the vector c containing the coefficients of the polynomial, you can evaluate the polynomial at any value of x using
the polyval function.
>> yFit = polyval(c,xFit)

You can evaluate the polynomial at any arbitrary xFit = -1:0.01:6;


values of x. 

A common approach is to create a vector of


uniformly spaced x values.

Evaluate and plot the fitted polynomial at values


contained in the vector xFit.
xFit = -1:0.01:6;
yFit = polyval(c,xFit);
hold on
plot(xFit,yFit)
hold off

Given 1-by-50 vectors x and y, what is the result of


the following command?

z = polyfit(x,y,3)
13.4 Polynomial Fitting: (4/7) Fit a Line

The polyfit function finds the coefficients of the c = polyfit(x,y,1)


best fit n-th degree polynomial of yData in terms
of xData.

cf = polyfit(xData,yData,n)
TASK
Fit a first degree polynomial in terms of x to the
vector y. Use the function polyfit and store the
resulting coefficients in a vector named c.

The polyval function evaluates a polynomial yFit = polyval(c,x)


(given by coefficients cf) at the points xEval.

yEval = polyval(cf,xEval)
TASK
Now, use the function polyval to find the value of
the fitted polynomial at each of the x values. Store
the result in yFit.

TASK hold on
Plot the polynomial values yFit against x as a red plot(x,yFit,'r')
line on top of the existing graph. hold off

c = polyfit(yr,penguins,3)
13.4 Polynomial Fitting: (5/7) Centering and Scaling

When performing polynomial fitting with large x values, numerical precision


limitations can lead to inaccurate results. The polyfit function will give a warning
in this case.
TASK
Use polyfit to fit a third degree polynomial to the vector penguins as a function
of yr. Store the resulting coefficients in a vector named c.

You can avoid the numerical precision limitations by centering and scaling [c,~,sc] =
the x data when using polyfit and polyval. To do this, ask for a third output polyfit(yr,penguins,3)
from polyfit:

[c,~,sc] = polyfit(x,y,deg)
TASK
Use centering and scaling to fit a third degree polynomial to the
vector penguins as a function of yr. Store the polynomial coefficients in a vector
named c and the scaling information in a variable called sc.

When evaluating the polynomial, pass the vector of scaling penguinfit = polyval(c,yr,[],sc)
coefficients sc to polyval as a fourth input:

yFit = polyval(c,xFit,[],sc)
TASK
Use polyval to evaluate the fitted polynomial at each of the yr values. Store the
result in penguinfit.

13.4 Polynomial Fitting: (6/7) Fit a Polynomial to

Electricity Usage

The datetime vector dates contains the months for his code imports, organizes, and plots the usage data.
which the electricity usage is recorded. To perform edata = readtable('electricity.csv');
polynomial fitting, you must first convert the dates dates = edata.Date;
to elapsed times. residential = edata.Residential;
TASK plot(dates,residential,'.-')
Create a vector t that contains, for each date, the
number of days elapsed since the first data point. t = days(dates - dates(1))

TASK [c,~,sc] = polyfit(t,residential,3)


Fit a cubic polynomial to the residential usage data resFit = polyval(c,t,[],sc)
as a function of t. Use centering and scaling to hold on
ensure accuracy.  plot(dates,resFit)
hold off
Evaluate the fitted polynomial at the same t values
and add the result to the existing plot.
13.5 Project - Data Analysis

TASK nanIdx = ismissing(mpg)


The variable mpg contains NaN values. Find the rows
mpg(nanIdx) = [];
in mpg with NaN values, and remove those rows from all
three data vectors: mpg, hp, and wt. hp(nanIdx) = [];
wt(nanIdx) = [];
Fuel economy in the U.S. is typically given in miles/gallon. econ = 235.214583./mpg
In many countries, however, the standard units are
numdata = [wt hp econ]
liters/100km. 

Given mpg, you can calculate economy in L/100km by


dividing 235.214583 by mpg.
TASK
Create a variable econ that contains the fuel economy in
L/100km rather than miles/gallon. 

Combine the data for weight, horsepower, and fuel


economy in L/100km (in that order) into a 50-by-3 matrix
called numdata.

TASK plotmatrix(numdata)
Create a matrix of the scatter plots of the variables
cc = corrcoef(numdata)
in numdata (weight, horsepower, and fuel economy) in a
single figure. 

Calculate the corresponding correlation coefficients and


store them as a matrix called cc.

TASK p = polyfit(wt,econ,1)
Determine the best-fit line (i.e., a first degree polynomial
econFit = polyval(p,wt)
fit) for fuel economy (in L/100km) as a function of vehicle
weight. 

Evaluate the fitted model at the weights in the data. Store


the fitted values in a vector called econFit. 

Note that you do not need to use centering and scaling for
the fit.

TASK scatter(wt,econ)
Create a scatter plot of fuel economy against weight, and
hold on
add the best-fit line as a red line.
plot(wt,econFit,'r')
hold off

You might also like