Statistical Toolbox For Use With MATLAB
Statistical Toolbox For Use With MATLAB
Toolbox
For Use with MATLAB
Computation
Visualization
Programming
Users Guide
Version 2
508-647-7000
Phone
508-647-7001
Fax
http://www.mathworks.com
Web
Anonymous FTP server
Newsgroup
ftp.mathworks.com
comp.soft-sys.matlab
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
Technical support
Product enhancement suggestions
Bug reports
Documentation error reports
Subscribing user registration
Order status, license renewals, passcodes
Sales, pricing, and general information
First printing
Version 1
Second printing Version 2
Third printing For MATLAB 5
Revised for MATLAB 5.1 (online version)
Revised for MATLAB 5.2 (online version)
Revised for Version 2.1.2 (Release 11) (online only)
Contents
Preface
Tutorial
1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Primary Topic Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Nonlinear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Multivariate Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Statistical Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Statistical Process Control (SPC) . . . . . . . . . . . . . . . . . . . . . .
Design of Experiments (DOE) . . . . . . . . . . . . . . . . . . . . . . . .
1-2
1-2
1-2
1-3
1-3
1-3
1-3
1-3
1-3
1-3
1-3
1-4
1-4
ii
Contents
1-12
1-13
1-15
1-17
1-18
1-20
1-21
1-23
1-24
1-25
1-27
1-28
1-29
1-30
1-31
1-33
1-35
1-36
1-37
1-38
1-39
Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Measures of Central Tendency (Location) . . . . . . . . . . . . . . . .
Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Functions for Data with Missing Values (NaNs) . . . . . . . . . . .
Percentiles and Graphical Descriptions . . . . . . . . . . . . . . . . . .
The Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-42
1-42
1-43
1-45
1-46
1-47
Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terminology and Basic Procedure . . . . . . . . . . . . . . . . . . . . . . .
Finding the Similarities Between Objects . . . . . . . . . . . . . . . .
Returning Distance Information . . . . . . . . . . . . . . . . . . . . . .
Defining the Links Between Objects . . . . . . . . . . . . . . . . . . . . .
Evaluating Cluster Formation . . . . . . . . . . . . . . . . . . . . . . . . .
Verifying the Cluster Tree . . . . . . . . . . . . . . . . . . . . . . . . . .
Getting More Information about Cluster Links . . . . . . . . . .
Creating Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Finding the Natural Divisions in the Dataset . . . . . . . . . . .
Specifying Arbitrary Clusters . . . . . . . . . . . . . . . . . . . . . . . .
1-50
1-50
1-51
1-53
1-53
1-56
1-56
1-57
1-61
1-61
1-62
Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
One-Way Analysis of Variance (ANOVA) . . . . . . . . . . . . . . . . .
Two-Way Analysis of Variance (ANOVA) . . . . . . . . . . . . . . . . .
Multiple Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Quadratic Response Surface Models . . . . . . . . . . . . . . . . . . . . .
Exploring Graphs of Multidimensional Polynomials . . . . . .
Stepwise Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Stepwise Regression Interactive GUI . . . . . . . . . . . . . . . . . .
Stepwise Regression Plot . . . . . . . . . . . . . . . . . . . . . . . . . . .
Stepwise Regression Diagnostics Figure . . . . . . . . . . . . . . .
1-65
1-65
1-67
1-69
1-72
1-73
1-74
1-75
1-75
1-76
1-76
1-79
1-79
1-79
1-80
1-82
1-83
1-83
Hypothesis Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-85
1-85
1-86
1-87
1-103
1-103
1-104
1-106
1-108
iii
1-110
1-110
1-110
1-111
1-112
1-113
1-115
1-116
1-117
1-118
1-118
1-121
1-124
Demos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The disttool Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The polytool Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The randtool Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The rsmdemo Demo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1-125
1-125
1-126
1-130
1-131
1-132
1-133
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-134
Reference
iv
Contents
Preface
Before You Begin . . . . .
What Is the Statistics Toolbox?
How to Use This Guide . . .
Mathematical Notation . . .
Typographical Conventions .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. vi
. vi
. vi
.vii
viii
Preface
You can change the way any toolbox function works by copying and renaming
the M-file, then modifying your copy. You can also extend the toolbox by adding
your own M-files.
Secondly, the toolbox provides a number of interactive tools that let you access
many of the functions through a graphical user interface (GUI). Together, the
GUI-based tools provide an environment for polynomial fitting and prediction,
as well as probability function exploration.
vi
All toolbox users should use Chapter 2, Reference, for information about
specific tools. For functions, reference descriptions include a synopsis of the
functions syntax, as well as a complete explanation of options and operation.
Many reference descriptions also include examples, a description of the
functions algorithm, and references to additional reading material.
Use this guide in conjunction with the software to learn about the powerful
features that MATLAB provides. Each chapter provides numerous examples
that apply the toolbox to representative statistical tasks.
The random number generation functions for various probability distributions
are based on all the primitive functions, randn and rand. There are many
examples that start by generating data using random numbers. To duplicate
the results in these examples, first execute the commands below.
seed = 931316785;
rand('seed',seed);
randn('seed',seed);
You might want to save these commands in an M-file script called init.m.
Then, instead of three separate commands, you need only type init.
Mathematical Notation
This manual and the Statistics Toolbox functions use the following
mathematical notation conventions.
E(x)
Expected value of x. E ( x ) =
f(x|a,b)
F(x|a,b)
I([a, b])
p and q
tf ( t ) dt
vii
Preface
Typographical Conventions
To Indicate
Example
Example code
Monospace type
A = 5
Function
names/syntax
Monospace type
Keys
Mathematical
expressions
Variables in italics.
MATLAB
output
p = x2 + 2x + 3
Monospace type
A =
5
viii
Menu names,
menu items,
and controls
New terms
NCS italics
(Use Body text-ital tag.)
An array is an ordered
collection of information.
In addition, some words in our syntax lines are shown within single quotation
marks (sometimes double). These marks are a MATLAB requirement and must
be typed. For example,
dir dirname
f = hex2num('s')
or
f ="pressure"
ix
Preface
1
Tutorial
Introduction . . . . . . . . . . . . . . . . . . . . 1-2
Probability Distributions . . . . . . . . . . . . . . 1-5
Descriptive Statistics . . . . . . . . . . . . . . . . 1-42
Cluster Analysis . . . . . . . . . . . . . . . . . . 1-50
Linear Models . . . . . . . . . . . . . . . . . . . 1-65
Nonlinear Regression Models . . . . . . . . . . . . 1-79
Hypothesis Tests . . . . . . . . . . . . . . . . . . 1-85
Multivariate Statistics
Statistical Plots
. . . . . . . . . . . . . . . 1-91
. . . . . . . . . . . . . . . . . 1-103
. . . . . . . . . 1-110
Tutorial
Introduction
The Statistics Toolbox, for use with MATLAB, supplies basic statistics
capability on the level of a first course in engineering or scientific statistics.
The statistics functions it provides are building blocks suitable for use inside
other analytical tools.
Probability Distributions
The Statistics Toolbox supports 20 probability distributions. For each
distribution there are five associated functions. They are:
Probability density function (pdf)
Cumulative distribution function (cdf)
Inverse of the cumulative distribution function
Random number generator
Mean and variance as a function of the parameters
For data driven distributions (beta, binomial, exponential, gamma, normal,
Poisson, uniform and Weibull), the Statistics Toolbox has functions for
computing parameter estimates and confidence intervals.
1-2
Introduction
Descriptive Statistics
The Statistics Toolbox provides functions for describing the features of a data
sample. These descriptive statistics include measures of location and spread,
percentile estimates and functions for dealing with data having missing
values.
Cluster Analysis
The Statistics Toolbox provides functions that allow you to divide a set of
objects into subgroups, each having members that are as much alike as
possible. This process is called cluster analysis.
Linear Models
In the area of linear models the Statistics Toolbox supports one-way and
two-way analysis of variance (ANOVA), multiple linear regression, stepwise
regression, response surface prediction, and ridge regression.
Nonlinear Models
For nonlinear models there are functions for parameter estimation, interactive
prediction and visualization of multidimensional nonlinear fits, and confidence
intervals for parameters and predicted values.
Hypothesis Tests
There are also functions that do the most common tests of hypothesis t-tests
and Z-tests.
Multivariate Statistics
The Statistics Toolbox supports methods in Multivariate Statistics, including
Principal Components Analysis and Linear Discriminant Analysis.
Statistical Plots
The Statistics Toolbox adds box plots, normal probability plots, Weibull
probability plots, control charts, and quantile-quantile plots to the arsenal of
graphs in MATLAB. There is also extended support for polynomial curve fitting
and prediction.
1-3
Tutorial
1-4
Probability Distributions
Probability Distributions
Probability distributions arise from experiments where the outcome is subject
to chance. The nature of the experiment dictates which probability
distributions may be appropriate for modeling the resulting random outcomes.
There are two types of probability distributions continuous and discrete.
Continuous (data)
Continuous (statistics)
Discrete
Beta
Chi-square
Binomial
Exponential
Noncentral Chi-square
Discrete Uniform
Gamma
Geometric
Lognormal
Noncentral F
Hypergeometric
Normal
Negative Binomial
Rayleigh
Noncentral t
Poisson
Uniform
Weibull
Suppose you are studying a machine that produces videotape. One measure of
the quality of the tape is the number of visual defects per hundred feet of tape.
The result of this experiment is an integer, since you cannot observe 1.5
defects. To model this experiment you should use a discrete probability
distribution.
A measure affecting the cost and quality of videotape is its thickness. Thick
tape is more expensive to produce, while variation in the thickness of the tape
on the reel increases the likelihood of breakage. Suppose you measure the
thickness of the tape every 1000 feet. The resulting numbers can take a
continuum of possible values, which suggests using a continuous probability
distribution to model the results.
Using a probability model does not allow you to predict the result of any
individual experiment but you can determine the probability that a given
outcome will fall inside a specific range of values.
1-5
Tutorial
1-6
Probability Distributions
The variable f contains the density of the normal pdf with parameters 0 and 1
at the values in x. The first input argument of every pdf is the set of values for
which you want to evaluate the density. Other arguments contain as many
parameters as are necessary to define the distribution uniquely. The normal
distribution requires two parameters, a location parameter (the mean, ) and
a scale parameter (the standard deviation, ).
f ( t ) dt
The cdf of a value x, F(x), is the probability of observing any outcome less than
or equal to x.
A cdf has two theoretical properties:
The cdf ranges from 0 to 1.
If y > x, then the cdf of y is greater than or equal to the cdf of x.
The cdf function call has the same general format for every distribution in the
Statistics Toolbox. The following commands illustrate how to call the cdf for the
normal distribution:
x = [3:0.1:3];
p = normcdf(x,0,1);
The variable p contains the probabilities associated with the normal cdf with
parameters 0 and 1 at the values in x. The first input argument of every cdf is
the set of values for which you want to evaluate the probability. Other
arguments contain as many parameters as are necessary to define the
distribution uniquely.
1-7
Tutorial
relationship between a continuous cdf and its inverse function, try the
following:
x = [3:0.1:3];
xnew = norminv(normcdf(x,0,1),0,1);
1-8
0.3770
0.6230
0.8281
0.9453
Probability Distributions
2.5758
The variable x contains the values associated with the normal inverse function
with parameters 0 and 1 at the probabilities in p. The difference p(2) p(1) is
0.99. Thus, the values in x define an interval that contains 99% of the standard
normal probability.
The inverse function call has the same general format for every distribution in
the Statistics Toolbox. The first input argument of every inverse function is the
set of probabilities for which you want to evaluate the critical values. Other
arguments contain as many parameters as are necessary to define the
distribution uniquely.
Random Numbers
The methods for generating random numbers from any distribution all start
with uniform random numbers. Once you have a uniform random number
generator, you can produce random numbers from other distributions either
directly or by using inversion or rejection methods.
Direct. Direct methods flow from the definition of the distribution.
1-9
Tutorial
So, you can generate a random number from a distribution by applying the
inverse function for that distribution to a uniform random number.
Unfortunately, this approach is usually not the most efficient.
Rejection. The functional form of some distributions makes it difficult or time
consuming to generate random numbers using direct or inversion methods.
Rejection methods can sometimes provide an elegant solution in these cases.
Suppose you want to generate random numbers from a distribution with pdf f.
To use rejection methods you must first find another density, g, and a constant,
c, so that the inequality below holds.
f ( x ) cg ( x ) x
You then generate the random numbers you want using the following steps:
1 Generate a random number x from distribution G with density g.
cg ( x )
f(x)
For efficiency you need a cheap method for generating random numbers from
G and the scalar, c, should be small. The expected number of iterations is c.
Syntax for Random Number Functions. You can generate random numbers from
each distribution. This function provides a single random number or a matrix
of random numbers, depending on the arguments you specify in the function
call.
For example, here is the way to generate random numbers from the beta
distribution. Four statements obtain random numbers: the first returns a
1-10
Probability Distributions
single number, the second returns a 2-by-2 matrix of random numbers, and the
third and fourth return 2-by-3 matrices of random numbers.
a = 1;
b = 2;
c = [.1 .5; 1 2];
d = [.25 .75; 5 10];
m = [2 3];
nrow = 2;
ncol = 3;
r1 = betarnd(a,b)
r1 =
0.4469
r2 = betarnd(c,d)
r2 =
0.8931
0.1316
0.4832
0.2403
r3 = betarnd(a,b,m)
r3 =
0.4196
0.0410
0.6078
0.0723
0.1392
0.0782
r4 = betarnd(a,b,nrow,ncol)
r4 =
0.0520
0.3891
0.3975
0.1848
0.1284
0.5186
1-11
Tutorial
The example shows a contour plot of the mean of the Weibull distribution as a
function of the parameters.
x = (0.5:0.1:5);
y = (1:0.04:2);
[X,Y] = meshgrid(x,y);
Z = weibstat(X,Y);
[c,h] = contour(x,y,Z,[0.4 0.6 1.0 1.8]);
clabel(c);
2
1.8
1.6
0.4
1.4
0.6
1.2
1.8
1
1
1-12
Probability Distributions
Lognormal
Negative Binomial
Normal
Poisson
Rayleigh
Students t
Noncentral t
Uniform
Weibull
This section gives a short introduction to each distribution.
Beta Distribution
Background. The beta distribution describes a family of curves that are unique
in that they are nonzero only on the interval [0 1]. A more general version of
the function assigns parameters to the end-points of the interval.
The beta cdf is the same as the incomplete beta function.
The beta distribution has a functional relationship with the t distribution. If Y
is an observation from Students t distribution with degrees of freedom then
the following transformation generates X, which is beta distributed:
1 1 Y
X = --- + --- -------------------2 2
2
+Y
if: Y t ( )
then
X ---, ---
2 2
The Statistics Toolbox uses this relationship to compute values of the t cdf and
inverse function as well as generating t distributed random numbers.
Mathematical Definition. The beta pdf is:
1
a1
b1
(1 x)
I ( 0, 1 ) ( x )
y = f ( x a, b ) = ------------------- x
B ( a, b )
1-13
Tutorial
Parameter Estimation. Suppose you are collecting data that has hard lower and
upper bounds of zero and one respectively. Parameter estimation is the process
of determining the parameters of the beta distribution that fit this data best in
some sense.
0.2301
pci =
2.8051
6.2610
0.1771
0.2832
The MLE for the parameter, a is 4.5330 compared to the true value of 5. The
95% confidence interval for a goes from 2.8051 to 6.2610, which includes the
true value.
Similarly the MLE for the parameter, b is 0.2301 compared to the true value of
0.2. The 95% confidence interval for b goes from 0.1771 to 0.2832, which also
includes the true value.
Of course in this made-up example we know the true value. In
experimentation we do not.
1-14
Probability Distributions
Example and Plot. The shape of the beta distribution is quite variable depending
on the values of the parameters, as illustrated by this plot.
2.5
a=b=4
a = b = 0.75
2
1.5
a=b=1
1
0.5
0
0
0.2
0.4
0.6
0.8
The constant pdf (the flat line) shows that the standard uniform distribution is
a special case of the beta distribution.
Binomial Distribution
Background. The binomial distribution models the total number of successes in
n x (1 x)
y = f ( x n, p ) = p q
I ( 0, 1, , n ) ( x )
x
n
n!
where: = ------------------------ and q = 1 p .
x
x! ( n x )!
The binomial distribution is discrete. For zero and for positive integers less
than n, the pdf is nonzero.
1-15
Tutorial
pci =
0.7998
0.9364
The MLE for the parameter, p is 0.8800 compared to the true value of 0.9. The
95% confidence interval for p goes from 0.7998 to 0.9364, which includes the
true value.
Of course in this made-up example we know the true value of p.
1-16
Probability Distributions
Example and Plot. The following commands generate a plot of the binomial pdf
for n = 10 and p = 1/2.
x = 0:10;
y = binopdf(x,10,0.5);
plot(x,y,'+')
0.25
0.2
0.15
0.1
0.05
0
0
10
--1
a1 b
y = f ( x a, b ) = ------------------x
e
a
b ( a)
( n 1 )s
2
---------------------- (n 1)
2
1-17
Tutorial
2 pdf is:
x 2
x ( 2 ) 2e
y = f (x ) = -----------------------------------v
--2
2 ( 2 )
Example and Plot. The 2 distribution is skewed to the right especially for few
degrees of freedom (). The plot shows the 2 distribution with four degrees of
freedom.
x = 0:0.2:15;
y = chi2pdf(x,4);
plot(x,y)
0.2
0.15
0.1
0.05
0
0
10
15
1-18
Probability Distributions
1
---
2 --2-
2
- e Pr [
F ( x , ) =
x]
----------- + 2j
j!
j = 0
Example and Plot. The following commands generate a plot of the noncentral
chi-square pdf.
x = (0:0.1:10)';
p1 = ncx2pdf(x,4,2);
p = chi2pdf(x,4);
plot(x,p,' ',x,p1,'')
0.2
0.15
0.1
0.05
0
0
10
1-19
Tutorial
1
y = f ( x N ) = ----I ( 1, , N ) ( x )
N
Example and Plot. As for all discrete distributions, the cdf is a step function. The
plot shows the discrete uniform cdf for N = 10.
x = 0:10;
y = unidcdf(x,10);
stairs(x,y)
set(gca,'Xlim',[0 11])
1
0.8
0.6
0.4
0.2
0
0
10
1-20
213
37
231
380
326
515
468
Probability Distributions
Exponential Distribution
Background. Like the chi-square, the exponential distribution is a special case
--1
a1 b
e
y = f ( x a, b ) = ------------------x
a
b (a)
1 --y = f ( x ) = --- e
Parameter Estimation. Suppose you are stress testing light bulbs and collecting
data on their lifetimes. You assume that these lifetimes follow an exponential
distribution. You want to know how long you can expect the average light bulb
to last. Parameter estimation is the process of determining the parameters of
the exponential distribution that fit this data best in some sense.
1-21
Tutorial
The function expfit returns the MLEs and confidence intervals for the
parameters of the exponential distribution. Here is an example using random
numbers from the exponential distribution with = 700.
lifetimes = exprnd(700,100,1);
[muhat, muci] = expfit(lifetimes)
muhat =
672.8207
muci =
547.4338
810.9437
The MLE for the parameter, is 672 compared to the true value of 700. The
95% confidence interval for goes from 547 to 811, which includes the true
value.
In our life tests we do not know the true value of so it is nice to have a
confidence interval on the parameter to give a range of likely values.
Example and Plot. For exponentially distributed lifetimes, the probability that
an item will survive an extra unit of time is independent of the current age of
the item. The example shows a specific case of this special property.
l = 10:10:60;
lpd = l+0.1;
deltap = (expcdf(lpd,50)expcdf(l,50))./(1expcdf(l,50))
deltap =
0.0020
1-22
0.0020
0.0020
0.0020
0.0020
0.0020
Probability Distributions
The plot shows the exponential pdf with its parameter (and mean), lambda, set
to two.
x = 0:0.1:10;
y = exppdf(x,2);
plot(x,y)
0.5
0.4
0.3
0.2
0.1
0
0
10
F Distribution
Background. The F distribution has a natural relationship with the chi-square
( 1 + 2 )
1 2
1
------------- ----------------------- ---2
x 2
1
y = f ( x 1 , 2 ) = -------------------------------- ------ 2 -----------------------------------------1 + 2
2 2
1
1 ---------------
1-23
Tutorial
The plot shows that the F distribution exists on the positive real numbers and
is skewed to the right.
x = 0:0.01:10;
y = fpdf(x,5,3);
plot(x,y)
0.8
0.6
0.4
0.2
0
0
10
Noncentral F Distribution
Background. As with the 2 the F distribution is a special case of the noncentral
2
-e
F ( x 1, 2, ) =
-----------j!
j = 0
2
1 x 1
------------------------- ------ + j, ------
2
2 + 1 x 2
1-24
Probability Distributions
Example and Plot. The following commands generate a plot of the noncentral F
pdf.
x = (0.01:0.1:10.01)';
p1 = ncfpdf(x,5,20,10);
p = fpdf(x,5,20);
plot(x,p,' ',x,p1,'')
0.8
0.6
0.4
0.2
0
0
10
12
Gamma Distribution
Background. The gamma distribution is a family of curves based on two
--1
a1 b
y = f ( x a, b ) = ------------------x
e
a
b (a)
1-25
Tutorial
Parameter Estimation. Suppose you are stress testing computer memory chips and
collecting data on their lifetimes. You assume that these lifetimes follow a
gamma distribution. You want to know how long you can expect the average
computer memory chip to last. Parameter estimation is the process of
determining the parameters of the gamma distribution that fit this data best
in some sense.
3.1543
6.2974
Note phat(1) = a and phat(2) = b . The MLE for the parameter, a is 10.98
compared to the true value of 10. The 95% confidence interval for a goes from
7.4 to 14.6, which includes the true value.
Similarly the MLE for the parameter, b is 4.7 compared to the true value of 5.
The 95% confidence interval for b goes from 3.2 to 6.3, which also includes the
true value.
In our life tests we do not know the true value of a and b so it is nice to have a
confidence interval on the parameters to give a range of likely values.
1-26
Probability Distributions
Example and Plot. In the example the gamma pdf is plotted with the solid line.
The normal pdf has a dashed line type.
x = gaminv((0.005:0.01:0.995),100,10);
y = gampdf(x,100,10);
y1 = normpdf(x,1000,100);
plot(x,y,'',x,y1,'.')
x 10-3
5
4
3
2
1
0
700
800
900
1000
1100
1200
1300
Geometric Distribution
Background. The geometric distribution is discrete, existing only on the
y = f ( x p ) = pq I ( 0, 1, K ) ( x )
where
q = 1p
1-27
Tutorial
0.6
0.4
0.2
0
0
10
15
20
25
Hypergeometric Distribution
Background. The hypergeometric distribution models the total number of
1-28
Probability Distributions
K M K
x n x
y = f ( x M, K, n ) = ----------------------------- M
n
Example and Plot. The plot shows the cdf of an experiment taking 20 samples
from a group of 1000 where there are 50 items of the desired type.
x = 0:10;
y = hygecdf(x,1000,50,20);
stairs(x,y)
1
0.8
0.6
0.4
0.2
0
10
Lognormal Distribution
Background. The normal and lognormal distributions are closely related. If X is
1-29
Tutorial
1
y = f ( x , ) = ------------------ e
x 2
Example and Plot. Suppose the income of a family of four in the United States
follows a lognormal distribution with = log(20,000) and 2 = 1.0. Plot the
income density.
x = (10:1000:125010)';
y=lognpdf(x,log(20000),1.0);
plot(x,y)
set(gca,'Xtick',[0 30000 60000 90000 120000 ])
set(gca,'xticklabel',str2mat('0','$30,000','$60,000',...
'$90,000','$120,000'))
x 10-5
4
0
0
1-30
Probability Distributions
r + x 1 r x
y = f ( x r, p ) =
p q I ( 0, 1, ) ( x )
x
where
q = 1p
Example and Plot. The following commands generate a plot of the negative
binomial pdf.
x = (0:10);
y = nbinpdf(x,3,0.5);
plot(x,y,'+')
set(gca,'XLim',[0.5,10.5])
0.2
0.15
0.1
0.05
0
0
10
Normal Distribution
Background. The normal distribution is a two parameter family of curves. The
first parameter, , is the mean. The second, , is the standard deviation. The
standard normal distribution (written (x)) sets to zero and to one.
1-31
Tutorial
( x )
---------------------2
2
2
1
y = f ( x , ) = --------------- e
2
1)
1
s = --n
2
( xi x )
i=1
2)
1
s = ------------n1
2
where x =
( xi x )
i=1
xi
----n
1-32
Probability Distributions
The function normfit returns the MVUEs and confidence intervals for and
2. Here is a playful example modeling the heights (inches) of a randomly
chosen 4th grade class.
height = normrnd(50,2,30,1); % Simulate heights.
[mu, s, muci, sci] = normfit(height)
mu =
50.2025
s =
1.7946
muci =
49.5210
50.8841
sci =
1.4292
2.4125
Example and Plot. The plot shows the bell curve of the standard normal pdf
= 0, = 1.
0.4
0.3
0.2
0.1
0
-3
-2
-1
Poisson Distribution
Background. The Poisson distribution is appropriate for applications that
involve counting the number of times a random event occurs in a given amount
of time, distance, area, etc. Sample applications that involve Poisson
distributions include the number of Geiger counter clicks per second, the
number of people walking into a store in an hour, and the number of flaws per
1000 feet of video tape.
The Poisson distribution is a one parameter discrete distribution that takes
nonnegative integer values. The parameter, , is both the mean and the
1-33
Tutorial
y = f ( x ) = -----e I ( 0, 1, K ) ( x )
x!
Parameter Estimation. The MLE and the MVUE of the Poisson parameter, , is the
sample mean. The sum of independent Poisson random variables is also
Poisson with parameter equal to the sum of the individual parameters. The
Statistics Toolbox makes use of this fact to calculate confidence intervals on .
As gets large the Poisson distribution can be approximated by a normal
distribution with = and 2 = . The Statistics Toolbox uses this
approximation for calculating confidence intervals for values of greater than
100.
1-34
Probability Distributions
Example and Plot. The plot shows the probability for each nonnegative integer
when = 5.
x = 0:15;
y = poisspdf(x,5);
plot(x,y,'+')
0.2
0.15
0.1
0.05
0
0
10
15
Rayleigh Distribution
Background. The Rayleigh distribution is a special case of the Weibull
2
p
-x
b
b p 1 ---2
y = f x ------, p = ------ p
e
I ( 0, ) ( x )
2
2
If the velocity of a particle in the x and y directions are two independent normal
random variables with zero means and equal variances, then the distance the
particle travels per unit time is distributed Rayleigh.
Mathematical Definition. The Rayleigh pdf is:
2
x
y = f ( x b ) = -----2- e
b
x
------- 2b 2
1-35
Tutorial
Example and Plot. The following commands generate a plot of the Rayleigh pdf.
x = [0:0.01:2];
p = raylpdf(x,0.5);
plot(x,p)
1.5
0.5
0
0
0.5
1.5
xi
=1
b = i----------------2n
Students t Distribution
Background. The t distribution is a family of curves depending on a single
1-36
Probability Distributions
+1
------------
2 1
1
y = f ( x ) = --------------------- ---------- -----------------------------+1
-----------
2
--2
2
1 + x-----
Example and Plot. The plot compares the t distribution with = 5 (solid line) to
the shorter tailed standard normal distribution (dashed line).
x = 5:0.1:5;
y = tpdf(x,5);
z = normpdf(x,0,1);
plot(x,y,'',x,z,'.')
0.4
0.3
0.2
0.1
0
-5
Noncentral t Distribution
Background. The noncentral t distribution is a generalization of the familiar
Students t distribution.
If x and s are the mean and standard deviation of an independent random
sample of size n from a normal distribution with mean , and 2 = n, then:
x
t ( ) = -----------s
= n1
Suppose that the mean of the normal distribution is not . Then the ratio has
the noncentral t distribution. The noncentrality parameter is the difference
between the sample mean and .
1-37
Tutorial
j = 0
x2 1
- + j, ---
--------------2- -2
+ x 2
pdf.
x = (5:0.1:5)';
p1 = nctcdf(x,10,1);
p = tcdf(x,10);
plot(x,p,' ',x,p1,'')
1
0.8
0.6
0.4
0.2
0
-5
pdf between its two parameters a, the minimum, and b, the maximum. The
standard uniform distribution (a = 0 and b =1) is a special case of the beta
distribution, setting both of its parameters to one.
1-38
Probability Distributions
xa
p = F ( x a, b ) = ------------I [ a, b ] ( x )
ba
Parameter Estimation. The sample minimum and maximum are the MLEs of a
and b respectively.
Example and Plot. The example illustrates the inversion method for generating
normal random numbers using rand and norminv. Note that the MATLAB
function, randn, does not use inversion since it is not efficient for this case.
u = rand(1000,1);
x = norminv(u,0,1);
hist(x)
300
200
100
0
-4
-2
Weibull Distribution
Background. Waloddi Weibull (1939) offered the distribution that bears his
1-39
Tutorial
Substituting the pdf and cdf of the exponential distribution for f(t) and F(t)
above yields a constant. The example on the next page shows that the hazard
rate for the Weibull distribution can vary.
Mathematical Definition. The Weibull pdf is:
y = f ( x a, b ) = abx
b 1 ax
I ( 0, ) ( x )
1.9582
ci =
0.3851
0.5641
1.6598
2.2565
The default 95% confidence interval for each parameter contains the true
value.
Example and Plot. The exponential distribution has a constant hazard function,
which is not generally the case for the Weibull distribution.
1-40
Probability Distributions
The plot shows the hazard functions for exponential (dashed line) and Weibull
(solid line) distributions having the same mean life. The Weibull hazard rate
here increases with age (a reasonable assumption).
t = 0:0.1:3;
h1 = exppdf(t,0.6267)./(1 expcdf(t,0.6267));
h2 = weibpdf(t,2,2)./(1 weibcdf(t,2,2));
plot(t,h1,' ',t,h2,'')
15
10
0
0
0.5
1.5
2.5
1-41
Tutorial
Descriptive Statistics
Data samples can have thousands (even millions) of values. Descriptive
statistics are a way to summarize this data into a few numbers that contain
most of the relevant information.
Geometric Mean.
harmmean
Harmonic Mean.
mean
median
trimmean
Trimmed Mean.
The average is a simple and popular estimate of location. If the data sample
comes from a normal distribution, then the sample average is also optimal
(MVUE of ).
Unfortunately, outliers, data entry errors, or glitches exist in almost all real
data. The sample average is sensitive to these problems. One bad data value
can move the average away from the center of the rest of the data by an
arbitrarily large distance.
The median and trimmed mean are two measures that are resistant (robust) to
outliers. The median is the 50th percentile of the sample, which will only
change slightly if you add a large perturbation to any value. The idea behind
the trimmed mean is to ignore a small percentage of the highest and lowest
values of a sample for determining the center of the sample.
1-42
Descriptive Statistics
The geometric mean and harmonic mean, like the average, are not robust to
outliers. They are useful when the sample is distributed lognormal or heavily
skewed.
The example shows the behavior of the measures of location for a sample with
one outlier.
x = [ones(1,6) 100]
x =
1
100
1.1647
15.1429
1.0000
1.0000
You can see that the mean is far from any data value because of the influence
of the outlier. The median and trimmed mean ignore the outlying value and
describe the location of the rest of the data values.
Measures of Dispersion
The purpose of measures of dispersion is to find out how spread out the data
values are on the number line. Another term for these statistics is measures of
spread.
The table gives the function names and descriptions.
Measures of Dispersion
iqr
Interquartile Range.
mad
range
Range.
1-43
Tutorial
Measures of Dispersion
std
var
Variance.
The range (the difference between the maximum and minimum values) is the
simplest measure of spread. But if there is an outlier in the data, it will be the
minimum or maximum value. Thus, the range is not robust to outliers.
The standard deviation and the variance are popular measures of spread that
are optimal for normally distributed samples. The sample variance is the
MVUE of the normal parameter 2. The standard deviation is the square root
of the variance and has the desirable property of being in the same units as the
data. That is, if the data is in meters the standard deviation is in meters as
well. The variance is in meters2, which is more difficult to interpret.
Neither the standard deviation nor the variance is robust to outliers. A data
value that is separate from the body of the data can increase the value of the
statistics by an arbitrarily large amount.
The Mean Absolute Deviation (MAD) is also sensitive to outliers. But the MAD
does not move quite as much as the standard deviation or variance in response
to bad data.
The Interquartile Range (IQR) is the difference between the 75th and 25th
percentile of the data. Since only the middle 50% of the data affects this
measure, it is robust to outliers.
The example below shows the behavior of the measures of dispersion for a
sample with one outlier.
x = [ones(1,6) 100]
x =
1
100
1-44
24.2449
99.0000
37.4185
Descriptive Statistics
1
NaN
9
6
7
NaN
Simply removing any row with a NaN in it would leave us with nothing, but any
arithmetic operation involving NaN yields NaN as below.
sum(m)
ans =
NaN
NaN
NaN
The NaN functions support the tabled arithmetic operations ignoring NaN.
nansum(m)
ans =
7
10
13
NaN Functions
nanmax
nanmean
nanmedian
nanmin
1-45
Tutorial
NaN Functions
nanstd
nansum
=
=
=
=
[normrnd(4,1,1,100) normrnd(6,0.5,1,200)];
100*(0:0.25:1);
prctile(x,p);
[p; y]
z =
0
1.5172
25.0000
4.6842
50.0000
5.6706
75.0000
6.1804
1-46
100.0000
7.6035
Descriptive Statistics
The box plot is a graph for descriptive statistics. The graph below is a box plot
of the data above.
boxplot(x)
7
Values
6
5
4
3
2
1
Column Number
The long lower tail and plus signs show the lack of symmetry in the sample
values. For more information on box plots see page 1-103.
The histogram is a complementary graph.
hist(x)
100
80
60
40
20
0
1
The Bootstrap
In the last decade the statistical literature has examined the properties of
resampling as a means to acquire information about the uncertainty of
statistical estimators.
The bootstrap is a procedure that involves choosing random samples with
replacement from a data set and analyzing each sample the same way.
Sampling with replacement means that every sample is returned to the data set
after sampling. So a particular data point from the original data set could
1-47
Tutorial
560
580
600
620
640
660
680
The least squares fit line indicates that higher LSAT scores go with higher law
school GPAs. But how sure are we of this conclusion? The plot gives us some
intuition but nothing quantitative.
We can calculate the correlation coefficient of the variables using the corrcoef
function.
rhohat = corrcoef(lsat,gpa)
rhohat =
1.0000
0.7764
0.7764
1.0000
1-48
Descriptive Statistics
Here is an example:
rhos1000 = bootstrp(1000,'corrcoef',lsat,gpa);
This command resamples the lsat and gpa vectors 1000 times and computes
the corrcoef function on each sample. Here is a histogram of the result.
hist(rhos1000(:,2),30)
100
80
60
40
20
0
0.2
0.4
0.6
0.8
1-49
Tutorial
Cluster Analysis
Cluster analysis, also called segmentation analysis or taxonomy analysis, is a
way to partition a set of objects into groups, or clusters, in such a way that the
profiles of objects in the same cluster are very similar and the profiles of objects
in different clusters are quite distinct.
Cluster analysis can be performed on many different types of datasets. For
example, a dataset might contain a number of observations of subjects in a
study where each observation contains a set of variables.
Many different fields of study, such as engineering, zoology, medicine,
linguistics, anthropology, psychology, and marketing, have contributed to the
development of clustering techniques and the application of such techniques.
For example, cluster analysis can be used to find two similar groups for the
experiment and control groups in a study. In this way, if statistical differences
are found in the groups, they can be attributed to the experiment and not to
any initial difference between the groups.
the dataset. In this step, you calculate the distance between objects using
the pdist function. The pdist function supports many different ways to
compute this measurement. See the section Finding the Similarities
Between Objects for more information.
2 Group the objects into a binary, hierarchical cluster tree. In this step,
you link together pairs of objects that are in close proximity using the
linkage function. The linkage function uses the distance information
generated in step 1 to determine the proximity of objects to each other. As
objects are paired into binary clusters, the newly formed clusters are
grouped into larger clusters until a hierarchical tree is formed. See the
section Defining the Links Between Objects for more information.
3 Determine where to divide the hierarchical tree into clusters. In this
step, you divide the objects in the hierarchical tree into clusters using the
cluster function. The cluster function can create clusters by detecting
1-50
Cluster Analysis
Note You can optionally normalize the values in the dataset before
calculating the distance information. In a real world dataset, variables can be
measured against different scales. For example, one variable can measure IQ
test scores and another variable measure head circumference. These
discrepancies can distort the proximity calculations. Using the zscore
function, you can convert all the values in the dataset to use the same
proportional scale. See the zscore function for more information.
1-51
Tutorial
For example, consider a data set, X, made up of five objects where each object
is a set of x,y coordinates.
Object
Value
2.5
4.5
1.5
2.5
You can define this dataset as a matrix, X = [1 2;2.5 4.5;2 2;4 1.5;4 2.5],
and pass it to pdist. The pdist function calculates the distance between object
1 and object 2, object 1 and object 3, and so on until the distances between all
the pairs have been calculated. The following figure plots these objects in a
graph. The distance between object 2 and object 3 is shown to illustrate one
interpretation of distance.
5
2
4
distance
3
5
2
1
1
1-52
Cluster Analysis
2.5495
3.3541
2.5000
2.9155
0
2.5495
3.3541
2.5000
1.0000
2.5495
0
2.0616
2.0616
3.0414
3.3541
2.0616
0
1.0000
3.0414
2.5000
2.0616
1.0000
0
1-53
Tutorial
For example, given the distance vector, Y, generated by pdist from the sample
dataset of x and y coordinates, the linkage function generates a hierarchical
cluster tree, returning the linkage information in a matrix, Z.
Z = linkage(Y)
Z =
1.0000
3.0000
4.0000
5.0000
6.0000
7.0000
8.0000
2.0000
1.0000
1.0000
2.0616
2.5000
In this output, each row identifies a link. The first two columns identify the
objects that have been linked, that is, object 1, object 2, and so on. The third
column contains the distance between these objects. For the sample dataset of
x and y coordinates, the linkage function begins by grouping together objects
1 and 3, which have the closest proximity (distance value = 1.0000). The
linkage function continues by grouping objects 4 and 5, which also have a
distance value of 1.0000.
The third row indicates that the linkage function grouped together objects 6
and 7. If our original sample dataset contained only 5 objects, what are objects
6 and 7? Object 6 is the newly formed binary cluster created by the grouping of
objects 1 and 3. When the linkage function groups two objects together into a
new cluster, it must assign the cluster a unique index value, starting with the
value m+1, where m is the number of objects in the original dataset. (Values 1
through m are already used by the original dataset.) Object 7 is the index for
the cluster formed by objects 4 and 5.
As the final cluster, the linkage function grouped object 8, the newly formed
cluster made up of objects 6 and 7, with object 2 from the original dataset. The
1-54
Cluster Analysis
following figure graphically illustrates the way linkage groups the objects into
a hierarchy of clusters.
5
2
4
8
3
6
1
5
4
1
1
The hierarchical, binary cluster tree created by the linkage function is most
easily understood when viewed graphically. The Statistics Toolbox includes the
dendrogram function that plots this hierarchical tree information as a graph,
as in the following example.
dendrogram(Z)
2.5
1.5
0.5
1-55
Tutorial
In the figure, the numbers along the horizontal axis represent the indices of the
objects in the original dataset. The links between objects are represented as
upside down U-shaped lines. The height of the U indicates the distance
between the objects. For example, the link representing the cluster containing
objects 1 and 3 has a height of 1. For more information about creating a
dendrogram diagram, see the dendrogram function reference page.
where Z is the matrix output by the linkage function and Y is the distance
vector output by the pdist function.
1-56
Cluster Analysis
Execute pdist again on the same dataset, this time specifying the City Block
metric. After running the linkage function on this new pdist output, use the
cophenet function to evaluate the clustering using a different distance metric.
c = cophenet(Z,Y)
c =
0.9289
The cophenetic correlation coefficient shows a stronger correlation when the
City Block metric is used.
1-57
Tutorial
rand('seed',3)
X = [rand(10,2)+1;rand(10,2)+2;rand(10,2)+3];
Y = pdist(X);
Z = linkage(Y);
dendrogram(Z);
These links show inconsistency, when compared to links below them.
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
23 25 29 30 27 28 26 24 21 22 11 12 15 13 16 18 20 14 17 19 1 2 7 3 6 8 9 10 4 5
1-58
Cluster Analysis
each link in the cluster hierarchy with adjacent links two levels below it in the
cluster hierarchy. This is called the depth of the comparison. Using the
inconsistent function, you can specify other depths. The objects at the bottom
of the cluster tree, called leaf nodes, that have no further objects below them,
have an inconsistency coefficient of zero.
For example, returning to the sample dataset of x and y coordinates, lets use
the inconsistent function to calculate the inconsistency values for the links
created by the linkage function, described in Defining the Links Between
Objects on page 1-53.
I = inconsistent(Z)
I =
1.0000
0
1.0000
0
1.3539
0.8668
2.2808
0.3100
1.0000
1.0000
3.0000
2.0000
0
0
0.8165
0.7071
The inconsistent function returns data about the links in an m-1 by 4 matrix
where each column provides data about the links.
Column
Description
Inconsistency coefficient.
In the sample output, the first row represents the link between objects 1 and 3.
(This cluster is assigned the index 6 by the linkage function.) Because this a
leaf node, the inconsistency coefficient is zero. The second row represents the
link between objects 4 and 5, also a leaf node. (This cluster is assigned the
index 7 by the linkage function.)
The third row evaluates the link that connects these two leaf nodes, objects 6
and 7. (This cluster is called object 8 in the linkage output). Column three
indicates that three links are considered in the calculation: the link itself and
the two links directly below it in the hierarchy. Column one represents the
mean of the lengths of these links. The inconsistent function uses the length
1-59
Tutorial
information output by the linkage function to calculate the mean. Column two
represents the standard deviation between the links. The last column contains
the inconsistency value for these links, 0.8165.
The following figure illustrates the links and lengths included in this
calculation.
2.5
Link 1
Link 2
2
Link 3
1.5
Lengths
0.5
Row four in the output matrix describes the link between object 8 and object 2.
Column three indicates that two links are included in this calculation: the link
itself and the link directly below it in the hierarchy. The inconsistency
coefficient for this link is 0.7071.
1-60
Cluster Analysis
The following figure illustrates the links and lengths included in this
calculation.
Link 1
2.5
Link 2
2
1.5
Lengths
1
0.5
Creating Clusters
After you create the hierarchical tree of binary clusters, you can divide the
hierarchy into larger clusters using the cluster function. The cluster
function lets you create clusters in two ways:
By finding the natural divisions in the original data set
By specifying an arbitrary number of clusters
1-61
Tutorial
For example, if you use the cluster function to group the sample dataset into
clusters, specifying an inconsistency coefficient threshold of 0.9 as the value of
the cutoff argument, the cluster function groups all the objects in the sample
dataset into one cluster. In this case, none of the links in the cluster hierarchy
had an inconsistency coefficient greater than 0.9.
T = cluster(Z,0.9)
T =
1
1
1
1
1
The cluster function outputs a vector, T, that is the same size as the original
dataset. Each element in this vector contains the number of the cluster into
which the corresponding object from the original data set was placed.
If you lower the inconsistency coefficient threshold to 0.8, the cluster function
divides the sample dataset into three separate clusters.
T = cluster(Z,0.8)
T =
1
3
1
2
2
This output indicates that objects 1 and 3 were placed in cluster 1, objects 4 and
5 were placed in cluster 2, and object 2 was placed in cluster 3.
1-62
Cluster Analysis
For example, you can specify that you want the cluster function to divide the
sample dataset into two clusters. In this case, the cluster function creates one
cluster containing objects 1, 3, 4, and 5 and another cluster containing object 2.
T = cluster(Z,2)
T =
1
2
1
1
1
To help you visualize how the cluster function determines how to create these
clusters, the following figure shows the dendrogram of the hierarchical cluster
tree. When you specify a value of 2, the cluster function draws an imaginary
horizontal line across the dendrogram that bisects two vertical lines. All the
objects below the line belong to one of these two clusters.
2.5
cutoff = 2
2
1.5
0.5
1-63
Tutorial
If you specify a cutoff value of 3, the cluster function cuts off the hierarchy
at a lower point, bisecting three lines.
T = cluster(Z,3)
T =
1
3
1
2
2
This time, objects 1 and 3 are grouped in a cluster, objects 4 and 5 are grouped
in a cluster and object 2 is placed into a cluster, as seen in the following figure.
2.5
cutoff = 3
1.5
0.5
1-64
Linear Models
Linear Models
Linear models are problems that take the form
y = X +
where:
y is an n by 1 vector of observations.
X is the n by p design matrix for the model.
is a p by 1 vector of parameters.
is an n by 1 vector of random disturbances.
One-way analysis of variance (ANOVA), two-way ANOVA, polynomial
regression, and multiple linear regression are specific cases of the linear model.
1-65
Tutorial
randomly from each shipment. Do some shipments have higher counts than
others?
load hogg
p = anova1(hogg)
p =
1.1971e04
hogg
hogg =
24
15
21
27
33
23
14
7
12
17
14
16
11
9
7
13
12
18
7
7
4
7
12
18
19
24
19
15
10
20
The standard ANOVA table has columns for the sums of squares, degrees of
freedom, mean squares (SS/df), and F statistic.
ANOVA Table
Source
Columns
Error
Total
SS
803
557.2
1360
df
4
25
29
MS
200.7
22.29
F
9.008
You can use the F statistic to do a hypothesis test to find out if the bacteria
counts are the same. anova1 returns the p-value from this hypothesis test.
In this case the p-value is about 0.0001, a very small value. This is a strong
indication that the bacteria counts from the different tankers are not the same.
An F statistic as extreme as the observed F would occur by chance only once in
10,000 times if the counts were truly equal.
The p-value returned by anova1 depends on assumptions about the random
disturbances in the model equation. For the p-value to be correct, these
1-66
Linear Models
25
20
15
10
5
1
Column Number
Since the notches in the box plots do not all overlap, this is strong confirming
evidence that the column means are not equal.
1-67
Tutorial
Two-way ANOVA is a special case of the linear model. The two-way ANOVA
form of the model is
y ijk = + .j + i. + ij + ijk
where:
yijk is a matrix of observations.
is a constant matrix of the overall mean.
.j is a matrix whose columns are the group means (the rows of sum to 0).
i. is a matrix whose rows are the group means (the columns of sum to 0).
ij is a matrix of interactions (the rows and columns of sum to zero).
ijk is a matrix of random disturbances.
The purpose of the example is to determine the effect of car model and factory
on the mileage rating of cars.
load mileage
mileage
mileage =
33.3000
33.4000
32.9000
32.6000
32.5000
33.0000
34.5000
34.8000
33.8000
33.4000
33.7000
33.9000
37.4000
36.8000
37.6000
36.6000
37.0000
36.7000
cars = 3;
p = anova2(mileage,cars)
p =
0.0000
0.0039
0.8411
There are three models of cars (columns) and two factories (rows). The reason
there are six rows instead of two is that each factory provides three cars of each
model for the study. The data from the first factory is in the first three rows,
and the data from the second factory is in the last three rows.
1-68
Linear Models
The standard ANOVA table has columns for the sums of squares, degrees of
freedom, mean squares (SS/df), and F statistics.
ANOVA Table
Source
SS
Columns
53.35
Rows
1.445
Interaction 0.04
Error
1.367
Total
56.2
df
2
1
2
12
17
MS
F
26.68 234.2
1.445 12.69
0.02 0.1756
0.1139
You can use the F statistics to do hypotheses tests to find out if the mileage is
the same across models, factories, and model-factory pairs (after adjusting for
the additive effects). anova2 returns the p-value from these tests.
The p-value for the model effect is zero to four decimal places. This is a strong
indication that the mileage varies from one model to another. An F statistic as
extreme as the observed F would occur by chance less than once in 10,000 times
if the gas mileage were truly equal from model to model.
The p-value for the factory effect is 0.0039, which is also highly significant.
This indicates that one factory is out-performing the other in the gas mileage
of the cars it produces. The observed p-value indicates that an F statistic as
extreme as the observed F would occur by chance about four out of 1000 times
if the gas mileage were truly equal from factory to factory.
There does not appear to be any interaction between factories and models. The
p-value, 0.8411, means that the observed result is quite likely (84 out 100
times) given that there is no interaction.
The p-values returned by anova2 depend on assumptions about the random
disturbances in the model equation. For the p-values to be correct these
disturbances need to be independent, normally distributed and have constant
variance.
1-69
Tutorial
b = = ( X'X ) X'y
This equation is useful for developing later statistical formulas, but has poor
numeric properties. regress uses QR decomposition of X followed by the
backslash operator to compute b. The QR decomposition is not necessary for
computing b, but the matrix, R, is useful for computing confidence intervals.
You can plug b back into the model formula to get the predicted y values at the
data points.
y = Xb = Hy
1
H = X ( X'X ) X'
Statisticians use a hat (circumflex) over a letter to denote an estimate of a
parameter or a prediction from a model. The projection matrix H, is called the
hat matrix, because it puts the hat on y.
The residuals are the difference between the observed and predicted y values.
r = y y = ( I H ) y
1-70
Linear Models
The residuals are useful for detecting failures in the model assumptions, since
they correspond to the errors, , in the model equation. By assumption, these
errors each have independent normal distributions with mean zero and a
constant variance.
The residuals, however, are correlated and have variances that depend on the
locations of the data points. It is a common practice to scale (Studentize) the
residuals so they all have the same variance.
In the equation below, the scaled residual, ti, has a Students t distribution with
(np) degrees of freedom.
ri
t i = --------------------------- ( i ) 1 h i
where:
2
(i)
2
ri
r
= ---------------------- ----------------------------------------------n p 1 ( n p 1 ) ( 1 hi )
( i )
1
---,
1 hi
1-71
Tutorial
Confidence intervals that do not include zero are equivalent to rejecting the
hypothesis (at a significance probability of ) that the residual mean is zero.
Such confidence intervals are good evidence that the observation is an outlier
for the given model.
Example
The example comes from Chatterjee and Hadi (1986) in a paper on regression
diagnostics. The dataset (originally from Moore (1975)) has five predictor
variables and one response.
load moore
X = [ones(size(moore,1),1) moore(:,1:5)];
The matrix, X, has a column of ones, then one column of values for each of the
five predictor variables. The column of ones is necessary for estimating the
y-intercept of the linear model.
y = moore(:,6);
[b,bint,r,rint,stats] = regress(y,X);
The y-intercept is b(1), which corresponds to the column index of the column
of ones.
stats
stats =
0.8107
11.9886
0.0001
The elements of the vector stats are the regression R2 statistic, the F statistic
(for the hypothesis test that all the regression coefficients are zero), and the
p-value associated with this F statistic.
1-72
Linear Models
R2 is 0.8107 indicating the model accounts for over 80% of the variability in the
observations. The F statistic of about 12 and its p-value of 0.0001 indicate that
it is highly unlikely that all of the regression coefficients are zero.
rcoplot(r,rint)
Residuals
0.5
0
-0.5
10
Case Number
15
20
The plot shows the residuals plotted in case order (by row). The 95% confidence
intervals about these residuals are plotted as error bars. The first observation
is an outlier since its error bar does not cross the zero reference line.
(quadratic terms)
1-73
Tutorial
You will see a vector of three plots. The dependent variable of all three plots
is the reaction rate. The first plot has hydrogen as the independent variable.
The second and third plots have n-pentane and isopentane respectively.
Each plot shows the fitted relationship of the reaction rate to the independent
variable at a fixed value of the other two independent variables. The fixed
value of each independent variable is in an editable text box below each axis.
You can change the fixed value of any independent variable by either typing a
new value in the box or by dragging any of the 3 vertical lines to a new position.
When you change the value of an independent variable, all the plots update to
show the current picture at the new point in the space of the independent
variables.
Note that while this example only uses three reactants, rstool can
accommodate an arbitrary number of independent variables. Interpretability
may be limited by the size of the monitor for large numbers of inputs.
The GUI also has two pop-up menus. The Export menu facilitates saving
various important variables in the GUI to the base workspace. Below the
Export menu there is another menu that allows you to change the order of the
polynomial model from within the GUI. If you used the commands above, this
menu will have the string Full Quadratic. Other choices are:
Linear has the constant and first order terms only.
Pure Quadratic includes constant, linear and squared terms.
Interactions includes constant, linear, and cross product terms.
1-74
Linear Models
Stepwise Regression
Stepwise regression is a technique for choosing the variables to include in a
multiple regression model. Forward stepwise regression starts with no model
terms. At each step it adds the most statistically significant term (the one with
the highest F statistic or lowest p-value) until there are none left. Backward
stepwise regression starts with all the terms in the model and removes the
least significant terms until all the remaining terms are statistically
significant. It is also possible to start with a subset of all the terms and then
add significant terms or remove insignificant terms.
An important assumption behind the method is that some input variables in a
multiple regression do not have an important explanatory effect on the
response. If this assumption is true, then it is a convenient simplification to
keep only the statistically significant terms in the model.
One common problem in multiple regression analysis is multicollinearity of the
input variables. The input variables may be as correlated with each other as
they are with the response. If this is the case, the presence of one input variable
in the model may mask the effect of another input. Stepwise regression used as
a canned procedure is a dangerous tool because the resulting model may
include different variables depending on the choice of starting model and
inclusion strategy.
The Statistics Toolbox uses an interactive graphical user interface (GUI) to
provide a more understandable comparison of competing models. You can
explore the GUI using the Hald (1960) data set. Here are the commands to get
started.
load hald
stepwise(ingredients,heat)
The Hald data come from a study of the heat of reaction of various cement
mixtures. There are 4 components in each mixture, and the amount of heat
produced depends on the amount of each ingredient in the mixture.
1-75
Tutorial
All three windows have hot regions. When your mouse is above one of these
regions, the pointer changes from an arrow to a circle. Clicking on this point
initiates some activity in the interface.
1-76
Linear Models
Confidence Intervals
Column #
Parameter
Lower
Upper
1.44
1.02
1.86
0.4161
-0.1602
0.9924
-0.41
-1.029
0.2086
-0.614
-0.7615
-0.4664
RMSE
2.734
R-square
0.9725
176.6
1.581e-08
Coefficients and Confidence Intervals. The table at the top of the figure shows the
regression coefficient and confidence interval for every term (in or out of the
model.) The green rows in the table (on your monitor) represent terms in the
model while red rows indicate terms not currently in the model.
Clicking on a row in this table toggles the state of the corresponding term. That
is, a term in the model (green row) gets removed (turns red), and terms out of
the model (red rows) enter the model (turn green).
The coefficient for a term out of the model is the coefficient resulting from
adding that term to the current model.
Additional Diagnostic Statistics. There are also several diagnostic statistics at the
1-77
Tutorial
Stepwise History. This plot shows the RMSE and a confidence interval for every
model generated in the course of the interactive use of the other windows.
Recreating a Previous Model. Clicking on one of these lines re-creates the current
model at that point in the analysis using a new set of windows. You can thus
compare the two candidate models directly.
1-78
Mathematical Form
The Statistics Toolbox has functions for fitting nonlinear models of the form
y = f ( X, ) +
where:
y is an n by 1 vector of observations.
f is any function of X and .
X is an n by p matrix of input variables.
is a p by 1 vector of unknown parameters to be estimated.
is an n by 1 vector of random disturbances.
1-79
Tutorial
rate
reactants
xn
yn
where:
rate is a vector of observed reaction rates 13 by 1.
reactants is a three column matrix of reactants 13 by 3.
beta is vector of initial parameter estimates 5 by 1.
'model' is a string containing the nonlinear function name.
'xn' is a string matrix of the names of the reactants.
'yn' is a string containing the name of the response.
unknown parameters. You must also supply a function that takes the input
data and the current parameter estimate and returns the predicted responses.
In MATLAB this is called a function function.
1-80
beta(1);
beta(2);
beta(3);
beta(4);
beta(5);
x1 = x(:,1);
x2 = x(:,2);
x3 = x(:,3);
1-81
Tutorial
These outputs are useful for obtaining confidence intervals on the parameter
estimates and predicted responses.
1-82
3.2519
0.1632
0.1113
0.2857
3.1208
8.2937
3.8584
4.7950
0.0725
2.5687
14.2227
2.4393
3.9360
12.9440
8.2670
0.1437
11.3484
3.3145
0.9178
0.7244
0.8267
0.4775
0.4987
0.9666
0.9247
0.7327
0.7210
0.9459
0.9537
0.9228
0.8418
The matrix, opd, has the observed rates in column 1 and the predictions in
column 2. The 95% confidence interval is column 2 column 3. Note that the
confidence interval contains the observations in each case.
1-83
Tutorial
You will see a vector of three plots. The dependent variable of all three plots
is the reaction rate. The first plot has hydrogen as the independent variable.
The second and third plots have n-pentane and isopentane respectively.
Each plot shows the fitted relationship of the reaction rate to the independent
variable at a fixed value of the other two independent variables. The fixed
value of each independent variable is in an editable text box below each axis.
You can change the fixed value of any independent variable by either typing a
new value in the box or by dragging any of the 3 vertical lines to a new position.
When you change the value of an independent variable, all the plots update to
show the current picture at the new point in the space of the independent
variables.
Note that while this example only uses three reactants, nlintool, can
accommodate an arbitrary number of independent variables. Interpretability
may be limited by the size of the monitor for large numbers of inputs.
1-84
Hypothesis Tests
Hypothesis Tests
A hypothesis test is a procedure for determining if an assertion about a
characteristic of a population is reasonable.
For example, suppose that someone says that the average price of a gallon of
regular unleaded gas in Massachusetts is $1.15. How would you decide
whether this statement is true? You could try to find out what every gas station
in the state was charging and how many gallons they were selling at that price.
That approach might be definitive, but it could end up costing more than the
information is worth.
A simpler approach is to find out the price of gas at a small number of randomly
chosen stations around the state and compare the average price to $1.15.
Of course, the average price you get will probably not be exactly $1.15 due to
variability in price from one station to the next. Suppose your average price
was $1.18. Is this three cent difference a result of chance variability, or is the
original assertion incorrect? A hypothesis test can provide an answer.
Terminology
To get started, there are some terms to define and assumptions to make.
The null hypothesis is the original assertion. In this case the null hypothesis
is that the average price of a gallon of gas is $1.15. The notation is H0: =
1.15.
There are three possibilities for the alternative hypothesis. You might only be
interested in the result if gas prices were actually higher. In this case, the
alternative hypothesis is H1: > 1.15. The other possibilities are H1: < 1.15
and H1: 1.15.
The significance level is related to the degree of certainty you require in order
to reject the null hypothesis in favor of the alternative. By taking a small
sample you cannot be certain about your conclusion. So you decide in
advance to reject the null hypothesis if the probability of observing your
sampled result is less than the significance level. For a typical significance
level of 5% the notation is = 0.05. For this significance level, the probability
of incorrectly rejecting the null hypothesis when it is actually true is 5%. If
you need more protection from this error, then choose a lower value of .
1-85
Tutorial
The p-value is the probability of observing the given sample result under the
assumption that the null hypothesis is true. If the p-value is less than , then
you reject the null hypothesis. For example, if = 0.05 and the p-value is
0.03, then you reject the null hypothesis.
The converse is not true. If the p-value is greater than , you do not accept
the null hypothesis. You just have insufficient evidence to reject the null
hypothesis (which is the same for practical purposes).
The outputs for the hypothesis test functions also include confidence
intervals. Loosely speaking, a confidence interval is a range of values that
have a chosen probability of containing the true hypothesized quantity.
Suppose, in our example, 1.15 is inside a 95% confidence interval for the
mean, . That is equivalent to being unable to reject the null hypothesis at a
significance level of 0.05. Conversely if the 100(1 ) confidence interval
does not contain 1.15, then you reject the null hypothesis at the level of
significance.
Assumptions
The difference between hypothesis test procedures often arises from
differences in the assumptions that the researcher is willing to make about the
data sample. The Z-test assumes that the data represents independent
samples from the same normal distribution and that you know the standard
deviation, . The t-test has the same assumptions except that you estimate the
standard deviation using the data instead of specifying it as a known quantity.
Both tests have an associated signal-to-noise ratio:
x
z = -----------
or
n
where x =
x
T = -----------s
xi
---ni=1
The signal is the difference between the average and the hypothesized mean.
The noise is the standard deviation posited or estimated.
If the null hypothesis is true, then Z has a standard normal distribution,
N(0,1). T has a Students t distribution with the degrees of freedom, , equal to
one less than the number of data values.
1-86
Hypothesis Tests
Given the observed result for Z or T, and knowing their distribution assuming
the null hypothesis is true, it is possible to compute the probability (p-value) of
observing this result. If the p-value is very small, then that casts doubt on the
truth of the null hypothesis. For example, suppose that the p-value was 0.001,
meaning that the probability of observing the given Z (or T) was one in a
thousand. That should make you skeptical enough about the null hypothesis
that you reject it rather than believe that your result was just a lucky 999 to 1
shot.
Example
This example uses the gasoline price data in gas.mat. There are two samples
of 20 observed gas prices for the months of January and February 1993.
load gas
prices = [price1 price2]
prices =
119
117
115
116
112
121
115
122
116
118
109
112
119
112
117
113
114
109
109
118
118
115
115
122
118
121
120
122
120
113
120
123
121
109
117
117
120
116
118
125
1-87
Tutorial
Suppose it is historically true that the standard deviation of gas prices at gas
stations around Massachusetts is four cents a gallon. The Z-test is a procedure
for testing the null hypothesis that the average price of a gallon of gas in
January (price1) is $1.15.
[h,pvalue,ci] = ztest(price1/100,1.15,0.04)
h =
0
pvalue =
0.8668
ci =
1.1340
1.1690
The result of the hypothesis test is the boolean variable, h. When h = 0, you do
not reject the null hypothesis.
The result suggests that $1.15 is reasonable. The 95% confidence interval
[1.1340 1.1690] neatly brackets $1.15.
What about February? Try a t-test with price2. Now you are not assuming
that you know the standard deviation in price.
[h,pvalue,ci] = ttest(price2/100,1.15)
h =
1
pvalue =
4.9517e-04
ci =
1.1675
1-88
1.2025
Hypothesis Tests
With the boolean result, h = 1, you can reject the null hypothesis at the default
significance level, 0.05.
It looks like $1.15 is not a reasonable estimate of the gasoline price in
February. The low end of the 95% confidence interval is greater than 1.15.
The function ttest2 allows you to compare the means of the two data samples.
[h,sig,ci] = ttest2(price1,price2)
h =
1
sig =
0.0083
ci =
5.7845
0.9155
The confidence interval (ci above) indicates that gasoline prices were between
one and six cents lower in January than February.
1-89
Tutorial
The box plot gives the same conclusion graphically. Note that the notches have
little, if any, overlap. Refer to Statistical Plots for more information about box
plots.
boxplot(prices,1)
set(gca,'XtickLabel',str2mat('January','February'))
xlabel('Month')
ylabel('Prices ($0.01)')
125
Prices ($0.01)
120
115
110
January
February
Month
1-90
Multivariate Statistics
Multivariate Statistics
Multivariate statistics is an omnibus term for a number of different statistical
methods. The defining characteristic of these methods is that they all aim to
understand a data set by considering a group of variables together rather than
focusing on only one variable at a time.
1-91
Tutorial
Example
Let us look at a sample application that uses nine different indices of the
quality of life in 329 U.S. cities. These are climate, housing, health, crime,
transportation, education, arts, recreation, and economics. For each index,
higher is better; so, for example, a higher index for crime means a lower crime
rate.
We start by loading the data in cities.mat.
load cities
whos
Name
categories
names
ratings
Size
Bytes
Class
9x14
329x43
329x9
252
28294
23688
char array
char array
double array
The whos command generates a table of information about all the variables in
the workspace. The cities data set contains three variables:
categories, a string matrix containing the names of the indices.
names, a string matrix containing the 329 city names.
ratings, the data matrix with 329 rows and 9 columns.
1-92
Multivariate Statistics
Now, lets look at the first several rows of names variable, too.
first5 = names(1:5,:)
first5 =
Abilene, TX
Akron, OH
Albany, GA
Albany-Troy, NY
Albuquerque, NM
1-93
Tutorial
These commands generate the plot below. Note that there is substantially more
variability in the ratings of the arts and housing than in the ratings of crime
and climate.
economics
recreation
Column Number
arts
education
transportation
crime
health
housing
climate
0
Values
5
x 104
Ordinarily you might also graph pairs of the original variables, but there are
36 two-variable plots. Maybe principal components analysis can reduce the
number of variables we need to consider.
Sometimes it makes sense to compute principal components for raw data. This
is appropriate when all the variables are in the same units. Standardizing the
data is reasonable when the variables are in different units or when the
variance of the different columns is substantial (as in this case).
You can standardize the data by dividing each column by its standard
deviation.
stdr = std(ratings);
sr = ratings./stdr(ones(329,1),:);
1-94
Multivariate Statistics
0.2178
0.2506
0.2995
0.3553
0.1796
0.4834
0.1948
0.3845
0.4713
0.6900
0.2082
0.0073
0.1851
0.1464
0.2297
0.0265
0.0509
0.6073
The largest weights in the first column (first principal component) are the third
and seventh elements corresponding to the variables, arts and health. All the
elements of the first principal component are the same sign, making it a
weighted average of all the variables.
To show the orthogonality of the principal components note that
premultiplying them by their transpose yields the identity matrix.
I = p3'*p3
I =
1.0000
0.0000
0.0000
0.0000
1.0000
0.0000
0.0000
0.0000
1.0000
1-95
Tutorial
A plot of the first two columns of newdata shows the ratings data projected onto
the first two principal components.
plot(newdata(:,1),newdata(:,2),'+')
xlabel('1st Principal Component');
ylabel('2nd Principal Component');
4
3
2
1
0
-1
-2
-3
-4
-4
-2
2
4
6
8
1st Principal Component
10
12
14
Move your cursor over the plot and click once near each point at the top right.
When you finish press the return key. Here is the resulting plot.
1-96
Multivariate Statistics
4
Chicago, IL
Washington, DC-MD-VA
New York, NY
2
Boston, MA
1
0
-1
San Francisco, CA
-2
-3
-4
-4
-2
2
4
6
8
1st Principal Component
10
12
14
The labeled cities are the biggest population centers in the United States.
Perhaps we should consider them as a completely separate group. If we call
gname without arguments, it labels each point with its row number.
1-97
Tutorial
4
234
65
213
314
2
43
1
0
179
-1
270
-2
-3
-4
-4
-2
2
4
6
8
1st Principal Component
10
12
14
We can create an index variable containing the row numbers of all the
metropolitan areas we chose.
metro = [43 65 179 213 234 270 314];
names(metro,:)
ans =
Boston, MA
Chicago, IL
Los Angeles, Long Beach, CA
New York, NY
Philadelphia, PA-NJ
San Francisco, CA
Washington, DC-MD-VA
1-98
Multivariate Statistics
To practice, repeat the analysis using the variable rsubset as the new data
matrix and nsubset as the string matrix of labels.
1-99
Tutorial
You can easily calculate the percent of the total variability explained by each
principal component.
percent_explained = 100*variances/sum(variances)
percent_explained =
37.8699
13.4886
12.6831
10.2324
8.3698
7.0062
5.4783
3.5338
1.3378
1-100
Multivariate Statistics
80
70
60
50
40
30
20
10
0
1
3
4
5
Principal Component
We can see that the first three principal components explain roughly two thirds
of the total variability in the standardized ratings.
1-101
Tutorial
It is not surprising that the ratings for New York are the furthest from the
average U.S. town.
1-102
Statistical Plots
Statistical Plots
The Statistics Toolbox adds specialized plots to the extensive graphics
capabilities of MATLAB.
Box plots are graphs for data sample description. They are also useful for
graphic comparisons of the means of many samples (see the discussion of
one-way ANOVA on page 1-65).
Normal probability plots are graphs for determining whether a data sample
has normal distribution.
Quantile-quantile plots graphically compare the distributions of two
samples.
Weibull probability plots are graphs for assessing whether data comes from
a Weibull distribution.
Box Plots
The graph shows an example of a notched box plot.
Values
125
120
115
110
1
Column Number
1-103
Tutorial
outliers, the maximum of the sample is the top of the upper whisker. The
minimum of the sample is the bottom of the lower whisker. By default, an
outlier is a value that is more than 1.5 times the interquartile range away
from the top or bottom of the box.
The plus sign at the top of the plot is an indication of an outlier in the data.
This point may be the result of a data entry error, a poor measurement or a
change in the system that generated the data.
The notches in the box are a graphic confidence interval about the median
of a sample. Box plots do not have notches by default.
A side-by-side comparison of two notched box plots is the graphical equivalent
of a t-test. See the section Hypothesis Tests on page 1-85.
1-104
Statistical Plots
Probability
0.75
0.50
0.25
0.10
0.05
0.02
0.01
8.5
9.5
10
10.5
Data
11
11.5
The plot has three graphic elements. The plus signs show the empirical
probability versus the data value for each point in the sample. The solid line
connects the 25th and 75th percentiles of the data and represents a robust
linear fit (i.e., insensitive to the extremes of the sample). The dashed line
extends the solid line to the ends of the sample.
The scale of the y-axis is not uniform. The y-axis values are probabilities and,
as such, go from zero to one. The distance between the tick marks on the y-axis
matches the distance between the quantiles of a normal distribution. The
quantiles are close together near the median (probability = 0.5) and stretch out
symmetrically moving away from the median. Compare the vertical distance
from the bottom of the plot to the probability 0.25 with the distance from 0.25
to 0.50. Similarly, compare the distance from the top of the plot to the
probability 0.75 with the distance from 0.75 to 0.50.
1-105
Tutorial
If all the data points fall near the line, the assumption of normality is
reasonable. But, if the data is nonnormal, the plus signs may follow a curve, as
in the example using exponential data below.
x = exprnd(10,100,1);
normplot(x)
Normal Probability Plot
0.997
Probability
0.99
0.98
0.95
0.90
0.75
0.50
0.25
0.10
0.05
0.02
0.01
0.003
0
10
15
20
25
Data
30
35
40
45
This plot is clear evidence that the underlying distribution is not normal.
Quantile-Quantile Plots
A quantile-quantile plot is useful for determining whether two samples come
from the same distribution (whether normally distributed or not).
1-106
Statistical Plots
Y Quantiles
8
6
4
2
0
-2
2
10
12
14
16
18
X Quantiles
Even though the parameters and sample sizes are different, the straight line
relationship shows that the two samples come from the same distribution.
Like the normal probability plot, the quantile-quantile plot has three graphic
elements. The pluses are the quantiles of each sample. By default the number
of pluses is the number of data values in the smaller sample. The solid line joins
the 25th and 75th percentiles of the samples. The dashed line extends the solid
line to the extent of the sample.
1-107
Tutorial
The example below shows what happens when the underlying distributions are
not the same.
x = normrnd(5,1,100,1);
y = weibrnd(2,0.5,100,1);
qqplot(x,y);
16
14
12
Y Quantiles
10
8
6
4
2
0
-2
2
X Quantiles
1-108
Statistical Plots
If the data points (pluses) fall near the line, the assumption that the data come
from a Weibull distribution is reasonable.
This example shows a typical Weibull probability plot.
y = weibrnd(2,0.5,100,1);
weibplot(y)
Weibull Probability Plot
0.99
0.96
0.90
0.75
Probability
0.50
0.25
0.10
0.05
0.02
0.01
0.003
10-4
10-2
Data
100
1-109
Tutorial
Control Charts
These graphs were popularized by Walter Shewhart in his work in the 1920s
at Western Electric. A control chart is a plot of a measurements over time with
statistical limits applied. Actually control chart is a slight misnomer. The chart
itself is actually a monitoring tool. The control activity may occur if the chart
indicates that the process is changing in an undesirable systematic direction.
The Statistics Toolbox supports three common control charts:
Xbar charts
S charts
Exponentially weighted moving average (EWMA) charts.
Xbar Charts
Xbar charts are a plot of the average of a sample of a process taken at regular
intervals. Suppose we are manufacturing pistons to a tolerance of 0.5
1-110
Measurements
0.4
21
0.2
25
26
UCL
0
-0.2
12
30
LCL
-0.4
LSL
10
20
Samples
30
40
The lines at the bottom and the top of the plot show the process specifications.
The central line is the average runout over all the pistons. The two lines
flanking the center line are the 99% statistical control limits. By chance only
one measurement in 100 should fall outside these lines. We can see that even
in this small run of 36 parts, there are several points outside the boundaries
(labeled by their observation numbers). This is an indication that the process
mean is not in statistical control. This might not be of much concern in practice,
since all the parts are well within specification.
S Charts
The S chart is a plot of the standard deviation of a process taken at regular
intervals. The standard deviation is a measure of the variability of a process.
So, the plot indicates whether there is any systematic change in the process
1-111
Tutorial
0.4
Standard Deviation
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0 LCL
0
10
15
20
25
Sample Number
30
35
40
EWMA Charts
The EWMA chart is another chart for monitoring the process average. It
operates on slightly different assumptions than the Xbar chart. The
mathematical model behind the Xbar chart posits that the process mean is
actually constant over time and any variation in individual measurements is
due entirely to chance.
1-112
The EWMA model is a little looser. Here we assume that the mean may be
varying in time. Here is an EWMA chart of our runout example. Compare this
with the plot on page 1-111.
ewmaplot(runout,0.5,0.01,spec)
Exponentially Weighted Moving Average (EWMA) Chart
0.5
USL
0.4
0.3
0.2
21
EWMA
0.1
2526
UCL
0
-0.1
-0.2
LCL
-0.3
-0.4
-0.5
LSL
10
15
20
25
Sample Number
30
35
40
Capability Studies
Before going into full-scale production, many manufacturers run a pilot study
to determine whether their process can actually build parts to the
specifications demanded by the engineering drawing.
1-113
Tutorial
Using the data from these capability studies with a statistical model allows us
to get a preliminary estimate of the percentage of parts that will fall outside
the specifications.
[p, Cp, Cpk] = capable(mean(runout),spec)
p =
1.3940e09
Cp =
2.3950
Cpk =
1.9812
1-114
1-115
Tutorial
1
1
1
1
2
2
2
2
3
3
3
3
1-116
One special subclass of factorial designs is when all the variables take only two
values. Suppose you want to quickly determine the sensitivity of a process to
high and low values of three variables.
d2 = ff2n(3)
d2 =
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
1-117
Tutorial
If we assume that the variables do not act synergistically in the system, we can
assess the sensitivity with far fewer runs. The theoretical minimum number is
eight. To see the design (X) matrix we use the hadamard function.
X = hadamard(8)
X =
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
The last seven columns are the actual variable settings (1 for low, 1 for high.)
The first column (all ones) allows us to measure the mean effect in the linear
equation, y = X + .
D-Optimal Designs
All the designs above were in use by early in the 20th century. In the 1970s
statisticians started to use the computer in experimental design by recasting
DOE in terms of optimization. A D-optimal design is one that maximizes the
determinant of Fishers information matrix, X'X. This matrix is proportional to
the inverse of the covariance matrix of the parameters. So maximizing det(X'X)
is equivalent to minimizing the determinant of the covariance of the
parameters.
A D-optimal design minimizes the volume of the confidence ellipsoid of the
regression estimates of the linear model parameters, .
There are several functions in the Statistics Toolbox that generate D-optimal
designs. These are cordexch, daugment, dcovary, and rowexch.
1-118
Suppose we want the D-optimal design for fitting this model with nine runs.
settings = cordexch(2,9,'q')
settings =
1
1
0
1
1
0
1
0
1
1
1
1
1
1
1
0
0
0
1-119
Tutorial
We can plot the columns of settings against each other to get a better picture
of the design.
h = plot(settings(:,1),settings(:,2),'.');
set(gca,'Xtick',[-1 0 1])
set(gca,'Ytick',[-1 0 1])
set(h,'Markersize',20)
1
-1
-1
Suppose we want the D-optimal design for fitting this model with four runs.
[settings, X] = rowexch(2,4,'i')
settings =
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
X =
1-120
1
1
1
1
1
1
1
1
The settings matrix shows how to vary the inputs from run to run. The X matrix
is the design matrix for fitting the above regression model. The first column of
X is for fitting the constant term. The last column is the element-wise product
of the second and third columns.
The associated plot is simple but elegant.
h = plot(settings(:,1),settings(:,2),'.');
set(gca,'Xtick',[-1 0 1])
set(gca,'Ytick',[-1 0 1])
set(h,'Markersize',20)
1
-1
-1
1-121
Tutorial
Suppose we have executed the eight-run design below for fitting a linear model
to four input variables.
settings = cordexch(4,8)
settings =
1
1
1
1
1
1
1
1
1-122
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
This design is adequate to fit the linear model for four inputs, but cannot fit the
six cross-product (interaction) terms. Suppose we are willing to do eight more
runs to fit these extra terms. Heres how.
[augmented, X] = daugment(settings,8,'i');
augmented
augmented =
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
info = X'*X
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
info =
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
16
1-123
Tutorial
1-124
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.7143
0.4286
0.1429
0.1429
0.4286
0.7143
1.0000
Demos
Demos
The Statistics Toolbox has demonstration programs that create an interactive
environment for exploring the probability distribution, random number
generation, curve fitting, and design of experiments functions.
Demo
Purpose
disttool
polytool
randtool
rsmdemo
1-125
Tutorial
distributions
pop-up
function type
pop-up
1
cdf function
0.8
cdf value
draggable horizontal
reference line
0.6
0.4
draggable vertical
reference line
0.2
0
-8
-6
-4
-2
x value
upper and lower
parameter bounds
parameter value
parameter control
1-126
Demos
You can use polytool to do curve fitting and prediction for any set of x-y data,
but, for the sake of demonstration, the Statistics Toolbox provides a dataset
(polydata.mat) to teach some basic concepts.
To start the demonstration you must first load the dataset.
load polydata
who
Your variables are:
x
x1
y1
The variables x and y are observations made with error from a cubic
polynomial. The variables x1 and y1 are data points from the true function
without error.
If you do not specify the degree of the polynomial, polytool does a linear fit to
the data.
polytool(x,y)
13
upper confidence
bound
12
11
predicted value
data point
10
9
fitted line
8
7
lower confidence
bound
6
5
draggable reference
line
4
3
10
x-value
1-127
Tutorial
The linear fit is not very good. The bulk of the data with x-values between zero
and two has a steeper slope than the fitted line. The two points to the right are
dragging down the estimate of the slope.
Go to the data entry box at the top and type 3 for a cubic model. Then, drag the
vertical reference line to the x-value of two (or type 2 in the x-axis data entry
box).
12
10
8
6
4
2
0
0
10
This graph shows a much better fit to the data. The confidence bounds are
closer together indicating that there is less uncertainty in prediction. The data
at both ends of the plot tracks the fitted curve.
The true function in this case is cubic.
2
1-128
Demos
12
10
8
6
fitted polynomial
4
true function
2
0
0
10
The true function is quite close to the fitted polynomial in the region of the
data. Between the two groups of data points the two functions separate, but
both fall inside the 95% confidence bounds.
If the cubic polynomial is a good fit, it is tempting to try a higher order
polynomial to see if even more precise predictions are possible.
1-129
Tutorial
Since the true function is cubic, this amounts to overfitting the data. Use the
data entry box for degree and type 5 for a quintic model.
25
20
15
10
5
0
-5
-10
-15
0
10
The resulting fit again does well predicting the function near the data points.
But, in the region between the data groups, the uncertainty of prediction rises
dramatically.
This bulge in the confidence bounds happens because the data really do not
contain enough information to estimate the higher order polynomial terms
precisely, so even interpolation using polynomials can be risky in some cases.
1-130
Demos
histogram
20
15
10
0
-8
-6
-4
-2
output to variable
ans
sample size
parameter value
parameter control
1-131
Tutorial
experiment.
2 Compare response surface (polynomial) modeling with nonlinear modeling.
Part 1
Begin the demo by using the sliders in the Reaction Simulator to control the
partial pressures of three reactants: Hydrogen, n-Pentane, and Isopentane.
Each time you click the Run button, the levels for the reactants and results of
the run are entered in the Trial and Error Data window.
Based on the results of previous runs, you can change the levels of the
reactants to increase the reaction rate. (The results are determined using an
underlying model that takes into account the noise in the process, so even if you
keep all of the levels the same, the results will vary from run to run.) You are
allotted a budget of 13 runs. When you have completed the runs, you can use
the Plot menu on the Trial and Error Data window to plot the relationships
between the reactants and the reaction rate, or click the Analyze button. When
you click Analyze, rsmdemo calls the rstool function, which you can then use
to try to optimize the results.)
Next, perform another set of 13 runs, this time from a designed experiment. In
the Experimental Design Data window, click the Do Experiment button.
rsmdemo calls the cordexch function to generate a D-optimal design, and then,
for each run, computes the reaction rate.
Now use the Plot menu on the Experimental Design Data window to plot the
relationships between the levels of the reactants and the reaction rate, or click
the Response Surface button to call rstool to find the optimal levels of the
reactants.
Compare the analysis results for the two sets of data. It is likely (though not
certain) that youll find some or all of these differences:
You can fit a full quadratic model with the data from the designed
experiment, but the trial and error data may be insufficient for fitting a
quadratic model or interactions model.
Using the data from the designed experiment, you are more likely to be able
to find levels for the reactants that result in the maximum reaction rate.
1-132
Demos
Even if you find the best settings using the trial and error data, the
confidence bounds are likely to be wider than those from the designed
experiment.
Part 2
Now analyze the experimental design data with a polynomial model and a
nonlinear model, and comparing the results. The true model for the process,
which is used to generate the data, is actually a nonlinear model. However,
within the range of the data, a quadratic model approximates the true model
quite well.
To see the polynomial model, click the Response Surface button on the
Experimental Design Data window. rsmdemo calls rstool, which fits a full
quadratic model to the data. Drag the reference lines to change the levels of the
reactants, and find the optimal reaction rate. Observe the width of the
confidence intervals.
Now click the Nonlinear Model button on the Experimental Design Data
window. rsmdemo calls nlintool, which fits a Hougen-Watson model to the
data. As with the quadratic model, you can drag the reference lines to change
the reactant levels. Observe the reaction rate and the confidence intervals.
Compare the analysis results for the two models. Even though the true model
is nonlinear, you may find that the polynomial model provides a good fit.
Because polynomial models are much easier to fit and work with than
nonlinear models, a polynomial model is often preferable even when modeling
a nonlinear process. Keep in mind, however, that such models are unlikely to
be reliable for extrapolating outside the range of the data.
1-133
Tutorial
References
Atkinson, A. C., and A. N. Donev, Optimum Experimental Designs, Oxford
Science Publications 1992.
Bates, D. and D. Watts. Nonlinear Regression Analysis and Its Applications,
John Wiley and Sons. 1988. pp. 271272.
Bernoulli, J., Ars Conjectandi, Basiliea: Thurnisius [11.19], 1713
Chatterjee, S. and A. S. Hadi. Influential Observations, High Leverage Points,
and Outliers in Linear Regression. Statistical Science, 1986. pp. 379416.
Efron, B., and R. J. Tibshirani. An Introduction to the Bootstrap, Chapman and
Hall, New York. 1993.
Evans, M., N. Hastings, and B. Peacock. Statistical Distributions, Second
Edition. John Wiley and Sons, 1993.
Hald, A., Statistical Theory with Engineering Applications, John Wiley and
Sons, 1960. p. 647.
Hogg, R. V., and J. Ledolter. Engineering Statistics. MacMillan Publishing
Company, 1987.
Johnson, N., and S. Kotz. Distributions in Statistics: Continuous Univariate
Distributions. John Wiley and Sons, 1970.
Moore, J., Total Biochemical Oxygen Demand of Dairy Manures. Ph.D. thesis.
University of Minnesota, Department of Agricultural Engineering, 1975.
Poisson, S. D., Recherches sur la Probabilit des Jugements en Matiere
Criminelle et en Metire Civile, Prcdes des Regles Gnrales du Calcul des
Probabilitis. Paris: Bachelier, Imprimeur-Libraire pour les Mathematiques,
1837.
Student, On the Probable Error of the Mean. Biometrika, 6:1908. pp. 125.
Weibull, W., A Statistical Theory of the Strength of Materials. Ingeniors
Vetenskaps Akademiens Handlingar, Royal Swedish Institute for Engineering
Research. Stockholm, Sweden, No. 153. 1939.
1-134
2
Reference
Reference
The Statistics Toolbox provides several categories of functions. These categories appear in the table below.
The Statistics Toolboxs Main Categories of Functions
Probability
Descriptive
Plots
Statistical plots.
SPC
Cluster
Analysis
Linear
Nonlinear
DOE
Design of Experiments.
PCA
Hypotheses
File I/O
Demos
Demonstrations.
Data
The following pages contain tables of functions from each of these specific
areas. The first seven tables contain probability distribution functions. The
remaining tables describe the other categories of functions.
2-2
Parameter Estimation
betafit
betalike
binofit
expfit
gamfit
gamlike
mle
normlike
normfit
poissfit
unifit
Beta cdf.
binocdf
Binomial cdf.
cdf
chi2cdf
Chi-square cdf.
expcdf
Exponential cdf.
fcdf
F cdf.
gamcdf
Gamma cdf.
geocdf
Geometric cdf.
hygecdf
Hypergeometric cdf.
2-3
Reference
Lognormal cdf.
nbincdf
ncfcdf
Noncentral F cdf.
nctcdf
Noncentral t cdf.
ncx2cdf
normcdf
poisscdf
Poisson cdf.
raylcdf
Rayleigh cdf.
tcdf
Students t cdf.
unidcdf
unifcdf
weibcdf
Weibull cdf.
2-4
betapdf
Beta pdf.
binopdf
Binomial pdf.
chi2pdf
Chi-square pdf.
exppdf
Exponential pdf.
fpdf
F pdf.
gampdf
Gamma pdf.
geopdf
Geometric pdf.
hygepdf
Hypergeometric pdf.
lognpdf
Lognormal pdf.
nbinpdf
ncfpdf
Noncentral F pdf.
nctpdf
Noncentral t pdf.
ncx2pdf
poisspdf
Poisson pdf.
raylpdf
Rayleigh pdf.
tpdf
Students t pdf.
unidpdf
unifpdf
weibpdf
Weibull pdf.
binoinv
chi2inv
expinv
finv
F critical values.
gaminv
geoinv
2-5
Reference
logninv
nbininv
ncfinv
nctinv
ncx2inv
icdf
norminv
poissinv
raylinv
tinv
unidinv
unifinv
weibinv
2-6
betarnd
binornd
chi2rnd
exprnd
frnd
F random numbers.
gamrnd
hygernd
lognrnd
nbinrnd
ncfrnd
nctrnd
ncx2rnd
normrnd
poissrnd
raylrnd
random
trnd
unidrnd
unifrnd
weibrnd
2-7
Reference
2-8
betastat
binostat
chi2stat
expstat
fstat
gamstat
geostat
hygestat
lognstat
nbinstat
ncfstat
nctstat
ncx2stat
normstat
poisstat
raylstat
tstat
unidstat
unifstat
weibstat
Descriptive Statistics
corrcoef
cov
geomean
Geometric mean.
harmmean
Harmonic mean.
iqr
Interquartile range.
kurtosis
Sample kurtosis.
mad
mean
median
moment
nanmax
nanmean
nanmedian
nanmin
nanstd
nansum
prctile
range
Sample range.
skewness
Sample skewness.
std
trimmean
Trimmed mean.
var
Variance.
2-9
Reference
Statistical Plotting
2-10
boxplot
Box plots.
errorbar
fsurfht
gline
gname
lsline
normplot
pareto
Pareto charts.
qqplot
Quantile-Quantile plots.
rcoplot
refcurve
Reference polynomial.
refline
Reference line.
surfht
weibplot
Weibull plotting.
capaplot
ewmaplot
histfit
normspec
schart
xbarplot
Cluster Analysis
cluster
clusterdata
cophenet
dendrogram
inconsistent
linkage
pdist
squareform
zscore
2-11
Reference
Linear Models
anova1
anova2
lscov
polyconf
polyfit
polyval
regress
ridge
Ridge regression.
rstool
stepwise
Nonlinear Regression
2-12
nlinfit
nlintool
nlparci
nlpredci
nnls
Design of Experiments
cordexch
daugment
dcovary
ff2n
fullfact
hadamard
rowexch
Bartletts test.
pcacov
pcares
princomp
Hypothesis Tests
ranksum
signrank
signtest
ttest
ttest2
ztest
Z-test.
2-13
Reference
File I/O
caseread
casewrite
tblread
tblwrite
Demonstrations
disttool
randtool
polytool
rsmdemo
statdemo
Data
2-14
census.mat
cities.mat
discrim.mat
Classification data.
gas.mat
Gasoline prices.
hald.mat
Hald data.
hogg.mat
lawdata.mat
mileage.mat
Data (Continued)
moore.mat
parts.mat
popcorn.mat
polydata.mat
reaction.mat
sat.dat
2-15
anova1
Purpose
Syntax
p = anova1(X)
p = anova1(x,group)
Description
near zero, this casts doubt on the null hypothesis and suggests that the means
of the columns are, in fact, different.
anova1(x,group) performs a one-way ANOVA for comparing the means of two
or more samples of data in x indexed by the vector, group. The input, group,
identifies the group of the corresponding element of the vector x.
The values of group are integers with minimum equal to one and maximum
equal to the number of different groups to compare. There must be at least one
element in each group. This two-input form of anova1 does not require equal
numbers of elements in each group, so it is appropriate for unbalanced data.
The choice of a limit for the p-value to determine whether the result is
statistically significant is left to the researcher. It is common to declare a
result significant if the p-value is less than 0.05 or 0.01.
anova1 also displays two figures.
The first figure is the standard ANOVA table, which divides the variability of
the data in X into two parts:
The variability due to the differences among the column means.
The variability due to the differences between the data in each column and
the column mean.
The ANOVA table has five columns:
The first shows the source of the variability.
The second shows the Sum of Squares (SS) due to each source.
The third shows the degrees of freedom (df) associated with each source.
The fourth shows the Mean Squares (MS), which is the ratio SS/df.
The fifth shows the F statistic, which is the ratio of the MSs.
2
2anova1
2-16
anova1
Examples
The five columns of x are the constants one through five plus a random normal
disturbance with mean zero and standard deviation one. The ANOVA
procedure detects the difference in the column means with great assurance.
The probability (p) of observing the sample x by chance given that there is no
difference in the column means is less than 6 in 100,000.
x = meshgrid(1:5)
x =
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
x = x + normrnd(0,1,5,5)
x =
2.1650
1.6268
1.0751
1.3516
0.3035
3.6961
2.0591
3.7971
2.2641
2.8717
1.5538
2.2988
4.2460
2.3610
3.5774
3.6400
3.8644
2.6507
2.7296
4.9846
4.9551
4.2011
4.2348
5.8617
4.9438
p = anova1(x)
p =
5.9952e05
2-17
anova1
ANOVA Table
Source
Columns
Error
Total
SS
32.93
14.62
47.55
df
4
20
24
MS
F
8.232 11.26
0.7312
Values
5
4
3
2
1
1
Column Number
2-18
anova1
Though alloy is sorted in this example, you do not need to sort the grouping
variable.
p = anova1(strength,alloy)
p =
1.5264e04
ANOVA Table
Source
Columns
Error
Total
SS
184.8
102
286.8
df
2
17
19
MS
92.4
6
F
15.4
Values
85
80
75
1
Group Number
The p-value indicates that the three alloys are significantly different. The box
plot confirms this graphically and shows that the steel beams deflect more than
the more expensive alloys.
References
2-19
anova2
Purpose
2anova2
Syntax
p = anova2(X,reps)
Description
The matrix below shows the format for a set-up where the column factor has
two levels, the row factor has three levels, and there are two replications. The
subscripts indicate row, column and replicate, respectively.
x 111 x 121
x 112 x 122
x 211 x 221
x 212 x 222
x 311 x 321
x 312 x 322
anova2 returns the p-values for the null hypotheses that the means of the
columns and the means of the rows of X are equal. If any p-value is near zero,
this casts doubt on the null hypothesis and suggests that the means of the
source of variability associated with that p-value are, in fact, different.
The choice of a limit for the p-value to determine whether the result is
statistically significant is left to the researcher. It is common to declare a
result significant if the p-value is less than 0.05 or 0.01.
anova2 also displays a figure showing the standard ANOVA table, which
divides the variability of the data in X into three or four parts depending on the
value of reps:
2-20
anova2
The variability due to the interaction between rows and columns (if reps is
greater than its default value of one.)
The remaining variability not explained by any systematic source.
The ANOVA table has five columns:
The first shows the source of the variability.
The second shows the Sum of Squares (SS) due to each source.
The third shows the degrees of freedom (df) associated with each source.
The fourth shows the Mean Squares (MS), which is the ratio SS/df.
The fifth shows the F statistics, which is the ratio of the mean squares.
The p-value is a function (fcdf) of F. As F increases the p-value decreases.
Examples
The data below comes from a study of popcorn brands and popper type (Hogg
1987). The columns of the matrix popcorn are brands (Gourmet, National, and
Generic). The rows are popper type (Oil and Air.) The study popped a batch of
2-21
anova2
each brand three times with each popper. The values are the yield in cups of
popped popcorn.
load popcorn
popcorn
popcorn =
5.5000
5.5000
6.0000
6.5000
7.0000
7.0000
4.5000
4.5000
4.0000
5.0000
5.5000
5.0000
3.5000
4.0000
3.0000
4.0000
5.0000
4.5000
p = anova2(popcorn,3)
p =
0.0000
0.0001
0.7462
ANOVA Table
Source
SS
Columns
15.75
Rows
4.5
Interaction 0.08333
Error
1.667
Total
22
df
2
1
2
12
17
MS
F
7.875
56.7
4.5
32.4
0.04167 0.3
0.1389
The vector, p, shows the p-values for the three brands of popcorn 0.0000, the
two popper types 0.0001, and the interaction between brand and popper type
0.7462. These values indicate that both popcorn brand and popper type affect
the yield of popcorn, but there is no evidence of a synergistic (interaction) effect
of the two.
The conclusion is that you can get the greatest yield using the Gourmet brand
and an Air popper (the three values located in popcorn(4:6,1)).
Reference
2-22
barttest
Purpose
Syntax
2barttest
Description
Example
See Also
2-23
betacdf
Purpose
2betacdf
Syntax
P = betacdf(X,A,B)
Description
betacdf(X,A,B) computes the beta cdf with parameters A and B at the values
in X. The arguments X, A, and B must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
The parameters A and B must both be positive and x must lie on the interval
[0 1].
The beta cdf is:
1
p = F ( x a, b ) = ------------------B ( a, b )
0 ta 1 ( 1 t ) b 1 dt
Examples
x
a
b
p
=
=
=
=
0.1:0.2:0.9;
2;
2;
betacdf(x,a,b)
p =
0.0280
0.2160
0.5000
a = [1 2 3];
p = betacdf(0.5,a,a)
p =
0.5000
2-24
0.5000
0.5000
0.7840
0.9720
betafit
Purpose
Syntax
2betafit
Description
2-by-2 matrix. The first column of the matrix contains the lower and upper
confidence bounds for parameter A, and the second column contains the
confidence bounds for parameter B.
The optional input argument, alpha, controls the width of the confidence
interval. By default, alpha is 0.05 which corresponds to 95% confidence
intervals.
Example
2.6193
2.5244
5.2777
1.7488
3.4899
ci =
Reference
See Also
betalike, mle
2-25
betainv
Purpose
2betainv
Syntax
X = betainv(P,A,B)
Description
betainv(P,A,B) computes the inverse of the beta cdf with parameters A and B
for the probabilities in P. The arguments P, A, and B must all be the same size
except that scalar arguments function as constant matrices of the common size
of the other arguments.
The parameters A and B must both be positive and P must lie on the interval
[0 1].
The beta inverse function in terms of the beta cdf is:
1
x = F ( p a, b ) = { x:F ( x a, b ) = p }
where
1
p = F ( x a, b ) = ------------------B ( a, b )
0 t a 1 ( 1 t ) b 1 dt
The result, x, is the solution of the integral equation of the beta cdf with
parameters a and b where you supply the desired probability p.
Algorithm
Examples
2-26
0.6742
0.8981
betalike
Purpose
Syntax
2betalike
Description
distribution. The likelihood assumes that all the elements in the data sample
are mutually independent. Since betalike returns the negative gamma
log-likelihood function, minimizing betalike using fmins is the same as
maximizing the likelihood.
Example
This continues the example for betafit where we calculated estimates of the
beta parameters for some randomly generated beta distributed data.
r = betarnd(4,3,100,1);
[logl,info] = betalike([3.9010 2.6193],r)
logl =
33.0514
info =
0.2856
0.1528
See Also
0.1528
0.1142
2-27
betapdf
Purpose
2betapdf
Syntax
Y = betapdf(X,A,B)
Description
betapdf(X,A,B) computes the beta pdf with parameters A and B at the values
in X. The arguments X, A, and B must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
The parameters A and B must both be positive and X must lie on the interval
[0 1].
The probability density function for the beta distribution is:
1
a1
b1
y = f ( x a, b ) = ------------------- x
(1 x)
I ( 0, 1 ) ( x )
B ( a, b )
A likelihood function is the pdf viewed as a function of the parameters.
Maximum likelihood estimators (MLEs) are the values of the parameters that
maximize the likelihood function for a fixed value of x.
The uniform distribution on [0 1] is a degenerate case of the beta where
a = 1 and b = 1.
Examples
a = [0.5 1; 2 4]
a =
0.5000
2.0000
1.0000
4.0000
y = betapdf(0.5,a,a)
y =
0.6366
1.5000
2-28
1.0000
2.1875
betarnd
Purpose
2betarnd
Syntax
R = betarnd(A,B)
R = betarnd(A,B,m)
R = betarnd(A,B,m,n)
Description
Examples
a = [1 1; 2 2];
b = [1 2; 1 2];
r = betarnd(a,b)
r =
0.6987
0.9102
0.6139
0.8067
r = betarnd(10,10,[1 5])
r =
0.5974
0.4777
0.5538
0.5465
0.6327
r = betarnd(4,2,2,3)
r =
0.3943
0.5990
0.6101
0.2760
0.5768
0.5474
2-29
betastat
Purpose
2betastat
Syntax
[M,V] = betastat(A,B)
Description
Examples
0.5000
0.5000
0.5000
0.5000
0.5000
0.0833
0.0500
0.0357
0.0278
0.0227
0.0192
v =
2-30
binocdf
Purpose
2binocdf
Syntax
Y = binocdf(X,N,P)
Description
The parameter N must be a positive integer and P must lie on the interval [0 1].
The binomial cdf is:
x
y = F ( x n, p ) =
i p i q
(1 i)
I ( 0, 1, , n ) ( i )
i=0
Examples
If a baseball team plays 162 games in a season and has a 50-50 chance of
winning any game, then the probability of that team winning more than 100
games in a season is:
1 binocdf(100,162,0.5)
The result is 0.001 (i.e., 1 0.999). If a team wins 100 or more games in a
season, this result suggests that it is likely that the teams true probability of
winning any game is greater than 0.5.
2-31
binofit
Purpose
Syntax
2binofit
Description
Example
0.6780
Reference
See Also
mle
2-32
binoinv
Purpose
2binoinv
Syntax
X = binoinv(Y,N,P)
Description
binoinv(Y,N,P) returns the smallest integer X such that the binomial cdf
evaluated at X is equal to or exceeds Y. You can think of Y as the probability of
observing X successes in N independent trials where P is the probability of
success in each trial.
The parameter n must be a positive integer and both P and Y must lie on the
interval [0 1]. Each X is a positive integer less than or equal to N.
Examples
91
This result means that in 90% of baseball seasons, a .500 team should win
between 71 and 91 games.
2-33
binopdf
Purpose
2binopdf
Syntax
Y = binopdf(X,N,P)
Description
Examples
What is the most likely number of defective boards the inspector will find?
y = binopdf([0:200],200,0.02);
[x,i] = max(y);
i
i =
5
2-34
binornd
Purpose
Syntax
2binornd
Description
Algorithm
Examples
The binornd function uses the direct method using the definition of the
binomial distribution as a sum of Bernoulli random variables.
n = 10:10:60;
r1 = binornd(n,1./n)
r1 =
2
r2 = binornd(n,1./n,[1 6])
r2 =
0
r3 = binornd(n,1./n,1,6)
r3 =
0
2-35
binostat
Purpose
2binostat
Syntax
[M,V] = binostat(N,P)
Description
Examples
n = logspace(1,5,5)
n =
10
100
1000
10000
100000
[m,v] = binostat(n,1./n)
m =
1
v =
0.9000
0.9900
0.9990
0.9999
1.0000
[m,v] = binostat(n,1/2)
m =
5
50
500
5000
v =
1.0e+04 *
0.0003
2-36
0.0025
0.0250
0.2500
2.5000
50000
bootstrp
Purpose
Syntax
2bootstrp
Description
Example
Correlate the LSAT scores and and law-school GPA for 15 students. These 15
data points are resampled to create 1000 different datasets, and the correlation
between the two variables is computed for each dataset.
load lawdata
[bootstat,bootsam] = bootstrp(1000,'corrcoef',lsat,gpa);
bootstat(1:5,:)
ans =
1.0000
1.0000
1.0000
1.0000
1.0000
0.3021
0.6869
0.8346
0.8711
0.8043
0.3021
0.6869
0.8346
0.8711
0.8043
1.0000
1.0000
1.0000
1.0000
1.0000
2-37
bootstrp
bootsam(:,1:5)
ans =
4
1
11
11
15
6
8
13
1
1
8
11
1
6
2
7
11
9
14
13
8
2
10
7
11
14
12
4
1
12
5
10
12
15
6
4
15
11
12
10
2
10
14
5
7
12
8
4
5
6
3
8
14
14
1
14
8
8
5
15
8
4
2
15
2
8
6
5
14
8
7
15
1
12
12
hist(bootstat(:,2))
250
200
150
100
50
0
0.2
0.4
0.6
0.8
The histogram shows the variation of the correlation coefficient across all the
bootstrap samples. The sample minimum is positive indicating that the
relationship between LSAT and GPA is not accidental.
2-38
boxplot
Purpose
Syntax
2boxplot
Description
boxplot(X) produces a box and whisker plot for each column of X. The box has
lines at the lower quartile, median, and upper quartile values. The whiskers
are lines extending from each end of the box to show the extent of the rest of
the data. Outliers are data with values beyond the ends of the whiskers. If
there is no data outside the whisker, there is a dot at the bottom whisker. The
dot color is the same as the whisker color.
boxplot(X,notch) with notch = 1 produces a notched-box plot. Notches graph
a robust estimate of the uncertainty about the means for box to box
comparison. The default, notch = 0 produces a rectangular box plot.
boxplot(X,notch,'sym') where 'sym' is a plotting symbol allows control of
the symbol for outliers if any (default = '+').
boxplot(X,notch,'sym',vert) with vert = 0 makes the boxes horizontal
2-39
boxplot
Examples
x1 = normrnd(5,1,100,1);
x2 = normrnd(6,1,100,1);
x = [x1 x2];
boxplot(x,1)
Values
2
Column Number
The difference between the means of the two columns of x is 1. We can detect
this difference graphically since the notches do not overlap.
2-40
capable
Purpose
Syntax
2capable
Description
The assumptions are that the measured values in the vector, data, are
normally distributed with constant mean and variance and the the
measurements are statistically independent.
[p,Cp,Cpk] = capable(data,lower,upper) also returns the capability indices
Cp and Cpk.
Cp is the ratio of the range of the specifications to six times the estimate of the
process standard deviation
USL LSL
C p = -------------------------------6
For a process that has its average value on target, a Cp of one translates to a
little more than one defect per thousand. Recently many industries have set a
quality goal of one part per million. This would correspond to a Cp = 1.6. The
higher the value of Cp the more capable the process.
Cpk is the ratio of difference between the process mean and the closer
specification limit to three times the estimate of the process standard deviation
USL LSL
C p k = min -----------------------, ----------------------
3
3
where the process mean is . For processes that do not maintain their average
on target, Cpk is a more descriptive index of process capability.
Example
2-41
capable
1.1144
0.7053
Reference
See Also
capaplot, histfit
2-42
capaplot
Purpose
Syntax
2capaplot
Description
normal distribution with unknown mean and variance and plots the
distribution of a new observation (T distribution.) The part of the distribution
between the lower and upper bounds contained in the two element vector,
specs, is shaded in the plot.
[p,h] = capaplot(data,specs) returns the probability of the new observation
being within specification in p and handles to the plot elements in h.
Example
See Also
-2
-1
capable, histfit
2-43
caseread
Purpose
Syntax
2caseread
Description
separate case.
names = caseread displays the File Open dialog box for interactive selection
Example
Use the file months.dat created using the function casewrite on the next page.
type months.dat
January
February
March
April
May
names = caseread('months.dat')
names =
January
February
March
April
May
See Also
2-44
casewrite
Purpose
Syntax
2casewrite
Description
Example
strmat = str2mat('January','February','March','April','May')
strmat =
January
February
March
April
May
casewrite(strmat,'months.dat')
type months.dat
January
February
March
April
May
See Also
2-45
cdf
Purpose
2cdf
Syntax
P = cdf('name',X,A1,A2,A3)
Description
cdf is a utility routine allowing you to access all the cdfs in the Statistics
Toolbox using the name of the distribution as a parameter.
P = cdf('name',X,A1,A2,A3) returns a matrix of probabilities. name is a string
containing the name of the distribution. X is a matrix of values, and A, A2, and
A3 are matrices of distribution parameters. Depending on the distribution,
some of the parameters may not be necessary.
The arguments X, A1, A2, and A3 must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
Examples
p = cdf('Normal',2:2,0,1)
p =
0.0228
0.1587
0.5000
0.8413
0.9772
0.4335
0.4405
p = cdf('Poisson',0:5,1:6)
p =
0.3679
See Also
2-46
0.4060
0.4232
0.4457
chi2cdf
Purpose
Syntax
P = chi2cdf(X,V)
Description
( 2 ) 2 t 2
- dt
0 ---------------------------------v
--2
2 ( 2 )
Examples
probability = chi2cdf(5,1:5)
probability =
0.9747
0.9179
0.8282
0.7127
0.5841
0.5940
0.5841
probability = chi2cdf(1:5,1:5)
probability =
0.6827
0.6321
0.6084
2-47
chi2inv
Purpose
Syntax
X = chi2inv(P,V)
Description
chi2inv(P,V) computes the inverse of the 2 cdf with parameter V for the
probabilities in P. The arguments P and V must be the same size except that a
x = F ( p ) = { x:F ( x ) = p }
where p = F ( x ) =
t 2
t ( 2 ) 2e
---------------------------------- dt
v
--2
2 ( 2 )
The result, x, is the solution of the integral equation of the 2 cdf with
parameter where you supply the desired probability p.
Examples
Find a value that exceeds 95% of the samples from a 2 distribution with 10
degrees of freedom.
x = chi2inv(0.95,10)
x =
18.3070
You would observe values greater than 18.3 only 5% of the time by chance.
2-48
chi2pdf
Purpose
Syntax
Y = chi2pdf(X,V)
Description
x ( 2 ) 2e
y = f (x ) = -----------------------------------v
--2
2 ( 2 )
The 2 density function with n degrees of freedom is the same as the gamma
density function with parameters n/2 and 2.
If x is standard normal , then x2 is distributed 2 with one degree of freedom.
If x1, x2, ..., xn are n independent standard normal observations, then the sum
of the squares of the xs is distributed 2 with n degrees of freedom.
Examples
nu = 1:6;
x = nu;
y = chi2pdf(x,nu)
y =
0.2420
0.1839
0.1542
0.1353
0.1220
0.1120
The mean of the 2 distribution is the value of the parameter, nu. The above
example shows that the probability density of the mean falls as nu increases.
2-49
chi2rnd
Purpose
Syntax
R = chi2rnd(V)
R = chi2rnd(V,m)
R = chi2rnd(V,m,n)
Description
Examples
Note that the first and third commands are the same but are different from the
second command.
r = chi2rnd(1:6)
r =
0.0037
3.0377
7.8142
0.9021
3.2019
9.0729
12.2497
3.0388
6.3133
5.0388
0.8273
3.2506
1.5469
10.9197
r = chi2rnd(6,[1 6])
r =
6.5249
2.6226
r = chi2rnd(1:6,1,6)
r =
0.7638
2-50
6.0955
chi2stat
Purpose
Syntax
[M,V] = chi2stat(NU)
Description
Example
nu = 1:10;
nu = nu'nu;
[m,v] = chi2stat(nu)
m =
1
2
3
4
5
6
7
8
9
10
2
4
6
8
10
12
14
16
18
20
3
6
9
12
15
18
21
24
27
30
4
8
12
16
20
24
28
32
36
40
5
10
15
20
25
30
35
40
45
50
6
12
18
24
30
36
42
48
54
60
7
14
21
28
35
42
49
56
63
70
8
16
24
32
40
48
56
64
72
80
9
18
27
36
45
54
63
72
81
90
10
20
30
40
50
60
70
80
90
100
4
8
12
16
20
24
28
32
36
40
6
12
18
24
30
36
42
48
54
60
8
16
24
32
40
48
56
64
72
80
10
20
30
40
50
60
70
80
90
100
12
24
36
48
60
72
84
96
108
120
14
28
42
56
70
84
98
112
126
140
16
32
48
64
80
96
112
128
144
160
18
36
54
72
90
108
126
144
162
180
20
40
60
80
100
120
140
160
180
200
v =
2
4
6
8
10
12
14
16
18
20
2-51
classify
Purpose
2classify
Syntax
class = classify(sample,training,group)
Description
The vector group contains integers, from one to the number of groups, that
identify which group each row of the training set belongs. group and training
must have the same number of rows.
The function returns class, a vector with the same number of rows as sample.
Each element of class identifies the group to which the corresponding element
of sample has been assigned. The classify function determines into which
group each row in sample belongs by computing the Mahalanobis distance of
each row in sample to each row in training.
Example
load discrim
sample = ratings(idx,:);
training = ratings(1:200,:);
g = group(1:200);
class = classify(sample,training,g);
first5 = class(1:5)
first5 =
2
2
2
2
2
See Also
2-52
mahal
cluster
Purpose
2cluster
Syntax
T = cluster(Z,cutoff)
T = cluster(Z,cutoff,depth)
Description
Meaning
2-53
cluster
Example
The example uses the pdist function to calculate the distance between items
in a matrix of random numbers and then uses the linkage function to compute
the hierarchical cluster tree based on the matrix. The output of the linkage
function is passed to the cluster function. The cutoff value 3 indicates that
you want to group the items into three clusters. The example uses the find
function to list all the items grouped into cluster 2.
rand('seed', 0);
X = [rand(10,3); rand(10,3)+1; rand(10,3)+2];
Y = pdist(X);
Z = linkage(Y);
T = cluster(Z,3);
find(T == 3)
ans =
11
12
13
14
15
16
17
18
19
20
See Also
2-54
clusterdata
Purpose
2clusterdata
Syntax
T = clusterdata(X,cutoff)
Description
Meaning
Follow this sequence to use nondefault parameters for pdist and linkage.
Example
The example first creates a sample dataset of random numbers. The example
then uses the clusterdata function to compute the distances between items in
the dataset and create a hierarchical cluster tree from the dataset. Finally, the
2-55
clusterdata
clusterdata function groups the items in the dataset into three clusters. The
example uses the find function to list all the items in cluster 2.
rand('seed', 12);
X = [rand(10,3); rand(10,3)+1.2; rand(10,3)+2.5;
T = clusterdata(X,3);
find(T == 2)
ans =
21
22
23
24
25
26
27
28
29
30
See Also
2-56
combnk
Purpose
Syntax
Description
2combnk
time.
C = combnk(v,k) produces a matrix, with k columns. Each row of C has k of the
elements in the vector v. C has n!/k!(n-k)! rows.
It is not feasible to use this function if v has more than about 10 elements.
Example
4
4
3
4
3
2
2-57
cophenet
Purpose
2cophenet
Syntax
c = cophenet(Z,Y)
Description
For example, given a group of objects {1,2,....m} with distances Y, the function
linkage produces a hierarchical cluster tree. The cophenet function measures
the distortion of this classification, indicating how readily the data fits into the
structure suggested by the classification.
The output value, c, is the cophenetic correlation coefficient. The magnitude of
this value should be very close to 1 for a high-quality solution. This measure
can be used to compare alternative cluster solutions obtained using different
algorithms.
The cophenetic correlation between Z(:,3) and Y is defined as
i < j ( Y ij y ) ( Z ij z )
c = -----------------------------------------------------------------------------2
i < j ( Y ij y ) 2 i < j ( Z ij z )
where:
Yij is the distance between objects i and j in Y.
Zij is the distance between objects i and j in Z(:,3).
y and z are the average of Y and Z(:,3), respectively.
Example
See Also
2-58
rand('seed',12);
X = [rand(10,3);rand(10,3)+1;rand(10,3)+2];
Y = pdist(X);
Z = linkage(Y,'centroid');
c = cophenet(Z,Y)
c =
0.6985
cluster, dendrogram, inconsistent, linkage, pdist, squareform
cordexch
Purpose
Syntax
2cordexch
Description
design matrix, X.
[settings,X] = cordexch(nfactors,nruns,'model') produces a design for
fitting a specified regression model. The input, 'model', can be one of these
strings:
Example
The D-optimal design for two factors in nine run using a quadratic model is the
32 factorial as shown below:
settings = cordexch(2,9,'quadratic')
settings =
1
1
0
1
1
0
1
0
1
See Also
1
1
1
1
1
1
0
0
0
2-59
corrcoef
Purpose
2corrcoef
Correlation coefficients.
Syntax
R = corrcoef(X)
Description
C ( i, j ) )
rR ( i, j ) = ------------------------------------C ( i, i )C ( j, j )
See Also
2-60
cov
Purpose
Syntax
2cov
Covariance matrix.
C = cov(X)
C = cov(x,y)
Description
cov computes the covariance matrix. For a single vector, cov(x) returns a
scalar containing the variance. For matrices, where each row is an observation,
and each column a variable, cov(X) is the covariance matrix.
The variance function, var(X) is the same as diag(cov(X)).
The standard deviation function, std(X) is equivalent to
sqrt(diag(cov(X))).
cov(x,y), where x and y are column vectors of equal length, gives the same
result as cov([x y]).
Algorithm
See Also
2-61
crosstab
Purpose
2crosstab
Syntax
table = crosstab(col1,col2)
[table,chi2,p] = crosstab(col1,col2)
Description
Example
5
8
13
chi2 =
4.1723
p =
0.1242
The result, 0.1242, is not a surprise. A very small value of p would make us
suspect the randomness of the random number generator.
See Also
2-62
tabulate
daugment
Purpose
Syntax
2daugment
Description
Example
1
1
1
1
0
0
1
0
1
See Also
2-63
dcovary
Purpose
Syntax
2dcovary
Description
Example
Suppose we want to block an eight run experiment into 4 blocks of size 2 to fit
a linear model on two factors.
covariates = dummyvar([1 1 2 2 3 3 4 4]);
settings = dcovary(2,covariates(:,1:3),'linear')
settings =
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
1
1
0
0
The first two columns of the output matrix contain the settings for the two
factors. The last three columns are dummy variable codings for the four blocks.
See Also
2-64
daugment, cordexch
dendrogram
Purpose
2dendrogram
Syntax
H = dendrogram(Z)
H = dendrogram(Z,p)
[H,T] = dendrogram(...)
Description
When there are fewer than p objects in the original data, all objects are
displayed in the dendrogram. In this case, T is the identical map, i.e.,
T = (1:m)', where each node contains only itself.
2-65
dendrogram
Example
rand('seed',12);
X= rand(100,2);
Y= pdist(X,'citiblock');
Z= linkage(Y,'average');
[H, T] = dendrogram(Z);
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
14 17 13 22 12 8 23 20 19 1 21 15 5 2 3 16 27 4 18 24 28 6 10 7 30 26 9 25 11 29
find(T==20)
ans =
20
49
62
65
73
96
2-66
dendrogram
This output indicates that leaf node 20 in the dendrogram contains the original
data points 20, 49, 62, 65, 73, and 96.
See Also
2-67
disttool
Purpose
2disttool
Syntax
disttool
Description
The disttool command sets up a graphic user interface for exploring the
effects of changing parameters on the plot of a cdf or pdf. Clicking and dragging
a vertical line on the plot allows you to evaluate the function over its entire
domain interactively.
Evaluate the plotted function by typing a value in the x-axis edit box or
dragging the vertical reference line on the plot. For cdfs, you can evaluate the
inverse function by typing a value in the y-axis edit box or dragging the
horizontal reference line on the plot. The shape of the pointer changes from an
arrow to a crosshair when you are over the vertical or horizontal line to indicate
that the reference line is draggable.
To change the distribution function choose from the pop-up menu of functions
at the top left of the figure. To change from cdfs to pdfs, choose from the pop-up
menu at the top right of the figure.
To change the parameter settings move the sliders or type a value in the edit
box under the name of the parameter. To change the limits of a parameter, type
a value in the edit box at the top or bottom of the parameter slider.
When you are done, press the Close button.
See Also
2-68
randtool
dummyvar
Purpose
2dummyvar
Syntax
D = dummyvar(group)
Description
Example
Suppose we are studying the effects of two machines and three operators on a
process. The first column of group would have the values one or two depending
on which machine was used. The second column of group would have the values
one, two, or three depending on which operator ran the machine.
group = [1 1;1 2;1 3;2 1;2 2;2 3];
D = dummyvar(group)
D =
1
1
1
0
0
0
See Also
0
0
0
1
1
1
1
0
0
1
0
0
0
1
0
0
1
0
0
0
1
0
0
1
pinv, regress
2-69
errorbar
Purpose
Syntax
2errorbar
Description
column produces a separate line. The error bars are each drawn a distance of
U(i) above and L(i) below the points in (X,Y). symbol is a string that controls
the line type, plotting symbol, and color of the error bars.
errorbar(X,Y,L) plots X versus Y with symmetric error bars about Y.
errorbar(Y,L) plots Y with error bars [YL Y+L].
Example
lambda = (0.1:0.2:0.5);
r = poissrnd(lambda(ones(50,1),:));
[p,pci] = poissfit(r,0.001);
L = p pci(1,:)
U = pci(2,:) p
errorbar(1:3,p,L,U,'+')
L =
0.1200
0.1600
0.2600
0.2000
0.2200
0.3400
U =
0.8
0.6
0.4
0.2
0
0.5
See Also
2-70
1.5
2.5
3.5
ewmaplot
Purpose
Syntax
2ewmaplot
Description
Example
Consider a process with a slowly drifting mean over time. An EWMA chart is
preferable to an x-bar chart for monitoring this kind of process. This simulation
demonstrates an EWMA chart for a slow linear drift.
t = (1:30)';
r = normrnd(10+0.02*t(:,ones(4,1)),0.5);
ewmaplot(r,0.4,0.01,[9.3 10.7])
2-71
ewmaplot
10.8
USL
10.6
EWMA
10.4
10.2
10
9.8
LCL
9.6
9.4
LSL
9.2
0
10
15
20
Sample Number
25
30
Reference
See Also
xbarplot, schart
2-72
expcdf
Purpose
2expcdf
Syntax
P = expcdf(X,MU)
Description
p = F(x ) =
--1 ----- e dt = 1 e
0
Examples
0.5000
0.5000
0.5000
0.5000
0.5000
What is the probability that an exponential random variable will be less than
or equal to the mean, ?
mu = 1:6;
x = mu;
p = expcdf(x,mu)
p =
0.6321
0.6321
0.6321
0.6321
0.6321
0.6321
2-73
expfit
Purpose
Syntax
2expfit
Description
Example
See Also
2-74
expinv
Purpose
2expinv
Syntax
X = expinv(P,MU)
Description
that a scalar argument functions as a constant matrix of the size of the other
argument.
The parameter MU must be positive and P must lie on the interval [0 1].
The inverse of the exponential cdf is:
x = F ( p ) = ln ( 1 p )
The result, x, is the value such that the probability is p that an observation
from an exponential distribution with parameter will fall in the range [0 x].
Examples
Let the lifetime of light bulbs be exponentially distributed with mu equal to 700
hours. What is the median lifetime of a bulb?
expinv(0.50,700)
ans =
485.2030
So, suppose you buy a box of 700 hour light bulbs. If 700 hours is mean life of
the bulbs, then half them will burn out in less than 500 hours.
2-75
exppdf
Purpose
2exppdf
Syntax
Y = exppdf(X,MU)
Description
1 --y = f ( x ) = --- e
The exponential pdf is the gamma pdf with its first parameter (a) equal to 1.
The exponential distribution is appropriate for modeling waiting times when
you think the probability of waiting an additional period of time is independent
of how long youve already waited. For example, the probability that a light
bulb will burn out in its next minute of use is relatively independent of how
many minutes it has already burned.
Examples
y = exppdf(5,1:5)
y =
0.0067
0.0410
0.0630
0.0716
0.0736
0.1226
0.0920
0.0736
y = exppdf(1:5,1:5)
y =
0.3679
2-76
0.1839
exprnd
Purpose
2exprnd
Syntax
R = exprnd(MU)
R = exprnd(MU,m)
R = exprnd(MU,m,n)
Description
Examples
n1 = exprnd(5:10)
n1 =
7.5943
18.3400
2.7113
3.0936
0.6078
9.5841
23.5530
23.4303
5.7190
3.9876
n2 = exprnd(5:10,[1 6])
n2 =
3.2752
1.1110
n3 = exprnd(5,2,3)
n3 =
24.3339
4.7932
13.5271
4.3675
1.8788
2.6468
2-77
expstat
Purpose
2expstat
Syntax
[M,V] = expstat(MU)
Description
Examples
10
100
1000
100
10000
1000000
v =
2-78
fcdf
Purpose
2fcdf
Syntax
P = fcdf(X,V1,V2)
Description
1 + ------ t
2 2
x
Examples
This example illustrates an important and useful mathematical identity for the
F distribution.
nu1 = 1:5;
nu2 = 6:10;
x = 2:6;
F1 = fcdf(x,nu1,nu2)
F1 =
0.7930
0.8854
0.9481
0.9788
0.9919
0.9788
0.9919
F2 = 1 fcdf(1./x,nu2,nu1)
F2 =
0.7930
0.8854
0.9481
2-79
ff2n
Purpose
2ff2n
Syntax
X = ff2n(n)
Description
Example
X = ff2n(3)
X =
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
See Also
2-80
fullfact
finv
Purpose
2finv
Syntax
X = finv(P,V1,V2)
Description
The parameters V1 and V2 must both be positive integers and P must lie on the
interval [0 1].
The F inverse function is defined in terms of the F cdf:
1
x = F ( p 1 , 2 ) = { x:F ( x 1 , 2 ) = p }
( 1 + 2 )
1 2
1
------------------------------------ ---2
2
t
1
- dt
where p = F ( x 1 , 2 ) =
-------------------------------- ------ 2 ---------------------------------------- 1 + 2
2
0 1 2
1 --------------- ------ -----1 + ------ t 2
2 2
2
Examples
Find a value that should exceed 95% of the samples from an F distribution with
5 degrees of freedom in the numerator and 10 degrees of freedom in the
denominator.
x = finv(0.95,5,10)
x =
3.3258
You would observe values greater than 3.3258 only 5% of the time by chance.
2-81
fpdf
Purpose
2fpdf
Syntax
Y = fpdf(X,V1,V2)
Description
2
--------
1 + ------ x
2 2
2
Examples
y = fpdf(1:6,2,2)
y =
0.2500
0.1111
0.0625
0.0400
0.0278
0.0204
z = fpdf(3,5:10,5:10)
z =
0.0689
2-82
0.0659
0.0620
0.0577
0.0532
0.0487
frnd
Purpose
Syntax
2frnd
Description
Examples
n1 = frnd(1:6,1:6)
n1 =
0.0022
0.3121
3.0528
0.3189
0.2715
0.9539
n2 = frnd(2,2,[2 3])
n2 =
0.3186
0.2052
0.9727
148.5816
3.0268
0.2191
0.2322
0.2121
31.5458
4.4955
2-83
fstat
Purpose
2fstat
Syntax
[M,V] = fstat(V1,V2)
Description
2 2 ( 1 + 2 2 )
------------------------------------------------2
1 ( 2 2 ) ( 2 4 )
The mean of the F distribution is undefined if 2 is less than 3. The variance is
undefined for 2 less than 5.
Examples
fstat returns NaN when the mean and variance are undefined.
[m,v] = fstat(1:5,1:5)
m =
NaN
NaN
3.0000
2.0000
1.6667
NaN
NaN
NaN
NaN
8.8889
v =
2-84
fsurfht
Purpose
Syntax
2fsurfht
Description
There are vertical and horizontal reference lines on the plot whose intersection
defines the current x-value and y-value. You can drag these dotted white
reference lines and watch the calculated z-values (at the top of the plot) update
simultaneously. Alternatively, you can get a specific z-value by typing the
x-value and y-value into editable text fields on the x-axis and y-axis
respectively.
Example
2-85
fsurfht
gauslike calls normpdf treating the data sample as fixed and the parameters
and as variables. Assume that the gas prices are normally distributed and
plot the likelihood surface of the sample.
fsurfht('gauslike',[112 118],[3 5],price1)
2e-25
4.8
4.6
4.4
4.2
1e-24
4
1.2e-24
3.8
6e-25
3.6
3.4
2e-25
3.2
3
112
8e-25
4e-25
113
114
115
116
117
118
The sample mean is the x-value at the maximum, but the sample standard
deviation is not the y-value at the maximum.
mumax = mean(price1)
mumax =
115.1500
sigmamax = std(price1)*sqrt(19/20)
sigmamax =
3.7719
2-86
fullfact
Purpose
2fullfact
Syntax
design = fullfact(levels)
Description
design = fullfact(levels) give the factor settings for a full factorial design.
Each element in the vector levels specifies the number of unique values in the
corresponding column of design.
For example, if the first element of levels is 3, then the first column of design
contains only integers from 1 to 3.
Example
If levels = [2 4], fullfact generates an eight run design with two levels in
the first column and four in the second column.
d = fullfact([2 4])
d =
1
2
1
2
1
2
1
2
See Also
1
1
2
2
3
3
4
4
2-87
gamcdf
Purpose
2gamcdf
Syntax
P = gamcdf(X,A,B)
Description
gamcdf(X,A,B) computes the gamma cdf with parameters A and B at the values
in X. The arguments X, A, and B must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
0 t
a1
t
--b
dt
Examples
a = 1:6;
b = 5:10;
prob = gamcdf(a.b,a,b)
prob =
0.6321
0.5940
0.5768
0.5665
0.5595
0.5543
The mean of the gamma distribution is the product of the parameters, a*b. In
this example as the mean increases, it approaches the median (i.e., the
distribution gets more symmetric).
2-88
gamfit
Purpose
Syntax
2gamfit
Description
Example
Note the 95% confidence intervals in the example bracket the true parameter
values, 2 and 4, respectively.
a = 2; b = 4;
r = gamrnd(a,b,100,1);
[p,ci] = gamfit(r)
p =
2.1990
3.7426
1.6840
2.7141
2.8298
4.6554
ci =
Reference
See Also
2-89
gaminv
Purpose
2gaminv
Syntax
X = gaminv(P,A,B)
Description
gaminv(P,A,B) computes the inverse of the gamma cdf with parameters A and
B for the probabilities in P. The arguments P, A and B must all be the same size
except that scalar arguments function as constant matrices of the common size
of the other arguments.
The parameters A and B must both be positive and P must lie on the interval
[0 1].
The gamma inverse function in terms of the gamma cdf is:
1
x = F ( p a ,b ) = { x:F ( x a ,b ) = p }
1
where p = F ( x a, b ) = -----------------a
b ( a)
0 t
a1
t
--b
dt
Algorithm
There is no known analytic solution to the integral equation above. gaminv uses
an iterative approach (Newtons method) to converge to the solution.
Examples
This example shows the relationship between the gamma cdf and its inverse
function.
a = 1:5;
b = 6:10;
x = gaminv(gamcdf(1:5,a,b),a,b)
x =
1.0000
2-90
2.0000
3.0000
4.0000
5.0000
gamlike
Purpose
Syntax
2gamlike
Description
respective parameters.
gamlike is a utility function for maximum likelihood estimation of the gamma
distribution. Since gamlike returns the negative gamma log-likelihood
function, minimizing gamlike using fmins is the same as maximizing the
likelihood.
Example
info =
0.0690
0.0790
See Also
0.0790
0.1220
2-91
gampdf
Purpose
2gampdf
Syntax
Y = gampdf(X,A,B)
Description
gampdf(X,A,B) computes the gamma pdf with parameters A and B at the values
in X. The arguments X, A and B must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
The parameters A and B must both be positive and X must lie on the interval
[0 ).
The gamma pdf is:
x
--1
a1 b
y = f ( x a, b ) = ------------------x
e
a
b (a)
Examples
0.3033
0.2388
0.1947
0.1637
0.2388
0.1947
0.1637
y1 = exppdf(1,mu)
y1 =
0.3679
2-92
0.3033
gamrnd
Purpose
Syntax
2gamrnd
Description
Examples
n1 = gamrnd(1:5,6:10)
n1 =
9.1132
12.8431
24.8025
38.5960
106.4164
33.6837
55.2014
46.8265
3.0982
15.6012
21.6739
n2 = gamrnd(5,10,[1 5])
n2 =
30.9486
33.5667
n3 = gamrnd(2:6,3,1,5)
n3 =
12.8715
11.3068
2-93
gamstat
Purpose
2gamstat
Syntax
[M,V] = gamstat(A,B)
Description
Examples
[m,v] = gamstat(1:5,1:5)
m =
1
16
25
27
64
125
v =
[m,v] = gamstat(1:5,1./(1:5))
m =
1
v =
1.0000
2-94
0.5000
0.3333
0.2500
0.2000
geocdf
Purpose
2geocdf
Syntax
Y = geocdf(X,P)
Description
y = F( x p ) =
pq
i=0
where: q = 1 p
The result, y, is the probability of observing up to x trials before a success when
the probability of success in any given trial is p.
Examples
Suppose you toss a fair coin repeatedly. If the coin lands face up (heads), that
is a success. What is the probability of observing three or fewer tails before
getting a heads?
p = geocdf(3,0.5)
p =
0.9375
2-95
geoinv
Purpose
2geoinv
Syntax
X = geoinv(Y,P)
Description
geoinv(Y,P) returns the smallest integer X such that the geometric cdf
evaluated at X is equal to or exceeds Y. You can think of Y as the probability of
observing X successes in a row in independent trials where P is the probability
of success in each trial.
The arguments P and Y must lie on the interval [0 1]. Each X is a positive
integer.
Examples
The probability of correctly guessing the result of 10 coin tosses in a row is less
than 0.001 (unless the coin is not fair.)
psychic = geoinv(0.999,0.5)
psychic =
9
The example below shows the inverse method for generating random numbers
from the geometric distribution.
rndgeo = geoinv(rand(2,5),0.5)
rndgeo =
0
0
2-96
1
1
3
0
1
2
0
0
geomean
Purpose
2geomean
Syntax
m = geomean(X)
Description
m =
xi
1
--n
i=1
Examples
0.6061
0.6038
0.2569
0.7539
0.3478
0.9741
0.5319
1.0088
0.8122
average = mean(x)
average =
1.3509
See Also
1.1583
2-97
geopdf
Purpose
2geopdf
Syntax
Y = geopdf(X,P)
Description
y = f ( x p ) = pq I ( 0, 1, K ) ( x )
where: q = 1 p
Examples
Suppose you toss a fair coin repeatedly. If the coin lands face up (heads), that
is a success. What is the probability of observing exactly three tails before
getting a heads?
p = geopdf(3,0.5)
p =
0.0625
2-98
geornd
Purpose
Syntax
2geornd
Description
The geometric distribution is useful when you want to model the number of
failed trials in a row before a success where the probability of success in any
given trial is the constant P.
R = geornd(P) generates geometric random numbers with probability
parameter, P . The size of R is the size of P.
R = geornd(P,m) generates geometric random numbers with probability
parameter, P. m is a 1-by-2 vector that contains the row and column dimensions
of R.
R = geornd(P,m,n) generates geometric random numbers with probability
parameter, P. The scalars m and n are the row and column dimensions of R.
Examples
r1 = geornd(1 ./ 2.^(1:6))
r1 =
2
10
60
r2 = geornd(0.01,[1 5])
r2 =
65
18
334
291
63
r3 = geornd(0.5,1,6)
r3 =
0
2-99
geostat
Purpose
2geostat
Syntax
[M,V] = geostat(P)
Description
Examples
[m,v] = geostat(1./(1:6))
m =
0
1.0000
2.0000
3.0000
4.0000
5.0000
2.0000
6.0000
12.0000
20.0000
30.0000
v =
2-100
gline
Purpose
Syntax
2gline
Description
gline(fig) draws a line segment by clicking the mouse at the two end-points
of the line segment in the figure, fig. A rubber band line tracks the mouse
movement.
h = gline(fig) returns the handle to the line in h.
gline with no input arguments draws in the current figure.
See Also
refline, gname
2-101
gname
Purpose
2gname
Syntax
gname('cases')
gname
h = gname('cases',line_handle)
Description
gname('cases') displays the graph window, puts up a cross-hair, and waits for
a mouse button or keyboard key to be pressed. Position the cross-hair with the
mouse and click once near each point that you want to label. When you are
done, press the Return or Enter key and the labels will appear at each point
that you clicked. 'cases' is a string matrix. Each row is the case name of a data
point.
gname with no arguments labels each case with its case number.
h = gname(cases,line_handle) returns a vector of handles to the text objects
on the plot. Use the scalar, line_handle, to identify the correct line if there is
more than one line object on the plot.
Example
2-102
Lets use the city ratings datasets to find out which cities are the best and worst
for education and the arts.
gname
load cities
education = ratings(:,6); arts = ratings(:,7);
plot(education,arts,'+')
gname(names)
x 104
6
New York, NY
5
4
3
2
1
0
1500
See Also
Pascagoula, MS
2000
2500
3000
3500
4000
gtext
2-103
grpstats
Purpose
Syntax
2grpstats
Description
confidence intervals about the mean value of for each value of index.
grpstats(x,group,alpha) plots 100(1 alpha)% confidence intervals around
each mean.
Example
See Also
2-104
2.0908
1.7600
2.0255
1.9264
tabulate, crosstab
2.8969
3.0285
2.8793
2.8232
3.6749
3.9484
4.0799
3.8815
4.6555
4.8169
5.3740
4.9689
harmmean
Purpose
2harmmean
Syntax
m = harmmean(X)
Description
---xi
i=1
Examples
0.3200
0.3710
0.0540
0.4936
0.0907
0.9741
0.5319
1.0088
0.8122
average = mean(x)
average =
1.3509
See Also
1.1583
2-105
hist
Purpose
Syntax
2hist
Plot histograms.
hist(y)
hist(y,nb)
hist(y,x)
[n,x] = hist(y,...)
Description
Examples
See Also
2-106
-2
-1
histfit
Purpose
Syntax
2histfit
Description
Example
r = normrnd(10,1,100,1);
histfit(r)
25
20
15
10
5
0
7
See Also
10
11
12
13
hist, normfit
2-107
hougen
Purpose
2hougen
Syntax
yhat = hougen(beta,X)
Description
yhat = hougen(beta,x) gives the predicted values of the reaction rate, yhat,
as a function of the vector of parameters, beta, and the matrix of data, X. beta
must have 5 elements and X must have three columns.
hougen is a utility function for rsmdemo.
Reference
Bates, D., and D. Watts, Nonlinear Regression Analysis and Its Applications,
Wiley 1988. p. 271272.
See Also
rsmdemo
2-108
hygecdf
Purpose
2hygecdf
Syntax
P = hygecdf(X,M,K,N)
Description
except that scalar arguments function as constant matrices of the common size
of the other arguments.
The hypergeometric cdf is:
K M K
i N i
-----------------------------p = F ( x M, K, N ) =
M
N
i=0
x
Examples
Suppose you have a lot of 100 floppy disks and you know that 20 of them are
defective. What is the probability of drawing zero to two defective floppies if
you select 10 at random?
p = hygecdf(2,100,20,10)
p =
0.6812
2-109
hygeinv
Purpose
2hygeinv
Syntax
X = hygeinv(P,M,K,N)
Description
Examples
2-110
hygepdf
Purpose
2hygepdf
Syntax
Y = hygepdf(X,M,K,N)
Description
except that scalar arguments function as constant matrices of the common size
of the other arguments.
The parameters M, K, and N must be positive integers. Also X must be less than
or equal to all the parameters and N must be less than or equal to M.
The hypergeometric pdf is:
K M K
x N x
y = f ( x M, K, N ) = ----------------------------- M
N
The result, y, is the probability of drawing exactly x items of a possible K in n
drawings without replacement from group of M objects.
Examples
Suppose you have a lot of 100 floppy disks and you know that 20 of them are
defective. What is the probability of drawing 0 through 5 defective floppy disks
if you select 10 at random?
p = hygepdf(0:5,100,20,10)
p =
0.0951
0.2679
0.3182
0.2092
0.0841
0.0215
2-111
hygernd
Purpose
Syntax
2hygernd
Description
Examples
numbers = hygernd(1000,40,50)
numbers =
1
2-112
hygestat
Purpose
2hygestat
Syntax
[MN,V] = hygestat(M,K,N)
Description
Examples
0.9000
0.9000
0.9000
0.0900
0.7445
0.8035
0.8094
v =
[m,v] = binostat(9,0.1)
m =
0.9000
v =
0.8100
2-113
icdf
Purpose
2icdf
Syntax
X = icdf('name',P,A1,A2,A3)
Description
icdf is a utility routine allowing you to access all the inverse cdfs in the
Statistics Toolbox using the name of the distribution as a parameter.
icdf('name',P,A1,A2,A3) returns a matrix of critical values, X. 'name' is a
string containing the name of the distribution. P is a matrix of probabilities,
and A, B, and C are matrices of distribution parameters. Depending on the
distribution some of the parameters may not be necessary.
The arguments P, A1, A2, and A3 must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
Examples
x = icdf('Normal',0.1:0.2:0.9,0,1)
x =
1.2816
0.5244
0.5244
x = icdf('Poisson',0.1:0.2:0.9,1:5)
x =
1
2-114
1.2816
inconsistent
Purpose
2inconsistent
Syntax
Y = inconsistent(Z)
Y = inconsistent(Z,d)
Description
Description
Inconsistency coefficient.
2-115
inconsistent
Example
rand('seed',12);
X = rand(10,2);
Y = pdist(X);
Z = linkage(Y,'centroid');
W = inconsistent(Z,3)
W =
0.0423
0.1406
0.1163
0.2101
0.2054
0.1742
0.2336
0.3081
0.4610
See Also
2-116
0
0
0.1047
0
0.0886
0.1762
0.1317
0.2109
0.3728
1.0000
1.0000
2.0000
1.0000
3.0000
3.0000
4.0000
5.0000
4.0000
0
0
0.7071
0
0.6792
0.6568
0.6408
0.7989
0.8004
iqr
Purpose
2iqr
Syntax
y = iqr(X)
Description
iqr(X) computes the difference between the 75th and the 25th percentiles of
the sample in X. The IQR is a robust estimate of the spread of the data, since
changes in the upper and lower 25% of the data do not affect it.
If there are outliers in the data, then the IQR is more representative than the
standard deviation as an estimate of the spread of the body of the data. The
IQR is less efficient than the standard deviation as an estimate of the spread,
when the data is all from the normal distribution.
Multiply the IQR by 0.7413 to estimate (the second parameter of the normal
distribution.)
Examples
This Monte Carlo simulation shows the relative efficiency of the IQR to the
sample standard deviation for normal data.
x = normrnd(0,1,100,100);
s = std(x);
s_IQR = 0.7413 iqr(x);
efficiency = (norm(s 1)./norm(s_IQR 1)).^2
efficiency =
0.3297
See Also
2-117
kurtosis
Purpose
2kurtosis
Sample kurtosis.
Syntax
k = kurtosis(X)
Description
E(x )
k = -----------------------4
Example
X = randn([5 4])
X =
1.1650
0.6268
0.0751
0.3516
0.6965
1.6961
0.0591
1.7971
0.2641
0.8717
1.4462
0.7012
1.2460
0.6390
0.5774
0.3600
0.1356
1.3493
1.2704
0.9846
1.6378
1.9589
k = kurtosis(X)
k =
2.1658
See Also
2-118
1.2967
leverage
Purpose
Syntax
2leverage
Description
h = leverage(DATA) finds the leverage of each row (point) in the matrix, DATA
Example
One rule of thumb is to compare the leverage to 2p/n where n is the number of
observations and p is the number of parameters in the model. For the Hald
dataset this value is 0.7692.
load hald
h = max(leverage(ingredients,'linear'))
h =
0.7004
Since 0.7004 < 0.7692, there are no high leverage points using this rule.
Algorithm
[Q,R] = qr(x2fx(DATA,'model'));
leverage = (sum(Q'.*Q'))'
Reference
See Also
regstats
2-119
linkage
Purpose
2linkage
Syntax
Z = linkage(Y)
Z = linkage(Y,method)
Description
Meaning
'single'
'complete'
Largest distance
'average'
Average distance
'centroid'
Centroid distance
'ward'
2-120
linkage
For example, consider a case with 30 initial nodes. If the tenth cluster formed
by the linkage function combines object 5 and object 7 and their distance is 1.5,
then row 10 of Z will contain the values (5,7,1.5). This newly formed cluster will
have the index 10+30=40. If cluster 40 shows up in a later row, that means this
newly formed cluster is being combined again into some bigger cluster.
Mathematical Definitions. The method argument is a character string that
specifies the algorithm used to generate the hierachical cluster tree
information. These linkage algorithms are based on various measurements of
proximity between two groups of objects. If nr is the number of objects in cluster
r and ns is the number of objects in cluster s, and xri is the ith object in cluster
r, the definitions of these various measurements are as follows:
Single linkage, also called nearest neighbor, uses the smallest distance
between objects in the two groups.
d ( r, s ) = min ( dist ( x ri, x sj ) ), i ( i, , n r ), j ( 1, , n s )
Complete linkage, also called furthest neighbor, uses the largest distance
between objects in the two groups.
d ( r, s ) = max ( dist ( x ri, x sj ) ), i ( 1, , n r ), j ( 1, , n s )
Average linkage uses the average distance between all pairs of objects in
cluster r and cluster s.
1
d ( r, s ) = -----------nr ns
nr
ns
Centroid linkage uses the distance between the centroids of the two
groups
d ( r, s ) = d ( x r, x s )
where:
1
x r = -----nr
nr
xri
i=1
2-121
linkage
n r n s d rs ( n r + n s )
Example
See Also
2-122
5.0000
4.0000
6.0000
7.0000
9.0000
10.0000
0.2000
0.5000
0.5099
0.7000
1.2806
1.3454
logncdf
Purpose
2logncdf
Syntax
P = logncdf(X,MU,SIGMA)
Description
The size of P is the common size of X, MU and SIGMA. A scalar input functions as
a constant matrix of the same size as the other inputs.
The lognormal cdf is:
( ln ( t ) )
x -------------------------------2
2
2
1
p = F ( x , ) = -------------- 2
Example
0 e----------------------------- dt
t
x = (0:0.2:10);
y = logncdf(x,0,1);
plot(x,y);grid;xlabel('x');ylabel('p')
1
0.8
0.6
0.4
0.2
0
0
10
Reference
See Also
2-123
logninv
Purpose
2logninv
Syntax
X = logninv(P,MU,SIGMA)
Description
x = F ( p , ) = { x:F ( x , ) = p }
where
( ln ( t ) )
x -------------------------------2
2
2
1
p = F ( x , ) = -------------- 2
Example
0 e----------------------------- dt
t
p = (0.005:0.01:0.995);
crit = logninv(p,1,0.5);
plot(p,crit)
xlabel('Probability');ylabel('Critical Value');grid
Critical Value
10
8
6
4
2
0
0
0.2
0.4
0.6
Probability
0.8
Reference
See Also
2-124
lognpdf
Purpose
2lognpdf
Syntax
Y = lognpdf(X,MU,SIGMA)
Description
The size of Y is the common size of X, MU and SIGMA. A scalar input functions as
a constant matrix of the same size as the other inputs.
The lognormal pdf is:
( ln ( x ) )
-------------------------------2
2
2
1
y = f ( x , ) = ------------------ e
x 2
Example
x = (0:0.02:10);
y = lognpdf(x,0,1);
plot(x,y);grid;xlabel('x');ylabel('p')
0.8
0.6
0.4
0.2
0
0
10
Reference
Mood, A. M., F.A. Graybill, and D.C. Boes, Introduction to the Theory of
Statistics, Third Edition, McGraw Hill 1974 p. 540541.
See Also
2-125
lognrnd
Purpose
Syntax
2lognrnd
Description
Example
r = lognrnd(0,1,4,3)
r =
3.2058
1.8717
1.0780
1.4213
0.4983
5.4529
1.0608
6.0320
1.3022
2.3909
0.2355
0.4960
Reference
See Also
2-126
lognstat
Purpose
2lognstat
Syntax
[M,V] = lognstat(MU,SIGMA)
Description
Example
( 2 + )
2
[m,v]= lognstat(0,1)
m =
1.6487
v =
7.0212
Reference
Mood, A. M., F.A. Graybill, and D.C. Boes, Introduction to the Theory of
Statistics, Third Edition, McGraw Hill 1974 p. 540541.
See Also
2-127
lsline
Purpose
Syntax
2lsline
Description
lsline superimposes the least squares line on each line object in the current
axes (except LineStyles '',' ','.').
h = lsline returns the handles to the line objects.
Example
See Also
2-128
polyfit, polyval
10
mad
Purpose
2mad
Syntax
y = mad(X)
Description
mad(X) computes the average of the absolute differences between a set of data
and the sample mean of that data. For vectors, mad(x) returns the mean
absolute deviation of the elements of x. For matrices, mad(X) returns the MAD
of each column of X.
The MAD is less efficient than the standard deviation as an estimate of the
spread, when the data is all from the normal distribution.
Multiply the MAD by 1.3 to estimate (the second parameter of the normal
distribution).
Examples
This example shows a Monte Carlo simulation of the relative efficiency of the
MAD to the sample standard deviation for normal data.
x = normrnd(0,1,100,100);
s = std(x);
s_MAD = 1.3 mad(x);
efficiency = (norm(s 1)./norm(s_MAD 1)).^2
efficiency =
0.5972
See Also
std, range
2-129
mahal
Purpose
2mahal
Mahalanobis distance.
Syntax
d = mahal(Y,X)
Description
The number of columns of Y must equal the number of columns in X, but the
number of rows may differ. The number of rows in X must exceed the number
of columns.
The Mahalanobis distance is a multivariate measure of the separation of a data
set from a point in space. It is the criterion minimized in linear discriminant
analysis.
Example
See Also
2-130
classify
mean
Purpose
2mean
Syntax
m = mean(X)
Description
1
x j = --n
xij
i=1
For vectors, mean(x) is the mean value of the elements in vector x. For
matrices, mean(X) is a row vector containing the mean value of each column.
Example
These commands generate five samples of 100 normal random numbers with
mean, zero, and standard deviation, one. The sample averages in xbar are
much less variable (0.00 0.10).
x = normrnd(0,1,100,5);
xbar = mean(x)
xbar =
0.0727
See Also
0.0264
0.0351
0.0424
0.0752
2-131
median
Purpose
2median
Syntax
m = median(X)
Description
For vectors, median(x) is the median value of the elements in vector x. For
matrices, median(X) is a row vector containing the median value of each
column. Since median is implemented using sort, it can be costly for large
matrices.
Examples
xodd = 1:5;
modd = median(xodd)
modd =
3
meven = median(xeven)
meven =
2.5000
See Also
2-132
mle
Purpose
Syntax
2mle
Description
confidence intervals.
[phat,pci] = mle('dist',data,alpha) returns the MLEs and
100(1alpha) percent confidence intervals given the data and the specified
alpha.
[phat,pci] = mle('dist',data,alpha,p1) is used for the binomial
Example
rv = binornd(20,0.75)
rv =
16
[p,pci] = mle('binomial',rv,0.05,20)
p =
0.8000
pci =
0.5634
0.9427
See Also
2-133
moment
Purpose
2moment
Syntax
m = moment(X,order)
Description
Note that the central first moment is zero, and the second central moment is
the variance computed using a divisor of n rather than n1, where n is the
length of the vector, x or the number of rows in the matrix, X.
The central moment of order k of a distribution is defined as:
mn = E ( x )
Example
X = randn([6 5])
X =
1.1650
0.6268
0.0751
0.3516
0.6965
1.6961
0.0591
1.7971
0.2641
0.8717
1.4462
0.7012
1.2460
0.6390
0.5774
0.3600
0.1356
1.3493
1.2704
0.9846
0.0449
0.7989
0.7652
0.8617
0.0562
0.5135
0.3967
0.7562
0.4005
1.3414
0.1253
0.1460
0.4486
m = moment(X,3)
m =
0.0282
See Also
2-134
0.0571
mvnrnd
Purpose
2mvnrnd
Syntax
r = mvnrnd(mu,SIGMA,cases)
Description
Example
mu = [2 3];
sigma = [1 1.5; 1.5 3];
r = mvnrnd(mu,sigma,100);
plot(r(:,1),r(:,2),'+')
8
6
4
2
0
-2
-1
See Also
normrnd
2-135
nanmax
Purpose
Syntax
2nanmax
Description
Example
m = magic(3);
m([1 6 8]) = [NaN NaN NaN]
m =
NaN
3
4
1
5
NaN
6
NaN
2
[nmax,maxidx] = nanmax(m)
nmax =
4
maxidx =
3
See Also
2-136
nanmean
Purpose
2nanmean
Syntax
y = nanmean(X)
Description
For vectors, nanmean(x) is the mean of the non-NaN elements of x. For matrices,
nanmean(X) is a row vector containing the mean of the non-NaN elements in
each column.
Example
m = magic(3);
m([1 6 8]) = [NaN NaN NaN]
m =
NaN
3
4
1
5
NaN
6
NaN
2
nmean = nanmean(m)
nmean =
3.5000
See Also
3.0000
4.0000
2-137
nanmedian
Purpose
2nanmedian
Syntax
y = nanmedian(X)
Description
Example
m = magic(4);
m([1 6 9 11]) = [NaN NaN NaN NaN]
m =
NaN
5
9
4
2
NaN
7
14
NaN
10
NaN
15
13
8
12
1
nmedian = nanmedian(m)
nmedian =
5.0000
See Also
2-138
7.0000
12.5000
10.0000
nanmin
Purpose
Syntax
2nanmin
Description
m = nanmin(a) returns the minimum with NaNs treated as missing. For vectors,
nanmin(a) is the smallest non-NaN element in a. For matrices, nanmin(A) is a
row vector containing the minimum non-NaN element from each column.
[m,ndx] = nanmin(a) also returns the indices of the minimum values in vector
ndx.
m = nanmin(a,b) returns the smaller of a or b, which must match in size.
Example
m = magic(3);
m([1 6 8]) = [NaN NaN NaN]
m =
NaN
3
4
1
5
NaN
6
NaN
2
[nmin,minidx] = nanmin(m)
nmin =
3
minidx =
2
See Also
2-139
nanstd
Purpose
2nanstd
Syntax
y = nanstd(X)
Description
Example
m = magic(3);
m([1 6 8]) = [NaN NaN NaN]
m =
NaN
3
4
1
5
NaN
6
NaN
2
nstd = nanstd(m)
nstd =
0.7071
See Also
2-140
2.8284
2.8284
nansum
Purpose
2nansum
Syntax
y = nansum(X)
Description
For vectors, nansum(x) is the sum of the non-NaN elements of x. For matrices,
nansum(X) is a row vector containing the sum of the non-NaN elements in each
column of X.
Example
m = magic(3);
m([1 6 8]) = [NaN NaN NaN]
m =
NaN
3
4
1
5
NaN
6
NaN
2
nsum = nansum(m)
nsum =
7
See Also
2-141
nbincdf
Purpose
2nbincdf
Syntax
Y = nbincdf(X,R,P)
Description
The size of Y is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
The negative binomial cdf is:
x
y = F ( x r, p ) =
r + i 1 r i
p q I ( 0, 1, ) ( i )
i
i=0
The motivation for the negative binomial is performing successive trials each
having a constant probability, P, of success. What you want to find out is how
many extra trials you must do to observe a given number, R, of successes.
Example
x = (0:15);
p = nbincdf(x,3,0.5);
stairs(x,p)
1
0.8
0.6
0.4
0.2
0
0
See Also
2-142
10
15
nbininv
Purpose
2nbininv
Syntax
X = nbininv(Y,R,P)
Description
The size of X is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
The negative binomial models consecutive trials each having a constant
probability, P, of success. The parameter, R, is the number of successes required
before stopping.
Example
How many times would you need to flip a fair coin to have a 99% probability of
having observed 10 heads?
flips = nbininv(0.99,10,0.5) + 10
flips =
33
Note that you have to flip at least 10 times to get 10 heads. That is why the
second term on the right side of the equals sign is a 10.
See Also
2-143
nbinpdf
Purpose
2nbinpdf
Syntax
Y = nbinpdf(X,R,P)
Description
Example
x = (0:10);
y = nbinpdf(x,3,0.5);
plot(x,y,'+')
set(gca,'Xlim',[0.5,10.5])
0.2
0.15
0.1
0.05
0
0
See Also
2-144
10
nbinrnd
Purpose
Syntax
2nbinrnd
Description
Example
Suppose you want to simulate a process that has a defect probability of 0.01.
How many units might Quality Assurance inspect before finding three
defective items?
r = nbinrnd(3,0.01,1,6) + 3
r =
496
See Also
142
420
396
851
178
2-145
nbinstat
Purpose
2nbinstat
Syntax
[M,V] = nbinstat(R,P)
Description
Example
p = 0.1:0.2:0.9;
r = 1:5;
[R,P] = meshgrid(r,p);
[M,V] = nbinstat(R,P)
M =
9.0000
2.3333
1.0000
0.4286
0.1111
18.0000
4.6667
2.0000
0.8571
0.2222
27.0000
7.0000
3.0000
1.2857
0.3333
36.0000
9.3333
4.0000
1.7143
0.4444
45.0000
11.6667
5.0000
2.1429
0.5556
90.0000
7.7778
2.0000
0.6122
0.1235
180.0000
15.5556
4.0000
1.2245
0.2469
270.0000
23.3333
6.0000
1.8367
0.3704
360.0000
31.1111
8.0000
2.4490
0.4938
450.0000
38.8889
10.0000
3.0612
0.6173
V =
See Also
2-146
ncfcdf
Purpose
2ncfcdf
Syntax
P = ncfcdf(X,NU1,NU2,DELTA)
Description
The size of P is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
The noncentral F cdf is:
j
1
---
2
2 --2- 1 x 1
- e ------------------------- ------ + j, ------
F ( x 1, 2, ) =
-----------2
j!
2 + 1 x 2
j = 0
Example
Compare the noncentral F cdf with = 10 to the F cdf with the same number of
numerator and denominator degrees of freedom (5 and 20 respectively).
x = (0.01:0.1:10.01)';
p1 = ncfcdf(x,5,20,10);
p = fcdf(x,5,20);
plot(x,p,' ',x,p1,'')
1
0.8
0.6
0.4
0.2
0
0
References
10
12
2-147
ncfinv
Purpose
2ncfinv
Syntax
X = ncfinv(P,NU1,NU2,DELTA)
Description
The size of X is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Example
One hypothesis test for comparing two sample variances is to take their ratio
and compare it to an F distribution. If the numerator and denominator degrees
of freedom are 5 and 20 respectively then you reject the hypothesis that the
first variance is equal to the second variance if their ratio is less than below:
critical = finv(0.95,5,20)
critical =
2.7109
Suppose the truth is that the first variance is twice as big as the second
variance. How likely is it that you would detect this difference?
prob = 1 ncfcdf(critical,5,20,2)
prob =
0.1297
References
See Also
2-148
ncfpdf
Purpose
2ncfpdf
Syntax
Y = ncfpdf(X,NU1,NU2,DELTA)
Description
The size of Y is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
The F distribution is a special case of the noncentral F where = 0. As
increases, the distribution flattens like the plot in the example.
Example
Compare the noncentral F pdf with = 10 to the F pdf with the same number
of numerator and denominator degrees of freedom (5 and 20 respectively).
x = (0.01:0.1:10.01)';
p1 = ncfpdf(x,5,20,10);
p = fpdf(x,5,20);
plot(x,p,' ',x,p1,'')
0.8
0.6
0.4
0.2
0
0
10
12
References
See Also
2-149
ncfrnd
Purpose
Syntax
2ncfrnd
Description
Example
0.8824
0.8220
1.4485
1.4415
1.4864
1.0967
0.9681
2.0096
0.6598
r1 = frnd(10,100,1,6)
r1 =
0.9826
0.5911
References
See Also
2-150
ncfstat
Purpose
2ncfstat
Syntax
[M,V] = ncfstat(NU1,NU2,DELTA)
Description
The mean is
2 ( + 1 )
------------------------- 1 ( 2 2 )
where 2 > 2.
The variance is
2
2 2 ( + 1 ) + ( 2 + 1 ) ( 2 2 )
2 ------ ------------------------------------------------------------------------2
1
( 2) ( 4)
2
where 2 > 4.
Example
[m,v]= ncfstat(10,100,4)
m =
1.4286
v =
3.9200
References
See Also
2-151
nctcdf
Purpose
2nctcdf
Syntax
P = nctcdf(X,NU,DELTA)
Description
The size of P is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Example
Compare the noncentral T cdf with DELTA = 1 to the T cdf with the same number
of degrees of freedom (10).
x = (5:0.1:5)';
p1 = nctcdf(x,10,1);
p = tcdf(x,10);
plot(x,p,' ',x,p1,'')
1
0.8
0.6
0.4
0.2
0
-5
References
See Also
2-152
nctinv
Purpose
2nctinv
Syntax
X = nctinv(P,NU,DELTA)
Description
The size of X is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Example
x = nctinv([.1 .2],10,1)
x =
0.2914
References
0.1618
See Also
2-153
nctpdf
Purpose
2nctpdf
Syntax
Y = nctpdf(X,V,DELTA)
Description
The size of Y is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Example
Compare the noncentral T pdf with DELTA = 1 to the T pdf with the same
number of degrees of freedom (10).
x = (5:0.1:5)';
p1 = nctpdf(x,10,1);
p = tpdf(x,10);
plot(x,p,' ',x,p1,'')
0.4
0.3
0.2
0.1
0
-5
References
See Also
2-154
nctrnd
Purpose
Syntax
2nctrnd
Description
Example
nctrnd(10,1,5,1)
ans =
1.6576
1.0617
1.4491
0.2930
3.6297
References
See Also
2-155
nctstat
Purpose
2nctstat
Syntax
[M,V] = nctstat(NU,DELTA)
Description
( 2 )
( ( 1) 2)
The mean is ------------------------------------------------------------( 2)
where > 1.
2 2 (( 1) 2)
The variance is ----------------- ( 1 + ) --- --------------------------------( 2)
( 2)
2
where > 2.
Example
[m,v] = nctstat(10,1)
m =
1.0837
v =
1.3255
References
See Also
2-156
ncx2cdf
Purpose
2ncx2cdf
Syntax
P = ncx2cdf(X,V,DELTA)
Description
The size of P is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Some texts refer to this distribution as the generalized Rayleigh,
Rayleigh-Rice, or Rice distribution.
The noncentral chi-square cdf is:
j
1
--- ---
2
2
2
- e Pr [
F ( x , ) =
x]
----------- + 2j
j!
j = 0
Example
x = (0:0.1:10)';
p1 = ncx2cdf(x,4,2);
p = chi2cdf(x,4);
plot(x,p,' ',x,p1,'')
1
0.8
0.6
0.4
0.2
0
0
10
References
See Also
2-157
ncx2inv
Purpose
2ncx2inv
Syntax
X = ncx2inv(P,V,DELTA)
Description
The size of X is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Algorithm
Example
References
1.1498
1.7066
See Also
2-158
ncx2pdf
Purpose
2ncx2pdf
Syntax
Y = ncx2pdf(X,V,DELTA)
Description
The size of Y is the common size of the input arguments. A scalar input
functions as a constant matrix of the same size as the other inputs.
Some texts refer to this distribution as the generalized Rayleigh,
Rayleigh-Rice, or Rice distribution.
Example
10
References
See Also
2-159
ncx2rnd
Purpose
Syntax
2ncx2rnd
Description
Example
ncx2rnd(4,2,6,3)
ans =
6.8552
5.2631
9.1939
10.3100
2.1142
3.8852
References
5.9650
4.2640
6.7162
4.4828
1.9826
5.3999
11.2961
5.9495
3.8315
7.1653
4.6400
0.9282
See Also
2-160
ncx2stat
Purpose
2ncx2stat
Syntax
[M,V] = ncx2stat(NU,DELTA)
Description
Example
[m,v] = ncx2stat(4,2)
m =
6
v =
16
References
See Also
2-161
nlinfit
Purpose
Syntax
Description
2nlinfit
Example
load reaction
betafit = nlinfit(reactants,rate,'hougen',beta)
betafit =
1.2526
0.0628
0.0400
0.1124
1.1914
See Also
2-162
nlintool
nlintool
Purpose
Syntax
2nlintool
Description
The default value for alpha is 0.05, which produces 95% confidence intervals.
nlintool(x,y,'model',beta0,alpha,'xname','yname') labels the plot using
the string matrix, 'xname' for the X variables and the string 'yname' for the Y
variable.
You can drag the dotted white reference line and watch the predicted values
update simultaneously. Alternatively, you can get a specific prediction by
typing the value for X into an editable text field. Use the pop-up menu labeled
Export to move specified variables to the base workspace.
Example
See Also
nlinfit, rstool
2-163
nlparci
Purpose
2nlparci
Syntax
ci = nlparci(beta,r,J)
Description
Example
See Also
2-164
3.3445
0.1689
0.1145
0.2941
3.7321
nlpredci
Purpose
Syntax
2nlpredci
Description
Example
See Also
2-165
normcdf
Purpose
2normcdf
Syntax
P = normcdf(X,MU,SIGMA)
Description
except that scalar arguments function as constant matrices of the common size
of the other arguments.
The parameter SIGMA must be positive.
The normal cdf is:
1
p = F ( x , ) = -------------- 2
( t )
--------------------2
2
2
dt
Examples
More generally, about 68% of the observations from a normal distribution fall
within one standard deviation,, of the mean, .
2-166
normfit
Purpose
Syntax
2normfit
Description
is the lower bound of the confidence interval and the bottom row is the upper
bound.
[muhat,sigmahat,muci,sigmaci] = normfit(X,alpha) gives estimates and
100(1alpha) percent confidence intervals. For example, alpha = 0.01 gives
99% confidence intervals.
Example
In this example the data is a two-column random normal matrix. Both columns
have = 10 and = 2. Note that the confidence intervals below contain the
true values.
r = normrnd(10,2,100,2);
[mu,sigma,muci,sigmaci] = normfit(r)
See Also
mu =
10.1455
10.0527
sigma =
1.9072
2.1256
muci =
9.7652
10.5258
9.6288
10.4766
sigmaci =
1.6745
2.2155
1.8663
2.4693
2-167
norminv
Purpose
2norminv
Syntax
X = norminv(P,MU,SIGMA)
Description
the same size except that scalar arguments function as constant matrices of the
common size of the other arguments.
The parameter SIGMA must be positive and P must lie on [0 1].
We define the normal inverse function in terms of the normal cdf.
1
x = F ( p , ) = { x:F ( x , ) = p }
1
where p = F ( x , ) = -------------- 2
( t )
--------------------2
2
2
dt
The result, x, is the solution of the integral equation above with the parameters
and where you supply the desired probability, p.
Examples
Find an interval that contains 95% of the values from a standard normal
distribution.
x = norminv([0.025 0.975],0,1)
x =
1.9600
1.9600
Note the interval x is not the only such interval, but it is the shortest.
xl = norminv([0.01 0.96],0,1)
xl =
2.3263
1.7507
The interval xl also contains 95% of the probability, but it is longer than x.
2-168
normpdf
Purpose
2normpdf
Syntax
Y = normpdf(X,MU,SIGMA)
Description
except that scalar arguments function as constant matrices of the common size
of the other arguments.
The parameter SIGMA must be positive.
The normal pdf is:
( x )
---------------------2
2
2
1
y = f ( x , ) = --------------- e
2
Examples
mu = [0:0.1:2];
[y i] = max(normpdf(1.5,mu,1));
MLE = mu(i)
MLE =
1.5000
2-169
normplot
Purpose
Syntax
2normplot
Description
The plot has the sample data displayed with the plot symbol '+'. Superimposed
on the plot is a line joining the first and third quartiles of each column of x. (A
robust linear fit of the sample order statistics.) This line is extrapolated out to
the ends of the sample to help evaluate the linearity of the data.
If the data does come from a normal distribution, the plot will appear linear.
Other probability density functions will introduce curvature in the plot.
h = normplot(X) returns a handle to the plotted lines.
Examples
Probability
0.95
0.90
0.75
0.50
0.25
0.10
0.05
0.02
0.01
-2.5
-2
-1.5
-1
-0.5
0
Data
0.5
1.5
The plot is linear, indicating that you can model the sample by a normal
distribution.
2-170
normrnd
Purpose
Syntax
2normrnd
Description
Examples
n1 = normrnd(1:6,1./(1:6))
n1 =
2.1650
2.3134
3.0250
4.0879
4.8607
0.2641
0.8717
-1.4462
6.2827
n2 = normrnd(0,1,[1 5])
n2 =
0.0591
1.7971
1.9361
5.0577
2.9640
5.9864
2-171
normspec
Purpose
Syntax
2normspec
Description
Example
Density
0.3
0.2
0.1
0
See Also
2-172
10
12
Critical Value
14
16
normstat
Purpose
2normstat
Syntax
[M,V] = normstat(MU,SIGMA)
Description
Examples
n = 1:5;
[m,v] = normstat(n'n,n'*n)
[m,v] = normstat(n'*n,n'*n)
m =
1
2
3
4
5
2
4
6
8
10
3
6
9
12
15
4
8
12
16
20
5
10
15
20
25
1
4
9
16
25
4
16
36
64
100
9
36
81
144
225
16
64
144
256
400
25
100
225
400
625
v =
2-173
pareto
Purpose
Syntax
2pareto
Description
pareto(y,names) displays a Pareto chart where the values in the vector y are
drawn as bars in descending order. Each bar is labeled with the associated
value in the string matrix names. pareto(y) labels each bar with the index of
the corresponding element in y.
Example
Create a Pareto chart from data measuring the number of manufactured parts
rejected for various types of defects.
defects = ['pits ';'cracks';'holes ';'dents '];
quantity = [5 3 19 25];
pareto(quantity,defects)
60
40
20
0
dents
See Also
2-174
holes
pits
cracks
pcacov
Purpose
Syntax
2pcacov
Description
Example
0.6460
0.0200
0.7553
0.1085
0.5673
0.5440
0.4036
0.4684
0.5062
0.4933
0.5156
0.4844
variances =
517.7969
67.4964
12.4054
0.2372
explained =
86.5974
11.2882
2.0747
0.0397
References
Jackson, J. E., A User's Guide to Principal Components, John Wiley and Sons,
Inc. 1991. pp. 125.
See Also
2-175
pcares
Purpose
2pcares
Syntax
residuals = pcares(X,ndim)
Description
Example
This example shows the drop in the residuals from the first row of the Hald
data as the number of component dimensions increase from one to three.
load hald
r1 = pcares(ingredients,1);
r2 = pcares(ingredients,2);
r3 = pcares(ingredients,3);
r11 = r1(1,:)
r11 =
2.0350
2.8304
6.8378
3.0879
2.6930
1.6482
2.3425
0.1957
0.2045
0.1921
r21 = r2(1,:)
r21 =
2.4037
r31 = r3(1,:)
r31 =
0.2008
Reference
Jackson, J. E., A User's Guide to Principal Components, John Wiley and Sons,
Inc. 1991. pp. 125.
See Also
2-176
Purpose
2pdf
Syntax
Y = pdf('name',X,A1,A2,A3)
Description
The arguments X, A1, A2, and A3 must all be the same size except that scalar
arguments function as constant matrices of the common size of the other
arguments.
pdf is a utility routine allowing access to all the pdfs in the Statistics Toolbox
using the name of the distribution as a parameter.
Examples
p = pdf('Normal',2:2,0,1)
p =
0.0540
0.2420
0.3989
0.2420
0.0540
0.1954
0.1755
p = pdf('Poisson',0:4,1:5)
p =
0.3679
0.2707
0.2240
2-177
pdist
Purpose
2pdist
Syntax
Y = pdist(X)
Y = pdist(X,metric)
Y = pdist(X,minkowski,p)
Description
Meaning
Euclid
SEuclid
Mahal
Mahalanobis distance
CityBlock
Minkowski
Minkowski metric
2-178
pdist
treated as m (1-by-n) row vectors x1, x2,..., xm, the various distances between
the vector xr and xs are defined as follows:
Euclidean distance:
2
d rs = ( x r x s ) ( x r x s )'
Standardized Euclidean distance:
2
d rs = ( x r x s )D ( x r x s )'
2
d rs = ( x r x s )'V 1 ( x r x s )
where V is the sample covariance matrix.
City Block metric:
n
d rs =
x rj x sj
j=1
Minkowski metric:
d rs
p 1 p
=
x rj x sj
j = 1
Notice that when p = 1 , it is the City Block case, and when p = 2 , it is the
Euclidean case.
2-179
pdist
Examples
X = [1 2; 1 3; 2 2; 3 1]
X =
1
2
1
3
2
2
3
1
Y = pdist(X,'mahal')
Y =
2.3452
2.0000
2.3452
1.2247
2.4495
1.2247
1.0000
2.2361
1.4142
2.8284
1.4142
squareform(Y)
ans =
0
1.0000
1.0000
0
1.0000
1.4142
2.2361
2.8284
1.0000
1.4142
0
1.4142
2.2361
2.8284
1.4142
0
Y = pdist(X)
Y =
1.0000
See Also
2-180
perms
Purpose
2perms
All permutations.
Syntax
P = perms(v)
Description
Example
perms([2 4 6])
ans =
6
4
6
2
4
2
4
6
2
6
2
4
2
2
4
4
6
6
2-181
poisscdf
Purpose
2poisscdf
Syntax
P = poisscdf(X,LAMBDA)
Description
except that a scalar argument functions as a constant matrix of the same size
of the other argument. The parameter, LAMBDA, is positive.
The Poisson cdf is:
floor ( x )
p = F(x ) = e
----i!
i=0
Examples
This means that this faulty manufacturing process continues to operate after
this first inspection almost 63% of the time.
2-182
poissfit
Purpose
Syntax
2poissfit
Description
1
= --n
xi
i=1
Example
r = poissrnd(5,10,2);
[l,lci] = poissfit(r)
l =
7.4000
6.3000
lci =
5.8000
9.1000
See Also
4.8000
7.9000
2-183
poissinv
Purpose
2poissinv
Syntax
X = poissinv(P,LAMBDA)
Description
poissinv(P,LAMBDA) returns the smallest value, X, such that the Poisson cdf
evaluated at X equals or exceeds P.
Examples
If the average number of defects () is two, what is the 95th percentile of the
number of defects?
poissinv(0.95,2)
ans =
5
2-184
poisspdf
Purpose
2poisspdf
Syntax
Y = poisspdf(X,LAMBDA)
Description
except that a scalar argument functions as a constant matrix of the same size
of the other argument.
The parameter, , must be positive.
The Poisson pdf is:
x
y = f ( x ) = -----e I ( 0, 1, ) ( x )
x!
x can be any non-negative integer. The density function is zero unless x is an
integer.
Examples
A computer hard disk manufacturer has observed that flaws occur randomly in
the manufacturing process at the average rate of two flaws in a 4 Gb hard disk
and has found this rate to be acceptable. What is the probability that a disk will
be manufactured with no defects?
In this problem, = 2 and x = 0.
p = poisspdf(0,2)
p =
0.1353
2-185
poissrnd
Purpose
Syntax
2poissrnd
Description
Examples
random_sample3 = poissrnd(lambda(ones(1,10)))
random_sample3 =
3
2-186
poisstat
Purpose
Syntax
2poisstat
Description
distribution.
For the Poisson distribution:
the mean is .
the variance is .
Examples
Find the mean and variance for the Poisson distribution with = 2:
[m,v] = poisstat([1 2; 3 4])
m =
1
3
2
4
1
3
2
4
v =
2-187
polyconf
Purpose
Syntax
2polyconf
Description
Examples
This example gives predictions and 90% confidence intervals for computing
time for LU factorizations of square matrices with 100 to 200 columns.
n =
for
A =
tic
B =
[100 100:20:200];
i = n
rand(i,i);
lu(A);
t(ceil((i80)/20)) = toc;
end
[p,S] = polyfit(n(2:7),t,3);
[time,delta_t] = polyconf(p,n(2:7),S,0.1)
time =
0.0829
0.1476
0.2277
0.3375
0.4912
0.7032
0.0057
0.0055
0.0055
0.0057
0.0064
delta_t =
0.0064
2-188
polyfit
Purpose
2polyfit
Syntax
[p,S] = polyfit(x,y,n)
Description
p ( x ) = p1 x + p2 x
n1
+ p n x + p n + 1
You may omit S if you are not going to pass it to polyval or polyconf for
calculating error estimates.
Example
0.4561
19.6214
0
8.0000
2.3180
2.8031
1.4639
0
0
S =
See Also
2-189
polytool
Purpose
Syntax
2polytool
Description
predicted values.
polytool fits by least-squares using the regression model,
2
y i = 0 + 1 x i + 2 xi + + n x i + i
2
i N ( 0, )
Cov ( i, j ) = 0
i, j
Evaluate the function by typing a value in the x-axis edit box or dragging the
vertical reference line on the plot. The shape of the pointer changes from an
arrow to a cross hair when you are over the vertical line to indicate that the line
is draggable. The predicted value of y will update as you drag the reference line.
The argument, n, controls the degree of the polynomial fit. To change the
degree of the polynomial, choose from the pop-up menu at the top of the figure.
When you are done, press the Close button.
2-190
polyval
Purpose
Syntax
2polyval
Polynomial evaluation.
Y = polyval(p,X)
[Y,DELTA] = polyval(p,X,S)
Description
Examples
1.0486
5.0606
9.0726
6.0636
7.0666
2.0516
0.0889
0.0889
0.0870
0.0951
0.0861
0.0916
0.0861
0.0870
0.0916
D =
See Also
2-191
prctile
Purpose
2prctile
Percentiles of a sample.
Syntax
Y = prctile(X,p)
Description
Examples
x = (1:5)'*(1:5)
x =
1
2
3
4
5
2
4
6
8
10
3
6
9
12
15
4
8
12
16
20
5
10
15
20
25
y = prctile(x,[25 50 75])
y =
1.7500
3.0000
4.2500
2-192
3.5000
6.0000
8.5000
5.2500
9.0000
12.7500
7.0000
12.0000
17.0000
8.7500
15.0000
21.2500
princomp
Purpose
Syntax
2princomp
Description
The Z-scores are the data formed by transforming the original data into the
space of the principal components. The values of the vector, latent, are the
variance of the columns of SCORE. Hotelling's T2 is a measure of the
multivariate distance of each observation from the center of the data set.
Example
Compute principal components for the ingredients data in the Hald dataset,
and the variance accounted for by each component.
load hald;
[pc,score,latent,tsquare] = princomp(ingredients);
pc,latent
pc =
0.0678
0.6785
0.0290
0.7309
0.6460
0.0200
0.7553
0.1085
0.5673
0.5440
0.4036
0.4684
0.5062
0.4933
0.5156
0.4844
latent =
517.7969
67.4964
12.4054
0.2372
Reference
Jackson, J. E., A User's Guide to Principal Components, John Wiley and Sons,
Inc. 1991. pp. 125.
See Also
2-193
qqplot
Purpose
Syntax
2qqplot
Description
Examples
Generate two normal samples with different means and standard deviations.
Then make a quantile-quantile plot of the two samples.
x = normrnd(0,1,100,1);
y = normrnd(0.5,2,50,1);
qqplot(x,y);
10
Y Quantiles
5
0
-5
-10
-3
-2
-1
0
X Quantiles
2-194
random
Purpose
2random
Syntax
y = random('name',A1,A2,A3,m,n)
Description
random is a utility routine allowing you to access all the random number
Examples
rn = random('Normal',0,1,2,4)
rn =
1.1650
0.6268
0.0751
0.3516
-0.6965
1.6961
0.0591
1.7971
rp = random('Poisson',1:6,1,6)
rp =
0
2-195
randtool
Purpose
Syntax
2randtool
Description
The randtool command sets up a graphic user interface for exploring the
effects of changing parameters and sample size on the histogram of random
samples from the supported probability distributions.
The M-file calls itself recursively using the action and flag parameters. For
general use call randtool without parameters.
To output the current set of random numbers, press the Output button. The
results are stored in the variable ans. Alternatively, the command
r = randtool('output') places the sample of random numbers in the vector,
r.
To sample repetitively from the same distribution, press the Resample button.
To change the distribution function, choose from the pop-up menu of functions
at the top of the figure.
To change the parameter settings, move the sliders or type a value in the edit
box under the name of the parameter. To change the limits of a parameter, type
a value in the edit box at the top or bottom of the parameter slider.
To change the sample size, type a number in the Sample Size edit box.
When you are done, press the Close button.
For an extensive discussion, see The disttool Demo on page 1-125.
See Also
2-196
disttool
range
Purpose
2range
Sample range.
Syntax
y = range(X)
Description
range(X) returns the difference between the maximum and the minimum of a
sample. For vectors, range(x) is the range of the elements. For matrices,
range(X) is a row vector containing the range of each column of X.
Example
See Also
6.4986
6.2909
5.8894
7.0002
2-197
ranksum
Purpose
Syntax
2ranksum
Description
Example
This example tests the hypothesis of equality of means for two samples
generated with poissrnd.
x = poissrnd(5,10,1);
y = poissrnd(2,20,1);
[p,h] = ranksum(x,y,0.05)
p =
0.0028
h =
1
See Also
2-198
raylcdf
Purpose
2raylcdf
Syntax
P = raylcdf(X,B)
Description
y = F( x b ) =
Example
-e
0 ----2
b
t
------- 2b 2
dt
x = 0:0.1:3;
p = raylcdf(x,1);
plot(x,p)
1
0.8
0.6
0.4
0.2
0
0
0.5
1.5
2.5
Reference
See Also
2-199
raylinv
Purpose
2raylinv
Syntax
X = raylinv(P,B)
Description
Example
x = raylinv(0.9,1)
x =
2.1460
See Also
2-200
raylpdf
Purpose
2raylpdf
Syntax
Y = raylpdf(X,B)
Description
x
--------
x 2b2
y = f ( x b ) = -----2- e
b
Example
x = 0:0.1:3;
p = raylpdf(x,1);
plot(x,p)
0.8
0.6
0.4
0.2
0
0
See Also
0.5
1.5
2.5
2-201
raylrnd
Purpose
Syntax
2raylrnd
Description
Example
r = raylrnd(1:5)
r =
1.7986
See Also
2-202
0.8795
3.3473
8.9159
3.5182
raylstat
Purpose
Syntax
2raylstat
Description
Example
[mn,v] = raylstat(1)
mn =
1.2533
v =
0.4292
See Also
2-203
rcoplot
Purpose
2rcoplot
Syntax
rcoplot(r,rint)
Description
Example
X = [ones(10,1) (1:10)'];
y = X [10;1] + normrnd(0,0.1,10,1);
[b,bint,r,rint] = regress(y,X,0.05);
rcoplot(r,rint);
Residuals
0.2
0.1
0
-0.1
-0.2
0
4
6
Case Number
10
The figure shows a plot of the residuals with error bars showing 95% confidence
intervals on the residuals. All the error bars pass through the zero line,
indicating that there are no outliers in the data.
See Also
2-204
regress
refcurve
Purpose
2refcurve
Syntax
h = refcurve(p)
Description
refcurve adds a graph of the polynomial, p, to the current axes. The function
for a polynomial of degree n is:
Example
Plot data for the height of a rocket against time, and add a reference curve
showing the theoretical height (assuming no air friction). The initial velocity of
the rocket is 100 m/sec.
h = [85 162 230 289 339 381 413 437 452 458 456 440 400 356];
plot(h,'+')
refcurve([4.9 100 0])
500
400
300
200
100
0
0
See Also
10
12
14
2-205
refline
Purpose
Syntax
2refline
Description
y = SLOPE(2) + SLOPE(1)x
to the figure.
h = refline(slope,intercept) returns the handle to the line.
refline with no input arguments superimposes the least squares line on each
line object in the current figure (except LineStyles '',' ','.'). This
behavior is equivalent to lsline.
Example
y = [3.2 2.6 3.1 3.4 2.4 2.9 3.0 3.3 3.2 2.1 2.6]';
plot(y,'+')
refline(0,3)
3.5
2.5
2
0
See Also
2-206
10
12
regress
Purpose
Syntax
2regress
Description
y = X +
2
N ( 0, I )
for , where:
y is an nx1 vector of observations,
X is an nxp matrix of regressors,
is a px1 vector of parameters, and
is an nx1 vector of random disturbances.
[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a 95%
confidence interval for , in the p-by-2 vector bint. The residuals are in r and
a 95% confidence interval for each residual, is in the n-by-2 vector rint. The
vector, stats, contains the R2 statistic along with the F and p values for the
regression.
[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1-alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives 80%
confidence intervals.
Examples
2-207
regress
X = [ones(10,1) (1:10)']
X =
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
1
10
y = X [10;1] + normrnd(0,0.1,10,1)
y =
11.1165
12.0627
13.0075
14.0352
14.9303
16.1696
17.0059
18.1797
19.0264
20.0872
[b,bint] = regress(y,X,0.05)
b =
10.0456
1.0030
bint =
9.9165
0.9822
10.1747
1.0238
Compare b to [10 1]'. Note that bint includes the true model values.
Reference
2-208
regstats
Purpose
Syntax
2regstats
Description
additive model with a constant term. The dependent variable is the vector,
responses. Values of the independent variables are in the matrix, DATA.
The function creates a figure with a group of checkboxes that save diagnostic
statistics to the base workspace using variable names you can specify.
regstats(responses,data,'model') controls the order of the regression
model. 'model' can be one of these strings:
2-209
regstats
Algorithm
= b = ( X'X ) X'y
However, this definition has poor numeric properties. Particularly dubious is
1
the computation of ( X'X ) , which is both expensive and imprecise.
Numerically stable MATLAB code for is: b = R\(Q'*y);
Reference
See Also
2-210
ridge
Purpose
2ridge
Syntax
b = ridge(y,X,k)
Description
Example
0
-5
-10
0
0.2
0.4
0.6
0.8
2-211
ridge
See Also
2-212
regress, stepwise
rowexch
Purpose
Syntax
2rowexch
Description
design matrix, X.
[settings,X] = rowexch(nfactors,nruns,'model') produces a design for
fitting a specified regression model. The input, 'model', can be one of these
strings:
Example
This example illustrates that the D-optimal design for three factors in eight
runs, using an interactions model, is a two level full-factorial design.
s = rowexch(3,8,'interaction')
s =
1
1
1
1
1
1
1
1
See Also
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
2-213
rsmdemo
Purpose
2rsmdemo
Syntax
rsmdemo
Description
rsmdemo creates a GUI that simulates a chemical reaction. To start, you have
a budget of 13 test reactions. Try to find out how changes in each reactant affect
the reaction rate. Determine the reactant settings that maximize the reaction
rate. Estimate the run-to-run variability of the reaction. Now run a designed
experiment using the model pop-up. Compare your previous results with the
output from response surface modeling or nonlinear modeling of the reaction.
The GUI has the following elements:
Example
See Also
2-214
rstool
Purpose
Syntax
2rstool
Description
Drag the dotted white reference line and watch the predicted values update
simultaneously. Alternatively, you can get a specific prediction by typing the
value of x into an editable text field. Use the pop-up menu labeled Model to
interactively change the model. Use the pop-up menu labeled Export to move
specified variables to the base workspace.
Example
See Also
nlintool
2-215
schart
Purpose
Syntax
2schart
Description
in time order. The upper and lower control limits are a 99% confidence interval
on a new observation from the process. So, roughly 99% of the plotted points
should fall between the control limits.
schart(DATA,conf) allows control of the the confidence level of the upper and
lower plotted confidence limits. For example, conf = 0.95 plots 95% confidence
intervals.
schart(DATA,conf,specs) plots the specification limits in the two element
vector, specs.
[outliers,h] = schart(data,conf,specs) returns outliers, a vector of
indices to the rows where the mean of DATA is out of control, and h, a vector of
handles to the plotted lines.
Example
2-216
schart
in thousandths of an inch, the amount the part radius differs from the target
radius.
load parts
schart(runout)
S Chart
0.45
UCL
0.4
Standard Deviation
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0 LCL
0
10
15
20
25
Sample Number
30
35
40
Reference
See Also
2-217
signrank
Purpose
Syntax
2signrank
Description
Example
This example tests the hypothesis of equality of means for two samples
generated with normrnd. The samples have the same theoretical mean but
different standard deviations.
x = normrnd(0,1,20,1);
y = normrnd(0,2,20,1);
[p,h] = signrank(x,y,0.05)
p =
0.2568
h =
0
See Also
2-218
signtest
Purpose
Syntax
2signtest
Description
desired level of significance and must be a scalar between zero and one.
[p,h] = signtest(x,y,alpha) also returns the result of the hypothesis test,
h. h is zero if the difference in medians of x and y is not significantly different
from zero. h is one if the two medians are significantly different.
p is the probability of observing a result equally or more extreme than the one
using the data (x and y) if the null hypothesis is true. p is calculated using the
signs (plus or minus) of the differences between corresponding elements in x
and y. If p is near zero, this casts doubt on this hypothesis.
Example
This example tests the hypothesis of equality of means for two samples
generated with normrnd. The samples have the same theoretical mean but
different standard deviations.
x = normrnd(0,1,20,1);
y = normrnd(0,2,20,1);
[p,h] = signtest(x,y,0.05)
p =
0.8238
h =
0
See Also
2-219
skewness
Purpose
2skewness
Sample skewness.
Syntax
y = skewness(X)
Description
Skewness is a measure of the asymmetry of the data around the sample mean.
If skewness is negative, the data are spread out more to the left of the mean
than to the right. If skewness is positive, the data are spread out more to the
right. The skewness of the normal distribution (or any perfectly symmetric
distribution) is zero.
The skewness of a distribution is defined as:
3
E(x )
y = -----------------------3
Example
X = randn([5 4])
X =
1.1650
0.6268
0.0751
0.3516
0.6965
1.6961
0.0591
1.7971
0.2641
0.8717
1.4462
0.7012
1.2460
0.6390
0.5774
0.3600
0.1356
1.3493
1.2704
0.9846
0.2735
0.4641
y = skewness(X)
y =
0.2933
See Also
2-220
0.0482
squareform
2squareform
Purpose
Syntax
S = squareform(Y)
Description
a vector into a square matrix. In this format, S(i,j) denotes the distance
between the i and j observations in the original data.
See Also
See pdist.
2-221
std
Purpose
2std
Syntax
y = std(X)
Description
std(X) computes the sample standard deviation of the data in X. For vectors,
std(x) is the standard deviation of the elements in x. For matrices, std(X) is
a row vector containing the standard deviation of each column of X.
std(X) normalizes by n1 where n is the sequence length. For normally
distributed data, the square of the standard deviation is the minimum variance
unbiased estimator of 2 (the second parameter).
--21
2
s = ------------( xi x )
n 1
i=1
1
where the sample average is x = --n
Examples
xi .
1.0628
y = std(1:2:1)
y =
1.4142
See Also
cov, var
std is a function in MATLAB.
2-222
1.0860
0.9927
0.9605
1.0254
stepwise
Purpose
Syntax
2stepwise
Description
The least squares coefficient is plotted with a green filled circle. A coefficient is
not significantly different from zero if its confidence interval crosses the white
zero line. Significant model terms are plotted using solid lines. Terms not
significantly different from zero are plotted with dotted lines.
Click on the confidence interval lines to toggle the state of the model
coefficients. If the confidence interval line is green the term is in the model. If
the the confidence interval line is red the term is not in the model.
Use the pop-up menu, Export, to move variables to the base workspace.
Example
Reference
See Also
2-223
surfht
Purpose
Syntax
2surfht
Description
There are vertical and horizontal reference lines on the plot whose intersection
defines the current x-value and y-value. You can drag these dotted white
reference lines and watch the interpolated z-value (at the top of the plot)
update simultaneously. Alternatively, you can get a specific interpolated
z-value by typing the x-value and y-value into editable text fields on the x-axis
and y-axis respectively.
2-224
tabulate
Purpose
Syntax
2tabulate
Frequency table.
table = tabulate(x)
tabulate(x)
Description
The first column of table contains the values of x. The second contains the
number of instances of this value. The last column contains the percentage of
each value.
tabulate with no output arguments displays a formatted table in the
command window.
Example
tabulate([1 2 4 4 3 4])
Value
1
2
3
4
See Also
Count
1
1
1
3
Percent
16.67%
16.67%
16.67%
50.00%
pareto
2-225
tblread
Purpose
Syntax
2tblread
Description
2-226
Return
Value
Description
data
varnames
casenames
tblread
Example
[data,varnames,casenames] = tblread('sat.dat')
data =
470
520
530
480
varnames =
Male
Female
casenames =
Verbal
Quantitative
See Also
caseread, tblwrite
2-227
tblwrite
Purpose
Syntax
2tblwrite
Description
for interactive specification of the tabular data output file. The file format has
variable names in the first row, case names in the first column and data
starting in the (2,2) position.
'varnames' is a string matrix containing the variable names. 'casenames' is
a string matrix containing the names of each case in the first column. data is
a numeric matrix with a value for each variable-case pair.
tblwrite(data,'varnames','casenames','filename') allows command line
specification of a file in the current directory, or the complete pathname of any
file in the string, 'filename'.
Example
Verbal
Quantitative
See Also
2-228
casewrite, tblread
Male
470
520
Female
530
480
tcdf
Purpose
2tcdf
Syntax
P = tcdf(X,V)
Description
2 -----------2
---
t
2
1 + ---
Examples
2-229
tinv
Purpose
2tinv
Syntax
X = tinv(P,V)
Description
tinv(P,V) computes the inverse of Students t cdf with parameter V for the
probabilities in P. The arguments P and V must be the same size except that a
scalar argument functions as a constant matrix of the size of the other
argument.
The degrees of freedom, V, must be a positive integer and P must lie in the
interval [0 1].
The t inverse function in terms of the t cdf is
1
x = F ( p ) = { x:F ( x ) = p }
where
+1
------------
2 1
1
--------------------- ---------- ----------------------------- dt
p = F( x ) =
+1
-----------
2
--t 2
2
1 + ---
The result, x, is the solution of the integral equation of the t cdf with parameter
where you supply the desired probability p.
Examples
What is the 99th percentile of the t distribution for one to six degrees of
freedom?
percentile = tinv(0.99,1:6)
percentile =
31.8205
2-230
6.9646
4.5407
3.7469
3.3649
3.1427
tpdf
Purpose
2tpdf
Syntax
Y = tpdf(X,V)
Description
2 ----------- ---
2
2
1 + x-----
Examples
The mode of the t distribution is at x = 0. This example shows that the value of
the function at the mode is an increasing function of the degrees of freedom.
tpdf(0,1:6)
ans =
0.3183
0.3536
0.3676
0.3750
0.3796
0.3827
0.0006
0.0042
0.0042
0.0006
0.0035
2-231
trimmean
Purpose
2trimmean
Syntax
m = trimmean(X,percent)
Description
Examples
This example shows a Monte Carlo simulation of the relative efficiency of the
10% trimmed mean to the sample average for normal data.
x = normrnd(0,1,100,100);
m = mean(x);
trim = trimmean(x,10);
sm = std(m);
strim = std(trim);
efficiency = (sm/strim).^2
efficiency =
0.9702
See Also
2-232
trnd
Purpose
Syntax
2trnd
Description
Examples
noisy = trnd(ones(1,6))
noisy =
19.7250
0.3488
0.2843
0.4034
0.4816
2.4190
0.9038
0.0754
0.9820
1.0115
0.6627
0.8646
0.1905
0.8060
1.5585
0.5216
0.0433
0.0891
0.9611
numbers = trnd(3,2,6)
numbers =
0.3177
0.2536
0.0812
0.5502
2-233
tstat
Purpose
2tstat
Syntax
[M,V] = tstat(NU)
Description
Examples
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
v =
NaN
NaN
3.0000
2.0000
1.6667
1.5000
1.4000
1.3333
1.2857
1.2500
1.2222
1.2000
1.1818
1.1667
1.1538
1.1429
1.1333
1.1250
1.1176
1.1111
1.1053
1.1000
1.0952
1.0909
1.0870
1.0833
1.0800
1.0769
1.0741
1.0714
Note that the variance does not exist for one and two degrees of freedom.
2-234
ttest
Purpose
Syntax
2ttest
Description
x
T = -----------s
sig is the probability that the observed value of T could be as large or larger by
Example
This example generates 100 normal random numbers with theoretical mean
zero and standard deviation one. The observed mean and standard deviation
are different from their theoretical values, of course. We test the hypothesis
that there is no true difference.
2-235
ttest
0.2620
The result, h = 0, means that we cannot reject the null hypothesis. The
significance level is 0.4474, which means that by chance we would have
observed values of T more extreme than the one in this example in 45 of 100
similar experiments. A 95% confidence interval on the mean is [0.1165
0.2620], which includes the theoretical (and hypothesized) mean of zero.
2-236
ttest2
Purpose
Syntax
2ttest2
Description
xy
T = -----------s
significance is the probability that the observed value of T could be as large
or larger by chance under the null hypothesis that the mean of x is equal to the
mean of y.
ci is a 95% confidence interval for the true difference in means.
[h,significance,ci] = ttest2(x,y,alpha) gives control of the significance
level, alpha. For example if alpha = 0.01, and the result, h, is 1, you can reject
the null hypothesis at the significance level 0.01. ci in this case is a
100(1alpha)% confidence interval for the true difference in means.
ttest2(x,y,alpha,tail) allows specification of one or two-tailed tests. tail
Examples
This example generates 100 normal random numbers with theoretical mean
zero and standard deviation one. We then generate 100 more normal random
numbers with theoretical mean one half and standard deviation one. The
observed means and standard deviations are different from their theoretical
values, of course. We test the hypothesis that there is no true difference
between the two means. Notice that the true difference is only one half of the
2-237
ttest2
0.1720
The result, h = 1, means that we can reject the null hypothesis. The
significance is 0.0017, which means that by chance we would have
observed values of t more extreme than the one in this example in only 17
of 10,000 similar experiments! A 95% confidence interval on the mean is
[0.7352 0.1720], which includes the theoretical (and hypothesized) difference
of 0.5.
2-238
unidcdf
Purpose
2unidcdf
Syntax
P = unidcdf(X,N)
Description
Examples
What is the probability of drawing a number 20 or less from a hat with the
numbers from 1 to 50 inside?
probability = unidcdf(20,50)
probability =
0.4000
2-239
unidinv
Purpose
2unidinv
Syntax
X = unidinv(P,N)
Description
unidinv(P,N) returns the smallest integer X such that the discrete uniform cdf
evaluated at X is equal to or exceeds P. You can think of P as the probability of
drawing a number as large as X out of a hat with the numbers 1 through N
inside.
The argument P must lie on the interval [0 1] and N must be a positive integer.
Each element of X is a positive integer.
Examples
x = unidinv(0.7,20)
x =
14
y = unidinv(0.7 + eps,20)
y =
15
A small change in the first parameter produces a large jump in output. The cdf
and its inverse are both step functions. The example shows what happens at a
step.
2-240
unidpdf
Purpose
2unidpdf
Syntax
Y = unidpdf(X,N)
Description
Examples
0.1000
0.1000
0.1000
0.1000
0.1000
0.1429
0.1250
0.1111
0.2000
0.1667
2-241
unidrnd
Purpose
Syntax
2unidrnd
Description
Examples
2-242
470
6788
6792
9346
unidstat
Purpose
2unidstat
Syntax
[M,V] = unidstat(N)
Description
N 1
The variance is ----------------- .
12
Examples
[m,v] = unidstat(1:6)
m =
1.0000
1.5000
2.0000
2.5000
3.0000
3.5000
0.2500
0.6667
1.2500
2.0000
2.9167
v =
2-243
unifcdf
Purpose
2unifcdf
Syntax
P = unifcdf(X,A,B)
Description
Examples
2-244
unifinv
Purpose
2unifinv
Syntax
X = unifinv(P,A,B)
Description
except that scalar arguments function as constant matrices of the common size
of the other arguments.
A and B are the minimum and maximum values respectively.
x = F ( p a, b ) = a + p ( a b )I [ 0, 1 ] ( p )
The standard uniform distribution has A = 0 and B = 1.
Examples
2-245
unifit
Purpose
Syntax
2unifit
Description
Example
r = unifrnd(10,12,100,2);
[ahat,bhat,aci,bci] = unifit(r)
ahat =
10.0154
10.0060
bhat =
11.9989
11.9743
aci =
9.9551
10.0154
9.9461
10.0060
bci =
11.9989
12.0592
See Also
2-246
11.9743
12.0341
unifpdf
Purpose
2unifpdf
Syntax
Y = unifpdf(X,A,B)
Description
that scalar arguments function as constant matrices of the common size of the
other arguments.
The parameter B must be greater than A.
The continuous uniform distribution pdf is:
1
y = f ( x a, b ) = ------------I [ a, b ] ( x )
ba
The standard uniform distribution has A = 0 and B = 1.
Examples
2-247
unifrnd
Purpose
Syntax
2unifrnd
Description
Examples
random = unifrnd(0,1:6)
random =
0.2190
0.0941
2.0366
2.7172
4.6735
2.3010
0.2138
2.6485
4.0269
1.6619
0.1037
random = unifrnd(0,1,2,3)
random =
0.0077
0.3834
2-248
0.0668
0.4175
0.6868
0.5890
unifstat
Purpose
2unifstat
Syntax
[M,V] = unifstat(A,B)
Description
(b a)
The variance is -------------------- .
12
Examples
a = 1:6;
b = 2.a;
[m,v] = unifstat(a,b)
m =
1.5000
3.0000
4.5000
6.0000
7.5000
9.0000
0.0833
0.3333
0.7500
1.3333
2.0833
3.0000
v =
2-249
var
Purpose
2var
Variance of a sample.
Syntax
y = var(X)
y = var(X,1)
y = var(X,w)
Description
var(X) computes the variance of the data in X. For vectors, var(x) is the
variance of the elements in x. For matrices, var(X) is a row vector containing
the variance of each column of X.
var(x) normalizes by n1 where n is the sequence length. For normally
distributed data, this makes var(x) the minimum variance unbiased estimator
estimator (MLE) of 2.
2-250
var
Examples
x = [1 1];
w = [1 3];
v1 = var(x)
v1 =
2
v2 = var(x,1)
v2 =
1
v3 = var(x,w)
v3 =
0.7500
See Also
cov, std
2-251
weibcdf
Purpose
2weibcdf
Syntax
P = weibcdf(X,A,B)
Description
p = F ( x a, b ) =
Examples
0 abtb 1 eat dt
b
= 1e
ax
I ( 0, ) ( x )
2-252
0.4054
0.5080
0.6201
0.5000
0.6116
0.7248
weibfit
Purpose
Syntax
2weibfit
Description
y = f ( x a, b ) = abx
b 1 ax
I ( 0, ) ( x )
returned (100(1alpha)%).
Example
r = weibrnd(0.5,0.8,100,1);
[phat,pci] = weibfit(r)
phat =
0.4746
0.7832
pci =
0.3851
0.5641
See Also
0.6367
0.9298
2-253
weibinv
Purpose
2weibinv
Syntax
X = weibinv(P,A,B)
Description
size except that scalar arguments function as constant matrices of the common
size of the other arguments.
The parameters A and B must be positive.
The inverse of the Weibull cdf is:
1
---
b
1
1
1
x = F ( p a, b ) = --- ln ------------ I [ 0, 1 ] ( p )
1p
a
Examples
A batch of light bulbs have lifetimes (in hours) distributed Weibull with
parameters a = 0.15 and b = 0.24. What is the median lifetime of the bulbs?
life = weibinv(0.5,0.15,0.24)
life =
588.4721
2-254
weiblike
Purpose
Syntax
2weiblike
Description
respective parameters.
The Weibull negative log-likelihood is:
n
log L = log
f ( a, b x i )
i=1
log f ( a, b xi )
i=1
Example
0.0022
0.0056
Reference
See Also
2-255
weibpdf
Purpose
2weibpdf
Syntax
Y = weibpdf(X,A,B)
Description
b 1 ax
I ( 0, ) ( x )
Some references refer to the Weibull distribution with a single parameter. This
corresponds to weibpdf with A =1.
Examples
1.3406
1.2197
0.8076
0.4104
0.1639
0.4104
0.1639
y1 = exppdf(0.1:0.1:0.6,1./lambda)
y1 =
0.9048
Reference
2-256
1.3406
1.2197
0.8076
weibplot
Purpose
Syntax
2weibplot
Description
Example
r = weibrnd(1.2,1.5,50,1);
weibplot(r)
Weibull Probability Plot
0.99
0.96
0.90
0.75
Probability
0.50
0.25
0.10
0.05
0.02
0.01
10-1
100
Data
See Also
normplot
2-257
weibrnd
Purpose
Syntax
2weibrnd
Description
Examples
n1 = weibrnd(0.5:0.5:2,0.5:0.5:2)
n1 =
0.0093
1.5189
0.8308
0.7541
n2 = weibrnd(1/2,1/2,[1 6])
n2 =
29.7822
Reference
2-258
0.9359
2.1477
12.6402
0.0050
0.0121
weibstat
Purpose
2weibstat
Syntax
[M,V] = weibstat(A,B)
Description
1
--b
(1 + b )
Examples
2
--b
( 1 + 2b ) ( 1 + b )
[m,v] = weibstat(1:4,1:4)
m =
1.0000
0.6267
0.6192
0.6409
1.0000
0.1073
0.0506
0.0323
v =
weibstat(0.5,0.7)
ans =
3.4073
2-259
x2fx
Purpose
Syntax
2x2fx
Description
Example
1
2
3
4
5
6
4
10
18
1
4
9
16
25
36
Let x1 be the first column of x and x2 be the second. Then, the first column of D
is for the constant term. The second column is x1 . The 3rd column is x2. The 4th
is x1x2. The fifth is x12 and the last is x22.
See Also
2-260
xbarplot
Purpose
Syntax
2xbarplot
Description
must be in time order. The upper and lower control limits are a 99% confidence
interval on a new observation from the process. So, roughly 99% of the plotted
points should fall between the control limits.
xbarplot(DATA,conf) allows control of the the confidence level of the upper
and lower plotted confidence limits. For example, conf = 0.95 plots 95%
confidence intervals.
xbarplot(DATA,conf,specs) plots the specification limits in the two element
vector, specs.
[outlier,h] = xbarplot(DATA,conf,specs) returns outlier, a vector of
indices to the rows where the mean of DATA is out of control, and h, a vector of
handles to the plotted lines.
Example
2-261
xbarplot
thousandths of an inch, the amount the part radius differs from the target
radius.
load parts
xbarplot(runout,0.999,[0.5 0.5])
Xbar Chart
0.5
USL
0.4
0.3
21
Measurements
0.2
25
UCL
0.1
0
-0.1
-0.2
-0.3
LCL
-0.4
-0.5
LSL
See Also
2-262
10
15
20
Samples
25
30
35
40
zscore
2zscore
Purpose
Standardized Z score.
Syntax
Z = zscore(D)
Description
2-263
ztest
Purpose
Syntax
2ztest
Hypothesis testing for the mean of one sample with known variance.
h = ztest(x,m,sigma)
h = ztest(x,m,sigma,alpha)
[h,sig,ci] = ztest(x,m,sigma,alpha,tail)
Description
sig is the probability that the observed value of Z could be as large or larger by
chance under the null hypothesis that the mean of x is equal to .
sig is the p-value associated with the Z statistic. z = ------------
Example
2-264
This example generates 100 normal random numbers with theoretical mean
zero and standard deviation one. The observed mean and standard deviation
ztest
are different from their theoretical values, of course. We test the hypothesis
that there is no true difference.
x = normrnd(0,1,100,1);
m = mean(x)
m =
0.0727
[h,sig,ci] = ztest(x,0,1)
h =
0
sig =
0.4669
ci =
0.1232
0.2687
The result, h = 0, means that we cannot reject the null hypothesis. The
significance level is 0.4669, which means that by chance we would have
observed values of Z more extreme than the one in this example in 47 of 100
similar experiments. A 95% confidence interval on the mean is [0.1232
0.2687], which includes the theoretical (and hypothesized) mean of zero.
2-265
ztest
2-266
Index
A
absolute deviation 1-44
additive 1-67
alternative hypothesis 1-85
analysis of variance 1-24
ANOVA 1-65
anova1 2-12, 2-16
anova2 2-12, 2-20
average linkage 2-121
B
bacteria counts 1-65
barttest 2-13
baseball odds 2-31, 2-33
Bernoulli random variables 2-35
beta distribution 1-12, 1-13
betacdf 2-3, 2-24
betafit 2-3, 2-25
betainv 2-5, 2-26
betalike 2-3, 2-27
betapdf 2-4, 2-28
betarnd 2-6, 2-29
betastat 2-8, 2-30
binocdf 2-3, 2-31
binofit 2-3, 2-32
binoinv 2-5, 2-33
binomial distribution 1-12, 1-15
binopdf 2-4, 2-34
binornd 2-6, 2-35
binostat 2-8, 2-36
bootstrap 2-37
C
capability studies 1-113
capable 2-11, 2-41
capaplot 2-11
caseread 2-14, 2-44
casewrite 2-14, 2-45
cdf 1-6, 1-7
cdf 2-3, 2-46
census 2-14
Central Limit Theorem 1-31
centroid linkage 2-121
Chatterjee and Hadi example 1-72
chi2cdf 2-3, 2-47
chi2inv 2-5, 2-48
chi2pdf 2-4, 2-49
chi2rnd 2-6, 2-50
chi2stat 2-8, 2-51
chi-square distribution 1-12, 1-17
circuit boards 2-34
cities 2-14
City Block metric
in cluster analysis 2-179
classify 2-52
cluster 2-11, 2-53
I-1
Index
coin 2-95
combnk 2-57
descriptive 2-2
descriptive statistics 1-42
Design of Experiments 1-115
D-optimal designs 1-118
fractional factorial designs 1-117
full factorial designs 1-116
discrete uniform distribution 1-12, 1-20
discrim 2-14
dissimilarity matrix
creating 1-51
distributions 1-2, 1-5
disttool 2-14, 2-68
DOE. See Design of Experiments
D-optimal designs 1-118
dummyvar 2-69
corrcoef 2-60
cov 2-61
erf 1-31
error function 1-31
errorbar 2-10, 2-70
estimate 1-128
Euclidean distance
in cluster analysis 2-179
EWMA charts 1-112
ewmaplot 2-11, 2-71
expcdf 2-3, 2-73
expfit 2-3, 2-74
expinv 2-5, 2-75
exponential distribution 1-12, 1-21
exppdf 2-4, 2-76
exprnd 2-6, 2-77
expstat 2-8, 2-78
extrapolated 2-194
crosstab 2-62
D
data 2-2
daugment 2-13, 2-63
dcovary 2-13, 2-64
I-2
Index
grpstats 2-104
G
gamcdf 2-3, 2-88
gamfit 2-3, 2-89
gaminv 2-5, 2-90
gamlike 2-3, 2-91
H
hadamard 2-13
hald 2-14
harmmean 2-9, 2-105
histogram 1-130
hogg 2-14
Hotellings T squared 1-102
hougen 2-108
I
icdf 2-114
I-3
Index
interpolated 2-224
interquartile range (iqr) 1-44
inverse cdf 1-6, 1-7
iqr 2-9, 2-117
K
kurtosis 2-9, 2-118
L
lawdata 2-14
least-squares 2-189
leverage 2-119
M
mad 2-9, 2-129
mahal 2-130
I-4
N
nanmax 2-9, 2-136
nanmean 2-9, 2-137
nanmedian 2-9, 2-138
nanmin 2-9, 2-139
NaNs 1-45
nanstd 2-9, 2-140
nansum 2-9, 2-141
nbincdf 2-4, 2-142
nbininv 2-6, 2-143
nbinpdf 2-5, 2-144
nbinrnd 2-7, 2-145
nbinstat 2-8, 2-146
ncfcdf 2-4, 2-147
ncfinv 2-6, 2-148
ncfpdf 2-5, 2-149
ncfrnd 2-7, 2-150
Index
null 1-85
null hypothesis 1-85
P
pareto 2-10, 2-174
parts 2-15
using 1-51
percentiles 1-46
perms 2-181
plots 1-46, 2-2
poisscdf 2-4, 2-182
poissfit 2-3, 2-183
poissinv 2-6, 2-184
Poisson distribution 1-13, 1-33
poisspdf 2-5, 2-185
poissrnd 2-7, 2-186
poisstat 2-8, 2-187
polyconf 2-12, 2-188
polydata 2-15
polyfit 2-12, 2-189
polynomial 1-126
polytool 1-125, 2-14, 2-190
polyval 2-12, 2-191
popcorn 2-21
I-5
Index
popcorn 2-15
prctile 2-9, 2-192
Q
qqplot 2-10, 2-194
QR decomposition 1-70
quality assurance 2-34
quantile-quantile plots 1-103, 1-106
R
random 2-195
I-6
S
S charts 1-111
sat 2-15
schart 2-11, 2-216
Scree plot 1-101
segmentation analysis 1-50
significance level 1-85
signrank 2-13, 2-218
signtest 2-13, 2-219
similarity matrix
creating 1-51
simulation 2-117
single linkage 2-121
skewness 1-103
skewness 2-9, 2-220
SPC. See Statistical Process Control
squareform 2-11, 2-221
Index
V
var 2-9, 2-250
T
t distribution 1-13, 1-36
tabulate 2-225
U
unbiased 2-222, 2-250
weibfit 2-253
weibinv 2-6, 2-254
weiblike 2-255
weibpdf 2-5, 2-256
weibplot 2-10, 2-257
weibrnd 2-7, 2-258
weibstat 2-8, 2-259
X
x2fx 2-260
I-7
Index
Z
zscore 2-263
ztest 2-13, 2-264
I-8