0% found this document useful (0 votes)

19 views114 pages

ERM 4b Final

Uploaded by

vinayak457

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views114 pages

ERM 4b Final

Uploaded by

vinayak457

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 114

Why a Manager Needs to Know About Statistics

 To Know How to Properly Present Information

 To Know How to Draw Conclusions about Populations

Based on Sample Information

 To Know How to Improve Processes

 To Know How to Obtain Reliable Forecasts

Why We Need Data
 To Provide Input to Survey
 To Provide Input to Study
 To Measure Performance of Ongoing Service or
Production Process
 To Evaluate Conformance to Standards
 To Assist in Formulating Alternative Courses of
Action
 To Satisfy Curiosity
Binomial Probability Distribution

 ‘n’ Identical Trials

 E.g., 15 tosses of a coin; 10 light bulbs taken from
a warehouse
 2 Mutually Exclusive Outcomes on Each Trial
 E.g., Heads or tails in each toss of a coin;
defective or not defective light bulb
 Trials are Independent
 The outcome of one trial does not affect the
outcome of the other
Binomial Probability Distribution
(continued)

 Constant Probability for Each Trial

 E.g., Probability of getting a tail is the same each
time we toss the coin
 2 Sampling Methods
 Infinite population without replacement
 Finite population with replacement
Binomial Probability Distribution Function

n!
P X   p 1  p 
X n X

X ! n  X  !
P  X  : probability of X successes given n and p
X : number of "successes" in sample  X  0,1, , n
p : the probability of each "success"
n : sample size Tails in 2 Tosses of Coin
X P(X)
0 1/4 = .25
1 2/4 = .50
2 1/4 = .25
Poisson Distribution

Siméon Poisson
Poisson Distribution
 Discrete events (“successes”) occurring in a given
area of opportunity (“interval”)
 “Interval” can be time, length, surface area, etc.
 The probability of a “success” in a given “interval” is
the same for all the “intervals”
 The number of “successes” in one “interval” is
independent of the number of “successes” in other
“intervals”
 The probability of two or more “successes” occurring
in an “interval” approaches zero as the “interval”
becomes smaller
 E.g., # customers arriving in 15 minutes
 E.g., # defects per case of light bulbs
The Normal Distribution

 “Bell Shaped” f(X)

 Symmetrical
 Mean, Median and
Mode are Equal X

 Inter-quartile Range
Equals 1.33 s Mean
Median
 Random Variable Mode
Has Infinite Range
The Mathematical Model
1  (1/ 2)  X    / s 
f X  
2

e
2s
f  X  : density of random variable X
  3.14159; e  2.71828
 : population mean
s : population standard deviation
X : value of random variable    X   
Many Normal Distributions
There are an Infinite Number of Normal Distributions

Varying the Parameters s and , We Obtain

Different Normal Distributions
The Standardized Normal
Distribution
When X is normally distributed with a mean  and a
X   follows a
standard deviation s, Z 
s
standardized (normalized) normal distribution with a
mean 0 and a standard deviation 1.

f(Z) f(X) s

sZ 1

X
Z  0 Z
Finding Probabilities
Probability is
the area under
the curve! P c  X  d   ?

f(X)

X
c d
Standardizing Example
X  6.2  5
Z   0.12
s 10
Normal Distribution Standardized
Normal Distribution
s  10 sZ 1

6.2 X 0.12 Z
 5 Z  0
Example:
P  2.9  X  7.1  .1664
X  2.9  5 X  7.1  5
Z   .21 Z   .21
s 10 s 10
Normal Distribution Standardized
Normal Distribution
s  10 .0832 sZ 1
.0832

2.9 7.1 X 0.21 0.21 Z

 5 Z  0
Example:
P  2.9  X  7.1  .1664(continued)
Cumulative Standardized Normal
Distribution Table (Portion)
Z  0 sZ 1
Z .00 .01 .02
.5832
0.0 .5000 .5040 .5080
0.1 .5398 .5438 .5478

0.2 .5793 .5832 .5871 0

Z = 0.21
0.3 .6179 .6217 .6255
Example:
P  2.9  X  7.1  .1664(continued)
Cumulative Standardized Normal
Distribution Table (Portion)
Z  0 sZ 1
Z .00 .01 .02 .4168
-0.3 .3821 .3783 .3745
-0.2 .4207 .4168 .4129

-0.1 .4602 .4562 .4522 0

Z = -0.21
0.0 .5000 .4960 .4920
Recovering X Values for Known
Probabilities
Normal Distribution Standardized
Normal Distribution
s  10
.6179 sZ 1
.3821

X 0.30
 5 ? Z  0
Z

X    Zs  5  .3010  8
More Examples of Normal
Distribution Using PHStat
A set of final exam grades was found to be normally
distributed with a mean of 73 and a standard deviation of 8.
What is the probability of getting a grade no higher than 91
on this exam?

X N  73,8 2
 P  X  91  ? s 8
Mean 73
Standard Deviation 8

Probability for X <= X

X Value 91
Z Value 2.25
  73 91
P(X<=91) 0.9877756
Z
0 2.25
More Examples of Normal
Distribution Using PHStat
(continued)

What percentage of students scored between

65 and 89?
X N  73,82  P  65  X  89  ?

Probability for a Range

From X Value 65
To X Value 89
Z Value for 65 -1
Z Value for 89 2 X
P(X<=65) 0.1587
P(X<=89) 0.9772 65   73 89
P(65<=X<=89) 0.8186 Z
-1 0 2
More Examples of Normal
Distribution Using PHStat
(continued)

The middle 50% of the students scored

between what two scores?
X N  73,82  P  a  X  b  .50
Find X and Z Given Cum. Pctage.
Cumulative Percentage 25.00%
Z Value -0.67449
X Value 67.60408 .25 .25

Find X and Z Given Cum. Pctage.

X
Cumulative Percentage 75.00% 67.6   73 78.4
Z Value 0.67449 Z
X Value 78.39592 -0.67 0 0.67
Assessing Normality

 Not All Continuous Random Variables are

Normally Distributed
 It is Important to Evaluate How Well the Data
Set Seems to Be Adequately Approximated by
a Normal Distribution
Assessing Normality (continued)
 Construct Charts
 For small- or moderate-sized data sets, do the
stem-and-leaf display and box-and-whisker plot
look symmetric?
 For large data sets, does the histogram or polygon
appear bell-shaped?
 Compute Descriptive Summary Measures
 Do the mean, median and mode have similar
values?
 Is the interquartile range approximately 1.33 s?
 Is the range approximately 6 s?
Assessing Normality
(continued)

 Observe the Distribution of the Data Set

 Do approximately 2/3 of the observations lie
between mean  1 standard deviation?
 Do approximately 4/5 of the observations lie
between mean  1.28 standard deviations?
 Do approximately 19/20 of the observations lie
between mean  2 standard deviations?
 Evaluate Normal Probability Plot
 Do the points lie on or close to a straight line with
positive slope?
Assessing Normality
(continued)

 Normal Probability Plot

 Arrange Data into Ordered Array
 Find Corresponding Standardized Normal Quantile
Values
 Plot the Pairs of Points with Observed Data Values
on the Vertical Axis and the Standardized Normal
Quantile Values on the Horizontal Axis
 Evaluate the Plot for Evidence of Linearity
Normal Probability Plot
Left-Skewed Right-Skewed
90 90
X 60 X 60
30 Z 30 Z
-2 -1 0 1 2 -2 -1 0 1 2

Rectangular U-Shaped
90 90
X 60 X 60
30 Z 30 Z
-2 -1 0 1 2 -2 -1 0 1 2
Unbiasedness (  X   )
f X 
Unbiased Biased

 X X
Effect of Large Sample
For sampling with replacement:
As n increases, s X decreases
f X  Larger
sample size
Smaller
sample size

 X
When the Population is Normal
Population Distribution
Central Tendency s  10
X  

Variation   50
s Sampling Distributions
sX 
n n4 n  16
sX 5 s X  2.5

 X  50 X
When the Population is
Not Normal
Population Distribution
Central Tendency s  10
X  

Variation   50
s Sampling Distributions
sX 
n n4 n  30
sX 5 s X  1.8

 X  50 X
Level of Significance and the
Rejection Region

H0:  3.5 a Critical

H1:  < 3.5 Value(s)

Rejection 0
Regions a
H0:   3.5
H1:  > 3.5
0
a/2
H0:  3.5
H1:   3.5
0
One-Tail Z Test for Mean
( s Known)

 Assumptions
 Population is normally distributed
 If not normal, requires large samples
 Null hypothesis has  or  sign only
 s is known
 Z Test Statistic

X  X X 
Z 
sX s/ n
Rejection Region
H0: 0 H0: 0
H1:  < 0 H1:  > 0
Reject H0 Reject H0
a a

0 Z 0 Z
Z must be significantly Small values of Z don’t
below 0 to reject H0 contradict H0 ; don’t
reject H0 !
Reject and Do Not Reject
Regions
H 0 :   368
Reject

.05
Do Not Reject
X
 X   X  368 372.5
0 1.645 Z
1.5
0

H1 :   368
Connection to Confidence
Intervals
For X  372.5, s  15 and n  25,
the 95% confidence interval is:
372.5  1.96  15 / 25    372.5  1.96  15 / 25
or
366.62    378.38
We are 95% confident that the population mean is
between 366.62 and 378.38.
If this interval contains the hypothesized mean (368),
we do not reject the null hypothesis.
It does. Do not reject.
t Test: s Unknown
 Assumption
 Population is normally distributed
 If not normal, requires a large sample
 s is unknown
 t Test Statistic with n-1 Degrees of Freedom
X 
 t
S/ n
Example Solution: One-Tail
H0: 368 Test Statistic:
H1:  368
X   372.5  368
a = 0.01 t   1.80
S 15
n = 36, df = 35 n 36
Critical Value: 2.4377
Reject Decision:
Do Not Reject at a = .01.
.01
Conclusion:
Insufficient Evidence that
0 2.437 t35
7
1.80
True Mean is More Than 368.
p -Value Solution
(p-Value is between .025 and .05)  (a = 0.01)
Do Not Reject.

p-Value = [.025, .05]

Reject

a = 0.01

0 t35
1.80 2.4377
Test Statistic 1.80 is in the Do Not Reject Region
Potential Pitfalls and Ethical Issues
 Data Collection Method is Not Randomized to Reduce
Selection Biases
 Treatment of Human Subjects are Manipulated Without
Informed Consent
 Data Snooping is Used to Choose between One-Tail
and Two-Tail Tests, and to Determine the Level of
Significance
Potential Pitfalls and Ethical Issues (continued)

 Data Cleansing is Practiced to Hide

Observations that do not Support a Stated
Hypothesis
 Fail to Report Pertinent Findings
One-Way Analysis of Variance
F Test
 Evaluate the Difference Among the Mean Responses of 2
or More (c ) Populations
 E.g., Several types of tires, oven temperature settings

 Assumptions
 Samples are randomly and independently drawn
 This condition must be met

 Populations are normally distributed

 F Test is robust to moderate departure from normality

 Populations have equal variances

 Less sensitive to this requirement when samples are of equal

size from each population

Why ANOVA?
 Could Compare the Means One by One using Z or t Tests for
Difference of Means
 Each Z or t Test Contains Type I Error
 The Total Type I Error with k Pairs of Means is 1- (1 - a) k

 E.g., If there are 5 means and use a = .05

 Must perform 10 comparisons

 Type I Error is 1 – (.95)

10 = .40

 40% of the time you will reject the null hypothesis of equal means

in favor of the alternative when the null is true!

Hypotheses of One-Way ANOVA
 H 0 : 1  2   c
 All population means are equal
 No treatment effect (no variation in means among groups)

 H1 : Not all i are the same

 At least one population mean is different (others may be the
same!)
 There is a treatment effect
 Does not mean that all population means are different
One-Way ANOVA
(No Treatment Effect)
H 0 : 1  2   c
H1 : Not all i are the same
The Null
Hypothesis is
True

1   2  3
One-Way ANOVA
(Treatment Effect Present)
H 0 : 1  2   c
H1 : Not all i are the same The Null
Hypothesis is
NOT True

1   2  3 1  2  3
One-Way ANOVA
(Partition of Total Variation)
Total Variation SST

Variation Due to Variation Due to Random

= Group SSA + Sampling SSW
Commonly referred to as: Commonly referred to as:
 Among Group Variation  Within Group Variation
 Sum of Squares Among  Sum of Squares Within
 Sum of Squares Between  Sum of Squares Error
 Sum of Squares Model  Sum of Squares Unexplained
 Sum of Squares Explained
 Sum of Squares Treatment
Total Variation
c nj

SST   ( X ij  X ) 2

j 1 i 1

X ij : the i -th observation in group j

n j : the number of observations in group j
n : the total number of observations in all groups
c : the number of groups
c nj

 X
j 1 i 1
ij

X  the overall or grand mean

n
Total Variation (continued)

  X   X 
2 2 2
SST  X 11  X 21 X nc c X
Response, X

Group 1 Group 2 Group 3

One-Way ANOVA
F Test Statistic
 Test Statistic
MSA
F


MSW
 MSA is mean squares among
 MSW is mean squares within
 Degrees of Freedom
 df1  c 1

df 2  n  c
One-Way ANOVA
Summary Table

Degrees Mean
Source of Sum of F
of Squares
Variation Squares Statistic
Freedom (Variance)
Among MSA =
c–1 SSA MSA/MSW
(Factor) SSA/(c – 1 )
Within MSW =
n–c SSW
(Error) SSW/(n – c )
SST =
Total n–1
SSA + SSW
Features of One-Way ANOVA F Statistic
 The F Statistic is the Ratio of the Among Estimate
of Variance and the Within Estimate of Variance
 The ratio must always be positive
 df1 = c -1 will typically be small
 df2 = n - c will typically be large
 The Ratio Should be Close to 1 if the Null is True
 If the Null Hypothesis is False
 The numerator should be greater than the denominator
 The ratio should be larger than 1
One-Way ANOVA F Test
Example
As Production Manager, you
want to see if 3 filling machines Machine1 Machine2 Machine3
have different mean filling 25.40 23.40 20.00
times. 26.31 21.80 22.20
You assign 15 similarly trained 24.10 23.50 19.75
& experienced workers, 5 per 23.74 22.75 20.60
machine, to the machines. 25.10 21.60 20.40
At the .05 significance level, is
there a difference in mean
filling times?
One-Way ANOVA
Example: Scatter Diagram

Machine1 Machine2 Machine3 27

25.40 23.40 20.00 26 •
26.31 21.80 22.20 •
25 • X1
24.10 23.50 19.75 24 •
23.74 22.75 20.60 • ••
25.10 21.60 20.40
23
• X2 •
X
22 ••
21
X 1  24.93 X 2  22.61 •• X3
20 ••
X 3  20.59 X  22.71 19
Residual Analysis for
Homoscedasticity

Y Y

X X
SR SR

X X

Heteroscedasticity Homoscedasticity
One-Way ANOVA Example
Computations
Machine1 Machine2 Machine3 X 1  24.93 nj  5
25.40 23.40 20.00
26.31 21.80 22.20 X 2  22.61 c3
24.10 23.50 19.75 X 3  20.59 n  15
23.74 22.75 20.60
25.10 21.60 20.40 X  22.71

SSA  5  24.93  22.71   22.61  22.71   20.59  22.71 

 2 2 2
 
 47.164
SSW  4.2592  3.112  3.682  11.0532
MSA  SSA /(c -1)  47.16 / 2  23.5820
MSW  SSW /( n - c)  11.0532 /12  .9211
One-Way ANOVA Example
Solution
Test Statistic:
H0: 1 = 2 = 3
H1: Not All Equal MSA 23.5820
a = .05 F   25.6
df1= 2 df2 = 12 MSW .9211

Critical Value(s):
Decision:
Reject at a = 0.05.
a = 0.05 Conclusion:
There is evidence that at
least one  i differs from
0 3.89 F the rest.
Two-Way ANOVA

 Examines the Effect of:

 Two factors on the dependent variable
 E.g., Percent carbonation and line speed on soft
drink bottling process
 Interaction between the different levels of these
two factors
 E.g., Does the effect of one particular percentage of
carbonation depend on which level the line speed is
set?
Two-Way ANOVA (continued)

 Assumptions
 Normality
 Populations are normally distributed
 Homogeneity of Variance
 Populations have equal variances
 Independence of Errors
 Independent random samples are drawn
Two-Way ANOVA
Total Variation Partitioning

Variation Due to SSA +

Factor A d.f.= r-1

SSB +
Variation Due to
d.f.= c-1
Total Variation Factor B

Variation Due to SSAB +

SST
= Interaction d.f.= (r-1)(c-1)
d.f.= n-1
Variation Due to SSE
Random Sampling d.f.= rc(n’-1)
Two-Way ANOVA
Total Variation Partitioning

r  the number of levels of factor A

c  the number of levels of factor B
n  the number of values (replications) for each cell
'

n  the total number of observations in the experiment

X ijk  the value of the k -th observation for level i of
factor A and level j of factor B
Features of Two-Way ANOVA
F Test
 Degrees of Freedom Always Add Up
 rcn’-1=rc(n’-1)+(c-1)+(r-1)+(c-1)(r-1)
 Total=Error+Column+Row+Interaction
 The Denominator of the F Test is Always the
Same but the Numerator is Different
 The Sums of Squares Always Add Up
 Total=Error+Column+Row+Interaction
Purpose of Regression Analysis

 Regression Analysis is Used Primarily to Model

Causality and Provide Prediction
 Predict the values of a dependent (response)
variable based on values of at least one
independent (explanatory) variable
 Explain the effect of the independent variables on
the dependent variable
Types of Regression Models
Positive Linear Relationship Relationship NOT Linear

Negative Linear Relationship No Relationship

Simple Linear Regression Model

 Relationship between Variables is Described

by a Linear Function
 The Change of One Variable Causes the Other
Variable to Change
 A Dependency of One Variable on the Other
Simple Linear Regression Model
(continued)

Population regression line is a straight line that

describes the dependence of the average value
(conditional mean) of one variable on the other
Population Random
Population Error
Slope
Y Intercept Coefficient

Yi      X i   i
Dependent
Population Independent
(Response)
Variable Regression  Y |X (Explanatory)
Line Variable
(Conditional Mean)
Simple Linear Regression Model
(continued)

Y (Observed Value of Y) = Yi      X i   i

 i = Random Error 

Y | X      X i
 (Conditional Mean)
X
Observed Value of Y
Linear Regression Equation
Sample regression line provides an estimate of
the population regression line as well as a
predicted value of Y
Sample
Sample
Slope
Y Intercept

Yi  b0  b1 X i  ei
Coefficient

Residual

Ŷ  b0  b1 X (Fitted Regression Line, Predicted Value)

Simple Regression Equation
Linear Regression Equation
(continued)

 b0 and b1 are obtained by finding the values

of b0 and b that minimize the sum of the
1
squared residuals

 
n n
  ei2
2
Yi  Yˆi
i 1 i 1

 b0 provides an estimate of  
 b1 provides an estimate of 
Linear Regression Equation
(continued)

Yi  b0  b1 X i  ei Yi      X i   i
Y b1
i 
ei
Y | X      X i
 Yˆi  b0  b1 X i
b0 X
Observed Value
Interpretation of the Slope
and Intercept

   E Y | X  0 is the average value of Y

when the value of X is zero

change in E Y | X 
 1  measures the
change in X
change in the average value of Y as a result
of a one-unit change in X
Interpretation of the Slope
and Intercept (continued)

 b  Eˆ Y | X  0  is the estimated average

value of Y when the value of X is zero

change in Eˆ Y | X 
 b1  is the estimated
change in X
change in the average value of Y as a result
of a one-unit change in X
Simple Linear Regression:
Example
You wish to examine Annual
Store Square Sales
the linear dependency Feet ($1000)
of the annual sales of 1 1,726 3,681
produce stores on their 2 1,542 3,395
sizes in square footage. 3 2,816 6,653
Sample data for 7 4 5,555 9,543
stores were obtained. 5 1,292 3,318
Find the equation of 6 2,208 5,563
the straight line that 7 1,313 3,760
fits the data best.
Scatter Diagram: Example
12000
Annua l Sa le s ($000)

10000

8000

6000

4000

2000

0
0 1000 2000 3000 4000 5000 6000

S q u a re F e e t
Excel Output
Simple Linear Regression
Equation: Example

Yˆi  b0  b1 X i
 1636.415  1.487 X i

From Excel Printout:

C o e ffi c i e n ts
I n te r c e p t 1 6 3 6 .4 1 4 7 2 6
X V a ria b le 1 1 .4 8 6 6 3 3 6 5 7
Interpretation of Results:
Example

Yˆi  1636.415  1.487 X i

The slope of 1.487 means that for each increase of
one unit in X, we predict the average of Y to
increase by an estimated 1.487 units.

The equation estimates that for each increase of 1

square foot in the size of the store, the expected
annual sales are predicted to increase by $1487.
Residual Analysis

 Purposes
 Examine linearity
 Evaluate violations of assumptions
 Graphical Analysis of Residuals
 Plot residuals vs. X and time
Residual Analysis for Linearity
Y Y

X
e X
e
X
X

Not Linear
 Linear
Residual Analysis for
Homoscedasticity

Y Y

X X
SR SR

X X

Heteroscedasticity Homoscedasticity
Residual Analysis: Excel Output
for Produce Stores Example
Observation Predicted Y Residuals
1 4202.344417 -521.3444173
2 3928.803824 -533.8038245
3 5822.775103 830.2248971
Excel Output 4 9894.664688 -351.6646882
5 3557.14541 -239.1454103
6 4918.90184 644.0981603
7 3588.364717 171.6352829
Residual Plot

0 1000 2000 3000 4000 5000 6000

Square Feet
Residual Analysis for
Independence
 The Durbin-Watson Statistic
 Used when data is collected over time to detect
autocorrelation (residuals in one time period are
related to residuals in another period)
 Measures violation of independence assumption
n

 i i1
( e  e ) 2
Should be close to 2.
D i 2
n If not, examine the model
e
i 1
2
i for autocorrelation.
Sample Observations from
Various r Values
Y Y Y

X X X
r = -1 r = -.6 r=0
Y Y

X X
r = .6 r=1
Features of rand r

 Unit Free
 Range between -1 and 1
 The Closer to -1, the Stronger the Negative
Linear Relationship
 The Closer to 1, the Stronger the Positive
Linear Relationship
 The Closer to 0, the Weaker the Linear
Relationship
t Test for Correlation
 Hypotheses
 H0: r = 0 (no correlation)
 H1: r  0 (correlation)
 Test Statistic
rr
t where
  r 2

n2
n

 X i  X Yi  Y 
r  r2  i 1
n n

 X X  Y  Y 
2 2
i i
i 1 i 1
Example: Produce Stores
From Excel Printout r
Is there any R e g r e ssi o n S ta ti sti c s
M u lt ip le R 0 .9 7 0 5 5 7 2
evidence of linear R S q u a re 0 .9 4 1 9 8 1 2 9
relationship between A d ju s t e d R S q u a re 0 . 9 3 0 3 7 7 5 4
annual sales of a S t a n d a rd E rro r 6 1 1 .7 5 1 5 1 7
store and its square O b s e rva t io n s 7
footage at .05 level
of significance? H0: r = 0 (no association)
H1: r  0 (association)
a  .05
df  7 - 2 = 5
Example: Produce Stores
Solution

Decision:
rr .9706 Reject H0.
t   9.0099
 r 2
1  .9420
Conclusion:
n2 5
There is evidence of a
Critical Value(s): linear relationship at 5%
level of significance.
Reject Reject
The value of the t statistic is
.025 .025 exactly the same as the t
statistic value for test on the
-2.5706 0 2.5706 slope coefficient.
Estimation of Mean Values
Confidence Interval Estimate for Y | X  X i
:
The Mean of Y Given a Particular Xi
Size of interval varies according
Standard error to distance away from mean, X
of the estimate
1 (Xi  X ) 2
Yˆi  tn 2 SYX  n
(Xi  X )
n 2
t value from table
i 1
with df=n-2
Prediction of Individual Values
Prediction Interval for Individual Response
Yi at a Particular Xi

Addition of 1 increases width of interval

from that for the mean of Y

1 (Xi  X ) 2
Yˆi  tn 2 SYX 1  n
(Xi  X )
n 2

i 1
Interval Estimates for Different
Values of X
Confidence
Prediction Interval Interval for the
for a Individual Yi Mean of Y
Y

X
X a given X
Example: Produce Stores
Data for 7 Stores:
Annual
Store Square Sales Consider a store
Feet ($000) with 2000 square
1 1,726 3,681 feet.
2 1,542 3,395
3 2,816 6,653
4 5,555 9,543 Regression Model Obtained:
5 1,292 3,318 
6 2,208 5,563
Yi = 1636.415 +1.487Xi
7 1,313 3,760
Estimation of Mean Values:
Example
Confidence Interval Estimate for Y | X  X i

Find the 95% confidence interval for the average annual

sales for stores of 2,000 square feet.

Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 ($000)
X = 2350.29 SYX = 611.75 tn-2 = t5 = 2.5706

1 ( X i  X )2
Yˆi  tn 2 SYX  n  4610.45  612.66
(Xi  X )
n 2

i 1

3997.02  Y |X  X i  5222.34
What is a Time Series?

 Numerical Data Obtained at Regular Time

Intervals
 The Time Intervals Can Be Annually,
Quarterly, Daily, Hourly, Etc.
 Example:
Year: 1994 1995 1996 1997 1998
Sales: 75.3 74.2 78.5 79.7 80.2
Time-Series Components

Trend Cyclical

Time-Series

Seasonal Random
Trend Component

 Overall Upward or Downward Movement

 Data Taken Over a Period of Years

Sales

Time
Cyclical Component

 Upward or Downward Swings

 May Vary in Length
 Usually Lasts 2 - 10 Years

Sales
Seasonal Component

 Upward or Downward Swings

 Regular Patterns
 Observed Within 1 Year
Sales

Summer

Winter
Spring Fall
Time (Monthly or Quarterly)
Random or Irregular Component

 Erratic, Nonsystematic, Random, “Residual”

Fluctuations
 Due to Random Variations of
 Nature
 Accidents
 Short Duration and Non-Repeating
Example: Quarterly Retail Sales
with Seasonal Components
Quarterly with Seasonal Components

15
Sales

0
0 5 10 15 20 25 30 35

Time
Example: Quarterly Retail Sales with
Seasonal Components Removed

Quarterly without Seasonal Com ponents

15
Sales

Y(t)

0
0 5 10 15 20 25 30 35
Tim e
Multiplicative Time-Series Model

 Used Primarily for Forecasting

 Observed Value in Time Series is the Product
of Components
 For Annual Data:
Ti = Trend
Yi  Ti Ci I i
 For Quarterly or Monthly Data: Ci = Cyclical

Yi  Ti Si Ci I i Ii = Irregular
Si = Seasonal
Moving Averages

 Used for Smoothing

 Series of Arithmetic Means Over Time
 Result Dependent Upon Choice of L (Length of
Period for Computing Means)
 To Smooth Out Cyclical Component, L Should
Be Multiple of the Estimated Average Length
of the Cycle
 For Annual Time Series, L Should Be Odd
Moving Averages
(continued)

 Example: 3-Year Moving Average

Y1  Y2  Y3
 First average: MA(3) 
3
Y2  Y3  Y4
 Second average: MA(3) 
3
Moving Average Example
John is a building contractor with a record of a total
of 24 single family homes constructed over a 6-year
period. Provide John with a 3-year moving average
graph. Year Units Moving
Ave
1994 2 NA
1995 5 3
1996 2 3
1997 2 3.67
1998 7 5
1999 6 NA
Moving Average Example
Solution

Year Response Moving

Ave Sales
L=3
1994 2 NA 8
1995 5 3 6
1996 2 3 4
1997 2 3.67 2
1998 7 5 0
94 95 96 97 98 99
1999 6 NA
No MA for the first and last (L-1)/2 years
Exponential Weight: Example
Ei  WYi  (1  W ) Ei 1
Year Response Smoothing Value Forecast
(W = .2, (1-W)=.8)
1994 2 2 NA
1995 5 (.2)(5) + (.8)(2) = 2.6 2
1996 2 (.2)(2) + (.8)(2.6) = 2.48 2.6
1997 2 (.2)(2) + (.8)(2.48) = 2.384 2.48
1998 7 (.2)(7) + (.8)(2.384) = 3.307 2.384
1999 6 (.2)(6) + (.8)(3.307) = 3.846 3.307
Exponential Weight:
Example Graph

Sales
8
Data
6

2 Smoothed

0
94 95 96 97 98 99 Year
Linear Trend Model
Use the method of least squares to obtain the
linear trend forecasting equation:

Year Coded X Sales (Y)

95 0 2
96 1 5
Yˆi  b0  b1 X i
97 2 2
98 3 2
99 4 7
00 5 6
Linear Trend Model
(continued)
Linear trend forecasting equation:
Yˆi  b0  b1 X i  2.143  .743X i
8
Excel Output
7
Co efficien ts
6
I n te r c e p t 2.14285714
5
X V a ria b le 1 0 .7 4 2 8 5 7 1 4
Sales

4
3 Projected to
2 year 2001
1
0
0 1 2 3 4 5 6
X
The Quadratic Trend Model
Use the method of least squares to obtain
the quadratic trend forecasting equation:
Year Coded X Sales (Y)
95 0 2
96
97
1
2
5
2
ˆ
Yi  b0  b1 X i  b2 X i
2

98 3 2
99 4 7
00 5 6
The Quadratic Trend Model
(continued)

ˆ
Yi  b0  b1 X i  b2 X i  2.857  .33X i  .214 X i
2 2

Excel Output 8
Coefficients 7

In te rce p t 2.85714286 6

X V a ria b le 1 -0.3285714 Sales 5

4
Projected to
X V a ria b le 2 0.21428571 year 2001
3
2
1
0
0 1 2 3 4 5 6
X
The Exponential Trend Model
After taking the logarithms, use the method of least
squares to get the forecasting equation:
ˆ
Yi  b0b1
Xi
or log Yˆi  log b0  X1 log b1
Year Coded X Sales (Y) C o e f f ic ie n t s
In t e r c e p t 0 .3 3 5 8 3 7 9 5
95 0 2
X V a ria b le 10 . 0 8 0 6 8 5 4 4
96 1 5
97 2 2 Excel Output of Values in Logs
98 3 2 a n t ilo g (. 3 3 5 8 3 7 9 5 ) = 2.17
a n t ilo g (. 0 8 0 6 8 5 4 4 ) = 1.2
99 4 7
00 5 6 ˆ
Yi  (2.17)(1.2) Xi
Autoregressive Modeling

 Used for Forecasting

 Takes Advantage of Autocorrelation
 1st order - correlation between consecutive values
 2nd order - correlation between values 2 periods
apart
 Autoregressive Model for p-th Order:
Yi  A0  AY
1 i 1  A2Yi  2   ApYi  p   i
Random
Error
Autoregressive Model:
Example
The Office Concept Corp. has acquired a number of
office units (in thousands of square feet) over the
last 8 years. Develop the 2nd order autoregressive
model.
Year Units
93 4
94 3
95 2
96 3
97 2
98 2
99 4
00 6
Autoregressive Model:
Example Solution
Develop the 2nd order Year Yi Yi-1 Yi-2
table 93 4 --- ---
94 3 4 ---
Use Excel to estimate a 95 2 3 4
regression model 96 3 2 3
Excel Output 97 2 3 2
Coefficients 98 2 2 3
I n te rc e p t 3.5 99 4 2 2
X V a ri a b l e 1 0.8125 00 6 4 2
X V a ri a b l e 2 -0 . 9 3 7 5

Yˆi  3.5  .8125Yi 1  .9375Yi 2

Autoregressive Model Example:
Forecasting

Use the 2nd order model to forecast number

of units for 2001:
Yˆi  3.5  .8125Yi 1  .9375Yi 2
Yˆ2001  3.5  .8125Y2000  .9375Y1999
 3.5  .8125  6  .9375  4
 4.625
Autoregressive Modeling Steps

1. Choose p : Note that df = n - 2p - 1

2. Form a Series of “Lag Predictor” Variables
Yi-1 , Yi-2 , … ,Yi-p
3. Use Excel to Run Regression Model Using All
p Variables
4. Test Significance of Ap
 If null hypothesis rejected, this model is selected
 If null hypothesis not rejected, decrease p by 1
and repeat

Wireshark TCP v8.1
No ratings yet
Wireshark TCP v8.1
9 pages
Research Methodology MCQ Questions With Answers
100% (5)
Research Methodology MCQ Questions With Answers
48 pages
College Statistics Cheat Sheet
100% (2)
College Statistics Cheat Sheet
2 pages
Stats-Proj Group 2
0% (1)
Stats-Proj Group 2
53 pages
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
27 pages
Forecast
No ratings yet
Forecast
82 pages
Visual Perception On The Architectural Elements of The Built Heritage of A Historic Temple Town A Case Study of Kumbakonam India
No ratings yet
Visual Perception On The Architectural Elements of The Built Heritage of A Historic Temple Town A Case Study of Kumbakonam India
12 pages
Multicollinearity, Heteroscedasticity and Autocorrelation
100% (3)
Multicollinearity, Heteroscedasticity and Autocorrelation
23 pages
Hypothesis Development and Testing: Sendil Mourougan, Dr. K. Sethuraman
No ratings yet
Hypothesis Development and Testing: Sendil Mourougan, Dr. K. Sethuraman
7 pages
Sampling Notes 2016 PDF
No ratings yet
Sampling Notes 2016 PDF
108 pages
Chi Square and McNemar Test
No ratings yet
Chi Square and McNemar Test
45 pages
Topic 2 Estimation
No ratings yet
Topic 2 Estimation
56 pages
RegrCorr PDF
No ratings yet
RegrCorr PDF
20 pages
Basic Statistical Tools
No ratings yet
Basic Statistical Tools
43 pages
Lecture 7 & 8
No ratings yet
Lecture 7 & 8
79 pages
Chapters 8 (Part I) & 9 - PPTs
No ratings yet
Chapters 8 (Part I) & 9 - PPTs
34 pages
CIVL101: Lecture-29 Test of Significance of Large Samples Z-Statistic
No ratings yet
CIVL101: Lecture-29 Test of Significance of Large Samples Z-Statistic
21 pages
2 BN Ve
No ratings yet
2 BN Ve
25 pages
Statistics For Managers Using Microsoft Excel: Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: Edition
56 pages
Chapter 6. Hypothesis Testing 2023
No ratings yet
Chapter 6. Hypothesis Testing 2023
31 pages
Post Hoc Power: A Concept Whose Time Has Come: Anthony J. Onwuegbuzie
No ratings yet
Post Hoc Power: A Concept Whose Time Has Come: Anthony J. Onwuegbuzie
31 pages
Practical Exercise 2 Group 6
No ratings yet
Practical Exercise 2 Group 6
12 pages
DirichletReg Vig
No ratings yet
DirichletReg Vig
13 pages
Chapter 6: Normal Probability Distributions
No ratings yet
Chapter 6: Normal Probability Distributions
15 pages
2s03 Session 3 CLT & Normal Dist (Handout)
No ratings yet
2s03 Session 3 CLT & Normal Dist (Handout)
51 pages
A Stata Implementation of The Blinder-Oaxaca Decomposition
No ratings yet
A Stata Implementation of The Blinder-Oaxaca Decomposition
25 pages
Chapter 3 NorDIs
No ratings yet
Chapter 3 NorDIs
73 pages
Bus 173 - 1
No ratings yet
Bus 173 - 1
28 pages
Normal, Binomial, Poisson Distributions
No ratings yet
Normal, Binomial, Poisson Distributions
12 pages
A New Test For Sufficient Homogeneity (8) 2001vol126 (1414-1417) PDF
No ratings yet
A New Test For Sufficient Homogeneity (8) 2001vol126 (1414-1417) PDF
4 pages
8.normal Distribution
No ratings yet
8.normal Distribution
32 pages
Statistical Analysis: Dr. Shahid Iqbal Fall 2021
No ratings yet
Statistical Analysis: Dr. Shahid Iqbal Fall 2021
65 pages
Probability and Statistics - Practice Tests and Solutions
No ratings yet
Probability and Statistics - Practice Tests and Solutions
46 pages
DOM105 Session 3
No ratings yet
DOM105 Session 3
17 pages
Prediction of Fruit Production in India-An Econometric Approach
No ratings yet
Prediction of Fruit Production in India-An Econometric Approach
11 pages
NPR N-W Estimator
No ratings yet
NPR N-W Estimator
4 pages
Continuous Probability Distributions
No ratings yet
Continuous Probability Distributions
25 pages
Introduction To Continuous Probability Distributions: Prepared By: Renna Magdalena
No ratings yet
Introduction To Continuous Probability Distributions: Prepared By: Renna Magdalena
46 pages
L4 Continuous Probability
No ratings yet
L4 Continuous Probability
28 pages
D2 Basic Stat
No ratings yet
D2 Basic Stat
53 pages
Analysis Part 2
No ratings yet
Analysis Part 2
71 pages
Essential Concept 2 - Standard Error of Estimate, Coefficient of Determination, Confidence Interval For A Regression Coefficient - IFT World
No ratings yet
Essential Concept 2 - Standard Error of Estimate, Coefficient of Determination, Confidence Interval For A Regression Coefficient - IFT World
2 pages
Lecture3 Na
No ratings yet
Lecture3 Na
73 pages
Mmw-Chapter 1docx-Pdf-Free
No ratings yet
Mmw-Chapter 1docx-Pdf-Free
5 pages
Lesson 7:: Normal Distribution in Statistics
No ratings yet
Lesson 7:: Normal Distribution in Statistics
5 pages
Measures of Variability and Position
No ratings yet
Measures of Variability and Position
34 pages
Statistics Formula Sheet New
No ratings yet
Statistics Formula Sheet New
22 pages
BUCSEP236P
No ratings yet
BUCSEP236P
45 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
Normal Distribution: X e X F
No ratings yet
Normal Distribution: X e X F
30 pages
Assignment of Inferencial Ststistics Annd Hypothisis Testing (1) Mahi
No ratings yet
Assignment of Inferencial Ststistics Annd Hypothisis Testing (1) Mahi
13 pages
5 Random Var PDF
No ratings yet
5 Random Var PDF
74 pages
Where and When The Exam Is!!!: BM 1200 Quantitative Methods & Analytics
No ratings yet
Where and When The Exam Is!!!: BM 1200 Quantitative Methods & Analytics
11 pages
Group 2
No ratings yet
Group 2
65 pages
CH6 - Continuous Probability Distributions
No ratings yet
CH6 - Continuous Probability Distributions
38 pages
Binomial Distributions For Sample Counts
No ratings yet
Binomial Distributions For Sample Counts
38 pages
Literature Review On Population Change
100% (1)
Literature Review On Population Change
5 pages
Probability and SamplingDistributions
No ratings yet
Probability and SamplingDistributions
59 pages
BS Lect 13
No ratings yet
BS Lect 13
34 pages
Artigo - An Experimental Investigation of The Effects of Retargeted
No ratings yet
Artigo - An Experimental Investigation of The Effects of Retargeted
50 pages
Sampling Distributions of Sample Means
No ratings yet
Sampling Distributions of Sample Means
7 pages
Continuous Probability Distribution
No ratings yet
Continuous Probability Distribution
47 pages
Key of Week1 - Lecture Notes
No ratings yet
Key of Week1 - Lecture Notes
10 pages
Theoretical Distributions 2
No ratings yet
Theoretical Distributions 2
3 pages
Continuous Random Variable
No ratings yet
Continuous Random Variable
44 pages
Unit 6.
No ratings yet
Unit 6.
37 pages
Lecture6 Normal Probability Distribution PDF
No ratings yet
Lecture6 Normal Probability Distribution PDF
6 pages
QB - Business Forecasting
No ratings yet
QB - Business Forecasting
8 pages
07 Normal Distribution
No ratings yet
07 Normal Distribution
38 pages
BUS 5 Prob Dist
No ratings yet
BUS 5 Prob Dist
35 pages
Quantitative Techniques: Normal Distribution
No ratings yet
Quantitative Techniques: Normal Distribution
39 pages
Lecture - The Normal Distribution
No ratings yet
Lecture - The Normal Distribution
40 pages
Module 9. Statistics New
No ratings yet
Module 9. Statistics New
74 pages
Biostatistis
No ratings yet
Biostatistis
35 pages
RM Unit - 1
No ratings yet
RM Unit - 1
48 pages
RM Unit-Iii
No ratings yet
RM Unit-Iii
33 pages
ML 1
No ratings yet
ML 1
1 page
Chapter 07 08
No ratings yet
Chapter 07 08
38 pages
Course Reference Sheet - Revision 2
No ratings yet
Course Reference Sheet - Revision 2
6 pages
Lecture 07
No ratings yet
Lecture 07
70 pages
ML 3
No ratings yet
ML 3
1 page
Ch-6 Normal Distribution Lecture Notes
No ratings yet
Ch-6 Normal Distribution Lecture Notes
6 pages
POC Assignment 1
No ratings yet
POC Assignment 1
3 pages
NBayes 1 20 2011 Ann
No ratings yet
NBayes 1 20 2011 Ann
21 pages
UNIT-I Day-6
No ratings yet
UNIT-I Day-6
17 pages
Introduction To Sensors and Devices
No ratings yet
Introduction To Sensors and Devices
16 pages
UNIT-I Day-7
No ratings yet
UNIT-I Day-7
16 pages
UNIT-I Day-1
No ratings yet
UNIT-I Day-1
15 pages
UNIT-II Day-4
No ratings yet
UNIT-II Day-4
13 pages
Chapter 5 Normal Distribution
No ratings yet
Chapter 5 Normal Distribution
32 pages
UNIT-III Day-2
No ratings yet
UNIT-III Day-2
12 pages
UNIT-I Day-2
No ratings yet
UNIT-I Day-2
12 pages
UNIT-III Day-3
No ratings yet
UNIT-III Day-3
10 pages
UNIT-III Day-1
No ratings yet
UNIT-III Day-1
10 pages
UNIT-II Day-1
No ratings yet
UNIT-II Day-1
8 pages
UNIT-I Day-8
No ratings yet
UNIT-I Day-8
8 pages
Theoretical Questions in Basic Business Statistics
No ratings yet
Theoretical Questions in Basic Business Statistics
12 pages
Chapter 07
No ratings yet
Chapter 07
27 pages
10.objective Questions
No ratings yet
10.objective Questions
2 pages
2 Introduction
No ratings yet
2 Introduction
2 pages
8.designing A RESTful Web API
No ratings yet
8.designing A RESTful Web API
1 page
Munication Apis
No ratings yet
Munication Apis
1 page
Growth Curve Models and Statistical Diagnostics Annotated PDF Download
100% (11)
Growth Curve Models and Statistical Diagnostics Annotated PDF Download
14 pages
TPPP
No ratings yet
TPPP
2 pages
Lec 26
No ratings yet
Lec 26
36 pages
Statistics For Economists Lecture V
No ratings yet
Statistics For Economists Lecture V
37 pages
5.1. Notes
No ratings yet
5.1. Notes
16 pages
Section
No ratings yet
Section
81 pages
AMA1611 Lecture 7 and 8 Powerpoints
No ratings yet
AMA1611 Lecture 7 and 8 Powerpoints
35 pages
STA 211 Lecture 1
No ratings yet
STA 211 Lecture 1
18 pages
Lecture On Normal Distribution - Docx STAT JUNE 17
No ratings yet
Lecture On Normal Distribution - Docx STAT JUNE 17
15 pages
Probability Theory: A Concise Course
From Everand
Probability Theory: A Concise Course
Y. A. Rozanov
4/5 (2)

ERM 4b Final

Uploaded by

ERM 4b Final

Uploaded by

Why a Manager Needs to Know About Statistics

 To Know How to Properly Present Information

 To Know How to Draw Conclusions about Populations

 To Know How to Improve Processes

 To Know How to Obtain Reliable Forecasts

 ‘n’ Identical Trials

 Constant Probability for Each Trial

 “Bell Shaped” f(X)

Varying the Parameters s and , We Obtain

2.9 7.1 X 0.21 0.21 Z

0.2 .5793 .5832 .5871 0

-0.1 .4602 .4562 .4522 0

Probability for X <= X

What percentage of students scored between

Probability for a Range

The middle 50% of the students scored

Find X and Z Given Cum. Pctage.

 Not All Continuous Random Variables are

 Observe the Distribution of the Data Set

 Normal Probability Plot

H0:  3.5 a Critical

p-Value = [.025, .05]

 Data Cleansing is Practiced to Hide

 Populations are normally distributed

 Populations have equal variances

size from each population

 E.g., If there are 5 means and use a = .05

 Type I Error is 1 – (.95)

in favor of the alternative when the null is true!

 H1 : Not all i are the same

Variation Due to Variation Due to Random

X ij : the i -th observation in group j

X  the overall or grand mean

Group 1 Group 2 Group 3

Machine1 Machine2 Machine3 27

SSA  5  24.93  22.71   22.61  22.71   20.59  22.71 

 Examines the Effect of:

Variation Due to SSA +

Variation Due to SSAB +

r  the number of levels of factor A

n  the total number of observations in the experiment

 Regression Analysis is Used Primarily to Model

Negative Linear Relationship No Relationship

 Relationship between Variables is Described

Population regression line is a straight line that

Ŷ  b0  b1 X (Fitted Regression Line, Predicted Value)

 b0 and b1 are obtained by finding the values

   E Y | X  0 is the average value of Y

 b  Eˆ Y | X  0  is the estimated average

From Excel Printout:

Yˆi  1636.415  1.487 X i

The equation estimates that for each increase of 1

0 1000 2000 3000 4000 5000 6000

Addition of 1 increases width of interval

Find the 95% confidence interval for the average annual

 Numerical Data Obtained at Regular Time

 Overall Upward or Downward Movement

 Upward or Downward Swings

 Upward or Downward Swings

 Erratic, Nonsystematic, Random, “Residual”

Quarterly without Seasonal Com ponents

 Used Primarily for Forecasting

 Used for Smoothing

 Example: 3-Year Moving Average

Year Response Moving

Year Coded X Sales (Y)

X V a ria b le 1 -0.3285714 Sales 5

 Used for Forecasting

Yˆi  3.5  .8125Yi 1  .9375Yi 2

Use the 2nd order model to forecast number

1. Choose p : Note that df = n - 2p - 1

You might also like