0% found this document useful (0 votes)

39 views59 pages

Simple Linear Regression Part I - Updated FA18

The document discusses simple linear regression and how it can be used to predict a variable from another correlated variable. It provides examples of using linear regression to predict psychological symptom scores based on the mean score, and how predicting could be improved by also considering an individual's stress level.

Uploaded by

Thomas G. Warner, Jr.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views59 pages

Simple Linear Regression Part I - Updated FA18

Uploaded by

Thomas G. Warner, Jr.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 59

ERM 680

Simple Linear Regression

Part I
Simple linear Regression
Students should be able to:
 Calculate and describe b, a, , r2, & standard
error (given the appropriate equations)
 Construct and use a regression equation
 Describe error variance
 Describe the meaning and uses of r2
What is Regression?
 Regression is the use of correlational
data to predict one variable from another
 Regression is also the plotting of a “best-
fitting” line through a scatter plot
 The plotted regression line is the
graphical representation of the prediction
Predicting a Score
 If we know nothing else about the variable you
want to predict, our best prediction is the mean
of that variable

Mean

 Remember, if we randomly draw a score from

a population, chances are that it will be closer
the mean of the population than far from it. 4
Predicting a Score
 If we know nothing else
but that the mean
symptom score of students
in UNCG is 90.7 (which is


a measure of 175

psychological problems).
150 

 We know a student comes

symptoms
 




from UNCG, if I ask you

125
 
 

about his symptom score   

 
  


(how much psychological

     
100  
   

    
  

problems he might have),  


   
  
   
 





the closest you can say is

    

75     
 
   

“about 90.7” because


   


  


that’s the mean, the best

estimate we can have.
0 25 50 75

stress

5
Predicting a Score
 Let’s assume we want to find out
if we can find a way to predict the 

number of symptoms students in 175

ERM680 experience
 We could collect scores on a 150 
symptom scale from a sample
and compute a mean to estimate

symptoms
 



the average in the population.


125
 
 

 However, this average may be 


 

 




higher or lower than the actual 100    

   
 

   

 

scores for many individuals

 
      
    
    

 Assume we also had scores on

  
      
    
75   
the level of stress for these   
   
  

individuals. We could then predict 

the symptoms scores for each 0 25 50 75

individual more precisely if we can stress

establish a relationship between
the number of symptoms reported 18 42

and the associated stress level.

6
Predicting a Score
 We can predict a symptom score
is a score “given” a specific stress 

score. 175

 E.g. If a student’s stress score is

42, his estimated symptom score 150 
maybe 105.

symptoms
 
 E.g. If a student’s stress score is  


18, his estimated symptom score

125
 
 

may be 92. 

 

 
 





       
100 
    
 
      
    
    
  
      
    
75     

   
  



0 25 50 75

stress

18 42

7
But how?
 We are looking for the best-fitting line, one logical
way to find a best line is to find one that minimizes
errors of prediction.
 If Y represents the observed response value and Ŷ
(Y-hat) represents the predicted value (value on
the line) , the difference between them is called the
residual, it is represented as (Y- Ŷ ).
 Like deviations in the computation of standard
deviation, we cannot simply sum the residuals,
because some of them are positive and some of
them are negative. To estimate the overall error, we
have to square the residuals before finding the
sum. The error in predicting Y (ie., Y- Ŷ ) is derived
from the sum of squared residuals 8
But how?...
 The goal of regression is to find the best
fitting line. ie., the one that minimizes the
sum of squared residuals or SSerror ,
represented by  (Y  Yˆ ) 2

 Our approach of finding the best line is

called “least squares regression”.
 One of the conditions to find the regression
line is that it must always pass through one
point ( mean x and mean Y), that is
( X ,Y )
9
The Regression Line
 The line that meets this criterion actually can be
numerically derived.
 Think back to your high school math class…the
equation for a straight line is:

ˆ
Y  a  bX Value
of X
Predicted
value of Y Slope of
Intercept regression
line
10
The Regression Line

Yˆ  a  bX
 When our values for Y are predicted or
estimated, we indicate this with a Ŷ (Y-hat)
 When the values are actual observed scores,
they’re indicated with Y
 a is the intercept or the Ŷ when X=0
 b is the slope or the rate of change in Y. For
every unit of increase in X value, we will see an
increase or decrease of b unit in the predicted
Y-hat. The increase or decrease depends on
the sign associated with the slope, b

11
The Regression Equation

ˆ
Y  bX  a
where
Yˆ  pronounced " Y hat" , the predicted value of Y
b  the slope of the regression line
a  the intercept (i.e., predicted value of Y when X  0)

a and b are called unstandardized regression

coefficients
Example Regression
18

8
Test Performance

0 Rsq = 0.5965
0 2 4 6 8 10 12 14 16 18 20

Anxiety
Defining the line
 Finding the regression line is the same as
finding a and b that defines the line.

14
Finding b
The slope, b, is derived from the
covariance & standard deviation of x:
cov XY
b 2
sX
Notice that the equation for b is similar
to the correlation equation. You can
find b using r:
cov XY sY
r br
s X sY sX
Finding a

The intercept, a, is derived from the

slope, b:

a  Y  bX
Another View of the Regression Equation

ˆ s s
Y r Y
X Y  r Y
X
Adjust sX
proportional to
sX
the strength of

sY
the relationship

 r (X  X ) Y The mean of Y is

sX
your estimate if
you know
Put the nothing
difference in
between X
Like a z score, how many SD is X
and its
above or below the mean
mean on Y’s
metric
Plotting a Regression Line
 To plot a regression line you solve the
regression equation for a few values of X
 Place the points on your graph
 Connect the dots
 Or just have SPSS do it!
Scatter Plot of SATV vs Score (Tab
9.6 in Howell)

Compare the two charts and comment

Scatter Plot of STAV vs Score (Tab 9.6)
Y-intercept=373.736

 The equation in SPSS: Y=3.74 E2+4.87X

should actually be written as
 Yhat= 374+4.87X
means
sX
SPSS Example

covXY

cov XY 220.317
b 2  2
 4.865
sX 6.729

a  Y  bX  598.57  (4.865  46.21)  373.73

Slope significantly
different from 0
SPSS Example

Slope, b=4.865)
Intercept, a = 373.73)

Yˆ  4.865 X  373.73
For every 1 point increase in score X we expect a 4.865 points
increase in SATV scores
Prediction Y given X
You can now use your equation to make a
prediction
Yˆ  4.865 X  373.73
When x=40, we predict

When X=53, we predict…

Scatter Plot of STAV vs Score
Scatter Plot of STAV vs Score

Yhat=631.58
ID (i) Score Xi Yi Yihat Residual

18 53 700 631.58 68.42

24 53 580 631.58 -51.58

27 53 630 631.58 -1.58

28 53 620 631.58 -11.58

X=53
Errors of Prediction

 When you predict a score using

regression, you may not obtain the exact
observed score, ie,
 The stronger the correlation r, the more
accurate your prediction
 The difference between the predicted and
actual value is called the residual

residual  (Y  Yˆ )
Residuals
 What do you think the residuals should
add up to?
sum of residuals   (Y  Yˆ )
2
sum of squared residuals   (Y  Yˆ )
ˆ
 (Y  Y ) 2
2
Error Variance  s Y Yˆ

n2
2
Note error variance is also written as sY . X
Table of residuals or prediction error
Residuals or
ID Score SATV (Observed) Predicted_SATV prediction error
X Y Yhat Y-Yhat
1 58 590 655.9096 -65.9096
2 48 580 607.259 -27.259
3 34 550 539.1483 10.85174
4 38 550 558.6085 -8.60848
5 41 560 573.2037 -13.2037
6 55 800 641.3144 158.6856
7 43 650 582.9338 67.06625
8 47 660 602.394 57.60603
9 47 600 602.394 -2.39397
10 46 610 597.5289 12.47108
11 40 620 568.3386 51.66141
12 39 560 563.4735 -3.47354
13 50 570 616.9891 -46.9891
14 46 510 597.5289 -87.5289
15 48 590 607.259 -17.259
16 41 490 573.2037 -83.2037
17 43 580 582.9338 -2.93375
18 53 700 631.5843 68.4157
19 60 690 665.6397 24.36032
20 44 600 587.7988 12.20119
21 49 580 612.1241 -32.1241
22 33 590 534.2832 55.71679
23 40 540 568.3386 -28.3386
24 53 580 631.5843 -51.5843
25 45 600 592.6639 7.33614
26 47 560 602.394 -42.394
27 53 630 631.5843 -1.5843
28 53 620 631.5843 -11.5843
Residuals Squared Residuals
Y-Yhat (Y-Yhat)2
1 -65.91 4344.08
2 -27.26 743.05
3 10.852 117.76
4 -8.608 74.11
5 -13.2 174.34
6 158.69 25181.12
7 67.066 4497.88
8 57.606 3318.45
9 -2.394 5.73
10 12.471 155.53
11 51.661 2668.90
12 -3.474 12.07
13 -46.99 2207.98
14 -87.53 7661.31
15 -17.26 297.87
16 -83.2 6922.86
17 -2.934 8.61
18 68.416 4680.71 SSerror=
19 24.36 593.43
20 12.201 148.87 SSY-Yhat
21 -32.12 1031.96
22 55.717 3104.36
23 -28.34 803.08
24 -51.58 2660.94
25 7.3361 53.82
26 -42.39 1797.25
27 -1.584 2.51
28 -11.58 134.20

Sum
 (Y  Yˆ ) 
0.00  (Y  Yˆ ) 2
 73402.75

Error variance or variance of  (Y  Yˆ ) 2


n2 2823.18
residuals
standard error of estimate=  (Y  Yˆ ) 2

 53.13
Sy.x =Sy-yhat n2
Error Variance
 Error variance is used to describe how
much error is in the predictions
 Just like variance describes variability
around the mean of a distribution, in
regression error variance describes
variability around predicted values
If our predictions were perfect (i.e., r=1), there
would be no error at all and error variance will
be zero
 Regression coefficients minimize error
variance
Conditional Distribution of Errors
Assumption of Homoscedasticity
Conditional Distribution
Standard Error of the Estimate (SEE)
 Like with regular variance there is a
standard deviation measure for error
variance i.e., the standard error of the
estimate
sY  X  sY Yˆ  error variance
ˆ
 (Y  Y ) 2

sY  X 
n2
SEE as a measure of the accuracy
of prediction
SEE is one way we can assess how well our
regression equation is working.
If the regression equation works well, it’s safe to assume the
predicted values of Y should be very close to the real values.
Refer to equation on previous slide. As the predicted value
and the actual get very close, SEE will get smaller.
Smaller SEE implies more accurate predictions.
It should be made clear that this value in practice can
be quite sizeable—predicting with an independent
variable is better that predicting without one, but it is
not without error. We rarely have a dataset that fall
exactly onto one straight line.
34
r2 as a measure of accuracy of prediction
 SSerror again. Remember  (Y  Yˆ ) is the sum of the
2

squared residuals. Residuals are “leftovers”.

 SSerror is the amount of the variance in the Predicted Y,
Yhat
scores in Y unaccounted for even when X is
used as a predictor ( it is the amount of variance
not explained by X).
 We can also sum up the squared difference
between Y and Y . This is the difference between
the actual observed values and the grand mean of
Y.
 We will denote it as SSY or 
2
(Y  Y ).
 This is the total variance or the variance in
the original Y (without taking X into
account).
SStotal in SPSS 35
SATV
ID (Observed) Deviation form Mean Squared Deviations
Y Y-Ybar (Y-Ybar)2
1 590 -8.57142857 73.46938776
2 580 -18.5714286 344.8979592
3 550 -48.5714286 2359.183673
4 550 -48.5714286 2359.183673
5 560 -38.5714286 1487.755102
6 800 201.4285714 40573.46939
7 650 51.42857143 2644.897959
8 660 61.42857143 3773.469388
9 600 1.428571429 2.040816327
10 610 11.42857143 130.6122449
11 620 21.42857143 459.1836735
12 560 -38.5714286 1487.755102
13 570 -28.5714286 816.3265306
14 510 -88.5714286 7844.897959
15 590 -8.57142857 73.46938776
16 490 -108.571429 11787.7551
17 580 -18.5714286 344.8979592 SStotal
18
19
700
690
101.4285714
91.42857143
10287.7551
8359.183673
in SPSS
20 600 1.428571429 2.040816327
21 580 -18.5714286 344.8979592
22 590 -8.57142857 73.46938776
23 540 -58.5714286 3430.612245
24 580 -18.5714286 344.8979592
25 600 1.428571429 2.040816327
26 560 -38.5714286 1487.755102
27 630 31.42857143 987.755102
28 620 21.42857143 459.1836735

Y  (Y  Y )  SS Y   (Y  Y ) 2 
= 598.5714 0.00 102342.86
SATV
ID (predicted) Deviation from Mean Squared Deviations
Yhat Yhat-Ybar (Y-Ybar)2
1 655.9096 57.3381714 3287.666
2 607.259 8.68757143 75.4739
3 539.1483 -59.4231286 3531.108
4 558.6085 -39.9629286 1597.036
5 573.2037 -25.3677286 SS regression
643.5217
6 641.3144 42.7429714 1826.962
7 582.9338 -15.6376286 in SPSS
244.5354
8 602.394 3.82257143 14.61205
9 602.394 3.82257143 14.61205
10 597.5289 -1.04252857 or also referred to
1.086866
914.0239
11
12
568.3386
563.4735
-30.2328286
-35.0979286
as 1231.865
13 616.9891 18.4176714 SSmodel
339.2106
14 597.5289 -1.04252857 1.086866
15 607.259 8.68757143 75.4739
Since the16 regression
573.2037line passes through the mean of Y,
-25.3677286 643.5217
then the17mean of 582.9338
predicted Y (Yhat bar)-15.6376286 244.5354
1089.85
18 631.5843 33.0128714
is the same
19 as the665.6397
mean of the observed Y
67.0682714 4498.153
20 587.7988 -10.7726286 116.0495
21 612.1241 13.5526714 183.6749
22 534.2832 -64.2882286 4132.976
23 568.3386 -30.2328286 914.0239
24 631.5843 33.0128714 1089.85
25 592.6639 -5.90752857 34.89889
26 602.394 3.82257143 14.61205
27 631.5843 33.0128714 1089.85
28 631.5843 33.0128714 1089.85

Yˆ  Y  (Yˆ  Y )  SS Yˆ   (Yˆ  Y ) 2 

= 598.5714 0.00 28940.12

R2 as a measure of accuracy of
prediction
2
 It can be shown that SSerror  SSY (1  r )
SSY  SSerror
2
 Expanding and rearranging, we have r 
SSY

 SSY : total variablility in Y can be decomposed in two

parts
SStotal =SSregression + SSerror
SSY  SSYˆ  SSY Yˆ

a. the part that can be explained/predicted,

SSYˆ accounted
for by or attributable to X, , as a result of the
linear relationship between X and Y

b. The part of the variability in Y that is independent

SS ˆ
38
Y Y
Partitioning of Sum of Squares and r-
squared
2
SSYˆ
28940.123 Partitioning of the Sum of Squares
r    .283
SSY 102342.86 Total
Note that
28940.123
SSregressi on SStotal  SSerror
r2  
SStotal SStotal 73402.734

SSY  SSYˆ  SSY Yˆ SSYhat SSerror

i.e.,
SStotal  SSregressi on  SSerror
Predictable Variability & r2
 The higher the r2, the better the predictors are
working (the more variance in Y that are explained
by the predictors).
 If a correlation (r) is found to be 0.8, we calculate
r2 = 0.82 = 0.64
 How do we interpret this?
 This means that 64% of the variance of Y can
be explained by the variability in X.
 You can use the phrases on the previous slide
interchangeably.
 Remember, this does NOT mean that 64% of Y
is caused by X.
40
Factors That Affect Regression

r=0.75

r=0.22

41
Alternate Standard Error Equation
 This equation does not require the computation
of all residuals
 Notice how as r increases to 1 the standard
error decreases to 0

2  n 1 
sY  X  sY (1  r ) 
n2
2
sY  X  sY (1  r )
Standard error of Estimate of the
SATV example

2
sY Yˆ  sY (1  r )
2
sY Yˆ  61.567 (1  (.532) )
sY Yˆ  61.567 .7169
sY Yˆ  53.1
Another Example- Use the data in the table below to
obtain the regression equation and the standard error of
estimate for modeling Test Performance as a function of
Anxiety
Correlations cov XY
b 2
Anxiety
Test
Performance sX
Anxiety Pearson Correlation 1 -.772**
Sig. (2-tailed)
Sum of Squares and
. .000
a  Y  bX
693.500 -404.500
Cross-products
Covariance 33.024 -19.262
N 22 22
Test Performance Pearson Correlation -.772** 1
Sig. (2-tailed) .000 .
Sum of Squares and
-404.500 395.500
Descriptive Statistics
Cross-products
Covariance -19.262
Mean Std.18.833
Deviation N
N Anxiety 22
10.5000 22
5.74663 22
Testat
**. Correlation is significant Performance 9.5000
the 0.01 level (2-tailed). 4.33974 22
SPSS’s Answer
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 15.624 1.277 12.234 .000
Anxiety -.583 .107 -.772 -5.438 .000
a. Dependent Variable: Test Performance

ˆ
Y  .583 X  15.62
For every 1 point increase in Anxiety (X) we expect
a .583 point decrease in Test Performance (Y)
Que: What test score would you predict for
someone with an anxiety score of 16?
 6.3
Standard Error From SPSS
Model Summary

Adjusted Std. Error of

Model R R Square R Square the Estimate
1 .772a .597 .576 2.82459
a. Predictors: (Constant), Anxiety

2
sY Yˆ  sY (1  r )
2
sY Yˆ  4.34 (1  (.772) )
sY Yˆ  4.34 .404
sY Yˆ  2.8
Significance Testing of the regression
Slope
 Regression coefficients are simply estimates of the true
parameters in the population. As such we can perform
hypothesis tests on them.
 For example we can perform a t-test on regression
slopes
 H0: b*=0, Ha: b*≠ 0
 Because we use b to indicate the standardized
regression coefficient, we will use b* to represent the
true slope of the regression line
 With single predictor regression equations, you will get
the same result whether you test the correlation
coefficient, r or the slope b.
 Applies only to simple linear regression, i.e., you have only
predictor in the model.
Significance Testing of the regression
Slope of the SATV Data
Using the SATV data to test for if the slope, b is significantly different than
zero at the 5% level of significance
b b( s X ) n  1
t 
sY  X 2 n 1
sY (1  r )
sX n 1 n2
4.865(6.729) 27 170.104
t slope    3.24
2 27 52.497
61.567 (1  .532 )
28 s b
Standard error
 tcv= t.05/2, df=26 = -+2.056 of slope
 The tobs (3.24) is more extreme than the critical t (2.056), we therefore
reject Ho and declare that the regression slope is different than zero.
 i.e., Not reading the passage (intelligent guessing) is still a significant
predictor of the SAT verbal scores
Significance Testing fo regression
slope - the Easy Way

tobs value

Significance achieved if this value is less

than your alpha level (usually .05)
Assumptions in Regression
 There are no firm assumptions underlying
regression (or correlations) when the statistics
are used to describe data
 A regression line is the “best fit” no matter what the
properties of the data are
 However, when we conduct significance testing,
or if we want to create confidence intervals we
make assumptions
 Homogeneity of Variance and Normality in the
predicted variable (e.g., Y) Independence of
observations and normality of errors of prediction etc.
 More on this next week
Assumptions in Regression
Potential Problems with Regression

 The same issues arise in regression as

occur with correlation
They are two faces of the same procedure
 Curvilinear relationships, range restriction,
subsamples, outliers, spurious correlations
 No proof of causality
Other Notes about Regression

 Avoid making predictions where you have

no data ( ie/., outside the range of
observed data).
 If you were predicting first year
Engineering School GPA from SATM
scores, you would not want to make a
prediction based on a SATM of 200.
 The prediction is likely not accurate
because the program isn’t likely to have
enrolled a student with SATM that low.
Standardized Regression
 In many situations, the y-intercept, a, is
meaningless such as when predicting the weight ()
of a baby when height (X) is zero.
 We could then transform (standardize) the raw
scores to z scores before conducting the
regression
 The z scores for Y (ZY ) are regressed onto the z
scores for X (Zx )
 Then the y-intercept is equal to the mean of ZY
corresponding to the mean of ZX .(i.e., (0,0) on the
scale This is consistent with the mean of Y that
corresponds to the mean of X on the raw score
scale
Standardized Regression
 =r when there is only one independent
variable ,i.e, true only for simple liear
regression
 The standardized value of the y-intercept,
a =mean of (ZY) = 0
 Note that a= on the Y metric but zero on
the z metric
 Standardized coefficients are generally
used with multiple linear regression
Standardized Coefficients
 When you perform regression on standardized data
(i.e., z scores) the regression coefficients are
standardized. In the case of simple linear
regression the standardized slope is called ,
instead of b, where

 =r when there is only one independent variable,

i.e., only one predictor in the model
 The standardized constant, is sometimes referred
to as 0 is always equal to zero.
Regression model in terms of
standardized Coefficients

 is also sometimes called the beta weight

 Note the intercept term is absent because
the standardized model, the mean of the z
scores for both X and Y is zero, ie.,

In our STAV example, =.532

The regression equation can be written as
Interpretation of standardized
Coefficients
 The slope .532 would be interpreted as the
expected increase in the standardized SATV
score for every one standard deviation increase
in the score when not reading the passage.
 Note the unit for .532 is in terms of standard
deviation not the raw scores
 ie., an increase of one standard deviation in the
score when not reading the passage will lead to
an increase of slightly more than half a standard
deviation in the predicted SATV scores.
Unstandardized vs Standardized
 slope is not very stable from sample to sample
 In simple linear regression, most researchers prefer
to use the unstandardized slope, b as it tends to be
more consistent form sample to sample.
 When there are several predictors in a model, the
regression is referred to as multiple linear
regression. If the predictors are on scales which are
not commensurate the are used instead of the b’s
to compare the relative importance of each
predictor.
 But this comes with its own issuesb/c they are
affected by the variances and covariances of both
the included predictors and the predictors not
included in the model.

Regression Linear
No ratings yet
Regression Linear
24 pages
R-Programming - Unit 5
No ratings yet
R-Programming - Unit 5
43 pages
RiP Final Study
No ratings yet
RiP Final Study
35 pages
Book Solution Applied Multivariate Statistical Analysis Solution Manual 6th Edition PDF
100% (2)
Book Solution Applied Multivariate Statistical Analysis Solution Manual 6th Edition PDF
369 pages
Running and Interpreting Multiple Regression in SPSS (Includes Review of Assumptions)
No ratings yet
Running and Interpreting Multiple Regression in SPSS (Includes Review of Assumptions)
60 pages
Linear Regression: Correlation Asks
No ratings yet
Linear Regression: Correlation Asks
18 pages
Lecturer 10 UET
No ratings yet
Lecturer 10 UET
54 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
L3 EDUC7610 Simple Reg
No ratings yet
L3 EDUC7610 Simple Reg
26 pages
Linear Regression II
No ratings yet
Linear Regression II
54 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Lecture 12 Regression Edited
No ratings yet
Lecture 12 Regression Edited
35 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Unit 3
No ratings yet
Unit 3
24 pages
STAR Rando Questions Stats
No ratings yet
STAR Rando Questions Stats
14 pages
Regression
No ratings yet
Regression
60 pages
Lecture Week 12 - Intro To Regression
No ratings yet
Lecture Week 12 - Intro To Regression
5 pages
MCQ Test Central Tendency Dispersion 100 Qs
No ratings yet
MCQ Test Central Tendency Dispersion 100 Qs
20 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
(Mathe) Simple Linear Regression and Correlation
No ratings yet
(Mathe) Simple Linear Regression and Correlation
61 pages
06 Regression
No ratings yet
06 Regression
18 pages
Module 3 PoM-Forecasting
No ratings yet
Module 3 PoM-Forecasting
5 pages
Module 3 - Regression and Correlation Analysis
No ratings yet
Module 3 - Regression and Correlation Analysis
54 pages
Introduction To Linear Regression
No ratings yet
Introduction To Linear Regression
6 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
Concepts - Regression Overview
No ratings yet
Concepts - Regression Overview
14 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
8-Simple Regression Analysis
No ratings yet
8-Simple Regression Analysis
9 pages
6: Regression and Multiple Regression: Independent Variable. Then, Click
No ratings yet
6: Regression and Multiple Regression: Independent Variable. Then, Click
9 pages
Psych Stat Reviewer Midterms
No ratings yet
Psych Stat Reviewer Midterms
10 pages
Remedial Measures Purdue - Edu
No ratings yet
Remedial Measures Purdue - Edu
28 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Regression Analysis (Simple)
100% (1)
Regression Analysis (Simple)
8 pages
Untitled
No ratings yet
Untitled
1,326 pages
Lecture 25 - Multiple Regression
No ratings yet
Lecture 25 - Multiple Regression
34 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Third, Regression Analysis Predicts Trends and Future Values
No ratings yet
Third, Regression Analysis Predicts Trends and Future Values
2 pages
Chapter 3 - Classical Simple Linear Regression
No ratings yet
Chapter 3 - Classical Simple Linear Regression
52 pages
Topic 8 - Regression Analysis
No ratings yet
Topic 8 - Regression Analysis
51 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Statistical Analysis: Linear Regression
No ratings yet
Statistical Analysis: Linear Regression
36 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Correlation and Regression 2
No ratings yet
Correlation and Regression 2
24 pages
Linear Regression
No ratings yet
Linear Regression
56 pages
Introduction To Linear Regression
No ratings yet
Introduction To Linear Regression
6 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Regression PDF
No ratings yet
Regression PDF
7 pages
Data Collection and Analysis in Obstetrics and Gynecology
No ratings yet
Data Collection and Analysis in Obstetrics and Gynecology
47 pages
Prediction Is A Key Task of Statistics
No ratings yet
Prediction Is A Key Task of Statistics
18 pages
Formula Sheet Biostatistics
No ratings yet
Formula Sheet Biostatistics
2 pages
Unit - 1 Statistical Modelling
No ratings yet
Unit - 1 Statistical Modelling
104 pages
Eco Trix
No ratings yet
Eco Trix
16 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Probability & Statistics
No ratings yet
Probability & Statistics
36 pages
Chapter 3
No ratings yet
Chapter 3
59 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
4 Statistics and Probability g11 Quarter 4 Module 4 Identifying The Appropriate Test Statistics Involving Population Mean
No ratings yet
4 Statistics and Probability g11 Quarter 4 Module 4 Identifying The Appropriate Test Statistics Involving Population Mean
28 pages
Stat 151 - Final Review
No ratings yet
Stat 151 - Final Review
15 pages
Feature Scaling in Machine Learning
No ratings yet
Feature Scaling in Machine Learning
14 pages
Case 01 Complete
100% (1)
Case 01 Complete
12 pages
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
MTH 161: Introduction To Statistics: Dr. Mumtaz Ahmed
No ratings yet
MTH 161: Introduction To Statistics: Dr. Mumtaz Ahmed
24 pages
s810153 Abstract
No ratings yet
s810153 Abstract
51 pages
Regression Modelling 1 Assignment (R)
No ratings yet
Regression Modelling 1 Assignment (R)
7 pages
Example 10 of Industrial Stat
No ratings yet
Example 10 of Industrial Stat
4 pages
SSC CGL Stats Syllabus
No ratings yet
SSC CGL Stats Syllabus
3 pages
Plotly
No ratings yet
Plotly
10 pages
DBCA
No ratings yet
DBCA
11 pages
How PSLE Aggregate and T
No ratings yet
How PSLE Aggregate and T
4 pages
Module 6 3Q For Students
No ratings yet
Module 6 3Q For Students
34 pages
Regresi 2
No ratings yet
Regresi 2
3 pages
What Is Statistics
No ratings yet
What Is Statistics
13 pages
Statistics Theory Notes
No ratings yet
Statistics Theory Notes
21 pages
Mean Median Mode Range Interquartile Range
No ratings yet
Mean Median Mode Range Interquartile Range
2 pages
Materi Uji Normalitas Dan Homogenitas
No ratings yet
Materi Uji Normalitas Dan Homogenitas
2 pages
Midterm Sample
No ratings yet
Midterm Sample
6 pages
Output Spss Tugas 2
No ratings yet
Output Spss Tugas 2
8 pages
Answer of Assignment of Displaying & Exploring Data
No ratings yet
Answer of Assignment of Displaying & Exploring Data
7 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
Chapter 7 - Statistics
No ratings yet
Chapter 7 - Statistics
15 pages
Multiple Regression
0% (1)
Multiple Regression
41 pages
I'M STILL HEALING: A poetry collection of love and trauma.
From Everand
I'M STILL HEALING: A poetry collection of love and trauma.
Koko Escoto
No ratings yet
The Ultimate Career Success Toolkit: Proven Strategies for Landing Your Dream Job and Achieving Your Goals: The Self-Development Mini Series, #0
From Everand
The Ultimate Career Success Toolkit: Proven Strategies for Landing Your Dream Job and Achieving Your Goals: The Self-Development Mini Series, #0
Rae Stonehouse
No ratings yet
The Ultimate Career Success Toolkit: Proven Strategies for Landing Your Dream Job and Achieving Your Goals
From Everand
The Ultimate Career Success Toolkit: Proven Strategies for Landing Your Dream Job and Achieving Your Goals
Rae A. Stonehouse
No ratings yet

Simple Linear Regression Part I - Updated FA18

Uploaded by

Simple Linear Regression Part I - Updated FA18

Uploaded by

ERM 680

Simple Linear Regression

 Remember, if we randomly draw a score from

 We know a student comes

from UNCG, if I ask you

about his symptom score   

(how much psychological

problems he might have),  

the closest you can say is

“about 90.7” because

that’s the mean, the best

number of symptoms students in 175

the average in the population.

 However, this average may be 

higher or lower than the actual 100    

scores for many individuals

 Assume we also had scores on

the symptoms scores for each 0 25 50 75

individual more precisely if we can stress

and the associated stress level.

 E.g. If a student’s stress score is

18, his estimated symptom score

 Our approach of finding the best line is

a and b are called unstandardized regression

The intercept, a, is derived from the

Compare the two charts and comment

 The equation in SPSS: Y=3.74 E2+4.87X

a  Y  bX  598.57  (4.865  46.21)  373.73

When X=53, we predict…

18 53 700 631.58 68.42

24 53 580 631.58 -51.58

27 53 630 631.58 -1.58

28 53 620 631.58 -11.58

 When you predict a score using

Error variance or variance of  (Y  Yˆ ) 2

squared residuals. Residuals are “leftovers”.

= 598.5714 0.00 28940.12

 SSY : total variablility in Y can be decomposed in two

a. the part that can be explained/predicted,

b. The part of the variability in Y that is independent

SSY  SSYˆ  SSY Yˆ SSYhat SSerror

Adjusted Std. Error of

Significance achieved if this value is less

 The same issues arise in regression as

 Avoid making predictions where you have

 =r when there is only one independent variable,

 is also sometimes called the beta weight

In our STAV example, =.532

You might also like