0% found this document useful (0 votes)
8 views32 pages

Regression 1

regression

Uploaded by

dosavo8504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views32 pages

Regression 1

regression

Uploaded by

dosavo8504
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPSX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Regressio

n
Prepared by
Sajana Shakya
Definition:
It is statistical measure used to
determine probable or average form
of relationship between two or more
variables. The objective of this
analysis is to predict the value of
unknown variable corresponding to
the given value of another variable.
Two types of Regression:
1) Linear Regression
2) Non-linear Regression

Two types of Linear Regression:


1) Simple Regression
2) Multiple Regression
X Y

1. Regression line of Y on X - -

The line of regression of Y on X is - -


- -
given by y = a + bx where a and b
- -
are unknown constants known as
- -
intercept and slope of the equation
This is used to predict the unknown
value of variable Y when value of
variable X is known.
ŷ=a + bx ----> (i)
Residual = - ŷ
=
=
Sum of the squares of errors is
E= = --------------(ii)
According to the principle of least of squares, we have
to determine constants a and b such that E is minimum.
From calculus a necessary condition for minima is
=0 and =0
= =Σ 20 [or, Σy- Σa- Σbx=0]
Σy=n a+ bΣx -------------(1)
[Σa=na]

= =Σ 20 [or, Σxy- Σax- Σbx2=0]


Σx y=a Σx + bΣx2 -------------(2)
n a + b ---------------(1)
=a + b ----------(2)
These equations are called normal equations.
Solving these two normal equations we get
a=
b=
Putting these values in (i) we get required fitted
regression line of y on x as ŷ=a + bx where ŷ is
estimated value of y for given x.
a = y-intercept
b = byx =slope of the regression line= regression
X Y

2. Regression line of x on y - -

On the other hand, the line of regression - -

of X on Y is given by x = c + dy which is - -

used to predict the unknown value of - -


- -
variable X using the known value of
variable Y.
=c + dy ----> (ii)
Residual = -
=
=
Sum of the squares of errors is
E= =
=0 and =0
n c + d ------------(1)
=c + d ------------(2)
Solving these two normal equations we get
c=
d=
Putting these values in (ii) we get
required fitted regression line of x on y
as =c + d y where is estimated value of
x for given y.
c = x-intercept
d = bxy =slope of the regression line =
regression coefficient of x on y
Properties of Regression Coefficient:
 The correlation coefficient is geometric mean of two
regression coefficients.
r=
 The arithmetic mean of two regression coefficients
is always greater than correlation coefficient.
r
 It is independent of change of origin but not of
scale.
= if u = x-A and v = y-B
≠ if u = and v =
Properties of Regression Coefficient:
 The two regression coefficients and
correlation coefficient have same sign.
 Ifone of regression coefficient is greater
than unity then other is less than unity.
If >1 then < 1
 Ifr=0 then two regression lines are parallel
to x- axis and y-axis i.e. the two lines are
mutually perpendicular.
 If r = ± 1, two regression lines are coincides.
Multiple Regression:
The study of average form of
relationship between one
dependent variable and two or
more independent variable is
called multiple regression.
The regression plane of y on x1
and x2 is
y=b0 + b1 x1 +b2 x2
The equation of plain is
y=b0 + b1 x1 +b2 x2

----------(1)
The normal equations are:

+ ----------(2)
+ + -----------(3)
Standard error of estimate:
It is a measure of difference between the observed
sample y- values and the predicted values obtained
from regression line. It is denoted by and given as
= (where is the predicted y value)
This formula is not convenient as it requires to
calculate the estimated value of Y the equivalent
formula is
= [=] [=]
[=n]
The smaller the value of a standard error of estimate
the closer are the dots to the regression line and
better is the estimate based on the equation of the
line.
Standard error of a= Intercep Slope
t
Standard error of b = Population α β
Confidence interval for the Intercept
Sample a b

and slope:
(1-α)100% confidence interval
For intercept α : a ±
For slope β : b ± Populatio Sample
n
Mean µ

Proportion P p

Variance
Problem 2:
Ten steel wires of diameter 0.5 mm and length 2.5m
were extended in a laboratory by applying vertical
forces of varying magnitudes. Results are as follows:
Force (kg) x 15 19 25 35 42 48 53 56 62 65

Increase in length 1.7 2.1 2.5 3.4 3.9 4.9 5.4 5.7 6.6 7.2
(mm) y

a) Estimate the parameters of a simple linear


regression model with force as the explanatory
variable.
b) find 95% confidence interval for the intercept and
slope of the line.

You might also like