0% found this document useful (0 votes)
13 views

Simple Linear Regression

The document provides an overview of simple linear regression, detailing its purpose of modeling the relationship between two variables through a regression line. It explains the estimation of regression parameters, including the intercept and slope, using the method of least squares and provides a step-by-step example. Additionally, it discusses inferences about regression parameters, significance testing, and assumptions necessary for valid results.

Uploaded by

mugumeignatius60
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Simple Linear Regression

The document provides an overview of simple linear regression, detailing its purpose of modeling the relationship between two variables through a regression line. It explains the estimation of regression parameters, including the intercept and slope, using the method of least squares and provides a step-by-step example. Additionally, it discusses inferences about regression parameters, significance testing, and assumptions necessary for valid results.

Uploaded by

mugumeignatius60
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

KAMPALA INTERNATIONAL UNIVERSITY

WESTERN CAMPUS
FACULTY: EDUCATION.
DEPARTMENT: POST GRADUATE STUDIES.
COURSE UNIT: EDUCATIONAL RESEARCH METHODS
YEAR OF STUDY: 2024/2025
YEAR: ONE
SEMESTER: ONE
NAME OF LECTURER: ASS. PROF. AUGUSTINE
PWASONG DAVOU (PhD)
NAMES OF STUDENT:
1. ARIYO ROBERTS: 2025-01-34789
TEL: NO 0782476224
PROGRAMME: MED
DATE OF SUBMISSION: 22ND MARCH 2025
COURSEWORK
Simple linear regression
a) Estimating the parameters of a simple linear regression
b) Inferences about regression parameters
Simple linear regression is a statistical method used to model the relationship
between two variables it involves finding a straight line called the regression line
that best fits the data points. The line represents how one variable changes in
relation to the other
It works with two Variables
1. Independent Variable (X).
The predictor Input Variable
2. Dependent Variable (y).
The out Come or response Variable which depends on the Independent Variable.
LINEAR RELATIONSHIP.
It assumes that there is a linear relationship between the two variables, meaning
that change in the dependent variable is proportional to the change in the
independent variable.
EQUATION OF THE LINE.
The relationship is represented by the Equation of a straight line.
Y = βo+ β1 x+ε Where
Y is the dependent Variable (what you are trying to predict).
X is the Independent Variable (predictor)
βo is the Y- Intercept (The Valve of Y where x=0)
β1. is the slope of the Line (how much Y Changes for each unit of Change in X)
ε Is the error term (the difference between the actual Value and the predicted
Value)
ESTIMATION OF SIMPLE LINEAR REGRESSION
In Simple Linear regression, the goal is to estimate the parameters of the regression
line that (1) The Intercept (βo) and the Slope (β1).
The Parameters are estimated using the Method of least squares which minimizes
the Sum of the squared differences between the observed Value and the Values
predicted by the model (e)
Formula for the regression line therefore is
Y= βo + β1x
ESTIMATION OF PARAMETERS
1. To Calculate the Slope (β1) the formula used is

β₁ ¿
∑ ( X — X ) ( yi− y )
∑ ( xi−x̅ ) 2

xi and yi are the individual data points


̅x and y̅ are the means of the x values and y values respectively
2. To calculate βo, the intercept, the formula used is
βo = y̅ – β1 ̅
Step-by-step process.
Using an example of Simple Linear regression where we estimate the relationship
between time of Study (in hours) and the exam Scores (in %age) illustrated in the
table below
Time of study (in hours) x Exam score (%age) y
1 50
2 55
3 60
4 65
5 70
Step 1
Calculate the mean of X and y the formula for the mean is
∑x ∑y
x= , y=
n n

Where n is the number of data points in this case, n= 5


To calculate x (mean of time of study
5 15
x=1+2+3+ 4+ = =3
5 5
To calculate y (mean of exam scores)
70 300
y=50+ 55+60+65+ = =60
5 5

Step 2
Calculate the slope β1
The formula for the slope is
∑ ( x i−x ̅ )( Y i− y ̅ )
β i=
∑ ( x i−x ̅ ) 2

We first compute as follows


Time of Exam score xi - x̅ Yi - y̅ (xi- x̅)(yi- x̅) (xx- x̅)2
study(in (in %age)Y
hours)X

1 50 1-3= -2 50-60=-10 −2 x−20=20 (-2)2 =4

2 55 3-2= -2 55-60 = -5 −1 x−15=5 (-1)2 =1

3 60 3-3 =0 60-60 =0 0x0= 0 (())2 = 0


4 65 4-3 =1 65-60 =5 1x5 =5 (1)2 = 1
5 70 5-2=2 70-60 =10 2 x10 =20 (2)2 =4

Substituting in the values


∑ ( Xi−x̅ ) ( Yi−Y ̅ )
βi=
∑ ( Xi−x̅ ) 2
50
βi = 10 =5

Slope (βi) = 5
Step 3
Calculate the intercept β0 = Y̅ - β1 x̅
βo =60 -5(3) = 60-15
βo = 45
Step 4
The regression equation is
Y =βo + β1x
Y = 45 =5X
STEP 5
Predicting (exam score0 for any given time study
For example if a student studies for 6 hours, we can predict their exam score using
the regression equation
Y =45+5 ( 6 )=45+30=7 5

CONCLUSION
The estimated regression equation is
y=45+5 x

This means that for each additional hour of study the exam is expected to increase
by 5 points
If the a student studies for 6 hours the predicted exam score is 75
This can be illustrated by the regression line graph below

A GRAPH OF EFFECT OF STUDY TIME (IN HOURS) ON THE EXAM


SCORES (%) FROM THE DATA ABOVE
INFERENCES ABOUT REGRESSON PARAMETERS
Here are the inferences about regression Parameters
1. Intercept (βo)
The intercept represents the expected Value of the dependent Variable y when the
Independent Variable (x) is 0
2. slope (β1)
The slope represents the rate of change in the dependent Variable (Y) for a one-
Unit increase in the Independent Variable (x)
3. Significance of parameters.
I. p-value
A Small P. Value (typically <0.05 Indicates that parameters is statistically
significant meaning that there is evidence that the Corresponding Coefficient βo or
β1. is different from O. If the p-value is large, the Coefficient may not be
significantly different from 0.
II. Confidence Interval.
The confidence interval for a parameter provides a range within which the true
Value of the parameter is likely to be with certain Level of Confidence (e.g. 95%)
4. Assumption about Parameters.
Linearity
The regression model assumes that the Independent and dependent Variables is
linear if the true relationship is non-linear, He regression parameters may not
provide an accurate representation.
Homoscedasticity.
The error Variance Should be constant across all Levels of the Independent
Variables. If this assumption is violated, it could affect the reliability of the
regression parameters.
5. Estimation
The Parameters βo and β1, are typically estimated using the method of ordinary
least Squares (OLS), which minimizes the Sum of Squared differences between the
observed and Predicted Values of the dependent Variable
The estimates of the parameters (β0and β1) are used to predict future Values of y
based future
6. Interpretation of parameters estimates:
Estimated intercept gives an estimate of independent variables value when the
independent variable is o
(1) estimate Slope) gives an estimate of how much the dependent Variable
changes for each one-unit increase in the Independent Variable
7. Multicollinearity
In Multiple regression. Multicollinearity occurs when Independent Variables are
highly correlated with each other, making it difficult to estimate the individual
effect of each Variable. This can cause Instability in the regression Coefficients
making them less reliable
8. Model fit
The Strength of the relationship between the Independent and dependent Variables
is often summarized using the R-Squared Statistic, which indicates the proportion
of Variation in The dependent Variable explained by the Model
In Conclusion, regression parameters are essential for understanding the
relationship between Variables. Intercepts and slope provide insights into the
starting point and rate of change of the dependent variable, while Significance tests
and Confidence Intervals help assess the reliability of the these Parameters. Proper
evaluation of assumptions ensures that these parameters provide Valid and
Meaningful results.
REFERENCE BOOKS

1. Sanford Weisberg (2005) 3rd edition Applied Linear regression Hoboken


publishers
2. Trover Hastie, Robert Tibshirani, Jerome Friedman (2009) second edition
The Elements of statistical Learning. Springer publishers New York
3. Ronal A. Fisher (1970) 14th edition: Statistical Methods for research work.
Oliver and Boyd publishers. Edinburgh Scotland
4. Gareth James, Daniela Witten, Trevor Hentie, Robert Tibshirani (2013) an
Introduction to Statistical Learning with Application in R Springer, New
York.
5. . Andrew Gelman, Jennifer Hill (2006) Data analysis using regression and
Multilevel Hierarchical Models. Cambridge University press Cambridge
UK.

You might also like