0% found this document useful (0 votes)
21 views6 pages

Part 8 Linear Regression & Anova

Uploaded by

natashasamanga5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views6 pages

Part 8 Linear Regression & Anova

Uploaded by

natashasamanga5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Linear Regression and ANOVA Notes

PART 8 LINEAR REGRESSION AND ANOVA NOTES

In quantitative methods for business, regression analysis is a statistical method used to


establish a relationship between two or more variables, where one variable (the dependent
variable) is predicted or explained by one or more independent variables.
An example of a regression equation is:

……………….. Equation 1
Where;
Yi = the dependant variable = the regressand = the explained variable.
Xi = the Independent variable = the regressors = the Explanatory variables.

There are many uses of regression models. They can be used in forecasting, identifying
relationships, optimization, etc.

The main types of linear regression models are:


i. Simple Linear Regression
ii. Multiple Linear Regression

8.1 Simple Regression


A simple linear regression model is a statistical model that describes the linear relationship
between a dependent variable ( ) and a single independent variable ( ). A simple linear
regression model can be expressed as:

………………. Equation 2

Where is the dependent variable and is the independent variable, is the y intercept or
value of when and is the gradient or the change in for a one-unit change in .

Group Discussion Questions:


1. What is the primary purpose of a Simple Linear Regression model?
2. How do you interpret and in a Simple Linear Regression model?
3. How do you handle outliers and influential observations in a Simple Linear Regression
analysis?
4. How do you evaluate the goodness of fit of a Simple Linear Regression model?
5. How do you test for linearity between the independent and dependent variables in a
Simple Linear Regression model?

Page 1 of 6
Linear Regression and ANOVA Notes

8.2 Multiple Regression


A multiple linear regression model is a statistical model that describes the linear relationship
between a dependent variable ( ) and two or more independent variables ( ). A multiple
linear regression model can be expressed as:

………………. Equation 3

Where is the dependent variable and is the independent variable with being any integer
number from 1 to the total number of variables.

Since is any integer number from 1 to the total number of variables, if , it becomes a
simple linear regression model. If , then the model will be written as:

………………. Equation 4

If n = 3, then the model will be written as:


………………. Equation 5

Therefore, a multiple linear regression model with n variables can be expressed as:
………………. Equation 5

Group Discussion Questions:


1. What is the difference between Multiple Regression and Simple Linear Regression?
2. Discuss the advantages and limitations of using Multiple Regression analysis in research.
3. What are some common applications of Multiple Regression in business and economics?
4. List any 5 assumptions of Classical Linear Regression.
5. A company wants to predict the sales of a new product based on the price, advertising
expenditure, and seasonality. How would you use Multiple Regression analysis to solve
this problem?
6. A researcher wants to examine the relationship between the dependent variable (job
satisfaction) and several independent variables (age, experience, salary, and education
level). How would you use Multiple Regression analysis to analyze this data?

8.3 Assumptions of Classical Linear Regression


 The relationship between the independent variable(s) and dependent variable is linear.
 The explanatory variables are independent and identically distributed (IID).
 There are no outliers on the scatter plot of independent variables.
 The variance of the errors (residuals) is constant across all levels of the independent
variable(s).
 The errors (residuals) are normally distributed.
 The errors (residuals) are not auto-correlated, meaning that the errors at one observation are
not related to the errors at another observation.
 The variance of the errors (residuals) is constant across all observations.

Page 2 of 6
Linear Regression and ANOVA Notes

8.4 Regressors
Regressors are variables that are used to predict or explain the value of the dependent
variable. They are also known as independent variables, predictor variables, or explanatory
variables. Regressors can be continuous or categorical variables.

Group Discussion Questions:


1. What is the role of a Regressor in a regression model?
2. How do you determine the number of Regressors to include in a regression model?
3. What is the difference between a Continuous Regressor and a Categorical Regressor?
4. List any 3 regressors that can be used in an experiment to determine the Stock Prices of
Econet?

8.5 Regressands
A regressand is the dependent variable or outcome variable in a regression model. It is the
variable that is being predicted or explained by the Regressors (independent variables).
Regressands are typically continuous variables, but can also be binary or categorical
variables. examples of regressands are yearly sales, stock prices of a company, customer
loyalty, GDP growth rate, etc.

Page 3 of 6
Linear Regression and ANOVA Notes

8.6 Best Linear Unbiased Estimator (BLUE)


An estimator is said to be the Best Linear Unbiased Estimator (BLUE) if it satisfies the
following conditions:
i. The estimator is a linear function of the sample data.
ii. The estimator is Unbiased. The expected value of the estimator is equal to the true
population parameter.
iii. The estimator has Minimum Variance among all unbiased linear estimators.

The Best Linear Unbiased Estimator (BLUE) plays a crucial role in statistical inference. Its
properties of linearity, unbiasedness, and minimum variance make it the most efficient and
accurate estimator among all unbiased linear estimators. BLUE has numerous applications in
statistics, econometrics, and time series analysis

A BLUE estimator is an estimator that is not only unbiased but also has the minimum
variance among all unbiased linear estimators. While both unbiased estimators and BLUE
estimators are used to estimate population parameters, BLUE estimators are more efficient
and have the minimum variance among all unbiased linear estimators. The Ordinary Least
Squares estimator is the BLUE for the linear regression model.

Group Discussion Questions:


List any 3 properties of the Best Linear Unbiased Estimator (BLUE).

Page 4 of 6
Linear Regression and ANOVA Notes

8.7 Error term concept


The goal of regression analysis is to create a mathematical model that can be used to predict
the value of the dependent variable based on the values of the independent variables.
However, no regression model is perfect, and there is always some degree of error or
uncertainty associated with the predictions made by the model. This is why we need to
encompass other factors not included in the regression equation through the use of the
error term.

Therefore, the correct regression model has to factor in the excluded errors. It can be written
as:
………………. Equation 6

Where is the error term with the variables and extracted from the sample. Therefore,
the value

………………. Equation 7

The Error Term is also known as the stochastic Disturbance Term or Residual Term. It
represents the random variation in the dependent variable (y) that cannot be explained by the
independent variables (x) in the regression model. It also captures omitted variables and
accounts for model misspecification.

Group Discussion Questions:


Define the error term in a regression equation.
What are some common causes of error terms in regression equations?
Discuss the role of the error term in regression analysis.

8.8 Interpretation of regression output (Excel or Eviews)


With the advancement of technology, various software programs have been developed to
facilitate regression analysis. Among these, Microsoft Excel has emerged as one of the most
popular and widely used tools for regression analysis.

Excel provides an easy-to-use interface for regression analysis. The software offers a built-in
regression tool that allows users to perform regression analysis with just a few clicks. This
makes it accessible to users who may not have extensive statistical knowledge or
programming skills. The regression analysis in your group assignment:

Page 5 of 6
Linear Regression and ANOVA Notes

8.9 ANOVA
ANOVA is the acronym for Analysis of Variance. It is a statistical technique used to compare
the means of two or more groups to determine if there is a significant difference between
them.

The comparisons are made of datasets with the following assumptions:


i. The data should be normally distributed.
ii. The variances of the groups should be equal.
iii. The observations should be independent of each other.

Steps to perform ANOVA:


1. State the null and alternative hypotheses
The null hypothesis states that there is no significant difference between the means of the
groups, while the alternative hypothesis states that there is a significant difference.
2. Choose the level of significance
The level of significance is the maximum probability of rejecting the null hypothesis
when it is true.
3. Calculate the F-statistic
The F-statistic is a ratio of the variance between groups to the variance within groups.
4. Determine the critical F-value
The critical F-value is the value of the F-statistic that corresponds to the chosen level of
significance.
5. Compare the calculated F-statistic to the critical F-value
If the calculated F-statistic is greater than the critical F-value, the null hypothesis is
rejected, indicating a significant difference between the means of the groups.

A large F-statistic indicates a significant difference between the means of the groups.
A small p-value indicates a significant difference between the means of the groups.

Situational Example:
A researcher wants to compare the average scores of students who received an online
teaching method (Group A) with those who received traditional teaching method (Group B).
The researcher has collected data on the scores of 20 students in each group.

Research Question:
Is there a significant difference in the average scores of students who received the new
teaching method (Group A) and those who received the traditional teaching method (Group
B)?

Group Discussion Questions:


What are the steps involved in performing ANOVA?
What is the purpose of the F-statistic in ANOVA?

Page 6 of 6

You might also like