0% found this document useful (0 votes)

4 views

MZB127_Topic_11_Lecture_Notes_(Unannotated_Version)

Chapter 11 focuses on linear regression, a statistical method used to analyze the relationship between two numerical variables, identifying one as independent and the other as dependent. It discusses how to explore linear relationships through scatterplots and introduces the statistical assumptions underlying linear regression, including error distribution and parameter estimation. The chapter also provides guidance on using Microsoft Excel to perform linear regression analysis and interpret the results.

Uploaded by

Jagath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

MZB127_Topic_11_Lecture_Notes_(Unannotated_Version)

Uploaded by

Jagath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Chapter 11

Linear Regression

Preface
In this final week, we look at the case of two numerical variables and consider whether they may
have a linear (straight line) relationship. The associated analysis method is known as linear
regression; because it is based on some statistical assumptions about our observations, it also
allows us to test statistically whether there is evidence of such a relationship between these
variables. We will be heavily relying upon Microsoft Excel for this week’s content.

11.1 Exploring Linear Relationships

In most of the cases where we look at possible relationships between two variables, it is natural
to consider one of the variables as being in some sense a response to the values of the other
variable. For example, when comparing the heights and weights of a group of people, we tend
to consider each person’s height as being more or less fixed and think of their weight as being
dependent to some extent on their height. In such a situation, we often call the variable that
explains the other one an explanatory, independent or predictor variable and the other variable
the response variable or dependent variable. (Note that these designations do not necessarily
imply a causal link between the variables, only that one is in some sense more primary compared
to the other.) Whenever we have clear explanatory and response variables, it is conventional
to plot the explanatory variable on the horizontal (x) axis and the response variable on the
vertical (y) axis.
A scatterplot of dependent variable (y) versus independent variable (x) can reveal trends of
various kinds – straight, curved, complicated – or, at times, it just shows a random scatter of
points that has no clear indication of a trend of any sort. While any knowledge of a relationship
between two variables can be useful, the most convenient form of relationship is when the two
variables are related linearly, that is, when the points in their plot fall on a perfect straight line.
We can express this situation mathematically by the equation
y = β0 + β1 x
where β0 is the intercept (specifically, the y-intercept; that is, the predicted value of y when
x = 0), and β1 is the slope (gradient) of the relationship between y and x. In other contexts,
you may have seen this equation written as y = ax + b or y = mx + c.
In practice, however, we often find that there is noticeable variation about whatever trend is
present. It may still be reasonable to characterise the trend as linear but we can no longer be

97
98 Chapter 11. Linear Regression

completely confident in determining what value of the response variable would be observed in
association with a particular value of the explanatory variable. In other words, we may not be
completely sure what the true values of β0 or β1 are, and hence we are not completely sure
what value of y would result from a particular value of x. We might naturally then ask how
well we can estimate these values, which immediately reminds us of the questions we asked
in preceding chapters about using sample data to estimate true values of parameters. Thus,
it will be useful here to introduce a statistical approach to determining the (proposed linear)
relationship between the variables y and x. We motivate and illustrate this approach throughout
this chapter using the “Fishing expedition” dataset described in Example 11.1.1.

Examples

11.1.1 Fishing expedition

A group of students on a fishing expedition observed the type, length (mm), and weight (g),
of 57 fish that they caught. Figure 11.1 shows the first 30 of these 57 observations (see
Canvas for a spreadsheet containing all the data). A scatterplot was created in Microsoft
Excel (Figure 11.2) to explore the possible relationship between length and weight of fish
in this example.

Figure 11.1: Data for first 30 of 57 fish caught on the fishing expedition.
11.1. Exploring Linear Relationships 99

How to calculate in Microsoft Excel?

To create a scatterplot in Microsoft Excel:

1. Select all of the data (this will consist of two columns and multiple rows,
excluding the column headings).

2. Select “Insert”, then in the “Charts” section, select “Insert Scatter (X, Y) or
Bubble Chart”, then select the top-left option “Scatter”. This will produce
the scatterplot.

3. If you wish to add/remove information to the scatterplot (e.g. to make it

prettier), click on the scatterplot, then click the “+” icon that appears to the
top-right of the scatterplot. (For example, you may want to add “Axis Titles”
or add a “Trendline”.)

4. If you wish to change text shown on the scatterplot (e.g. modify the “Chart
Title”), click twice on the text of interest and then type your changes.

Figure 11.2: Scatterplot of the weight vs length of fish caught.

(a) Treating fish weight as the dependent variable y, and treating fish length as the
independent variable x, will it be possible to exactly fit the equation
y = β0 + β1 x
to the data shown in the scatterplot of fish weight vs fish length (Figure 11.2)?
100 Chapter 11. Linear Regression

(b) In your opinion, does it look like the proposed relationship between fish weight (y)
and fish length (x) shown in Figure 11.2 could be approximately linear?

11.2 Statistical assumptions of linear regression

We have just seen that it isn’t possible to exactly fit a straight line through the points for a real
dataset comparing two variables. How then can we model our observations so as to provide a
workable method of fitting a suitable straight line through these points?
We start by assuming an underlying linear relationship between the two variables, but now with
some sort of scatter or “error” superimposed on this underlying relationship. More specifically,
we assume that the explanatory (x) values have been observed precisely but that there is some
random error in each observed response (y) value. (This error could be due to measurement
inaccuracy, the presence of other influencing factors that haven’t been taken into account, or
some combination of these.) If there are n paired observations each of variables x and y, and we
denote the ith observations of x and y as xi and yi respectively, then the model for our observed
responses is then:

yi = β0 + β1 xi + εi , i = 1, ..., n,

where εi is the error in the observation of yi . (Note in Example 11.1.1 that n = 57, so i can
take values of 1, 2, 3, etc. up to 57.)
Equivalently, we can consider εi = yi − β0 − β1 xi as the difference between the observed value yi
and the value it should take according to the underlying linear model. This difference is called
the residual (for that observation). Now, if we are considering εi as a random error or scatter,
it makes sense to assume a probability distribution for these quantities εi . In practice, we make
the following (reasonable) assumptions about the residuals:

1. The errors εi and εj are identically distributed but independent of one another for all
i ̸= j.

2. Their average is zero, that is, E (εi ) = 0 for all i.

3. They are Normally distributed, that is, εi ∼ N (0, σ 2 ) for all i.

Note that Assumption 1 implies that σ 2 in Assumption 3 is the same for all observations.
iid
These assumptions can be summarised in the statement εi ∼ N (0, σ 2 ), where “iid” is short for
independently and identically distributed. Having assumed a distribution for these errors, we
11.3. Outputs of linear regression 101

can also make equivalent statements about the distribution of the observed response values yi ,
as follows:

E (yi ) = β0 + β1 xi + E (εi )
= β0 + β1 xi ,
Var (yi ) = 0 + 0 + Var (εi )
= σ2,

so yi ∼ N (β0 + β1 xi , σ 2 ) for all i. (Recall that β0 and β1 are (unknown) constants and that we
have assumed the xi values have been observed without any error, so they are effectively known
constants here.)

11.3 Outputs of linear regression

When we perform linear regression on a dataset consisting of n paired x- and y-values, we aim
to estimate three unknown parameters characterising the proposed (and likely approximate)
linear relationship between x and y. The three unknown parameters we wish to estimate are β0
(the intercept), β1 (the slope) and σ (the standard deviation of the errors).
Now, we will not typically be able to estimate the true values of parameters β0 , β1 and σ;
instead we will obtain sample estimates of these parameters, denoted here as β̂0 , β̂1 and s. Also,
note here that the sample standard deviation of the errors, s, is sometimes also referred to as
the “standard error of the estimate”.

Not assessed: It is possible to show that, for a set of data (xi , yi ), i = 1, ..., n, fitted to the
iid
linear model yi = β0 + β1 xi + εi , where εi ∼ N (0, σ 2 ), that:

β̂0 = ȳ − β̂1 x̄,

Pn
(xi yi − nx̄ȳ)
β̂1 = Pi=1
n 2 2
,
i=1 (xi − nx̄ )

n
1 X 2
s2 = yi − β̂0 − β̂1 xi ,
n − 2 i=1

n n
1X 1X
where ȳ = yi and x̄ = xi .
n i=1 n i=1

Whilst we could calculate β̂0 , β̂1 and s by hand using the formulas above, in practice we would
usually use statistical software packages (e.g. “Trendline” option and/or “Regression” Analysis
Tool in Microsoft Excel) to do these calculations for us. Furthermore, these packages can give
us several additional useful quantities, beyond just the sample estimates β̂0 , β̂1 and s.
For example, we may be interested in how close β̂0 and β̂1 are to the true values β0 and β1 . To
assess this, we can obtain sample estimates for the standard deviations of β̂0 and β̂1 , denoted as
sβb0 and sβb1 .
102 Chapter 11. Linear Regression

Not assessed: It is possible to show that, for a set of data (xi , yi ), i = 1, ..., n, fitted to the
iid
linear model yi = β0 + β1 xi + εi , where εi ∼ N (0, σ 2 ), that:

s2
s2βb0 = Pn 2,
i=1 (xi − x̄)
s2 ni=1 x2i
P
s2βb1 = Pn 2.
i=1 (xi − x̄)

Furthermore, we may be interested in the proportion of variation in the response variable (y)
that is explained by fitting the linear regression model. This quantity is labelled R2 , and takes
a value between 0 and 1 inclusive. If R2 is closer to 1, then a greater proportion of the variation
in y is explained by the regression model yi = β0 + β1 xi + εi . If R2 is closer to 0, then a smaller
proportion of the variation in y is explained by this regression model.

Not assessed: It is possible to show that, for a set of data (xi , yi ), i = 1, ..., n, fitted to the
iid
linear model yi = β0 + β1 xi + εi , where εi ∼ N (0, σ 2 ), that:
Pn 2
2 i=1 (yi − ŷi )
R =1− P n 2
,
i=1 (yi − ȳ)

where ŷi = β̂0 + β̂1 xi .

All of these quantities can be calculated by hand but we usually use statistical software to
perform the calculations for us instead.

How to calculate in Microsoft Excel?

If you are only interested in obtaining the sample estimate of the intercept (β̂0 ), the
sample estimate of the slope (β̂1 ), and/or the proportion of the variation in response
variable explained by the linear regression model (R2 ), then:

1. Click on the scatterplot, then click the “+” icon that appears to the top-right of
the scatterplot, then check “Trendline”.
2. A linear trendline will appear on your scatterplot. Right-click on the trendline, then
click “Format Trendline...”.
3. In “Trendline Options”, ensure that “Linear” is checked. Check “Display Equation
on chart” and “Display R-squared value on chart”.
4. A textbox will appear on your scatterplot in the format

y = β̂1 x + β̂0
R2 = R2

with the calculated values of the linear regression outputs β̂0 , β̂1 , and R2 shown in
place of the red text above (see Figure 11.3 for an example).
11.3. Outputs of linear regression 103

Figure 11.3: Scatterplot of the weight vs length of fish caught (data from Example 11.1.1),
including trendline and linear regression parameter estimates β̂0 = −190.13, β̂1 = 2.444 and
R2 = 0.8639.

How to calculate in Microsoft Excel?

If you are instead interested in performing a full linear regression analysis in Microsoft
Excel, which will yield values for the sample estimate of the intercept (β̂0 ), the sample
estimate of the slope (β̂1 ), the standard error of the estimate (s), the sample standard
deviation of the intercept (sβb0 ), the sample standard deviation of the slope (sβb1 ), the
proportion of the variation in response variable explained by the linear regression model
(R2 ), as well as many other useful quantities, then there are two steps:
Installing the Analysis Toolpak
To perform linear regression in Microsoft Excel, you first need to install the free add-in
called “Analysis Toolpak”:

1. Go to “File” >> “Options” >> “Add-Ins”.

2. Next to “Manage: Excel Add-ins”, click “Go...”

3. Check “Analysis ToolPak” and then click OK.

Performing Regression Analysis

You only need to do the installation (described above) once. After you have installed the
“Analysis Toolpak”, linear regression is performed as follows:

1. Click “Data”, then click “Data Analysis”.

2. Select “Regression” and then click OK.

3. For “Input Y Range”, select all cells where the data is stored for the dependent
variable (y).

4. For “Input X Range”, select all cells where the data is stored for the independent
variable (x).

5. Check “Line Fit Plots”.

104 Chapter 11. Linear Regression

6. Select “Output Range” and choose a cell that has no data above or to the right of
it.

7. Click OK.

8. This will produce two outputs:

(a) Several tables of linear regression outputs positioned with its top-left corner in
the cell you chose in Step 6 (see Figure 11.5 for an example). In this table,
The sample estimate of the intercept (β̂0 ) is given by the number in the
“Coefficients” column and “Intercept” row of the third table.
The sample estimate of the slope (β̂1 ) is given by the number in the
“Coefficients” column and “X Variable 1” row of the third table.
The standard error of the estimate (s) is given by the number next to
“Standard Error” in the first table.
The sample standard deviation of the intercept (sβb0 ) is given by the number
in the “Standard Error” column and “Intercept” row of the third table.
The sample standard deviation of the slope (sβb1 ) is given by the number in
the “Standard Error” column and “X Variable 1” row of the third table.
The proportion of the variation in response variable explained by the linear
regression model (R2 ) is given by the number next to “R Square” in the
first table.
(b) A “line fit plot” which is an option you chose in Step 5 (see Figure 11.4 for
an example). To change the fitted line from dots to a line, right-click on the
dots making up the line, and select “Format Data Series...”. There are many
options there to make the fitted line plot prettier!

Figure 11.4: Line fit plot obtained from a linear regression analysis applied to the data for
weight vs length of fish caught (data from Example 11.1.1).
11.3. Outputs of linear regression 105

Figure 11.5: Outputs of a linear regression analysis applied to the data for weight vs length of
fish caught (data from Example 11.1.1). In this example, the “Residual Output” table has rows
for all 57 observations (only the first 29 observations are shown here!). This output gives us
that β̂0 ≈ −190.13, β̂1 ≈ 2.444, s ≈ 67.944, sβb0 ≈ 35.52, sβb1 ≈ 0.131, and R2 ≈ 0.8639.
106 Chapter 11. Linear Regression

11.4 Strength of evidence for the linear relationship

When we perform linear regression, a natural question to ask is “what is the strength of the
evidence [from the data] that a linear relationship exists between variables x and y?” In this
section, we combine hypothesis testing (from Section 10.2) with the linear regression output
parameters (from Section 11.3) to answer this question.
First of all, it can be shown (not here!) that differences between the true and sample estimates of
parameters β0 and β1 , scaled by their respective sample standard deviations, follow a Student’s
t-distribution with n − 2 degrees of freedom:

βˆ0 − β0
t= , d = n − 2,
sβb0

βˆ1 − β1
t= , d = n − 2.
sβb1
(Not assessed: The use of n − 2 degrees of freedom here is related to the fact that we have
used linear regression to estimate two quantities, β0 and β1 .)
Combining the formulas above with what we have learned in Chapter 10, this means that we
can:

1. Construct confidence intervals for the true value of the intercept (β0 ) and the true value
of the slope (β1 ) – see Section 10.1; and

2. Perform hypothesis tests to compare the sample values of the intercept (β̂0 ) and slope (β̂1 )
obtained from our (x, y) data to separate pre-existing hypotheses about the true values of
the intercept (β0 = some number) and slope (β1 = some number) – see Section 10.2.

Following on from the first point above, and using similar mathematical procedures to those
described in Sections 10.1.1 and 10.1.2, we can obtain that the confidence intervals, for a
confidence level of (1 − α), for the true values of parameters β0 and β1 , are given by:

βˆ0 ± a0 , where a0 = tn−2,α/2 sβb0 ,

βˆ1 ± a1 , where a1 = tn−2,α/2 sβb1 ,

recalling here that td,p denotes the value of the t-distribution for d degrees of freedom that
satisfies Pr(T > td,p ) = p, and recalling that Fawcett and Kent Table 6 can be used to obtain
these values td,p .

How to calculate in Microsoft Excel?

Note here that the Microsoft Excel linear regression output actually pre-calculates the
95% confidence intervals (i.e. α = 0.05) for the true values of the intercept (β0 ) and slope
(β1 ) already! This information is stated in the “Lower 95%” and “Upper 95%” columns
in the third table of the regression output (see Figure 11.5). By default, Microsoft Excel
reports these confidence intervals twice.
If instead you want Excel to calculate an confidence interval that is different to 95%, this
can be selected when you perform the regression analysis (it is an additional option listed
just below “Input X Range”).
11.4. Strength of evidence for the linear relationship 107

Examples

11.4.1 Fishing expedition: revisited

Using the Microsoft Excel regression analysis output shown in Figure 11.5 for the fishing
expedition data described in Example 11.1.1, and the formulas on the previous page,
confirm that Microsoft Excel is correctly calculating the 95% confidence intervals for the
true values of the intercept (β0 ) and slope (β1 ).
108 Chapter 11. Linear Regression

Following on from the second point (about hypothesis tests), a natural test to perform – and
this answers our original question posed at the start of this section “what is the strength of the
evidence [from the data] that a linear relationship exists between variables x and y?” – is:

How much evidence is there that the slope parameter (β1 ) is different from zero?

A value of the slope parameter of β1 = 0 would imply a horizontal line when y is plotted against
x, which in turn implies that x does not explain any variation in y! So comparing the null
hypothesis H0 : β1 = 0 against the alternative hypothesis H1 : β1 ̸= 0 determines the evidence
for a linear relationship between variables x and y.
Because this is such an important test to perform, the p-value associated with this hypothesis
test (H0 : β1 = 0 versus H1 : β1 ̸= 0) is already pre-calculated in the output of most linear
regression analysis software packages.

How to calculate in Microsoft Excel?

For the hypothesis test H0 : β1 = 0 versus H1 : β1 ̸= 0, which determines the evidence
associated with the slope of the fitted line between y-data and x-data being different
from zero (and thus the evidence associated with a linear relationship existing between
variables x and y), the associated p-value is listed in Microsoft Excel regression analysis
output in two places:

Under the “Significance F” column in the second table.

In the “P-value” column and “X Variable 1” row in the third table.

Microsoft Excel also calculates the p-value for the evidence associated with the intercept of
the fitted line between y-data and x-data being different from zero (and thus the evidence
for whether or not the fitted line goes through the origin (x = 0, y = 0)). This p-value is
listed in the “P-value” column and “Intercept” row in the third table, and is associated
with the hypothesis test H0 : β0 = 0 versus H1 : β0 ̸= 0.

Examples

11.4.2 Fishing expedition: revisited

(a) Using the Microsoft Excel regression analysis output shown in Figure 11.5 for
the fishing expedition data described in Example 11.1.1, confirm that Microsoft
Excel is correctly calculating the p-value for the hypothesis test H0 : β1 = 0 versus
H1 : β1 ̸= 0.
11.4. Strength of evidence for the linear relationship 109

(b) What does the p-value above tell us?

110 Chapter 11. Linear Regression

(c) Using the Microsoft Excel regression analysis output shown in Figure 11.5 for
the fishing expedition data described in Example 11.1.1, confirm that Microsoft
Excel is correctly calculating the p-value for the hypothesis test H0 : β0 = 0 versus
H1 : β0 ̸= 0.

(d) What does the p-value above tell us?

Introduction To Statistics and Data Analysis - 11
No ratings yet
Introduction To Statistics and Data Analysis - 11
47 pages
Assignment Forecasting
No ratings yet
Assignment Forecasting
9 pages
Chapter 12 Notes
No ratings yet
Chapter 12 Notes
60 pages
OpenStax Chapter 12 Power Point
No ratings yet
OpenStax Chapter 12 Power Point
81 pages
1486016038da-mod12-Q1-e-text
No ratings yet
1486016038da-mod12-Q1-e-text
11 pages
Notes 1017 Part1
No ratings yet
Notes 1017 Part1
50 pages
lecture 6 linear regression
No ratings yet
lecture 6 linear regression
8 pages
REGRESSION ANALYSIS STA 221
No ratings yet
REGRESSION ANALYSIS STA 221
10 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
Linear Regression
No ratings yet
Linear Regression
10 pages
Chapter 1 Simple Linear Regression
No ratings yet
Chapter 1 Simple Linear Regression
17 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Regression Notes- Part-1
No ratings yet
Regression Notes- Part-1
17 pages
Unit 2 - Scatterplots Correlation and Regression Summer 2021
No ratings yet
Unit 2 - Scatterplots Correlation and Regression Summer 2021
43 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
CVEN2002 Week11
No ratings yet
CVEN2002 Week11
49 pages
Business Stat 10 12 .PDF
No ratings yet
Business Stat 10 12 .PDF
144 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
CHAPTER 1 - INTRODUCTION - Introduction To Linear Regression Analysis, 5th Edition
No ratings yet
CHAPTER 1 - INTRODUCTION - Introduction To Linear Regression Analysis, 5th Edition
11 pages
MODULE-3
No ratings yet
MODULE-3
34 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
REGRESSION and CORRELATION ANALYSIS STA 106 -DR. BASHIRU
No ratings yet
REGRESSION and CORRELATION ANALYSIS STA 106 -DR. BASHIRU
10 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
Regression and Correlation
No ratings yet
Regression and Correlation
14 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Regression Analysis (Simple)
100% (1)
Regression Analysis (Simple)
8 pages
Asynchronus Learning Module - Sesi 8
No ratings yet
Asynchronus Learning Module - Sesi 8
9 pages
Lecture 2.2: Simple Regression Model-Linear Equation With One Independent Variable
No ratings yet
Lecture 2.2: Simple Regression Model-Linear Equation With One Independent Variable
14 pages
Scatterplots and Regression
No ratings yet
Scatterplots and Regression
17 pages
Gade 12 & 12 Promaths STATS 2024 June 2024
No ratings yet
Gade 12 & 12 Promaths STATS 2024 June 2024
206 pages
Lecture 8 Linear and Multiple Regression
No ratings yet
Lecture 8 Linear and Multiple Regression
55 pages
Regression and correlation notes
No ratings yet
Regression and correlation notes
28 pages
Midterm 2 Nem Veg Leges
No ratings yet
Midterm 2 Nem Veg Leges
9 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Linear Regression - Module 3
No ratings yet
Linear Regression - Module 3
16 pages
Statistic SimpleLinearRegression
No ratings yet
Statistic SimpleLinearRegression
7 pages
Chapter 2. Simple Linear Regression Module May13
No ratings yet
Chapter 2. Simple Linear Regression Module May13
20 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Chapter 0
No ratings yet
Chapter 0
10 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
16 pages
Ra Web
No ratings yet
Ra Web
70 pages
Handout 4 Regression and Correlation
No ratings yet
Handout 4 Regression and Correlation
13 pages
Correlation (Linear Dependence) Linear Regression (Simple and Multiple)
No ratings yet
Correlation (Linear Dependence) Linear Regression (Simple and Multiple)
35 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
SIMPLE LINEAR REGRESSION ANALYSIS..
No ratings yet
SIMPLE LINEAR REGRESSION ANALYSIS..
51 pages
Linear Regression. Com
No ratings yet
Linear Regression. Com
13 pages
06 Simple Linear Regression Part1
No ratings yet
06 Simple Linear Regression Part1
8 pages
Bivariate Data Analysis
100% (1)
Bivariate Data Analysis
34 pages
Regression Analysis 1 2020
No ratings yet
Regression Analysis 1 2020
40 pages
Regression
No ratings yet
Regression
25 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
(Mathe) Simple Linear Regression and Correlation
No ratings yet
(Mathe) Simple Linear Regression and Correlation
61 pages
Chapter 6
No ratings yet
Chapter 6
35 pages
Regression and Correlation
No ratings yet
Regression and Correlation
15 pages
Lec2 ASE
No ratings yet
Lec2 ASE
86 pages
Regression
No ratings yet
Regression
60 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Real Variables with Basic Metric Space Topology
From Everand
Real Variables with Basic Metric Space Topology
Robert B. Ash
5/5 (1)
Je as 7 Study Vortex 2019
No ratings yet
Je as 7 Study Vortex 2019
8 pages
1_multivariable_functions_partial_derivatives
No ratings yet
1_multivariable_functions_partial_derivatives
9 pages
MZB127 Topic 7 Lecture Notes (Unannotated Version)
No ratings yet
MZB127 Topic 7 Lecture Notes (Unannotated Version)
26 pages
MZB127 Week 13 Practice Exam
No ratings yet
MZB127 Week 13 Practice Exam
10 pages
(Green) Employability Skills Personal Qualities and Early Employment Problems of Entry Level Auditor
No ratings yet
(Green) Employability Skills Personal Qualities and Early Employment Problems of Entry Level Auditor
9 pages
(Yellow) Enhancing Graduate Employability Attributes and Capabilities Formation A Service Learning AP
No ratings yet
(Yellow) Enhancing Graduate Employability Attributes and Capabilities Formation A Service Learning AP
18 pages
Sampling CH-5
No ratings yet
Sampling CH-5
6 pages
Experimental Designs
No ratings yet
Experimental Designs
68 pages
Hypothesis Testing
0% (1)
Hypothesis Testing
139 pages
Cusum Charts V-Mask
No ratings yet
Cusum Charts V-Mask
12 pages
BPCC 104 EM 23-24 @assignment - Solved - IGNOU
No ratings yet
BPCC 104 EM 23-24 @assignment - Solved - IGNOU
11 pages
Table of Areas Under The Normal Curve (Updated 2023)
No ratings yet
Table of Areas Under The Normal Curve (Updated 2023)
2 pages
Multiple Regression
No ratings yet
Multiple Regression
67 pages
Binning
No ratings yet
Binning
6 pages
Israr Educational Statistics (8614) 1st Assignment
No ratings yet
Israr Educational Statistics (8614) 1st Assignment
51 pages
Jamboree_Case_Study
No ratings yet
Jamboree_Case_Study
24 pages
Type I and Type II Errors
100% (1)
Type I and Type II Errors
8 pages
22 - Demand Forecasting Using Linear Regression
No ratings yet
22 - Demand Forecasting Using Linear Regression
2 pages
Chapter 4 Statistial Process Control (SPC)
No ratings yet
Chapter 4 Statistial Process Control (SPC)
22 pages
1 PB
No ratings yet
1 PB
10 pages
Research OF Statistics & Probability: Pole, Ricky Mike C. Pajares, Lorens
No ratings yet
Research OF Statistics & Probability: Pole, Ricky Mike C. Pajares, Lorens
6 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
14 pages
4th Periodical Exam in Math
83% (6)
4th Periodical Exam in Math
4 pages
Testing of Hypothesis
100% (1)
Testing of Hypothesis
54 pages
An Introduction To Quantitative Research Methods: DR Iman Ardekani
No ratings yet
An Introduction To Quantitative Research Methods: DR Iman Ardekani
74 pages
Assignment
No ratings yet
Assignment
3 pages
Exponential Families In Theory And Practice Bradley Efron download
No ratings yet
Exponential Families In Theory And Practice Bradley Efron download
80 pages
Your Answer Score Explanation
No ratings yet
Your Answer Score Explanation
20 pages
Refresher Course For Statistics
100% (1)
Refresher Course For Statistics
80 pages
Adopsi Budaya Kerja
No ratings yet
Adopsi Budaya Kerja
11 pages
Practical Research 2 Fourth Summative Test-Second Quarter
No ratings yet
Practical Research 2 Fourth Summative Test-Second Quarter
2 pages
Solution Manual for Introductory Statistics 9th by Mann pdf download
100% (2)
Solution Manual for Introductory Statistics 9th by Mann pdf download
31 pages
What Is Statistical Quality Control
100% (1)
What Is Statistical Quality Control
15 pages
Choosing A Statistical Test PDF
No ratings yet
Choosing A Statistical Test PDF
5 pages
Syllabus Statistics1
No ratings yet
Syllabus Statistics1
2 pages

MZB127_Topic_11_Lecture_Notes_(Unannotated_Version)

Uploaded by

MZB127_Topic_11_Lecture_Notes_(Unannotated_Version)

Uploaded by

Chapter 11

11.1 Exploring Linear Relationships

11.1.1 Fishing expedition

How to calculate in Microsoft Excel?

3. If you wish to add/remove information to the scatterplot (e.g. to make it

Figure 11.2: Scatterplot of the weight vs length of fish caught.

11.2 Statistical assumptions of linear regression

2. Their average is zero, that is, E (εi ) = 0 for all i.

3. They are Normally distributed, that is, εi ∼ N (0, σ 2 ) for all i.

11.3 Outputs of linear regression

β̂0 = ȳ − β̂1 x̄,

where ŷi = β̂0 + β̂1 xi .

How to calculate in Microsoft Excel?

How to calculate in Microsoft Excel?

1. Go to “File” >> “Options” >> “Add-Ins”.

2. Next to “Manage: Excel Add-ins”, click “Go...”

3. Check “Analysis ToolPak” and then click OK.

Performing Regression Analysis

1. Click “Data”, then click “Data Analysis”.

2. Select “Regression” and then click OK.

5. Check “Line Fit Plots”.

8. This will produce two outputs:

11.4 Strength of evidence for the linear relationship

βˆ0 ± a0 , where a0 = tn−2,α/2 sβb0 ,

βˆ1 ± a1 , where a1 = tn−2,α/2 sβb1 ,

How to calculate in Microsoft Excel?

11.4.1 Fishing expedition: revisited

How to calculate in Microsoft Excel?

 Under the “Significance F” column in the second table.

 In the “P-value” column and “X Variable 1” row in the third table.

11.4.2 Fishing expedition: revisited

(b) What does the p-value above tell us?

(d) What does the p-value above tell us?

You might also like

Under the “Significance F” column in the second table.

In the “P-value” column and “X Variable 1” row in the third table.