0% found this document useful (0 votes)

20 views

Linear Regression

Linear regression analyzes the relationship between two variables to determine if an apparent linear relationship is statistically significant or could be due to chance. It calculates a line of best fit that minimizes the distance between data points and the line. The slope and y-intercept of this line can indicate if changes in the independent variable significantly impact the dependent variable. For example, linear regression of egg mass over time showed mass significantly decreased with increasing time.

Uploaded by

Richard Hampson

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Linear Regression

Uploaded by

Richard Hampson

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Linear Regression

The aim of this learning module is to test whether an apparently linear relationship
between two variables is real, or whether it could have happened by chance because of
variability. To do this we use regression analysis, strictly speaking linear regression.

This is based on the 'method of least squares' that we've already met in using Excel to fit
a trendline to data on a spreadsheet chart.

Regression can also be used to extrapolate or interpolate your data to obtain

predicted values on the Y-axis for X-axis values where you don't have an actual
measurement.

Difference between regression and correlation

In an earlier unit we saw how correlation allows us to test for relationships between
two measurements taken on the same items, either of which might be dependent on
the other, e.g. heart rate and blood pressure.

However, there are many cases when one measurement is clearly independent of the
other one. Some examples:

1. Ages and weights of people: the weight of a growing person clearly depends on their
age, but their age is independent of their weight.

2. Time and intracellular pH in cells: you can measure the pH in cells at particular times,
but it would be meaningless to plan to measure the time at particular pH values.

3. Reaction rate and temperature: you can control the temperature and this affects the
rate of the reaction, but you can't set a reaction rate that will affect the temperature of
the experiment.

Statistically speaking, you shouldn't investigate these cases using correlation. Instead
you should plot the data correctly and use regression analysis to see if there is a linear
association.

Plotting the data

You must plot the dependent variable (weight, pH, reaction rate etc.) on the Y-axis and
the independent variable (time, temperature etc.) on the X-axis. For instance the
graph below shows how the mass of eggs depends on their age.
This helps you to see how the dependent variable is affected by the independent
variable.

What regression does

Regression analysis calculates the "line of best fit" through the data points.

It does this by finding a straight line, y = a + bx, which minimises the sum of the
squares of the distances, si2, of each point to the line.

The slope, b, of the line is given by the equation

where the bars over the x and y indicate that they are the mean values of x and y from
all the data points. The value of a is then calculated by putting the values of x, y and b
into the equation y = a + bx.

The problem

You could get exactly the same line and exactly the same equation just by chance even
if there is no association between the independent variable and the dependent one. This
can easily happen when the data points are very scattered, as on the right-hand graph
below.

In panel (a) you can clearly see that there is a significant linear relationship between the
two variables. This will be indicated by the fact that the sum of squares Σs i2 is low.

In panel (b) the points are all over the place so the value of Σsi2 is likely to be very large,
indicating that there is not a significant association - even though the line of best fit is
exactly the same as in panel (a).

To illustrate the use of linear regression analysis in Prism, let's consider the relationship
between time and the mass of a batch of eggs. It looks as if the mass of the eggs fell as
they got older, but is this a significant fall? To determine this we must test whether the
slope of the regression line is significantly less than zero. The null hypothesis is that the
mas of the eggs does not change over time; in other words that the slope of the line is
zero.

To enter the data we use an XY data table in Prism, putting the time values (the
independent variable in this example) in the X column and the mass of the eggs (the
dependent variable) in the first Y column:
After pressing the Analyze button we select Linear Regression from the list of XY
analyses:
Accepting all the default options in the next dialogue box we get to a Results page
looking like this:

Here we can see that the line of best fit has a Slope of -1.361 with a standard error of
0.0951, and a Y-intercept of 89.44 with a standard error of 2.279.

The equation of the line Y = -1.361*X + 89.44 is shown at the bottom of the window.

The R squared value is 0.9192 is close to 1 telling us that the regression line

(equivalent to a trendline in Excel) is a good fit to the data.

The P value for the difference between the slope and zero is much less than 0.05 so we
can reject our null hypothesis and conclude that the mass of the eggs does decrease
significantly with time.

The Graph page shows an XY plot of the data together with the fitted regression line:
You can also carry out your own t-tests on the results of the linear regression obtained
with Prism to work out whether the slope or intercept are different from any particular
values.

For example, you could test whether eggs are significantly lighter than, say, 90 g when
they are laid. In other words, test whether the intercept is significantly lower than 90.
We do this as follows:

1.Our null hypothesis is that the eggs do weigh 90 g when laid.

2.Calculate t using the equation

This gives:
3.Compare t with the critical value for N - 2 degrees of freedom, where N = number
of points:

tcrit = 2.069

4.Clearly 0.247 is less than 2.069, so the difference is not significant.

We conclude that the initial weight is not significantly different from 90 g.

Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
6 Correlation and Linear Regression
No ratings yet
6 Correlation and Linear Regression
32 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
lecture 6 linear regression
No ratings yet
lecture 6 linear regression
8 pages
Looking at Data: Relationships: Least-Squares Regression
No ratings yet
Looking at Data: Relationships: Least-Squares Regression
23 pages
Correlation Regression And: Learning Outcomes
No ratings yet
Correlation Regression And: Learning Outcomes
16 pages
HELM Workbook 43 Regression and Correlation
No ratings yet
HELM Workbook 43 Regression and Correlation
32 pages
Data Analytics Lesson 11 Notes
No ratings yet
Data Analytics Lesson 11 Notes
8 pages
Psych Stat Reviewer Midterms
No ratings yet
Psych Stat Reviewer Midterms
10 pages
Correlation
No ratings yet
Correlation
22 pages
@regression
No ratings yet
@regression
33 pages
Correlation and Regression
No ratings yet
Correlation and Regression
10 pages
Correlation and Regression
No ratings yet
Correlation and Regression
8 pages
Corelation With Example
No ratings yet
Corelation With Example
112 pages
BCSE352E EDA CAT 2 Mod 1,2,5 PDF
No ratings yet
BCSE352E EDA CAT 2 Mod 1,2,5 PDF
146 pages
Chapter 8
No ratings yet
Chapter 8
45 pages
PARAMETRIC-TEST
No ratings yet
PARAMETRIC-TEST
49 pages
Scatter Plot - Correlation - and Regression On The TI-83-84
No ratings yet
Scatter Plot - Correlation - and Regression On The TI-83-84
5 pages
Lse Ppa M4u3 Notes
No ratings yet
Lse Ppa M4u3 Notes
15 pages
Module 9 - Simple Linear Regression & Correlation
No ratings yet
Module 9 - Simple Linear Regression & Correlation
29 pages
Lec1 ppt2019
No ratings yet
Lec1 ppt2019
23 pages
Lecture9 Regression1 PDF
No ratings yet
Lecture9 Regression1 PDF
22 pages
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
No ratings yet
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
4 pages
BCSE352E EDA CAT 2 Mod 1,2,5
No ratings yet
BCSE352E EDA CAT 2 Mod 1,2,5
146 pages
Simple Regression and Correlation MEE
No ratings yet
Simple Regression and Correlation MEE
7 pages
Stats10_Chapter+4 2
No ratings yet
Stats10_Chapter+4 2
54 pages
SEE5211 Chapter3-P2017
No ratings yet
SEE5211 Chapter3-P2017
58 pages
Unit-15 Data Analysis and R
No ratings yet
Unit-15 Data Analysis and R
12 pages
Lab 4 Regression BBIO180 Manual Au24
No ratings yet
Lab 4 Regression BBIO180 Manual Au24
5 pages
A Tutorial On How To Run A Simple Linear Regression in Excel
No ratings yet
A Tutorial On How To Run A Simple Linear Regression in Excel
19 pages
Topic 6
No ratings yet
Topic 6
22 pages
Correlation and Regression
No ratings yet
Correlation and Regression
61 pages
MBR Lab Week 10-12-1
No ratings yet
MBR Lab Week 10-12-1
65 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
6 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
Captura de ecrã 2024-10-16 à(s) 13.04.06
No ratings yet
Captura de ecrã 2024-10-16 à(s) 13.04.06
38 pages
Lecture 7
No ratings yet
Lecture 7
65 pages
MAT 120 Chapter 9 Notes PDF
No ratings yet
MAT 120 Chapter 9 Notes PDF
4 pages
Chapter 14 Multiple Regression and Correlation Analysis
No ratings yet
Chapter 14 Multiple Regression and Correlation Analysis
25 pages
Statistical Analysis: Linear Regression
No ratings yet
Statistical Analysis: Linear Regression
36 pages
Regression Analysis I
No ratings yet
Regression Analysis I
46 pages
Lesson - 4.2 - Exploratory Data Analysis - Analyze - Phase
No ratings yet
Lesson - 4.2 - Exploratory Data Analysis - Analyze - Phase
50 pages
Correlation and Regression Skill Set
No ratings yet
Correlation and Regression Skill Set
8 pages
4 Regression Analysis
No ratings yet
4 Regression Analysis
44 pages
Regression PDF
No ratings yet
Regression PDF
18 pages
Lecture 7 - Correlation Regression
No ratings yet
Lecture 7 - Correlation Regression
47 pages
L4&5 Multiple Regression 2010B
No ratings yet
L4&5 Multiple Regression 2010B
77 pages
Scatter plot
No ratings yet
Scatter plot
20 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Statistical Methods For Bioinformatics Lecture 1
No ratings yet
Statistical Methods For Bioinformatics Lecture 1
217 pages
Prediction Is A Key Task of Statistics
No ratings yet
Prediction Is A Key Task of Statistics
18 pages
Regresión y Calibración
No ratings yet
Regresión y Calibración
6 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
16 pages
Regression Using Excel
No ratings yet
Regression Using Excel
18 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
Unit_6_Machine_Learning_Algorithms
No ratings yet
Unit_6_Machine_Learning_Algorithms
13 pages
Corr_Regression Analysis
No ratings yet
Corr_Regression Analysis
19 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Mathematical Analysis 1: theory and solved exercises
From Everand
Mathematical Analysis 1: theory and solved exercises
Alessio Mangoni
5/5 (1)
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Lab Practical 10 - Human Biology - Lung Mechanics
No ratings yet
Lab Practical 10 - Human Biology - Lung Mechanics
2 pages
Lecture 6 Handout 2013
No ratings yet
Lecture 6 Handout 2013
2 pages
Lab Practical 7 - Switching On Genes - The Lac Operon of E.Coli
No ratings yet
Lab Practical 7 - Switching On Genes - The Lac Operon of E.Coli
2 pages
Thaumoctopus Mimicus - The Mimic Octopus
No ratings yet
Thaumoctopus Mimicus - The Mimic Octopus
4 pages
Linear Regression Using The 'Trendline' Function (Excel)
No ratings yet
Linear Regression Using The 'Trendline' Function (Excel)
3 pages
Megacities - Lagos
No ratings yet
Megacities - Lagos
4 pages
L'Aquila - Impacts and Management
No ratings yet
L'Aquila - Impacts and Management
6 pages

Linear Regression

Uploaded by

Linear Regression

Uploaded by

Linear Regression

Regression can also be used to extrapolate or interpolate your data to obtain

Difference between regression and correlation

Plotting the data

What regression does

The slope, b, of the line is given by the equation

The R squared value is 0.9192 is close to 1 telling us that the regression line

1.Our null hypothesis is that the eggs do weigh 90 g when laid.

2.Calculate t using the equation

4.Clearly 0.247 is less than 2.069, so the difference is not significant.

We conclude that the initial weight is not significantly different from 90 g.

You might also like