Investigation 4-Worksheet FINAL
Investigation 4-Worksheet FINAL
Interpret scatter diagrams and regression lines for bivariate data, including recognition
of scatter diagrams which include distinct sections of the population
Understand informal interpretation of correlation.
Investigation 4 (a)
Investigate if there is a correlation between daily mean total cloud cover and daily mean
pressure.
The data
Open the Excel workbook Pearson Edexcel GCE AS and AL Mathematics data set.xlsx.
Select the Information worksheet.
The fraction of the celestial dome covered by cloud. The unit is oktas.
2. Read the information in cell A19. What units are used to measure daily mean pressure?
Pascal
Independent
It is difficult to analyse these data as it is presented in the dataset. The headers need to be in
row 1.
Copy the data into a new workbook
Select the whole worksheet
Click on the small blue square in the left hand corner, this will select
the whole worksheet.
1
Investigation 4 - Correlation and Regression - Worksheet
Delete rows 1 – 5
Select rows 1 – 5 right click Delete
Save workbook as Leuchars2015
Data on each of two variable, where each value of one of the variables is paired with a value
of the other variable
In bivariate data if one of the variables is controlled (or explains the other variable), it is
known as the independent (or explanatory) variable.
A dependent (or response) variable is a variable whose value depends on the value of another
variable.
The dependent variable is usually plotted on the vertical axis. Note: if you are using a
regression model to predict a value, the variable for the value you wish to predict should be
the Y variable and plotted on the vertical axis of a scatter diagram.
For bivariate data if all the points in a scatter diagram seem to lie near a straight line, there is
a linear correlation between the two variables.
Process
Plot a scatter diagram to investigate if there is a correlation between daily mean total cloud
cover and daily mean pressure. In this case there is no controlled variable. Later in this
investigation a prediction for daily mean total cloud cover will be calculated based on the
regression equation for daily mean total cloud cover against daily mean pressure.
6. Which variable should be plotted on the vertical axis?
2
Investigation 4 - Correlation and Regression - Worksheet
9
8
7
6
5
4
3
2
1
0
970 980 990 1000 1010 1020 1030 1040
Daily mean pressure (hPa)
Add a title 7
6
5
Daily Mean Total Cloud
4 (oktas)
3
2
1
3
0
980 1000 1020 1040
Investigation 4 - Correlation and Regression - Worksheet
Click on the title and type Daily mean total cloud cover vs Daily mean pressure Leuchars
2015 then Enter
Add a vertical axis title
Click on the chart then select the Layout tab select Axis Titles then Primary Vertical Axis
Title then Vertical Title and type Daily mean total cloud cover (oktas) and Enter
4
Investigation 4 - Correlation and Regression - Worksheet
8
7
6
5
4
3
2
1
0
980 990 1000 1010 1020 1030 1040
Daily mean pressure (hPa)
Figure 1
5
Investigation 4 - Correlation and Regression - Worksheet
8
7
6
5
4
3
y = -0.07x + 77.53
2
1
0
980 990 1000 1010 1020 1030 1040
Daily mean pressure (hPa)
10. Write the line of regression using the names of the variables.
y = -0.0078x + 14.019
So Daily mean total cloud cover = -0.0078 x daily mean pressure + 14.019
11. Interpret the gradient of the line of regression for daily mean total cloud cover against
daily mean pressure.
-0.0078
Yes
13. Use the regression model (equation of regression) to predict the daily mean total cloud
cover for a daily mean pressure of 1030 hPa.
6
Investigation 4 - Correlation and Regression - Worksheet
Report
Investigation 4 (b)
Plot a scatter diagram to investigate the relationship between daily mean total cloud cover
against daily mean pressure for Leuchars 2015, split by season.
To plot the scatter diagram daily mean total cloud cover against daily mean pressure for
Leuchars 2015, split by season
7
Investigation 4 - Correlation and Regression - Worksheet
8
7
6
5
4 Spring/summer
3 Autumn
2
1
0
980 990 1000 1010 1020 1030 1040
Daily mean pressure (hPa)
Report
15. Comment of the split of the data between Spring/Summer and Autumn.
Investigation 4 (c)
Investigate if there is a correlation between daily mean visibility and daily mean temperature.
Use the same random sample i.e. Leuchars 2015.
The data
In the Excel workbook Pearson Edexcel GCE AS and AL Mathematics data set.xlsx.
Select the Information worksheet.
8
Investigation 4 - Correlation and Regression - Worksheet
Process
Plot a scatter diagram of daily mean visibility against daily mean temperature.
4. Which variable should be plotted on the vertical axis?
Investigation 4 (d)
Investigate if there is a correlation between daily mean air temperature and daily mean
pressure in Beijing May to October 2015.
The data
The data are provided in the Excel workbook Edexceldataset.xlsx.
Process
9
Investigation 4 - Correlation and Regression - Worksheet
Plot a scatter diagram to investigate if there is a correlation between daily mean air
temperature and daily mean pressure in Beijing, May to October 2015. Daily mean pressure
is the explanatory variable.
Correlation coefficient =
7. Interpret the gradient of the line of regression for daily mean air temperature against daily
mean pressure.
8. Interpret the intercept of the line of regression for daily mean air temperature against
daily mean pressure.
9. Does this regression model seem to fit the data? Give a reason for your answer.
10
Investigation 4 - Correlation and Regression - Worksheet
10. Use the regression model (line of regression) to predict the daily mean air temperature for
a daily mean pressure of 1005 hPa.
11. Comment on the accuracy of the predicted daily mean air temperature in question 10.
12. Plot scatter diagram to show the relationship between daily mean air temperature against
daily mean pressure for Beijing, May to October 2015, split by season.
Report
11