0% found this document useful (0 votes)
31 views19 pages

Theme 6 Week 12 - Tutorial With Answers

The document discusses the effects of carbon content on the elongation of welded joints using linear regression analysis. It includes calculations for estimating changes in elongation, hypothesis testing for the linear model's appropriateness, and evaluations of residual plots. Additionally, it covers the relationship between age and height in South African boys, providing a least-squares line, correlation coefficient, and outlier analysis.

Uploaded by

blade-blaze-4o
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views19 pages

Theme 6 Week 12 - Tutorial With Answers

The document discusses the effects of carbon content on the elongation of welded joints using linear regression analysis. It includes calculations for estimating changes in elongation, hypothesis testing for the linear model's appropriateness, and evaluations of residual plots. Additionally, it covers the relationship between age and height in South African boys, providing a least-squares line, correlation coefficient, and outlier analysis.

Uploaded by

blade-blaze-4o
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIVERSITY OF PRETORIA

Theme 6:
Week 12- Tutorial
Linear regression and correlation

1
Question 1
The ability of a welded joint to elongate under stress is affected by the
chemical composition of the weld metal. In an experiment to determine
the effect of carbon content (x) on elongation (y), 39 welds were
stressed until fracture, and both carbon content (in parts per thousand)
and elongation (in percent) were measured. The following summary
statistics were calculated:
σ𝑛𝑖=1 ( 𝑥1 − 𝑥)ҧ 2 = 0.6561, σ𝑛𝑖=1(𝑥1 − 𝑥)(𝑦
ҧ 1 − 𝑦)
ത = - 3.9097, s = 4.3319
Assuming that 𝑥 and 𝑦 follow a linear model, compute the estimated
change in elongation due to an increase of one part per thousand in
carbon content. Test if it is appropriate to use the linear model to
predict elongation from carbon content?
Solution 1
Given: σ𝑛𝑖=1 ( 𝑥1 − 𝑥)ҧ 2 = 0.6561, σ𝑛𝑖=1(𝑥1 − 𝑥)(𝑦
ҧ 1 − 𝑦)
ത = - 3.9097, s =
4.3319, n = 39
In the linear model: 𝑦 = 𝛽0 + 𝛽1𝑥 + 𝜀, the change in elongation due to
an increase of one part per thousand in carbon, is the slope 𝛽1. We
can estimate this to be:

σ𝑛
𝑖=1(𝑥1 −𝑥)(𝑦
ҧ 1 −𝑦)
ത − 3.9097
𝛽መ1 = σ𝑛
= = − 5.959
𝑖=1 ( 𝑥1 − 𝑥)ҧ 2 0.6561

Now, we can conduct a hypothesis test.


1. Define 𝐻0 and 𝐻𝐴.
𝐻0: 𝛽1 = 0, 𝐻𝐴: 𝛽1 ≠ 0
2. Assume 𝐻0 is true.
𝛽1 = 0
Solution 1
σ𝑛𝑖=1 ( 𝑥1 − 𝑥)ҧ 2 = 0.6561, σ𝑛𝑖=1(𝑥1 − 𝑥)(𝑦
ҧ 1 − 𝑦)
ത = - 3.9097, s = 4.3319, n =
39, 𝛽መ1 = - 5.959
Now, we can conduct a hypothesis test.

෡1 − 𝛽1
𝛽
t=
𝑆𝛽
෡ 1

𝑠 4.3319
We must find 𝑆𝛽෡1 : 𝑆𝛽෡1 = = 0.6561
= 5.348
σ𝑛
𝑖=1 ത 2
( 𝑥1 − 𝑥)

෡1 − 𝛽1
𝛽 −5.959 −0
t= = = -1.114
𝑆𝛽
෡1
5.348

4. Compute p–value: From table with df = 39 – 2 = 37


P-value > 0.20> 𝛼 = 0.05
Solution 1
σ𝑛𝑖=1 ( 𝑥1 − 𝑥)ҧ 2 = 0.6561, σ𝑛𝑖=1(𝑥1 − 𝑥)(𝑦
ҧ 1 − 𝑦)
ത = - 3.9097, s =
4.3319, n = 39, 𝛽መ1 = - 5.959

Now, we can conduct a hypothesis test.


3. Compute a test statistics.
t = -1.114
4. Compute p–value: From table
P-value > 0.20> 𝛼 = 0.05
5. Conclude:
We fail to reject the null hypothesis. We cannot conclude that the
linear model is useful for predicting elongation from carbon content.
Question 2
Determine if a linear model is appropriate for the following residual
plots.
Solution 2
Determine if a linear model is appropriate for the following residual
plots.

A linear model is not appropriate. The errors are not random, there is a
pattern.
Solution 2
Determine if a linear model is appropriate for the following residual
plots.

A linear model is not appropriate. The variance of the errors increases


across the regression line.
Solution 2
Determine if a linear model is appropriate for the following residual
plots.

A linear model is not appropriate. The errors are not independent.


Solution 2
Determine if a linear model is appropriate for the following residual
plots.

A linear model is not appropriate. The mean of the errors is not 0.


Question 3
The table below shows the average heights for South African boy’s
in 2010. Let a = age and b = height.
Question 3

(a) Decide which variable should be the independent variable and which should
be the dependent variable.
(b) What is the slope of the least-squares (best-fit) line? Interpret the slope.
(c) Calculate the least-squares line.
(d) Find the correlation coefficient.
(e) Find the estimated average height for an eleven-year-old.
(f) Are there any outliers in the data?
Solution 3

a) Decide which variable should be the independent variable and


which should be the dependent variable.
b) Independent variable x = Age
c) Dependent variable y = Height
Solution 3

b) What is the slope of the least-squares (best-fit) line? Interpret the slope.
σ𝑛 ҧ 1 −𝑦)
𝑖=1(𝑥1 −𝑥)(𝑦 ത 1013,54
𝛽መ1 = σ𝑛 ҧ 2
= = − 7.095
𝑖=1 ( 𝑥1 − 𝑥) 142,85

As the age of a South African boy increases by one year, the average
height tends to increase by 7.095
Solution 3

c) Calculate the least square line:


𝛽መ𝑜 = 𝑦ത − 𝛽መ1 𝑥ҧ = 106.64 − 7.095 5.86 = 65.06
σ𝑛𝑖=1 ( 𝑥1 − 𝑥)ҧ 2 = 0.6561, σ𝑛𝑖=1(𝑥1 − 𝑥)(𝑦
ҧ 1 − 𝑦)
ത = - 3.9097, s = 4.3319
The least squares line is given by: 𝑦ത = 𝛽መ𝑜 + 𝛽መ1 x
𝑦ത = 65.06 + 7.09𝑥
Solution 3

d) Find the correlation coefficient.


𝑆𝑦
𝛽መ1 = 𝑟 ×
𝑆𝑥

𝑆 4.88
𝑟 = 𝛽መ1 × 𝑥 = 7.095 × = 0.976
𝑆𝑦 35.47
Solution 3

e) Find the estimated average height for an eleven-year-old.


𝑦ො = 65.06 + 7.095 (11) = 143.105 cm
Solution 3

f) Are there outliers in the data?


Outliers > 2 (s)

(1 −𝑟 2 ) σ𝑛 ഥ)2
𝑖=1 ( 𝑦1 −𝑦 (1 −0.9762 )(7546,86)
s= = = 8.46
𝑛 −2 7 −2

Outlier > 2(8.46) = 16.92. Check all the 𝑦1 − 𝑦ො using 𝑦ො 65.06 + 7.095x.
No, there are no outliers.
Thank you!
Happy studying ☺

You might also like