0% found this document useful (0 votes)
23 views61 pages

Lecture Notes Week 3

Uploaded by

Jay VN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views61 pages

Lecture Notes Week 3

Uploaded by

Jay VN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

228371

Statistical Modelling for Engineers & Technologists


LECTURE SET 3: More Regression
We are
Lecture 1 (Monday 11am) Polynomial Regression here
Lecture 2 (Weds 8am) One-way ANOVA
Lecture 3 (Thurs 12pm) Two-way ANOVA
Lab (2hours on Friday) “Polynomial Regression, One and Two-way ANOVA”

Lab1 (Friday 8am – 10am) OR Lab2 (Friday 12pm – 2pm)


228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Regression → Fitting equations to data

Linear regression → Linear in parameters [does not mean straight line]

Polynomial regression → STILL LINEAR! [Just with powers of x]

𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝛽2 𝑥𝑖2 + 𝛽3 𝑥𝑖3 + 𝛽4 𝑥𝑖4 + ⋯ + 𝜖𝑖

- Assumptions of normality and constant variance still need to be satisfied by y

- Enter your powers in order, and test for significance in order (2, 3, 4….)

- In practical terms, “new” variables equal to x2, x3, etc. are calculated and added to the model.
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: shotputt_powerclean.csv – Look at data


http://www.stat.ufl.edu/~winner/data/ → shotputt_powerclean.csv

> plot(shot.putt~power.clean, data = shot)


> abline(lm(shot.putt~power.clean, data = shot))
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: shotputt_powerclean.csv – Build model


228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: shotputt_powerclean.csv – Evaluate model


plot(m1, which = 1)

Residual plot shows evidence of curvature

Try a quadratic model


228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: shotputt_powerclean.csv – Build model


m2 <- lm(shot.putt ~ power.clean + I(power.clean^2), data = shot)
summary(m2)

Looking for significance of the


coefficient of power.clean2
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: shotputt_powerclean.csv – Look at model


plot(shot.putt~power.clean, data = shot)
x<- seq(30,150)
y<- predict.lm(m2, data.frame(power.clean=x))
points(y~x, type = “l”, lwd = 2, col = “red”)
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: shotputt_powerclean.csv – Evaluate at model


plot(m2, which = 1)

Residual plot shows no


consistent curvature

Note:
plot(m2, 1) # same output
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Comparing polynomial models

- Each model needs to be fitted in turn

- You always include every term of lower order in each model


- E.g., a cubic should have quadratic and linear terms

- Compare using residual error, adjusted R2, and the standard residual analysis graphics.
- We look more at adjusted R2 later ☺
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Comparing polynomial models

summary(m1)

summary(m2)

Can compare these values as they are….OR use some statistical tests….
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Comparing polynomial models: F-test

The linear model is a subset of the quadratic model (if the coefficient of the quadratic model is zero, it
is a linear model).
→ This means we can compare the models with an F-test.

H0: “Linear (smaller) model is correct”


Ha: “Quadratic model is correct”
anova(m1, m2)

Looking for p-value to reject (or not) H0

P-value is LOW the null must go


Conclude that quadratic model is preferable
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Extrapolation

- Be careful not to over interpret polynomial models

- Is curvature real (based on science etc.)? Or due to outliers?

- Polynomials are, in general, very poor for extrapolation (prediction outside data range)
- They rapidly become large (positive or negative)

- They may be good for prediction within the range of the x-values

- Prediction intervals get larger outside the range of x-values, BUT they still assume that the model
is correct
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Alternative fitting method


We can use the poly() function in R to build the model, it makes more complicated models easier to
code:
m2 <- lm(shot.putt ~ power.clean + I(power.clean^2), data = shot) # original
m2 <- lm(shot.putt ~ poly(power.clean, 2, raw = TRUE), data = shot) # with poly()
summary(m2)
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: Indianapolis.csv – Look at data


Indy500<- read.csv(“Indianapolis.csv”)
str(Indy500)

head(Indy500)
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: Indianapolis.csv – Look at data & Build set of models


plot(Speed ~ Year, data = Indy500)

Indy500$Yr <- Indy500$Year – 1900

m1<-lm(Speed ~ Yr, data = Indy500)


m2<-lm(Speed ~ poly(Yr, 2, raw = TRUE), data=Indy500)
m3<-lm(Speed ~ poly(Yr, 3, raw = TRUE), data=Indy500)
m4<-lm(Speed ~ poly(Yr, 4, raw = TRUE), data=Indy500)
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: Indianapolis.csv – Evaluate models


anova(m1,m2,m3,m4)

Looking for p-value to reject (or not) H0

P-value is LOW the null must go


Conclude that cubic model (3) is preferable,
quartic (4) is not significant.
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: Indianapolis.csv
plot(Indy500$Year, Indy500$Speed, xlab = “Year”, ylab = “Speed”)
lines(Indy500$Year, fitted(m3), col = 3, lw = 3)
228371 Lecture set 3: More Regression
Lecture 1: Polynomial Regression

Example: Indianapolis.csv
summary(m3)
228371
Statistical Modelling for Engineers & Technologists
LECTURE SET 3: More Regression

Lecture 1 (Monday 11am) Polynomial Regression We are


Lecture 2 (Weds 8am) One-way ANOVA here
Lecture 3 (Thurs 12pm) Two-way ANOVA
Lab (2hours on Friday) “Polynomial Regression, One and Two-way ANOVA”

Lab1 (Friday 8am – 10am) OR Lab2 (Friday 12pm – 2pm)


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

F-test
There is an overall test to see if the model is useful in predicting change in the response.

This is called the F-test, it uses Mean Squares (MS) and is related to the Sum of Squares.

For a model with p explanatory variables:

𝑀𝑆𝑅𝑒𝑔 𝑆𝑆𝑅𝑒𝑔 /𝑝
F-test statistic: =
𝑀𝑆𝑅𝑒𝑠 𝑆𝑆𝑅𝑒𝑠 /(𝑛−𝑝−1)

Null hypothesis is that the model is NOT significant, under which the test statistic follows an F distribution
with p, and n-p-1 degrees of freedom.
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

ANOVA is “ANalysis Of VAriance"

Remember – the real world is messy, we are not just interested in fitting a model, we are interested in
the stuff around it – the variance.

ANOVA provides information about variability within our regression models, and allow us to test for
significance.

For polynomials in the last lecture, we were using ANOVA to test H0: 𝛽𝑖 = 0, against Ha: 𝛽𝑖 ≠ 0 using
the F-statistic.
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Factors

Factors are discrete valued variables. The values a factor takes are called levels. For example:

- Drug treatment: Control, Drug A, Drug B


- Sex: Male, Female
- Month: Jan, Feb, Mar, Apr,…., Dec
- Water Temperature: Cold, Warm, Hot

Factor levels can be ordered or nominal:

- Ordered: Water Temperature Cold < Warm < Hot


- Nominal: Diet Corn, Grain, Soybean (no natural ordering of levels)
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Numerical variables as factors

Numerical variables with discrete levels can be modelled as factors.

- This is common in experimental design because you can set the levels

- We model factors by changes in the mean response for each level


- (not with regression lines)
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA

Used to analyse experiments with ONE factor.

Model: Fit = overall effect + treatment effect

Fit: mean of response for treatment group 𝑦ത𝑖


Overall effect: mean of all responses 𝑦ത
Treatment effect: fit – overall effect 𝑦ത𝑖 − 𝑦ത

Let 𝑦𝑖𝑗 be the jth replicate of treatment i, then:


𝑦𝑖𝑗 = 𝑦ത + 𝑦ത𝑖 − 𝑦ത + 𝑦𝑖𝑗 − 𝑦ത𝑖

Observation = overall + treatment + residual


= fit + residual
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Example: fabrics.txt
Q → are any of the fabrics more or less flammable than the others?
fabrics <- read.table(“fabrics.txt”, header = TRUE)
Data are “stacked” → need unstack()
attach(fabrics)
fabrics_table <- unstack(as.data.frame(fabrics), burntime ~ fabric)
names(fabrics_table) = paste0(“Fabric”, 1:4)
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Example: fabrics.txt
Q → are any of the fabrics more or less flammable than the others?

𝑦ത = 12.69 (mean of all 20 observations)

e.g., Fabric 1 effect: 16.78 – 12.69 = 4.09

Residual for 5th observation of Fabric 1: 𝑦15 = 15.0 − 16.78 = −1.78


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Example: fabrics.txt – EDA: Exploratory Data Analysis


plot(fabric, burntime, data = fabrics)
abline(h = mean(fabrics$burntime), lty = 2)
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Example: fabrics.txt – EDA: Exploratory Data Analysis


plot(factor(fabric), burntime, data = fabrics)
abline(h = median(fabrics$burntime))

Fabrics 2 and 4 burn times are similar

Fabric 1 burn times seem much higher

Fabric 3 burn times seem a bit lower


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Decompose Variation

2 2 2
෍ ෍ 𝑦𝑖𝑗 − 𝑦ത = ෍ ෍ 𝑦ത𝑖 − 𝑦ത + ෍ ෍ 𝑦𝑖𝑗 − 𝑦ത𝑖

Total Sum Squared = Factor Sum Squared + Residual Sum Squared

Same as regression but 𝑦ത i instead of 𝑦.


Let k = number of treatments (factor levels), n = total number of observations (all groups),
Then: Total df = n-1, Factor df = k -1, and residual df = n-k

Mean Sq = SumSq/df
F = FactorMS / Residual MS

(i.e., same as for regression)


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA in R- Example: Fabrics.txt

Let 𝜇𝑖 be the true mean of the ith group (i = 1,2,…,k).

𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑘 (no difference in means)


𝐻𝑎 : 𝑛𝑜𝑡 𝐻0 (i.e., at least one difference)

oneway <- lm(burntime ~ factor(fabric)) # tell R that fabric is a factor


anova(oneway)
Looking for p-value to reject (or not) H0
P-value is LOW the null must go

Conclude that burn times differ across fabrics


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA in R- Example: Fabrics.txt


summary(oneway)

Fabric type explains 72% of variation


in burn times
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA in R- Example: Fabrics.txt

The R command lm() creates linear model for fit:


the equation takes the value of i (fabric type) and outputs the corresponding treatment mean, 𝑦ത𝑖

It does this by creating indicator variables for each level after the first level,
e.g., for 2nd fabric:

1 If factor is at level 2
𝐼2 = ቊ If factor is at another level
0

So R output tells us that the regression equation is:

Expected burn time = 16.78 – 5.02 * I2 – 6.54 * I3 – 4.8 * I4


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA in R – TukeyHSD()

To explore differences between means (i.e., if H0 is rejected), use TukeyHSD():


Produces confidence intervals for every paired difference (μi – μj)

Cannot use two-sample t-test, because:


1. Comparisons are not all independent of one another (same means), and
2. By chance, we would expect 1 in 20 differences to be significant even if H0 were true

TukeyHSD() gives joint 95% confidence for all intervals simultaneously – note only works on aov()
e.g., TukeyHSD(aov(oneway))

HSD: Honest Significant Difference (doing separate t-tests would be ‘dishonest’)


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA in R – TukeyHSD()

MC <- TukeyHSD(aov(oneway))
MC

Differences between Fabric 1 and


other burn times are ALL significant

Differences between the others are


NOT significant
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

One-way ANOVA in R – TukeyHSD()


plot(MC)

Remember difference of means:

Top 3 intervals don’t span 0 → there IS a


difference between means

Bottom 3 intervals do span 0 → there is


NO difference between means
228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Important note on ANOVA & TukeyHSD()

IGNORE the coefficients part of the summary() output for treatments/factors.

Always use TukeyHSD() to explore significant differences between treatments.

Only use summary() for output R2, and p-value.


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

ANOVA assumptions
[similar to regression assumptions]

1. Responses are normally distributed about their means


Check: using residuals as before

2. Variances are constant across treatment groups


Check: with two different tests for equal variances: Bartlett’s (assumes normality), and Levene’s.

If necessary, transform response variable to satisfy assumptions as before.


228371 Lecture set 3: More Regression
Lecture 2: One-way ANOVA

Tests for equal variances in R

bartlett.test(burntime, fabric, data = fabrics)


Both tests have large p-values

Cannot reject null hypothesis of


equal variance

library(“car”) # must install.packages(“car”) first Conclude that equal variance


leveneTest(burntime ~ factor(fabric), data = fabrics) assumption is satisfied.
228371
Statistical Modelling for Engineers & Technologists
LECTURE SET 3: More Regression

Lecture 1 (Monday 11am) Polynomial Regression


Lecture 2 (Weds 8am) One-way ANOVA We are
Lecture 3 (Thurs 12pm) Two-way ANOVA here
Lab (2hours on Friday) “Polynomial Regression, One and Two-way ANOVA”

Lab1 (Friday 8am – 10am) OR Lab2 (Friday 12pm – 2pm)


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way ANOVA – example concrete.txt


Used to analyse experiments with TWO factors.
Represent data in a table, rows for factor 1 levels, and columns for factor 2 levels.

concrete <- read.table("concrete.txt", header = TRUE) concrete$aggregate <- paste0(“Agg”, concrete$aggregate)


View(concrete) concrete$cement <- paste0(“Cem”, concrete$cement)
attach(concrete)
concrete.tab <- tapply(strength, list(cement, aggregate), mean)
addmargins(concrete.tab, FUN = mean)
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way ANOVA

Model: fit = overall effect + row effect + column effect

Decomposition of variation and ANOVA table similar to one-way, but now there are two factor SumSqs
giving two F tests and p-values.

FOUR possible test outcomes:


- neither significant,
- factor 1 significant,
- factor 2 significant,
- both factors significant.
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way ANOVA

e.g., row 2 effect


= 13.67 – 14.42

e.g., col 1 effect = Overall mean,


5.50 – 14.42 ഥ=14.42
𝒚
Effect = (row or column) mean – overall mean

Model for, e.g., first cell (row 1, col 1):


[Model: fit = overall effect + row effect + column effect]

Fitted = 14.42 - 2.42 - 8.92 = 3.08 Residual = 4 – 3.08 = 0.92


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Cement #1
Two-way ANOVA – example concrete.txt
concrete <- read.table(“concrete.txt”, header = TRUE)
attach(concrete)

rowmeans <- sapply(split(strength, cement), mean)


rowmeans

colmeans <- sapply(split(strength, aggregate), mean)


colmeans

Note: Can also use rowMeans()


and colMeans() if you make it into
a table first – see slide 41
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Main effects plot


Plot row means and plot column means to explore magnitude of differences due to each factor.
agg <- c(1, 2, 3) cem <- c(1, 2, 3,4)
plot(agg, colmeans, type = “o”, xlab = “Aggregate”, xaxp = c(1,3,2)) plot(cem, rowmeans, type = “o”, xlab = “Cement”, xaxp = c(1,4,3))
abline(h = mean(strength), lty = 2) abline(h = mean(strength), lty = 2)

Aggregate seems to have


greater effect than Cement
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way ANOVA in R
twoway <- lm(strength ~ factor(aggregate) + factor(cement))
anova(twoway)

p-value is low
Aggregate is significant

p-value is high
Cement is NOT
significant

Choose whichever cement you want (cost, availability, etc), but choose Aggregate 3 to maximise concrete
strength
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way ANOVA with Interaction

Suppose we hold one factor constant, and vary the levels of the other factor:

If the changes are the same (similar) regardless of which first factor level was chosen, then there is no
(little) interaction.

However, if the response changes are not similar, then the two factors are interacting with one another.

Need to investigate whether interaction is present, and if so, it needs to be put into the fitted model.

Note: Need at least 2 replicates per cell.


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Example: mixingtime.txt

A large paddle is used to mix milk that has been collected and stored in large vats.

The optimal mixing time depends on the diameter of the paddle, and its rotation speed (3 levels for
each).

mixing <- read.table(“mixingtime.txt”, header = TRUE)


attach(mixing)

rowmeans <- sapply(split(mixtime, speed), mean)


colmeans <- sapply(split(mixtime, diameter), mean)
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Example: mixingtime.txt – Plot main effects


spd <- c(1, 2, 3) dia <- c(1, 2, 3)
plot(spd, rowmeans, type = “o”, xlab = “Speed”, xaxp = c(1,3,2)) plot(dia, colmeans, type = “o”, xlab = “Diameeter”, xaxp = c(1,3,2))
abline(h = mean(mixtime), lty = 2) abline(h = mean(mixtime), lty = 2)

Speed seems to have greater


effect than Diameter
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Interaction Plots: mixingtime.txt

interaction.plot(factor(speed), factor(diameter), mixtime, type = “b”)


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Interaction Plots: mixingtime.txt

interaction.plot(factor(diameter), factor(speed), mixtime, type = “b”)


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way model with interaction

Model: fit = overall effect + row effect + column effect + interaction effect

Two factor SumSqs plus an interaction SumSq giving three F tests and p-values.

Interaction is represented using multiplication symbol (x or *) between factors.

If interaction term is not significant then we can refit ANOVA model without interaction term.

Use TukeyHSD() to explore differences as before.


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Two-way model with interaction


twowayintact <- lm(mixtime ~ factor(diameter) * factor(speed))
anova(twowayintact)

All p-vales are low.


Diameter, speed, and
interaction terms are all
significant in determining
optimal mixing time.

Note:
df(rowfactor) = r – 1 where r is number of levels per row (there are three speeds)
df(colfactor) = c – 1 where c is number of levels per column (there are three diameters)
df(interaction) = (r-1)(c-1)
df(residuals) = (n-1)-(r-1)-(c-1)-(r-1)(c-1)
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

ANCOVA – ANalysis of COVAriance


Indicator variables allow us to explore the effect of a factor on a regression situation (quantitative explanatory and
response variables) – the quantitative variable is called a covariate.

Analysis allows different regression lines (different slopes and/or intercepts) for different levels of the factor by
testing for significance using the linear model.

Response variable: Number of cats


Explanatory variable: pink hair or not

Covariate: Allergies? Home-owner?


228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

ANCOVA – Example: restaurant.txt


Restaurant sales (Y) depend on the number of households (X), and the restaurant location (factor)
RESPONSE EXPLANATORY COVARIATE
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

ANCOVA – Example: restaurant.txt


Restaurant <- read.table(“restaurant.txt”, header = TRUE)
attach(Restaurant)
plot(homes, sales, type = “n”, xlab = “Homes”, ylab = “Sales”)
text(homes, sales, as.character(location))

Looks like two lines are needed:


one for Mall,
and one for Highway+Street.

Possibly parallel lines


(same slope, different intercepts)
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

ANCOVA Model

𝑌 = 𝛽0 + 𝛽1 𝐼2 + 𝛽2 𝐼3 + (𝛽3 +𝛽4 𝐼2 + 𝛽5 𝐼3 )𝑋

Indicators are like before:


1 If factor is at level 2
𝐼2 = ቊ
0 If factor is at another level

This model allows (up to) three different lines:

Highway: I2 = I3 = 0 → 𝑌 = 𝛽0 + 𝛽3 𝑋
Mall: I2 = 1, I3 = 0 → 𝑌 = 𝛽0 + 𝛽1 + (𝛽3 +𝛽4 )𝑋
Street: I2 = 0, I3 = 1 → 𝑌 = 𝛽0 + 𝛽2 + (𝛽3 +𝛽5 )𝑋
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Interaction model with X and a factor


anc <- lm(sales ~ homes*location)
summary(anc)

Only these two are


significant:
Homes, X
Mall, I2
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

model.matrix(anc)
Interaction model with X and a factor
Extract I2, and then regress Y on X and I2

I2<- model.matrix(anc)[ , 3]
final <- lm(sales~homes + I2)
summary(final)
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Interaction model with X and a factor

summary(final)$coefficients

𝑏0 = −3.298
𝑏3 = 0.906
𝑏1 = 23.84

𝑌 = 𝛽0 + 𝛽1 𝐼2 + 𝛽2 𝐼3 + (𝛽3 +𝛽4 𝐼2 + 𝛽5 𝐼3 )𝑋
𝑌 = 𝛽0 + 𝛽1 𝐼2 + 𝛽3 𝑋

Highway: I2 = I 3 = 0 → 𝑌 = 𝛽0 + 𝛽3 𝑋 → 𝑦ො = −3.298 + 0.906 𝑥


Mall: I2 = 1, I3 = 0 → 𝑌 = 𝛽0 + 𝛽1 + 𝛽3 𝑋 → 𝑦ො = (−3.298 + 23.84) + 0.906 𝑥
Street: I2 = 0, I3 = 1 → 𝑌 = 𝛽0 + 𝛽3 𝑋 → 𝑦ො = −3.298 + 0.906 𝑥
228371 Lecture set 3: More Regression
Lecture 3: Two-way ANOVA

Interaction model with X and a factor


plot(homes, sales, type = “n”, xlab = “Homes”, ylab = “Sales”)
text(homes, sales, as.character(location))
abline(coef = c(-3.298, 0.906), lty = 1)
abline(coef = c(20.54, 0.906), lty = 2)

legend(“bottomright”, lty = 1:2, c(“Highway & Street”, “Mall”)


228371
Statistical Modelling for Engineers & Technologists
LECTURE SET 3: More Regression

Lecture 1 (Monday 11am) Polynomial Regression We are


Lecture 2 (Weds 8am) One-way ANOVA here
Lecture 3 (Thurs 12pm) Two-way ANOVA
Lab (2hours on Friday) “Polynomial Regression, One and Two-way ANOVA”

Lab1 (Friday 8am – 10am) OR Lab2 (Friday 12pm – 2pm)

You might also like