BIA B350F - 2022 Autumn - Assignment 1
BIA B350F - 2022 Autumn - Assignment 1
Weighting: 30%
Due date: 14 November 2022 (Monday)
Learning outcome:
Apply matrix algebra to summarize and process multivariate data.
Construct linear and logistic regression models to solve business prediction problems.
Instructions: (Marks would be deducted if you fail to follow the instructions below.)
In answering the questions of the assignment, show clearly the steps you take in arriving at
your solutions for each question. Keep at least four decimal places in the final answer for
statistical computations or otherwise specified.
Except Question 5 which require you to utilize R, the rest of the questions must be answered
manually.
The soft copy of handwritten or typed answers of Questions 1 to 4 and the analysis report for
Question 5 (in Word) must be uploaded to OLE by the due date. The R program of Question
5 must also be uploaded to “Assignment 1 – R program”.
This is an individual assignment. Copying some or all of another student’s assignment or R
program is plagiarism. (Note: The assignment will be checked by Turnitin and zero mark
will be given to plagiarized works.)
Question 1 (8 marks)
−1
−5 3 3 2
Let 𝐴 = [ ],𝐵 = [ ] , 𝐶 = [−2 3 ]′
1 and 𝐷 = [ 4 ]
1 2 −2 6
2
1
4 −2 1 𝑋1
𝑋
Let Σ = [−2 16 3 ] be the covariance matrix of the random vector 𝑋 = [ 2 ].
1 3 25 𝑋3
−1
(a) Determine 𝑉 1⁄2 , (𝑉 1⁄2 ) and 𝜌. (6 marks)
(b) Find the covariance matrix for the linear combination 2𝑋1 + 3𝑋2 − 𝑋3 . (3 marks)
(c) Find the covariance matrix for the following linear combinations of X1, X2 and X3.
𝑍1 = 2𝑋1 + 𝑋2 + 𝑋3
𝑍2 = 𝑋1 − 𝑋2 + 2𝑋3
(3 marks)
Question 5 (60 marks)
A researcher wishes to predict the concrete compressive strength using multiple linear regression
model. The concrete compressive strength is a highly nonlinear function of age and ingredients.
A random sample of 1030 concrete laboratory testing results were collected to form a dataset
“Concrete_data.csv”. The dataset includes the following nine variables:
Variable Description
Cement Cement (Kg/m3)
Slag Blast Furnace Slag (Kg/m3)
Ash Fly Ash (Kg/m3)
Water Water (Kg/m3)
Plasticizer Superplasticizer (Kg/m3)
C_aggregate Coarse Aggregate (Kg/m3)
F_aggregate Fine Aggregate (Kg/m3)
Age No. of day after mixing (1-365 days)
Strength Concrete compressive strength (MPa)
The dependent variable is “Strength”. The “Concrete_data.csv” dataset can be downloaded from the
OLE.)
(a) Utilize R to determine the multiple linear regression model to predict the Strength of concrete
by considering which independent variable(s) be included in the model among the other given
variables using stepwise regression (forward). You are expected to perform relevant model
checking including relevant graphs plotting after the desired model is formulated. All R
programs must be included in the answer and marks will be deducted if failing to do so.
(40 marks)
(b) Perform relevant hypothesis testing to assess the validity of the multiple linear regression model
obtained as well as the validity of individual regression coefficients. (5 marks)
(d) Write a reflective journal of not more than 200 words that summarizes your learning experience
in applying knowledge and skills acquired in the course to build the regression model for the
given problem, and that explain how this experience could enrich your ability to apply course
knowledge to real life applications. (10 marks)