Assignment 1
Assignment 1
Assignment 1
Important Note: The university policy on academic dishonesty (cheating) will be taken very seriously in this course.
You may not provide or use any solution, in whole or in part, to or by another student.
You are encouraged to discuss the concepts involved in the questions with other students. If you are in doubt as to what
constitutes acceptable discussion, please ask! Further, please take advantage of office hours offered by the instructor
and the TA if you are having difficulties with this assignment.
DO NOT:
DO:
• Meet with other students to discuss assignment (it is best not to take any notes during such meetings, and to re-work
assignment on your own)
• Use online resources (e.g. Wikipedia) to understand the concepts needed to solve the assignment.
The assignment must be submitted online on Coursys. You must submit a report in PDF format. You may typeset
your assignment in LaTeX or Word, or submit neatly handwritten and scanned solutions. We will not be able to give
credit to solutions that are not legible.
For the last question you are required to submit: A Python script (linreg submission.py) containing your
complete code for each of the linear regression tasks. You do not need to include the actual Python code in your
report, but please provide a clear discussion of your results.
1
CMPT 726: Assignment 1 (Fall 2024) Instructor: Steven Bergner
1. Linear Algebra
a) Let U be a subspace of R5 defined by:
x1
x
2
5
U = x3 ∈ R : x1 = 3x2 and x4 = 2x5
x4
x5
π
Figure 1: In this case, x = (1, 0) and y = (0, 1) are both rotated by θ = 4.
2
CMPT 726: Assignment 1 (Fall 2024) Instructor: Steven Bergner
Show that U and V ⊤ are both rotation matrices and find their corresponding rotational angles θU and θV ⊤ .
f ( #»
x ) = 5x21 + 3x22 + 2x23 + 4x1 x2 − 2x1 x3 + 6x2 x3
• f (t #»
x + (1 − t) #»
y ) ≤ tf ( #»
x ) + (1 − t)f ( #»
y)
• If f is differentiable: f ( y ) ≥ f ( x ) + (∇f ( #»
#» #» x )) ( #»
y − #»
⊤
x)
#»
• If f is twice differentiable: Hf ( x ) ⪰ 0
a) Given x ∈ R and only using the definition of convex functions given above, prove that the rectified linear
unit function, ReLU(x) := max(x, 0), is convex.
5. Linear Regresssion – House Price Prediction with Polynomial Features and Ridge Regression
You are working on predicting house prices in a real estate market using a dataset that consists of 500 examples,
each with multiple features: number of rooms, house age, area size, etc. The target variable is the house price. You
decide to apply linear regression and Ridge regression (a regularized form of linear regression) to build predictive
models. Your goal is to assess the generalization of these models using cross-validation and to experiment with
different polynomial degrees and regularization strengths to improve the model’s performance.
Data Loading: Use the following code to load the Boston Housing dataset from GitHub and initialize the
DataFrame. For the model, only use the variables rm (number of rooms) and lstat (lower status population,
percentage), and the target variable medv (median house value in $1,000s).
import pandas as pd
# Features: ’rm’ (number of rooms) and ’lstat’ (lower status population, percentage)
X = df[[’rm’, ’lstat’]]
y = df[’medv’] # Target: ’medv’ (median house value in $1,000s)
Implementation Hints: For managing your dataset, use pandas dataframes and for the models and training
tools, utilize scikit-learn. Plotting should be done with the built-in functions or using matplotlib.
You can refer to the official scikit-learn documentation for functions like train-test split, cross-validation, linear
regression, and Ridge regression.
For coding environments:
• You can work in a Jupyter notebook in Google Colab, and export it as a .py script for final submission.
• If you are already using VS Code, consider using #%% cell separators in your Python script, allowing you
to run parts of your script like Jupyter notebook cells.
Ensure your script includes the code for each task, such as MSE computation or the best parameter choices, and
attach the result outputs to your assignment report in PDF format.
3
CMPT 726: Assignment 1 (Fall 2024) Instructor: Steven Bergner
a. Train-Test Split and Cross-Validation for Linear Regression with Polynomial Features
Split the dataset into a training set (80%) and a test set (20%). Use 5-fold cross-validation on the training set
to evaluate the performance of linear regression models with polynomial features. For each polynomial degree
(from 1 to 5), compute the average mean squared error (MSE) over the five folds and report your results.
Hint: Use PolynomialFeatures from scikit-learn to create polynomial features of different degrees. When
using the cross val score function, set the scoring parameter to neg mean squared error.
Ridge regression introduces a regularization term controlled by a hyperparameter α. Perform 5-fold cross-
validation on the training set with Ridge regression, using polynomial features (degrees 1 to 5) and different
values of α (e.g., 0.1, 1, 10, 100). Use grid search to find the best α and the optimal degree, and report both the
values that minimize the cross-validation error along with the corresponding average MSE.
Hint: Use GridSearchCV from scikit-learn to automate the search for the best regularization parameter α.
Now that the best polynomial degree and regularization strength have been identified from part (b), train both the
linear regression model and the Ridge regression model on the training set, and evaluate their MSE on the test
set. Compare the performance of the two models, and discuss which model generalizes better to unseen data and
why.
Hint: Use mean squared error from scikit-learn to evaluate the models on the test set.
d. Theoretical Considerations
Explain why the regularization in Ridge regression helps prevent overfitting, especially when using polynomial
features. How does the choice of the regularization parameter α and the polynomial degree influence the model?
What might happen if α is too small or too large, or if the degree is too high?