0% found this document useful (0 votes)

66 views4 pages

Tutorial 1 Machine Learning

Universiti Malaya

Uploaded by

BunnySha Land

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views4 pages

Tutorial 1 Machine Learning

Universiti Malaya

Uploaded by

BunnySha Land

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

WIA1006/WID3006, Semester II, Session 2022/2023

Tutorial 1

1.1 Intro to ML

1. A computer program is said to learn from experience E with respect to some task T and
some performance measure P if its performance on T, as measured by P, improves with
experience E. Suppose we feed a learning algorithm a lot of historical trajectory data of
vehicles, and have it learn to predict traffic speed. In this setting what is E?
2. Suppose you are working on traffic prediction, and you would like to predict whether traffic
will be heavy at 5pm tomorrow or not. You want to use a learning algorithm for this. Would
you treat this as a classification or a regression problem?

3. Some of the problems below are best addressed using a supervised learning algorithm, and
the others with an unsupervised learning algorithm. Which of the following would you
apply supervised learning to? (Select all that apply.) In each case, assume some appropriate
dataset is available for your algorithm to learn from.

Problems Supervised/
Unsupervised?
Take a collection of 1000 essays written on the US Economy, and find a way
to automatically group these essays into a small number of groups of essays
that are somehow "similar" or "related".

Given 50 articles written by male authors, and 50 articles written by female

authors, learn to predict the gender of a new manuscript's author (when the
identity of this author is unknown).

Examine a large collection of emails that are known to be spam email, to

discover if there are sub-types of spam mail.
Examine a web page, and classify whether the content on the web page
should be considered "child friendly" (e.g., non-pornographic, etc.) or
"adult."
WIA1006/WID3006, Semester II, Session 2022/2023

1.2 Linear Regression (Univariate)

1. Consider the problem of predicting sunny weather condition in each week of 2023 given the
sunny weather condition in each week of 2022.
In this scenario, x represents the number of days in each week that the weather is sunny in
2022. The value of y is defined as “the number of sunny days” in each week of 2023 which
we want to predict.
The following training set is a sample of few weeks with number of sunny days in each of
them.
Recall that in linear regression our hypothesis is , and we use m to
denote the number of training examples.

x y
3 2
2 3
4 5
1 1
5 4

For the training set given above, what is the value of m?

2. For this question, continue using the data provided in (1). Recall the definition of cost
function for linear regression is
𝑚
1 2
𝐽(𝜃0 , 𝜃1 ) = ∗∑ (ℎ𝜃 (𝑥 (𝑖) ) − 𝑦 (𝑖) )
2𝑚
𝑖

What is J(0,1)?

3. Suppose we set 𝜃0 = −1, 𝜃1 = 0.5, what is ℎ𝜃 (3)?

4. Three different classifiers are trained on the same data. Their decision
boundaries are shown below. Which of the following statements are
true?

€ The leftmost classifier has high robustness, poor fit.

WIA1006/WID3006, Semester II, Session 2022/2023

€ The leftmost classifier has poor robustness, high fit.

€ The rightmost classifier has poor robustness, high fit.
€ The rightmost classifier has high robustness, poor fit

5. What is the difference between local minima and global minima gradient descent?

1.3 Linear Regression (Multivariate)

1. Suppose we have m = 4 houses, and each house has area and number of bedrooms which
can be used to predict the house price. A dataset of the features is as follows:

Bedrooms Sqft_area Price

1. 1 880 490,000

2. 3 1930 630,000

3. 4 1940 640,000

4. 3 1350 570,000

You’d like to use polynomial regression to predict a house price from its numbers of
bedrooms and sqft_area. Concretely, suppose you wish to fit a model of the form.

where x1 is the number of bedrooms and x2 is sqft_area. Further you plan to use both
feature scaling (dividing by the max-min, or range, of a feature) and mean normalization.

What is the normalized feature 𝑥24 (i.e. for the fourth training data)?

2. You run gradient descent for 12 iterations with α = 0.2 and compute J(θ) after each iteration
you find that the value of J(θ) increases over time. What would you do to correct this issue?

3. Suppose you have m = 23 training examples with n = 5 features (excluding the additional all-
ones feature for the intercept term, which you should add). The normal equation is
θ=(XTX)−1XT y. For the given values of m and n, what are the dimensions of θ, X, and y in this
equation?

1. X is 23 × 6, y is 23 × 6, θ is 6 × 6

2. X is 23 × 5, y is 23 × 1, θ is 5 × 1

3. X is 23 × 6, y is 23 × 1, θ is 6 × 1

4. X is 23 × 6, y is 23 × 1, θ is 5 × 5
WIA1006/WID3006, Semester II, Session 2022/2023

4. Suppose you have a dataset with m = 1000000 examples and n = 200000 features for each
example. You want to use multivariate linear regression to fit the parameters θ to our data.
Should you prefer gradient descent or the normal equation?

5. Which of the following statements are true?

1. MAE doesn’t add any additional weight to the distance between points. The error
growth is linear.
2. MSE errors grow exponentially with larger values of distance. It’s a metric that adds a
massive penalty to points that are far away and a minimal penalty for points that are
close to the expected result.
3. It is necessary to prevent the normal equation from getting stuck in local optima.
4. It prevents the matrix XTX (used in the normal equation) from being non-invertable
(singular/degenerate).

Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
11 - Học máy cơ bản - Hồi quy tuyến tính 1
No ratings yet
11 - Học máy cơ bản - Hồi quy tuyến tính 1
105 pages
Exam Final
100% (1)
Exam Final
21 pages
Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
Week 4
No ratings yet
Week 4
101 pages
Lec 10
No ratings yet
Lec 10
61 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
ML - Lec 4-Introduction To Regression
No ratings yet
ML - Lec 4-Introduction To Regression
65 pages
ML 2
No ratings yet
ML 2
155 pages
MLT Assign PDF
No ratings yet
MLT Assign PDF
137 pages
Week 04
No ratings yet
Week 04
101 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
03 Linear Models
No ratings yet
03 Linear Models
46 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
2022 Jan
No ratings yet
2022 Jan
37 pages
Group 30
No ratings yet
Group 30
33 pages
Linear Regression
No ratings yet
Linear Regression
75 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
Wk05 Machine Learning
No ratings yet
Wk05 Machine Learning
6 pages
Graded Quiz Unit 3 PDF
No ratings yet
Graded Quiz Unit 3 PDF
10 pages
Lec 03
No ratings yet
Lec 03
42 pages
Lecture 3 Multi-Regresion 2022.
No ratings yet
Lecture 3 Multi-Regresion 2022.
16 pages
Intro To ML RevisionNotes
No ratings yet
Intro To ML RevisionNotes
24 pages
04 LinearModels
No ratings yet
04 LinearModels
28 pages
Midterm F02soln
No ratings yet
Midterm F02soln
14 pages
Ass8 Solns
No ratings yet
Ass8 Solns
10 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Mids 21
No ratings yet
Mids 21
10 pages
Asset Management in SAP S-4HANA Cloud, Public Edition 2302 SAP Blogs
No ratings yet
Asset Management in SAP S-4HANA Cloud, Public Edition 2302 SAP Blogs
25 pages
Quiz1 Solutions Quiz 1 Soln
No ratings yet
Quiz1 Solutions Quiz 1 Soln
7 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
Visvesvaraya Technological University: Lung Cancer Segmentation and Detection Using Machine Learning
No ratings yet
Visvesvaraya Technological University: Lung Cancer Segmentation and Detection Using Machine Learning
67 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Ba ZG512 Ec-2r First Sem 2024-2025
No ratings yet
Ba ZG512 Ec-2r First Sem 2024-2025
12 pages
Single-Parameter Linear Regression: Predicting Real-Valued Outputs: An Introduction To Regression
No ratings yet
Single-Parameter Linear Regression: Predicting Real-Valued Outputs: An Introduction To Regression
51 pages
EE2211 Past Paper
No ratings yet
EE2211 Past Paper
14 pages
Supervised Machine Learning Regression
No ratings yet
Supervised Machine Learning Regression
6 pages
02 - Linear Models - A
No ratings yet
02 - Linear Models - A
23 pages
Test 1 With Key 10-3
No ratings yet
Test 1 With Key 10-3
16 pages
PRML Test 2
No ratings yet
PRML Test 2
3 pages
ML MCQ 1
No ratings yet
ML MCQ 1
5 pages
COGS 118 Homework 3 Supervised Machine Learning Algorithms
No ratings yet
COGS 118 Homework 3 Supervised Machine Learning Algorithms
7 pages
Tut1 Questions
No ratings yet
Tut1 Questions
2 pages
IML-IITKGP - Assignment 5 Solution
No ratings yet
IML-IITKGP - Assignment 5 Solution
7 pages
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Ell409 Aq
No ratings yet
Ell409 Aq
8 pages
Hundred Page ML Book CH 3
No ratings yet
Hundred Page ML Book CH 3
16 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
No ratings yet
SS ZG568 EC 2R SECOND SEM 2020 2021 Solution 1617000149821
6 pages
Quizz ML
No ratings yet
Quizz ML
3 pages
Solutions Problem Set 1
No ratings yet
Solutions Problem Set 1
7 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
AIML Domestic Executive Brochure Dec 10 2024
No ratings yet
AIML Domestic Executive Brochure Dec 10 2024
25 pages
ML Suggestion
No ratings yet
ML Suggestion
5 pages
Nptel Week 5
No ratings yet
Nptel Week 5
4 pages
HW 1
No ratings yet
HW 1
3 pages
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
No ratings yet
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
6 pages
Be Part of The Future of Quantitative Finance: Cohort 6 Starts: Tuesday 20th April 2021
No ratings yet
Be Part of The Future of Quantitative Finance: Cohort 6 Starts: Tuesday 20th April 2021
16 pages
Devoir 1
No ratings yet
Devoir 1
6 pages
Artificial Intelligence: Computer Science Engineering
No ratings yet
Artificial Intelligence: Computer Science Engineering
1 page
Big Data Analytics (BDAG 19-5) : Quiz: GMP - 2019 Term V
No ratings yet
Big Data Analytics (BDAG 19-5) : Quiz: GMP - 2019 Term V
2 pages
DL
No ratings yet
DL
2 pages
Smart Disease Prediction Using Machine Learning
No ratings yet
Smart Disease Prediction Using Machine Learning
5 pages
Axioms:: Simultaneously Meannormalization
No ratings yet
Axioms:: Simultaneously Meannormalization
2 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
Report Chap (Completed) - 1
No ratings yet
Report Chap (Completed) - 1
55 pages
Neural Networks:: Basics Using MATLAB
No ratings yet
Neural Networks:: Basics Using MATLAB
54 pages
OWASP LLM - GenAI Security Solutions Reference Guide v1.1.25
No ratings yet
OWASP LLM - GenAI Security Solutions Reference Guide v1.1.25
58 pages
SAP AI Presentation
No ratings yet
SAP AI Presentation
28 pages
Artificial Intelligence - What It Is and Why It Matters - SAS
No ratings yet
Artificial Intelligence - What It Is and Why It Matters - SAS
8 pages
FAMT Academic Brochure
No ratings yet
FAMT Academic Brochure
59 pages
Biografi Alan Turing Full Version.1.2.p
No ratings yet
Biografi Alan Turing Full Version.1.2.p
169 pages
DA DE Intern ICT C A ActiveFence 1
No ratings yet
DA DE Intern ICT C A ActiveFence 1
4 pages
Education 11 00568 v2
No ratings yet
Education 11 00568 v2
21 pages
Lecture1 IntroductiontoML
No ratings yet
Lecture1 IntroductiontoML
70 pages
Machine Learning (BTCOC603 - Y23) Supplementary December 2024
No ratings yet
Machine Learning (BTCOC603 - Y23) Supplementary December 2024
4 pages
L1 - SLM Notes (Bacground, ML)
No ratings yet
L1 - SLM Notes (Bacground, ML)
29 pages
Machine Learning For High-Dimensional Data and Signals: Michel Verleysen
No ratings yet
Machine Learning For High-Dimensional Data and Signals: Michel Verleysen
54 pages
Project Titles
No ratings yet
Project Titles
6 pages
Large Language Model Routing With Benchmark Datasets
No ratings yet
Large Language Model Routing With Benchmark Datasets
18 pages
Strategic Data Science-Preview
No ratings yet
Strategic Data Science-Preview
109 pages
Medmnist V2 - A Large-Scale Lightweight Benchmark For 2D and 3D Biomedical Image Classification
No ratings yet
Medmnist V2 - A Large-Scale Lightweight Benchmark For 2D and 3D Biomedical Image Classification
10 pages
K-Medoids Clustering Using Partitioning Around Medoids For Performing Face Recognition
No ratings yet
K-Medoids Clustering Using Partitioning Around Medoids For Performing Face Recognition
12 pages
Pentachart Example 3
No ratings yet
Pentachart Example 3
1 page
Fatigue Fract Eng Mat Struct - 2021 - Silva - Machine Learning and Finite Element Analysis An Integrated Approach For
No ratings yet
Fatigue Fract Eng Mat Struct - 2021 - Silva - Machine Learning and Finite Element Analysis An Integrated Approach For
15 pages
Random Forest Regression
No ratings yet
Random Forest Regression
22 pages
Using AI in The Retention and Disposition of Records at The New South Wales State
No ratings yet
Using AI in The Retention and Disposition of Records at The New South Wales State
3 pages

Tutorial 1 Machine Learning

Uploaded by

Tutorial 1 Machine Learning

Uploaded by

WIA1006/WID3006, Semester II, Session 2022/2023

Given 50 articles written by male authors, and 50 articles written by female

Examine a large collection of emails that are known to be spam email, to

1.2 Linear Regression (Univariate)

For the training set given above, what is the value of m?

3. Suppose we set 𝜃0 = −1, 𝜃1 = 0.5, what is ℎ𝜃 (3)?

€ The leftmost classifier has high robustness, poor fit.

€ The leftmost classifier has poor robustness, high fit.

1.3 Linear Regression (Multivariate)

Bedrooms Sqft_area Price

5. Which of the following statements are true?

You might also like