0% found this document useful (0 votes)

8 views35 pages

Lecture 1.2 Basics and Prerequisite

The document outlines a Machine Learning course led by Dr. Mohamed-Rafik Bouguelia at Halmstad University, focusing on the basics and prerequisites of machine learning, including terminology, dataset notations, and model parameters. It covers concepts such as feature vectors, training datasets, cost functions, and linear algebra fundamentals necessary for understanding machine learning algorithms. Additionally, it emphasizes the importance of learning and optimizing model parameters to minimize prediction errors.

Uploaded by

homerajasekhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views35 pages

Lecture 1.2 Basics and Prerequisite

Uploaded by

homerajasekhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Machine Learning Course

Basics and Prerequisites

Terminology, definitions and review of some math
notions

Dr. Mohamed-Rafik Bouguelia

[email protected]

Halmstad University, Sweden.

You can also watch the video corresponding
to this lecture at: https://youtu.be/91siCik-b7o
Notations related to
datasets

Machine Learning Course.

Dr. Mohamed-Rafik Bouguelia.
[email protected]
Notations related to datasets
Heath state:
malignant cancer
benign cancer
healthy (no cancer)

House
Age
price

House size Uniformity of cells

Notations related to datasets
Heath state:
malignant cancer
benign cancer
Data-points healthy (no cancer)
(also called feature-vectors, examples,
instances, or observations)

House
Age
price

House size Uniformity of cells

Notations related to datasets
Heath state:
malignant cancer
benign cancer
Data-points healthy (no cancer)
(also called feature-vectors, examples,
instances, or observations)

House
Age
price

House size Uniformity of cells

One-dimensional Two-dimensional
data (1 feature) The features data (2 features)
(also called attributes / variables)
3 Classes

Notations related to datasets

The output variable Heath state:
(also called target variable) malignant cancer
benign cancer
Data-points healthy (no cancer)
(also called feature-vectors, examples,
instances, or observations)

House
Age
price

House size Uniformity of cells

One-dimensional Two-dimensional
data (1 feature) The features data (2 features)
(also called attributes / variables)
Notations related to datasets
• Assume we have a set of 𝑛𝑛 houses

• Each house 𝑥𝑥 (𝑖𝑖) is characterized by:

1. its size
2. its number of rooms
3. its location (distance from the city center)

• This is a 3-dimensional data (we have 𝑑𝑑 = 3 features). So,

each data-point 𝑥𝑥 (𝑖𝑖) ∈ ℝ3 is represented by a feature-vector:

𝑖𝑖
• Let 𝑥𝑥𝑗𝑗 be the 𝑗𝑗𝑡𝑡𝑡 feature value of the 𝑖𝑖 𝑡𝑡𝑡 house.
 First house: 𝑥𝑥11 = 𝟖𝟖𝟖𝟖 𝑥𝑥21 = 𝟑𝟑 𝑥𝑥31 = 𝟒𝟒
 Second house: 𝑥𝑥12 = 𝟐𝟐𝟎𝟎 𝑥𝑥22 = 𝟐𝟐 𝑥𝑥32 = 𝟑𝟑
 …
 The whole data is represented as a matrix
of 𝑛𝑛 rows and 𝑑𝑑 columns (here 𝑑𝑑 = 3 features)
Notations related to datasets
• We want to train a supervised ML algorithm to predict the price of new houses.

• We need first to prepare a training dataset which consists of:

– The input data
– The real price (output) 𝑦𝑦 (𝑖𝑖) associated to each training data-point 𝑥𝑥 (𝑖𝑖)
• NOTE: These real prices are given to teach (or supervise) the algorithm, so that it learns (or models)
the relation between “the features that characterizes the input data”, and the “desired output” (price).

• The 𝑖𝑖 𝑡𝑡𝑡 house has a price 𝑦𝑦 (𝑖𝑖) (a scalar value) and is characterized by a feature-
vector 𝑥𝑥 (𝑖𝑖) . So, our training dataset is:

• Can, also be represented as matrix and a vector of prices

Notations related to
models

Machine Learning Course.

Dr. Mohamed-Rafik Bouguelia.
[email protected]
Notations related to models
• The model (to be learned) is a function ℎ𝜃𝜃 (called hypothesis).
• The model has a parameters vector 𝜃𝜃.

• Learning (or training) means finding the optimal parameters on a given dataset.

• In this example, as we have one feature (house size), the input 𝑥𝑥 is a scalar value.
• ℎ𝜃𝜃 (𝑥𝑥) is the predicted price for the input 𝑥𝑥 using the model ℎ𝜃𝜃

House
price

hypothesis
(model)
House size
Notations related to models
𝒂𝒂
𝜽𝜽𝟏𝟏 =
𝒃𝒃

𝒃𝒃
𝜽𝜽𝟎𝟎 𝒂𝒂 𝜽𝜽𝟎𝟎

How to choose 𝜃𝜃0 and 𝜃𝜃1  We will see this in the next lecture.
Notations related to models

• In this example, the input 𝒙𝒙 is a two-dimensional vector (i.e., 𝒙𝒙 ∈ ℝ𝟐𝟐 ), as we

have two features:
1. house size
2. number of rooms

• ℎ𝜃𝜃 (𝑥𝑥) is the predicted price for the input 𝑥𝑥 using the model ℎ𝜃𝜃 .
Notations related to models

• How would you write the equation of ℎ𝜃𝜃 (𝑥𝑥) in a more compact format
(using vectors) ?

Help:
• The dot product between two vectors 𝑢𝑢 and 𝑣𝑣
of the same dimension is a scalar value: ;
𝑢𝑢𝑇𝑇 𝑣𝑣 = � 𝑢𝑢𝑖𝑖 𝑣𝑣𝑖𝑖 = 𝑢𝑢0 𝑣𝑣0 + 𝑢𝑢1 𝑣𝑣1 + ⋯
𝑖𝑖
Notations related to models

• We redefine the input 𝑥𝑥 by adding 𝟏𝟏 as the first element.

Then we can just use the dot product between 𝜃𝜃 and 𝑥𝑥.

= 𝜃𝜃0 + 𝜃𝜃1 𝑥𝑥1 + 𝜃𝜃2 𝑥𝑥2

ℎ𝜃𝜃 (𝑥𝑥) = 𝜃𝜃 𝑇𝑇 𝑥𝑥
𝜃𝜃 𝑇𝑇 𝑥𝑥 ℎ𝜃𝜃 𝑥𝑥
Learning is estimating the parameters of the model
• Learning (training) means finding the parameters that
minimizes the cost (error).

y
The cost function (or error)
• Given a dataset:

• The cost 𝐸𝐸(𝜃𝜃) of a model ℎ𝜃𝜃 on this dataset is:

𝑖𝑖
NOTE: The error function is also The predicted output for The true output for 𝑥𝑥
sometimes called “cost function” the data-point 𝑥𝑥 𝑖𝑖
or “loss function”. e.g. the true price of the
e.g. the predicted price of 𝑖𝑖 𝑡𝑡𝑡 house
the 𝑖𝑖 𝑡𝑡𝑡 house.
Notations to remember
• 𝑥𝑥 (𝑖𝑖) ∈ ℝ𝑑𝑑 the 𝑖𝑖 𝑡𝑡𝑡 data-point (or feature-vector). It is a 𝑑𝑑-dimensional vector.

𝑖𝑖
• 𝑥𝑥𝑗𝑗 ∈ ℝ the value of the 𝒋𝒋𝒕𝒕𝒕𝒕 feature (or attribute, or variable) in the data-point 𝑥𝑥 (𝑖𝑖) .

• 𝑦𝑦 (𝑖𝑖) the value of the output variable (or target variable), for the 𝑖𝑖 𝑡𝑡𝑡 data-point.
𝑦𝑦 (𝑖𝑖) ∈ ℝ in regression, and 𝑦𝑦 𝑖𝑖 ∈ {… } in classification.

• 𝐗𝐗 ∈ ℝ𝑛𝑛×𝑑𝑑 a dataset represented as a matrix of 𝑛𝑛 rows and 𝑑𝑑 columns.

• 𝜃𝜃 ∈ ℝ𝑝𝑝 a vector representing the model parameters. It has 𝑝𝑝 parameters.

Sometimes also called weights vector.

• ℎ𝜃𝜃 a model (hypothesis function) with parameters vector .

• ℎ𝜃𝜃 (𝑥𝑥) the output predicted by the model ℎ𝜃𝜃 for the data-point 𝑥𝑥.

• 𝐸𝐸 𝜃𝜃 the cost (or loss, or error) of a model ℎ𝜃𝜃 , on some dataset.

Some basics of linear
algebra

Machine Learning Course.

Dr. Mohamed-Rafik Bouguelia.
[email protected]
Matrices and Vectors

The matrix 𝐴𝐴 has a dimension of 4 × 2

• A vector is simply an 𝑛𝑛 × 1 matrix

𝑢𝑢𝑖𝑖 is the 𝑖𝑖 𝑡𝑡𝑡 element of 𝑢𝑢

Matrix addition
Scalar Multiplication
Combination of operations
Matrix Vector multiplication

3×2 2×1 3×1

Just a dot product

between two vectors
Matrix Vector multiplication
• Example: to predict the outputs of all data-points in a
dataset using a linear model ℎ𝜃𝜃 , just multiply the dataset
matrix by the vector of parameters 𝜃𝜃

𝒏𝒏 × 𝟑𝟑 𝟑𝟑 × 𝟏𝟏 𝒏𝒏 × 𝟏𝟏
Matrix Matrix multiplication
Matrix Matrix multiplication
• Example: to predict the outputs of all data-points in a dataset using several
linear models (ℎ𝜃𝜃 , 𝑔𝑔𝜃𝜃 , 𝑓𝑓𝜃𝜃 ) just multiply the dataset matrix by a matrix that
contains on each column the parameters of one model.
Predictions of ℎ

Predictions of 𝑔𝑔
Predictions of 𝑓𝑓
Dataset matrix Each column is the
parameters of one model

𝒏𝒏 × 𝟑𝟑 𝟑𝟑 × 𝟑𝟑
𝒏𝒏 × 𝟑𝟑
Matrix multiplication properties
• Matrix multiplication is not commutative

• Matrix multiplication is associative

Same
result
Identity matrix, inverse, and transpose
• Identity matrix • Transpose

• Inverse of a matrix
If 𝐴𝐴 is an 𝑛𝑛 × 𝑛𝑛 matrix, and if it has an inverse, then:
Norm of a vector
Example:
The 2-norm (or 𝑙𝑙2 norm, or Euclidian norm) of the vector is:

More generally:

Euclidian distance:
The Euclidian distance between two vectors 𝑥𝑥
and 𝑧𝑧, is the Euclidian norm of their difference:
Norm of a vector

Example:
Derivatives

Machine Learning Course.

Dr. Mohamed-Rafik Bouguelia.
[email protected]
Definition of a derivative
Derivatives – Time saving rules

Question:
Compute the derivative of the error function 𝐸𝐸 with respect to
each parameter of the linear model ℎ𝜃𝜃 𝑥𝑥 = 𝜃𝜃0 + 𝜃𝜃1 𝑥𝑥
Example:
Compute the derivative of the function 𝐸𝐸
where:

• Derivative of 𝑬𝑬(𝜽𝜽𝟎𝟎 , 𝜽𝜽𝟏𝟏 ) with respect to 𝜽𝜽𝟎𝟎

• Derivative of 𝑬𝑬(𝜽𝜽𝟎𝟎 , 𝜽𝜽𝟏𝟏 ) with respect to 𝜽𝜽𝟏𝟏

Cost Function
No ratings yet
Cost Function
17 pages
CS229 Lecture 2 PDF
100% (1)
CS229 Lecture 2 PDF
48 pages
CS229 Lecture Notes
No ratings yet
CS229 Lecture Notes
142 pages
Assignment 1
100% (1)
Assignment 1
3 pages
LinearRegression Annotated
No ratings yet
LinearRegression Annotated
116 pages
Lecture 1.2. Basics and Prerequisite
No ratings yet
Lecture 1.2. Basics and Prerequisite
34 pages
Linear Regression: Jia-Bin Huang Virginia Tech
No ratings yet
Linear Regression: Jia-Bin Huang Virginia Tech
59 pages
Lecture LinearRegression
No ratings yet
Lecture LinearRegression
42 pages
Revised-L3-Linear Regression
No ratings yet
Revised-L3-Linear Regression
41 pages
Lecture 8 - Logistic Regression
No ratings yet
Lecture 8 - Logistic Regression
58 pages
Machine Learning Guidelines and Practical List - Tutorialsduniya
No ratings yet
Machine Learning Guidelines and Practical List - Tutorialsduniya
2 pages
Lecture 2.1 Linear Regression
No ratings yet
Lecture 2.1 Linear Regression
36 pages
03 Linear Regression Intuition
No ratings yet
03 Linear Regression Intuition
23 pages
ML03
No ratings yet
ML03
14 pages
Lec3 4 ML Project
No ratings yet
Lec3 4 ML Project
26 pages
Machine Learning Notes by Standard Andrew NG
No ratings yet
Machine Learning Notes by Standard Andrew NG
142 pages
cs229 2
No ratings yet
cs229 2
275 pages
Lecture-2-1 Model Representation 20220301
No ratings yet
Lecture-2-1 Model Representation 20220301
10 pages
ML MTT2
No ratings yet
ML MTT2
23 pages
Deriving The Normal Equation Using Matrix Calculus
No ratings yet
Deriving The Normal Equation Using Matrix Calculus
18 pages
Lecture W2c
No ratings yet
Lecture W2c
16 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
CS229
No ratings yet
CS229
69 pages
Lab 4 - Markdown Practical - Solution
No ratings yet
Lab 4 - Markdown Practical - Solution
5 pages
5.2 Regression
No ratings yet
5.2 Regression
19 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
No ratings yet
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
113 pages
Regression
No ratings yet
Regression
30 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
15 pages
Distributed Linear Regression Class Notes
No ratings yet
Distributed Linear Regression Class Notes
140 pages
Lec 03
No ratings yet
Lec 03
42 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Information Systems - What Every Business Student Needs To Know, Second Edition
No ratings yet
Information Systems - What Every Business Student Needs To Know, Second Edition
391 pages
Digital Twin Applications in Aviation Industry A R
No ratings yet
Digital Twin Applications in Aviation Industry A R
17 pages
Notes 1
No ratings yet
Notes 1
30 pages
Lecture Slides - Linear Regression (2025)
No ratings yet
Lecture Slides - Linear Regression (2025)
45 pages
Your E-Admit Card
No ratings yet
Your E-Admit Card
4 pages
009 Neural - Networks Complete
No ratings yet
009 Neural - Networks Complete
61 pages
Linear Regression For Absolute Beginners With Implementation in Python
No ratings yet
Linear Regression For Absolute Beginners With Implementation in Python
17 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Lab02
No ratings yet
Lab02
14 pages
Linearna Regresija - NG
No ratings yet
Linearna Regresija - NG
7 pages
L4 More On Linear Regression and Polynomial Regression
No ratings yet
L4 More On Linear Regression and Polynomial Regression
37 pages
Module 3
No ratings yet
Module 3
27 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
Machine Learning - Home - Week 2 - Notes - Coursera
No ratings yet
Machine Learning - Home - Week 2 - Notes - Coursera
10 pages
Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal - 1
No ratings yet
Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal - 1
2 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
Chapter 1 Statement of The Problem
100% (2)
Chapter 1 Statement of The Problem
4 pages
Week 2
No ratings yet
Week 2
5 pages
HCIA-Intelligent Computing V1.0 Lab Guide
No ratings yet
HCIA-Intelligent Computing V1.0 Lab Guide
213 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
C1 W2 Lab02 Multiple Variable Soln
No ratings yet
C1 W2 Lab02 Multiple Variable Soln
11 pages
Day 1
No ratings yet
Day 1
41 pages
DGS-1510 Series CLI Reference Guide v1.70
No ratings yet
DGS-1510 Series CLI Reference Guide v1.70
815 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Run Length Encoding
No ratings yet
Run Length Encoding
7 pages
ML 01
No ratings yet
ML 01
24 pages
Remote Connection Data Sheet
No ratings yet
Remote Connection Data Sheet
7 pages
Keys To Effective Journalism in The Multimedia Era
No ratings yet
Keys To Effective Journalism in The Multimedia Era
13 pages
PLC Unit 4
No ratings yet
PLC Unit 4
62 pages
Install Apache PHP5 MySQL5.6 Debian 9.6
No ratings yet
Install Apache PHP5 MySQL5.6 Debian 9.6
5 pages
Network Security Checklist: Ensuring Robust Protection Against Cyber Threats
No ratings yet
Network Security Checklist: Ensuring Robust Protection Against Cyber Threats
12 pages
Employmentary Skill For Class - X All 5 Units 2024-25
No ratings yet
Employmentary Skill For Class - X All 5 Units 2024-25
63 pages
ML Notes
No ratings yet
ML Notes
14 pages
ON1 Photo Keyword AI User Guide PDF
No ratings yet
ON1 Photo Keyword AI User Guide PDF
91 pages
Preparing A Static PDF Form
No ratings yet
Preparing A Static PDF Form
6 pages
Change Log
No ratings yet
Change Log
75 pages
01 Linear Data Structures
No ratings yet
01 Linear Data Structures
56 pages
PRBT-0348 D20AC Peripheral Discontinuance Notice V100 R0
No ratings yet
PRBT-0348 D20AC Peripheral Discontinuance Notice V100 R0
3 pages
Laiza-Powerpoint 20240410 215454 0000
No ratings yet
Laiza-Powerpoint 20240410 215454 0000
9 pages
Jumo Indicator
No ratings yet
Jumo Indicator
10 pages
QMS Quick Start Guide
No ratings yet
QMS Quick Start Guide
14 pages
4th Grade Math Skill of The Day Week 1
100% (1)
4th Grade Math Skill of The Day Week 1
9 pages
Module 5
No ratings yet
Module 5
11 pages
Fatima Khan.
No ratings yet
Fatima Khan.
8 pages
Asyril Datasheet Asycube 240 en
No ratings yet
Asyril Datasheet Asycube 240 en
2 pages
Plankalkül
No ratings yet
Plankalkül
7 pages
Designing Advanced Encryption Methods For Secure IoT Communication
No ratings yet
Designing Advanced Encryption Methods For Secure IoT Communication
4 pages
Registering Domain Name 15feb23 en
No ratings yet
Registering Domain Name 15feb23 en
1 page
Welcome To Diligent-BTS One Pager
No ratings yet
Welcome To Diligent-BTS One Pager
1 page

Lecture 1.2 Basics and Prerequisite

Uploaded by

Lecture 1.2 Basics and Prerequisite

Uploaded by

Machine Learning Course

Basics and Prerequisites

Dr. Mohamed-Rafik Bouguelia

Halmstad University, Sweden.

Machine Learning Course.

House size Uniformity of cells

House size Uniformity of cells

House size Uniformity of cells

Notations related to datasets

House size Uniformity of cells

• Each house 𝑥𝑥 (𝑖𝑖) is characterized by:

• This is a 3-dimensional data (we have 𝑑𝑑 = 3 features). So,

• We need first to prepare a training dataset which consists of:

• Can, also be represented as matrix and a vector of prices

Machine Learning Course.

• In this example, the input 𝒙𝒙 is a two-dimensional vector (i.e., 𝒙𝒙 ∈ ℝ𝟐𝟐 ), as we

• We redefine the input 𝑥𝑥 by adding 𝟏𝟏 as the first element.

= 𝜃𝜃0 + 𝜃𝜃1 𝑥𝑥1 + 𝜃𝜃2 𝑥𝑥2

• The cost 𝐸𝐸(𝜃𝜃) of a model ℎ𝜃𝜃 on this dataset is:

• 𝐗𝐗 ∈ ℝ𝑛𝑛×𝑑𝑑 a dataset represented as a matrix of 𝑛𝑛 rows and 𝑑𝑑 columns.

• 𝜃𝜃 ∈ ℝ𝑝𝑝 a vector representing the model parameters. It has 𝑝𝑝 parameters.

• ℎ𝜃𝜃 a model (hypothesis function) with parameters vector .

• 𝐸𝐸 𝜃𝜃 the cost (or loss, or error) of a model ℎ𝜃𝜃 , on some dataset.

Machine Learning Course.

The matrix 𝐴𝐴 has a dimension of 4 × 2

• A vector is simply an 𝑛𝑛 × 1 matrix

𝑢𝑢𝑖𝑖 is the 𝑖𝑖 𝑡𝑡𝑡 element of 𝑢𝑢

3×2 2×1 3×1

Just a dot product

• Matrix multiplication is associative

Machine Learning Course.

• Derivative of 𝑬𝑬(𝜽𝜽𝟎𝟎 , 𝜽𝜽𝟏𝟏 ) with respect to 𝜽𝜽𝟎𝟎

• Derivative of 𝑬𝑬(𝜽𝜽𝟎𝟎 , 𝜽𝜽𝟏𝟏 ) with respect to 𝜽𝜽𝟏𝟏

You might also like