0% found this document useful (0 votes)

24 views3 pages

Exercise 01

This document outlines Problem Set 1 for the Machine Learning Course at EPFL, focusing on efficient Python/NumPy programming. It includes three tasks related to matrix standardization, pairwise distances, and likelihood of data samples, emphasizing the use of vectorized commands for computational efficiency. Additionally, it provides resources for Python setup, useful commands, and theoretical questions to prepare for the final exam.

Uploaded by

Nabil Ben Mazouz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views3 pages

Exercise 01

Uploaded by

Nabil Ben Mazouz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Labs EPFL

Machine Learning Course School of Computer and Communication Sciences

Fall 2017
Martin Jaggi & Rüdiger Urbanke
mlo.epfl.ch/page-146520.html
[email protected]
Problem Set 1, Sept 21, 2017
(Efficient Python/NumPy Programming)

Introduction
For computational efficiency of typical operations in machine learning applications, it is very beneficial to use
NumPy arrays together with vectorized commands, instead of explicit for loops. The vectorized commands are
better optimized, and bring the performance of Python code (and similarly e.g. for Matlab) closer to lower level
languages like C. In this exercise, you are asked to write efficient implementations for three small problems that
are typical for the field of machine learning.

Getting Started
Follow the Python setup tutorial provided on our github repository here:

github.com/epfml/ML course/tree/master/labs/ex01/python setup tutorial.md

After you are set up, clone or download the repository, and start by filling in the template notebooks in the folder
/labs/ex01, for each of the 3 tasks below.
To get more familiar with vector and matrix operations using NumPy arrays, it is also recommended to go through
the npprimer.ipynb notebook in the same folder.

Note: The following three exercises could be solved by for-loops. While that’s ok to get started, the goal of
this exercise sheet is to use the more efficient vectorized commands instead:

Useful Commands
We give a short overview over some commands that prove useful for writing vectorized code. You can read the
full documentation and examples by issuing help(func).
At the beginning: import numpy as np

• a * b, a / b: element-wise multiplication and division of matrices (arrays) a and b

• a.dot(b): matrix-multiplication of two matrices a and b
• a.max(0): find the maximum element for each column of matrix a (note that NumPy uses zero-based
indices, while Matlab uses one-based)
• a.max(1): find the maximum element for each row of matrix a
• np.mean(a), np.std(a): compute the mean and standard deviation of all entries of a
• a.shape: return the array dimensions of a
• a.shape[k]: return the size of array a along dimension k
• np.sum(a, axis=k): sum the elements of matrix a along dimension k
• linalg.inv(a): returns the inverse of a square matrix a

A broader tutorial can be found here: http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf

For users who were more familiar with Matlab, a nice comparison of the analogous functions can be found here:
https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 1: Two sets of points in the plane. The circles are a subset of the dots and have been perturbed randomly.

Task A: Matrix Standardization

The different dimensions or features of a data sample often show different variances. For some subsequent
operations, it is a beneficial preprocessing step to standardize the data, i.e. subtract the mean and divide by the
standard deviation for each dimension. After this processing, each dimension has zero mean and unit variance.
Note that this is not equivalent to data whitening, which additionally de-correlates the dimensions (by means of
a coordinate rotation).
Write a function that accepts data matrix x ∈ Rn×d as input and outputs the same data after normalization. n
is the number of samples, and d the number of dimensions, i.e. rows contain samples and columns features.

Task B: Pairwise Distances in the Plane

One application of machine learning to computer vision is interest point tracking. The location of corners in an
image is tracked along subsequent frames of a video signal (see Figure 1 for a synthetic example). In this context,
one is often interested in the pairwise distance of all points in the first frame to all points in the second frame.
Matching points according to minimal distance is a simple heuristic that works well if many interest points are
found in both frames and perturbations are small.
Write a function that accepts two matrices P ∈ Rp×2 , Q ∈ Rq×2 as input, where each row contains the (x, y)
coordinates of an interest point. Note that the number of points (p and q) do not have to be equal. As output,
compute the pairwise distances of all points in P to all points in Q and collect them in matrix D. Element Di,j
is the Euclidean distance of the i-th point in P to the j-th point in Q.

Task C: Likelihood of a Data Sample

In this exercise, you are not required to understand the statistics and machine learning concepts described here
yet. The goal here is just to practically implement the assignment of data to two given distributions, in Python.
A subtask of many machine learning algorithms is to compute the likelihood p(xn |θ) of a sample xn for a given
density model with parameters θ. Given k models, we now want to assign xn to the model for which the likelihood
is maximal: an = arg maxm p(xn | θ m ), where m = 1, . . . , k. Here θ m = (µm , Σm ) are the parameters of the
m-th density model (µm ∈ Rd is the mean, and Σm is the so called covariance matrix).
We ask you to implement the assignment step for the two model case, i.e. k = 2. As input, your function receives
a set of data examples xn ∈ Rd (indexed by 1 ≤ n ≤ N ) as well as the two sets of parameters θ 1 = (µ1 , Σ1 )
and θ 2 = (µ2 , Σ2 ) of two given multivariate Gaussian distributions:

1 1
p(xn | µ, Σ) = exp − (xn − µ)> Σ−1 (xn − µ) .
(2π)d/2 |Σ|1/2 2

|Σ| is the determinant of Σ and Σ−1 its inverse. Your function must return the ’most likely’ assignment
an ∈ {1, 2} for each input point n, where an = 1 means that xn has been assigned to model 1. In other words
in the case that an = 1, it holds that p(xn | µ1 , Σ1 ) > p(xn | µ2 , Σ2 ).

2
Theory Questions
In addition to the practical exercises you do in the labs, as for example above, we will in future labs also provide
you some theory oriented questions, to prepare you for the final exam. As the rest of the exercises, it is not
mandatory to solve them, but we would recommend that you at least look at - and try - some of them during the
semester. From last year’s experience, many students where surprised by the heavy theoretical focus of the final
exam after having worked on the two very practical projects. Do not fall for this trap! Passing the course require
acquiring both a practical and a theoretical understanding of the material, and those exercises should help you
with the latter.
However, please note that the difficulty of these exercises might not be of the same level as the exam and are
not enough by themselves; you should read the additional material given at the end of the lectures and do the
exercises in the recommended books - see The course info sheet.
Note that we will try to, but might not, provide solutions.
This week, as we just started the course, there are no exercises. You should refresh your mind of the prerequisites,
especially on the following topics.

• Make sure your linear algebra is fresh in memory, especially

– Matrix manipulation (Multiplication, Transpose, Inverse)
– Ranks, Linear independence
– Eigenvalues and Eigenvectors
You can use the following resources to help you get up to speed if needed.

– The Linear Algebra handout

– Gilbert Strang’s Introduction to Linear Algebra. Some chapters are available online, and the book
(along with many other textbooks on linear algebra) is available at the EPFL Library.

• If it has been long since your last calculus class, make sure you know how to handle Gradients. You can
find a quick summary and useful identities in The Matrix Cookbook.
• For probability and statistics, you should at least know about

– Conditional and joint probability distributions

– Bayes theorem
– Random variables, independence, variance, expectation
– The Gaussian distribution

If you need a refresh, check Chapter 2 in Pattern Recognition and Machine Learning by Christopher Bishop,
available at the EPFL Library.

Bco - English 5
No ratings yet
Bco - English 5
12 pages
Why Do We Baptize Infants - (Basics of The Faith) (Basics of - Bryan Chapell - Basics of The Reformed Faith, 1st Ed, Phillipsburg, N - J, - Oxford - 9781596380585 - Anna's Archive
No ratings yet
Why Do We Baptize Infants - (Basics of The Faith) (Basics of - Bryan Chapell - Basics of The Reformed Faith, 1st Ed, Phillipsburg, N - J, - Oxford - 9781596380585 - Anna's Archive
36 pages
Tecnomatix Plant Simulation Basics, Methods, and Strategies Student Guide - 2012
100% (1)
Tecnomatix Plant Simulation Basics, Methods, and Strategies Student Guide - 2012
764 pages
ML Coursera Python Assignments
100% (1)
ML Coursera Python Assignments
20 pages
MATLAB for Beginners: A Gentle Approach - Revised Edition
From Everand
MATLAB for Beginners: A Gentle Approach - Revised Edition
Peter I. Kattan
3.5/5 (11)
Semester I: Discipline: Electronics and Communication Stream: EC3
No ratings yet
Semester I: Discipline: Electronics and Communication Stream: EC3
99 pages
Intro To ANSYS Ncode DL 14 5 L14 Standalone DesignLife
No ratings yet
Intro To ANSYS Ncode DL 14 5 L14 Standalone DesignLife
21 pages
Yarima Ahmed by MMN Noorul Hudah
No ratings yet
Yarima Ahmed by MMN Noorul Hudah
133 pages
1 & 2 Linear Algebra and Probability Distribution
No ratings yet
1 & 2 Linear Algebra and Probability Distribution
11 pages
Data Science Using Python Lab Week8
No ratings yet
Data Science Using Python Lab Week8
23 pages
Lec1 Mathreview
No ratings yet
Lec1 Mathreview
61 pages
1.speak Out Starter Booklet Student
No ratings yet
1.speak Out Starter Booklet Student
126 pages
Ai Tools Lab - N3
No ratings yet
Ai Tools Lab - N3
66 pages
Numpy
No ratings yet
Numpy
29 pages
Vedant 2024801005 Experiment 3
No ratings yet
Vedant 2024801005 Experiment 3
18 pages
Develop Programs To Understand Concept of Class and Object in Python
No ratings yet
Develop Programs To Understand Concept of Class and Object in Python
49 pages
057-349 G0123 Soft
No ratings yet
057-349 G0123 Soft
24 pages
50 Inference
No ratings yet
50 Inference
31 pages
Session 1
No ratings yet
Session 1
66 pages
HW 1
No ratings yet
HW 1
12 pages
SubjectwiseCutOffs-1
No ratings yet
SubjectwiseCutOffs-1
2 pages
Homework 2 MATH2050
No ratings yet
Homework 2 MATH2050
10 pages
Lab1 ML Eac22050
No ratings yet
Lab1 ML Eac22050
17 pages
Brand Guidelines
No ratings yet
Brand Guidelines
2 pages
Assignment
No ratings yet
Assignment
7 pages
Lab Description File
No ratings yet
Lab Description File
8 pages
FL LectureNotes
No ratings yet
FL LectureNotes
92 pages
On Varieties of English and Language Registers
No ratings yet
On Varieties of English and Language Registers
28 pages
00 Statistics
No ratings yet
00 Statistics
18 pages
Exercise 01 Math Refresher
No ratings yet
Exercise 01 Math Refresher
4 pages
Exercises Question
No ratings yet
Exercises Question
30 pages
HW 1 in 2015
No ratings yet
HW 1 in 2015
3 pages
Col726 A2
No ratings yet
Col726 A2
5 pages
CO - Module 1
No ratings yet
CO - Module 1
31 pages
Graph Theory
From Everand
Graph Theory
Ronald Gould
No ratings yet
CST294 July 2021
No ratings yet
CST294 July 2021
4 pages
Tut2 Questions
No ratings yet
Tut2 Questions
3 pages
Piar CSNM2020 Exo2
No ratings yet
Piar CSNM2020 Exo2
1 page
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
hw8 (5555)
No ratings yet
hw8 (5555)
3 pages
2.3 SciPy-1
No ratings yet
2.3 SciPy-1
17 pages
Assignment 0
No ratings yet
Assignment 0
3 pages
The Pirate Walter Scott Mark Weinstein Alison Lumsden PDF Download
100% (1)
The Pirate Walter Scott Mark Weinstein Alison Lumsden PDF Download
85 pages
Lab 4
No ratings yet
Lab 4
21 pages
Python Lab Sols 6
No ratings yet
Python Lab Sols 6
8 pages
Kuttler LinearAlgebra AFirstCourse YorkU MATH2022 Winter2017
No ratings yet
Kuttler LinearAlgebra AFirstCourse YorkU MATH2022 Winter2017
258 pages
Precalc 12 Review Rational Functions 202324 EB
No ratings yet
Precalc 12 Review Rational Functions 202324 EB
14 pages
MLF Combined
No ratings yet
MLF Combined
84 pages
Sns Lab Manuals: 18-Ee-128 Musaib Ahmed
No ratings yet
Sns Lab Manuals: 18-Ee-128 Musaib Ahmed
104 pages
Unit I
No ratings yet
Unit I
24 pages
Vmls - 103exercises
No ratings yet
Vmls - 103exercises
50 pages
Part 3
No ratings yet
Part 3
113 pages
Qnpaper
No ratings yet
Qnpaper
3 pages
Q05.2 Group 6 Global Warming
No ratings yet
Q05.2 Group 6 Global Warming
14 pages
MATLAB for Beginners: A Gentle Approach - Revised Edition
From Everand
MATLAB for Beginners: A Gentle Approach - Revised Edition
Peter Kattan
No ratings yet
MATLAB for Beginners: A Gentle Approach
From Everand
MATLAB for Beginners: A Gentle Approach
Peter I. Kattan
No ratings yet
HW 4
No ratings yet
HW 4
5 pages
Assignment 1: Statistical Machine Learning, Summer Term 2022
No ratings yet
Assignment 1: Statistical Machine Learning, Summer Term 2022
4 pages
Basic Math: 1.1 Scipy Constants (Scipy - Constants)
No ratings yet
Basic Math: 1.1 Scipy Constants (Scipy - Constants)
32 pages
ML Lab 04 Manual - Pandas and MatplotLib
No ratings yet
ML Lab 04 Manual - Pandas and MatplotLib
7 pages
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
No ratings yet
E9 205 - Machine Learning For Signal Processing: Practice Midterm Exam
4 pages
HW01 - Math Recap
No ratings yet
HW01 - Math Recap
4 pages
Matrix and Tensor Factorization For Machine Learning: IFT 6760A
No ratings yet
Matrix and Tensor Factorization For Machine Learning: IFT 6760A
49 pages
EML Couse Outcome
No ratings yet
EML Couse Outcome
2 pages
Write A Short Story About A Strange Dream You Had
No ratings yet
Write A Short Story About A Strange Dream You Had
2 pages
Step by Step Guide LyncDebugTools - Snooper 2013
No ratings yet
Step by Step Guide LyncDebugTools - Snooper 2013
13 pages
3 Baptism (Peter Tan)
100% (1)
3 Baptism (Peter Tan)
22 pages
Course Outline 2
No ratings yet
Course Outline 2
4 pages
MATLAB Homework 2
No ratings yet
MATLAB Homework 2
2 pages
Python Basics Nympy
No ratings yet
Python Basics Nympy
5 pages
Descargar Gratis El Libro La Rebelion de Las Ratas
No ratings yet
Descargar Gratis El Libro La Rebelion de Las Ratas
3 pages
Polynomial Sample Problems
No ratings yet
Polynomial Sample Problems
3 pages
Data11002 2019 E0 PDF
No ratings yet
Data11002 2019 E0 PDF
3 pages
LUNA, Luis Eduardo - Functions of The Magic Melodies or Icaros PDF
100% (1)
LUNA, Luis Eduardo - Functions of The Magic Melodies or Icaros PDF
23 pages
Essay Conceptual Art - Joseph Kosuth
No ratings yet
Essay Conceptual Art - Joseph Kosuth
6 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Excel Functions - An Introduction PDF
No ratings yet
Excel Functions - An Introduction PDF
9 pages
Intro To Matlab: Solutions To Exercises: Monique Ebell Studienzentrum Gerzensee
No ratings yet
Intro To Matlab: Solutions To Exercises: Monique Ebell Studienzentrum Gerzensee
6 pages
Immanence - A Life
No ratings yet
Immanence - A Life
3 pages
2021 EE769 Tutorial Sheet 1
No ratings yet
2021 EE769 Tutorial Sheet 1
4 pages
I Dedicate My Victory To Palestine' Afaf Raed Sharif, 17, From Palestine
No ratings yet
I Dedicate My Victory To Palestine' Afaf Raed Sharif, 17, From Palestine
2 pages
Database Administration Topics
No ratings yet
Database Administration Topics
10 pages
Foundations of Data Science: Exercise 1
No ratings yet
Foundations of Data Science: Exercise 1
5 pages
Li Lai-Resume
No ratings yet
Li Lai-Resume
2 pages
Acord Sb+predicat - Engleza - Exercitii
No ratings yet
Acord Sb+predicat - Engleza - Exercitii
5 pages
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
No ratings yet
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
4 pages
Python SciPy Cheat Sheet Linear Algebra PDF
No ratings yet
Python SciPy Cheat Sheet Linear Algebra PDF
1 page
Alarms RDC Integration
No ratings yet
Alarms RDC Integration
13 pages
How Do I Write A Report
100% (2)
How Do I Write A Report
2 pages
HW 1
No ratings yet
HW 1
4 pages

Exercise 01

Uploaded by

Exercise 01

Uploaded by

Labs EPFL

Machine Learning Course School of Computer and Communication Sciences

github.com/epfml/ML course/tree/master/labs/ex01/python setup tutorial.md

• a * b, a / b: element-wise multiplication and division of matrices (arrays) a and b

A broader tutorial can be found here: http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf

Task A: Matrix Standardization

Task B: Pairwise Distances in the Plane

Task C: Likelihood of a Data Sample

• Make sure your linear algebra is fresh in memory, especially

– The Linear Algebra handout

– Conditional and joint probability distributions

You might also like