1 Lecture 1: Introduction To Machine Learning
1 Lecture 1: Introduction To Machine Learning
We hear a lot about machine learning (or ML for short) in the news.
But what is it, really?
You use machine learninng every day when you run a search engine query.
Machine learning also powers the speech recognition, question answering and other intelligent ca-
pabilities of smartphone assistants like Apple Siri.
1
ML systems are also used by credit card companies and banks to automatically detect fraudulent
behavior.
One of the most exciting and cutting-edge uses of machine learning algorithms are in autonomous
vehicles.
A self-driving car system uses dozens of components that include detection of cars, pedestrians,
and other objects.
In practice, it’s almost impossible for a human to specify all the edge cases.
2
11 Self Driving Cars: An ML Approach
The machine learning approach is to teach a computer how to do detection by showing it many
examples of different objects.
No manual programming is needed: the computer learns what defines a pedestrian or a car on its
own!
Machine learning is a field of study that gives computers the ability to learn without
being explicitly programmed. (Arthur Samuel, 1959.)
This principle can be applied to countless domains: medical diagnosis, factory automation, machine
translation, and many more!
14 Supervised Learning
Consider a simple dataset for supervised learning: house prices in Boston. * Each datapoint is a
house. * We know its price, neighborhood, size, etc.
3
[13]: # We will load the dataset from the sklearn ML library
from sklearn import datasets
boston = datasets.load_boston()
We will visualize two variables in this dataset: house price and the education level in the neighbor-
hood.
[14]: import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [12, 4]
plt.scatter(boston.data[:,12], boston.target)
plt.ylabel("Median house price ($K)")
plt.xlabel("% of adults in neighborhood that don't have a high school diploma")
plt.title("House prices as a function of average neighborhood education level")
4
# Visualize the results
plt.scatter(boston.data[:,[12]], boston.target, alpha=0.25)
plt.plot(np.linspace(2, 35), predictions, c='red')
plt.ylabel("Median house price ($K)")
plt.xlabel("% of adults in neighborhood that don't have a high school diploma")
plt.title("House prices as a function of average neighborhood education level")
Many of the most important applications of machine learning are supervised: * Classifying medical
images. * Translating between pairs of languages. * Detecting objects in a self-driving car.
18 Unsupervised Learning
Here, we have a dataset without labels. Our goal is to learn something interesting about the
structure of the data: * Clusters hidden in the dataset. * Outliers: particularly unusual and/or
interesting datapoints. * Useful signal hidden in noise, e.g. human speech over a noisy phone.
5
plt.scatter(iris.data[:,0], iris.data[:,1], alpha=0.5)
plt.ylabel("Sepal width (cm)")
plt.xlabel("Sepal length (cm)")
plt.title("Dataset of Iris flowers")
We can use this dataset of examples to fit an unsupervised learning model. * The model de-
fines a probability distribution over the inputs. * The probability distribution identifies multiple
components (multiple peaks). * The components indicate structure in the data.
[21]: GaussianMixture(n_components=3)
6
plt.legend(['Datapoints', 'Probability peaks'])
7
21 Applications of Unsupervised Learning
22 Reinforcement Learning
In reinforcement learning, an agent is interacting with the world over time. We teach it good
behavior by providing it with rewards.
Image by Lily Weng
Applications of reinforcement learning include: * Creating agents that play games such as Chess
or Go. * Controling the cooling systems of datacenters to use energy more efficiently. * Designing
new drug compounds.
Machine learning is often discussed in the context of these two fields. * AI is about building
machines that exhibit intelligence. * ML enables machines to learn from experience, a useful tool
for AI. * Deep learning focuses on a family of learning algorithms loosely inspired by the brain.
Image source.
# Part 3: About the Course
Next, let’s look at the machine learning topics that we will cover.
25 Teaching Approach
The focus of this course is on applied machine learning. * We will cover a broad toolset of core
algorithms from many different subfields of ML. * We will emphasize applications and show how
to implement and apply algorithms via examples and exercises.
Why are we following this approach? * Applying machine learning is among the most in demand
industry skills right now. * There can be a gap between theory and practice, especially in modern
machine learning. * Often, the best way of understanding how an algorithm works is to implement
it.
8
26 What You Will Learn
• What are the core algorithms of ML and how to define them in mathematical language.
• How to implement algorithms from scratch as well as using ML libraries and apply them to
problems in computer vision, language processing, medical analysis, and more.
• Why machine learning algorithms work and how to use that knowledge to debug and improve
them.
You will use Python and popular machine learning libraries such as: * scikit-learn. It implements
most classical machine learning algorithms. * tensorflow, keras, pytorch. Standard libraries for
modern deep learning. * numpy, pandas. Linear algebra and data processing libraries used to
implement algorithms from scratch.
The core materials for this course (including the slides!) are created using Jupyter notebooks. * We
are going to embed an execute code directly in the slides and use that to demonstrate algorithms.
* These slides can be downloaded locally and all the code can be reproduced.
[29]: import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets, neural_network
plt.rcParams['figure.figsize'] = [12, 4]
# The data that we are interested in is made of 8x8 images of digits, let's
# have a look at the first 4 images.
_, axes = plt.subplots(1, 4)
images_and_labels = list(zip(digits.images, digits.target))
for ax, (image, label) in zip(axes, images_and_labels[:4]):
ax.set_axis_off()
ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
ax.set_title('Label: %i' % label)
9
We can now load and train this algorithm inside the slides.
[30]: np.random.seed(0)
# To apply a classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
data = digits.images.reshape((len(digits.images), -1))
10
# Part 4: Logistics and Other Information
We will go over some practical bits of information.
The format of this course will be that of the “reverse classroom”. * Pre-recorded lecture videos
will be made available online ahead of time. You should watch them ahead of each weekly lecture.
* In-class discussions will focus on answering student questions, going over homework problems,
doing tutorials.
30 Course Content
The course spans about 25 lectures approximately divided up into a set of blocks: 1. Supervised
and unsupervised algorithms. 2. Foundations of machine learning. 4. Applying machine learning
in practice. 5. Advanced topics and guest lectures.
• Supervised learning algorithms: linear models and extensions, kernel machines, tree-based
algorithms.
• Unsupervised learning algorithms: density estimation, clustering, dimensionality reduction
• Introduction to deep learning models.
• The basic language of machine learning: datasets, features, models, objective functions.
• Tools for machine learning: optimization, probability, linear algebra.
• Why do algorithms work in practice? Probabilistic and statistical foundations.
11
33 Applying Machine Learning
35 Course Assignments
There are two main types of assignments. 1. Supervised and unsupervised algorithms. 2. Founda-
tions of machine learning. 4. Applying machine learning in practice. 5. Advanced topics and guest
lectures.
This course is designed to aimed at a very general technical audience. Main requirements are: *
Programming experience (at least 1 year), preferably in Python. * College-level linear algebra. Ma-
trix operations, the SVD decomposition, etc. * College-level probability. Probability distributions,
random variables, Bayes’ rule, etc.
37 Other Logistics
12