0% found this document useful (0 votes)

25 views31 pages

Intro Slides

The document outlines the Spring 2023 Introduction to Machine Learning course at MIT, detailing the course structure, grading, prerequisites, and key topics covered. It emphasizes the importance of understanding machine learning concepts and provides information on staff, office hours, and collaboration policies. The course will cover supervised and unsupervised learning, with a focus on practical applications and theoretical foundations.

Uploaded by

vco.osc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views31 pages

Intro Slides

Uploaded by

vco.osc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Spring 2023!

Introduction to
Machine Learning
https://introml.mit.edu

Marzyeh Ghassemi
[email protected]
Spring 2023!
Introduction to
Machine Learning
https://introml.mit.edu

Tomas Lozano-Perez
[email protected]
Spring 2023!
Introduction to
Machine Learning
https://introml.mit.edu

Wojciech Matusik
[email protected]
Spring 2023!
Introduction to
Machine Learning
https://introml.mit.edu

Vince Monardo
[email protected]
Spring 2023!
Introduction to
Machine Learning
https://introml.mit.edu

Shen Shen
[email protected]
Spring 2023!
Introduction to
Machine Learning
https://introml.mit.edu

Ashia Wilson
[email protected]
Full Staff

Logistical issues? Personal concerns?

We’d love to help out at
[email protected]
and ~40 awesome LAs
Section 1 staff
Recitation + Lab

Lab plus ~7 awesome LAs

Section 2 staff
Recitation + Lab

Lab plus ~5 awesome LAs

Section 3 staff
Recitation + Lab

Lab
plus ~7 awesome LAs
Section 4 staff
Recitation + Lab

Lab plus ~5 awesome LAs

Section 5 staff
Recitation + Lab

Lab

plus ~6 awesome LAs

Section 6 staff

Recitation + Lab

Lab

plus ~5 awesome LAs

Section 7 staff
Recitation + Lab

Lab

plus ~6 awesome LAs

Course pedagogy:
A nominal week – mix of theory, concepts, and application to problems!
• Exercises: Releases on Wed 5pm, due the following Mon. 9am
Easy questions based on that week’s notes reading (and viewing optional recorded lecture)
• Recitation: Monday, with attendance check-in (not today)
Assumes you have read and done exercises; start on homework
• Homework: Releases Monday 9am; due Wednesday (9 days later) at 11pm
Harder questions: concepts, mechanics, implementations
• Lab: Wednesday, with attendance check-in (starting Feb 8)
In-class empirical exploration of concepts
Work with partner on lab assignment
Check-off conversation with staff member, due the following Monday 11pm
Office hours: lots! posted on website. Also make use of Piazza and Psetpartners!
Exams:
• Midterm: Thurs. March 23: 7:30-9:30 pm
• Final: scheduled by Registrar (posted in 3rd week). Alert – might be as late as May 24!
Grading and collaboration (details on web)
Our objective (and we hope yours) is for you to learn about machine learning
• take responsibility for your understanding
• we will help!
Formula:
exercises 5% + attendance 5% + homework 15% + labs 15% + midterm 25% + final 35%
Lateness: 20% penalty per day, applied linearly (so 1 hour late is -0.83%)
Extensions:
• 20 one-day extensions (move one assignment’s deadline forward by one day) will be
applied automatically at the end of the term in a way that is maximally helpful
• for medical or personal difficulties see S3 & contact us at [email protected]
Collaboration: don't cheat!
• Understand everything you turn in
• Coding and detailed derivations must be done by you
• See collaboration policy/examples on course web site
Expected prerequisite background
Things we expect you to know (we use these constantly, but don’t
teach them explicitly):

Programming (e.g. as in 6.009 or 6.006)

• Intermediate Python, including classes
• Exposure to algorithms – ability to understand & discuss pseudo-code,
and implement in Python
Linear Algebra (e.g. as in 18.06, 18.C06, 18.03, or 18.700)
• Matrix manipulations: transpose, multiplication, inverse etc.
• Points and planes in high-dimensional space
• (Together with calculus): taking gradients, matrix calculus
Useful background
Things it helps to have prior exposure to, but we don’t expect (we
use these in 6.390, but will discuss as we go):

• numpy (Python package for matrix/linear algebra)

• pytorch (python package for modern ml models like deep neural
networks)
• Basic discrete probability: random variables, independence, conditioning
Heads-up for Wednesday
● Attend your assigned section only starting Wednesday Feb 8
● If you need to change your permanent section assignment, you will be
able to self-switch, starting 5pm today; details on introml homepage

Rest of Today
● Start our ML journey with an overview
● Work through recitation handout with others at your table
● Ask questions by putting yourself in the help queue
● No worries if no introml access yet; great chance to know your
neighbor (ask them to put you in the queue)
What we're teaching: Machine Learning!
Given:
• a collection of examples (gene sequences, documents, tree sections)
• an encoding of those examples in a computer (as vectors)

Derive:
• a computational model (called a hypothesis) that describes relationships
within and among the examples that is expected to characterize well new
examples from that same population, to make good predictions or decisions
A model might:
• classify images of cells as to whether they're cancerous
• specify groupings (clusters) of documents that address similar topics
• steer a car appropriately given lidar images of the surroundings
Very roughly, ML can be categorized into

(the categorization can be refined, e.g. there are active learning, semi-supervised, selective, contrastive,
few-shot, inverse reinforcement learning… )

[Slides adapted from 6.790]

Supervised learning
Goal: predict to what
degree a drug candidate
binds to the intended
target protein (based on
a dataset of already
screened molecules
against the target)

[Slides adapted from 6.790]

Unsupervised learning dimensionality reduction, embedding
2
Country and Capital Vectors Projected by PCA
China
Beijing
1.5 Russia
Japan
Moscow
1

dependency 0.5
Turkey Ankara Tokyo

/causal Poland

structure [Sachs et al 05] 0 Germany

France Warsaw
Berlin
-0.5 Italy Paris

Greece Athens
Rome
-1 Spain

[Mikolov et al., 2013]

Madrid
-1.5 Portugal
Lisbon

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

Figure 2: Two-dimensional PCA projection of the 1000-dimensional Skip-gram vectors of countries and their

Over 3D protein structures, etc.

capital cities. The figure illustrates ability of the model to automatically organize concepts and learn implicitly
the relationships between them, as during the training we did not provide any supervised information about
what a capital city means.

which is used to replace every log P (wO |wI ) term in the Skip-gram objective. Thus the task is to
distinguish the target word wO from draws from the noise distribution Pn (w) using logistic regres-
sion, where there are k negative samples for each data sample. Our experiments indicate that values
of k in the range 5–20 are useful for small training datasets, while for large datasets the k can be as
small as 2–5. The main difference between the Negative sampling and NCE is that NCE needs both
samples and the numerical probabilities of the noise distribution, while Negative sampling uses only
samples. And while NCE approximately maximizes the log probability of the softmax, this property
is not important for our application.
Both NCE and NEG have the noise distribution Pn (w) as a free parameter. We investigated a number

de-noising diffusion models over images

of choices for Pn (w) and found that the unigram distribution U (w) raised to the 3/4rd power (i.e.,
U (w)3/4 /Z) outperformed significantly the unigram and the uniform distributions, for both NCE
and NEG on every task we tried including language modeling (not reported here).

2.3 Subsampling of Frequent Words

In very large corpora, the most frequent words can easily occur hundreds of millions of times (e.g.,
“in”, “the”, and “a”). Such words usually provide less information value than the rare words. For
example, while the Skip-gram model benefits from observing the co-occurrences of “France” and
“Paris”, it benefits much less from observing the frequent co-occurrences of “France” and “the”, as
nearly every word co-occurs frequently within a sentence with “the”. This idea can also be applied
in the opposite direction; the vector representations of frequent words do not change significantly
after training on several million examples.
[Slides adapted from 6.790]
To counter the imbalance between the rare and frequent words, we used a simple subsampling ap-
proach: each word wi in the training set is discarded with probability computed by the formula
!
t
P (wi ) = 1 − (5)
f (wi )

4
ChatGPT
Reinforcement learning

[Slides adapted from 6.790]

Machine learning (ML): why & what
• What is ML? Roughly, a set of methods for making predictions
and decisions from data.
• Why study ML? To apply; to understand; to evaluate; to create!
• Notes: ML is a tool with pros & cons

• What do we have? Data! And computation!

• What do we want? To make predictions on new data!
• How do we learn to make those decisions?
• The topic of this course!
What do we have?
• There are many different problem classes in ML
• We will first focus on an instance of supervised learning known
as regression.
(Training) data
• n training data points

• For data point

• Feature vector

• Label
• Training data
What do we want?
We want a “good” way to label new feature
vectors
• How to label? Learn a hypothesis

• We typically consider a class of

possible hypotheses
Input: Output:
Feature vector Label

how well our hypothesis labels new feature vectors depends largely
on how expressive the hypothesis class is
What do we want?
We may consider the class of linear
regressors:
• Hypotheses take the form:

parameters to learn Θ
• What we really want is to generalize to future data!
• What we don’t want:
• Model does not capture the input-output relationship (e.g.,
not enough data) —> Underfitting
• Model too specific to training data —> Overfitting
How good is a hypothesis?
Hopefully predict well on future data
• How good is a regressor at one point?

• Quantify the error using a loss

function,
• Common choice: squared loss:

g: guess,
a: actual

• Training error:

• Validation or Test error (n’ new points):

How do we learn?
• Have data; have hypothesis class
• Want to choose (learn) a good
hypothesis (a set of parameters)

What we want:

How to get it: learning

(Next time!) algorithm

Tuo Zhao Notes
No ratings yet
Tuo Zhao Notes
47 pages
An Introduction To Serious Algorithmic Trading
86% (7)
An Introduction To Serious Algorithmic Trading
40 pages
Lecture 1
100% (1)
Lecture 1
51 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
60 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
37 pages
Concepts - of - Machine - Learning (Minor)
No ratings yet
Concepts - of - Machine - Learning (Minor)
14 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Machine Learning Notes22
No ratings yet
Machine Learning Notes22
45 pages
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
No ratings yet
CONCEPTS IN MACHINE LEARNING-Ktunotes - in
14 pages
ML Final
No ratings yet
ML Final
95 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
Lahore University of Management Sciences CS 535/EE 514 Machine Learning
No ratings yet
Lahore University of Management Sciences CS 535/EE 514 Machine Learning
3 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Syl3 ML
No ratings yet
Syl3 ML
5 pages
Introduction To Machine Learning: Pekka Parviainen
No ratings yet
Introduction To Machine Learning: Pekka Parviainen
39 pages
Lesson 4 - Introduction Machine Learning
No ratings yet
Lesson 4 - Introduction Machine Learning
44 pages
ML Handout
No ratings yet
ML Handout
9 pages
Machine Learning CS229/STATS229: Instructors: Moses Charikar, Tengyu Ma, and Chris Re
No ratings yet
Machine Learning CS229/STATS229: Instructors: Moses Charikar, Tengyu Ma, and Chris Re
40 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Topic 1 - Introduction
No ratings yet
Topic 1 - Introduction
30 pages
01 Introduction
No ratings yet
01 Introduction
68 pages
ML Short U1-4
No ratings yet
ML Short U1-4
60 pages
Lecture 1
No ratings yet
Lecture 1
34 pages
Syllabus SML 2024
No ratings yet
Syllabus SML 2024
3 pages
ML 01
No ratings yet
ML 01
24 pages
PDF Machine Learning
100% (1)
PDF Machine Learning
222 pages
ML Module No 01
No ratings yet
ML Module No 01
138 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
Machine Learning: Martin Jaggi & Nicolas Flammarion
No ratings yet
Machine Learning: Martin Jaggi & Nicolas Flammarion
52 pages
COS324 Course Notes
No ratings yet
COS324 Course Notes
256 pages
R18B Tech MinorIVYearISemesterTENTATIVESyllabus
No ratings yet
R18B Tech MinorIVYearISemesterTENTATIVESyllabus
22 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
15 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Chapter Introduction
No ratings yet
Chapter Introduction
7 pages
Mac Unit 3
No ratings yet
Mac Unit 3
65 pages
GML Slides 2024 04 29
No ratings yet
GML Slides 2024 04 29
206 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Chapter I - Neat
No ratings yet
Chapter I - Neat
23 pages
Ad8552 ML Unit Ii
No ratings yet
Ad8552 ML Unit Ii
94 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Poly Aml
No ratings yet
Poly Aml
76 pages
Machine Learning A Lecture Note
No ratings yet
Machine Learning A Lecture Note
111 pages
1 - ML Introduction1
No ratings yet
1 - ML Introduction1
23 pages
CS-871-Lecture 1
No ratings yet
CS-871-Lecture 1
41 pages
Machine Learning 2025
No ratings yet
Machine Learning 2025
111 pages
Final PRINT 2022 SCHEME VI SEM SCHEME & SYLLABUS
No ratings yet
Final PRINT 2022 SCHEME VI SEM SCHEME & SYLLABUS
30 pages
DSC - MachineLearning Regular HO
No ratings yet
DSC - MachineLearning Regular HO
7 pages
ML 23 First Lectures 2 3 v0.1
No ratings yet
ML 23 First Lectures 2 3 v0.1
66 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
01 ML Basics
No ratings yet
01 ML Basics
61 pages
Chapter I - Neat
No ratings yet
Chapter I - Neat
23 pages
LN ML Rug
No ratings yet
LN ML Rug
267 pages
Lecture 1
100% (1)
Lecture 1
81 pages
Planetary Mixer: We Provide Quality Professional Equipment and Services To Artisan Bakeries
No ratings yet
Planetary Mixer: We Provide Quality Professional Equipment and Services To Artisan Bakeries
8 pages
Biology
No ratings yet
Biology
5 pages
Gas Burner Manual: Genisys 7590 Control
No ratings yet
Gas Burner Manual: Genisys 7590 Control
36 pages
Physics Text
No ratings yet
Physics Text
1 page
Lu02 Ecology
No ratings yet
Lu02 Ecology
3 pages
Spiral Mixer: Made in France
No ratings yet
Spiral Mixer: Made in France
8 pages
Cleanup Action Plan Checklist
No ratings yet
Cleanup Action Plan Checklist
10 pages
VHDL Bram
No ratings yet
VHDL Bram
51 pages
1 Robotics Intro Course
No ratings yet
1 Robotics Intro Course
42 pages
ExhibitB 2024 FGPA Information Flyer ENGLISH
No ratings yet
ExhibitB 2024 FGPA Information Flyer ENGLISH
1 page
Gas Burner Selection Guide 051919
No ratings yet
Gas Burner Selection Guide 051919
16 pages
542417cb47f33SS GMH95
No ratings yet
542417cb47f33SS GMH95
12 pages
CY24 047 Terms and Conditions Booklet FGPA FINAL
No ratings yet
CY24 047 Terms and Conditions Booklet FGPA FINAL
9 pages
Brochure Deltav Standard Bms Solution Flyer en Us 166512
No ratings yet
Brochure Deltav Standard Bms Solution Flyer en Us 166512
2 pages
Ccrma Colloq Oct22
No ratings yet
Ccrma Colloq Oct22
91 pages
Om58 132
No ratings yet
Om58 132
10 pages
Mcdoc03339970 Hbe5453uc
No ratings yet
Mcdoc03339970 Hbe5453uc
2 pages
Ehb Ml180uh 1904
No ratings yet
Ehb Ml180uh 1904
20 pages
Pâtisserie and Confectionery
No ratings yet
Pâtisserie and Confectionery
27 pages
Ice Cream Maker: HK170333 Until HK122221
No ratings yet
Ice Cream Maker: HK170333 Until HK122221
3 pages
4 IntroPython
No ratings yet
4 IntroPython
52 pages
Manual 239377 Induction Cooker Web
No ratings yet
Manual 239377 Induction Cooker Web
72 pages
Design Intro
No ratings yet
Design Intro
17 pages
Rev V01
No ratings yet
Rev V01
3 pages
Manual 230268 231753 231340 237540 236567 236574 Dishwashers Electronic WEB
No ratings yet
Manual 230268 231753 231340 237540 236567 236574 Dishwashers Electronic WEB
232 pages
Manual 272411 Lounge Heater Web
No ratings yet
Manual 272411 Lounge Heater Web
128 pages
Rev1 0 Batch NR 181295 and Up
No ratings yet
Rev1 0 Batch NR 181295 and Up
3 pages
V02 Batch NR HK084823 and Up
No ratings yet
V02 Batch NR HK084823 and Up
3 pages
PL Big Line 2024 Rev01 Eng
No ratings yet
PL Big Line 2024 Rev01 Eng
4 pages
Rev1 1 Batch NR HK071519 and Up
No ratings yet
Rev1 1 Batch NR HK071519 and Up
3 pages
Design Analysis and Performance Prediction of Packed Bed Latent Heat Storage System Employing Machine Learning Models
No ratings yet
Design Analysis and Performance Prediction of Packed Bed Latent Heat Storage System Employing Machine Learning Models
17 pages
Factors Influencing Environmental Awareness and So
No ratings yet
Factors Influencing Environmental Awareness and So
23 pages
51 Machine Learning Interview Questions With Answers - Springboard
100% (1)
51 Machine Learning Interview Questions With Answers - Springboard
20 pages
Deep Learning For Software Defect Prediction - A Survey
No ratings yet
Deep Learning For Software Defect Prediction - A Survey
6 pages
Study of Ink Release From Gravure Cell
No ratings yet
Study of Ink Release From Gravure Cell
235 pages
XGBoost WM
No ratings yet
XGBoost WM
39 pages
DWM Exp6 A49
No ratings yet
DWM Exp6 A49
7 pages
SSRN 3808539
No ratings yet
SSRN 3808539
14 pages
ML Unit 1 Solution
No ratings yet
ML Unit 1 Solution
18 pages
Final Synopsis Minor Project
No ratings yet
Final Synopsis Minor Project
7 pages
Unit 6
No ratings yet
Unit 6
8 pages
House DZ RC 158 ML Patterns 2023
No ratings yet
House DZ RC 158 ML Patterns 2023
7 pages
cs224n 2022 Lecture08 Final Project
No ratings yet
cs224n 2022 Lecture08 Final Project
71 pages
Cattle Weight Estimation Using Linear Regression A
No ratings yet
Cattle Weight Estimation Using Linear Regression A
8 pages
MSC-IT Part II Regular Sem 3 Oct 2022
No ratings yet
MSC-IT Part II Regular Sem 3 Oct 2022
6 pages
MLT Kai601 2022-23 External
No ratings yet
MLT Kai601 2022-23 External
36 pages
Unit 4 A
No ratings yet
Unit 4 A
16 pages
CM20315 09 Regularization
No ratings yet
CM20315 09 Regularization
44 pages
Convolutional Neural Network CNN For Image Detection and Recognition
No ratings yet
Convolutional Neural Network CNN For Image Detection and Recognition
5 pages
END of Year Project Report
No ratings yet
END of Year Project Report
44 pages
Ayomitide Assignment
No ratings yet
Ayomitide Assignment
7 pages
p3 Assesslearners Report
No ratings yet
p3 Assesslearners Report
7 pages
Predicting The Price of Bitcoin Using Machine Learning
No ratings yet
Predicting The Price of Bitcoin Using Machine Learning
5 pages
Ay-Sem8-Internship Report
No ratings yet
Ay-Sem8-Internship Report
34 pages
Cross-Validation and Model Selection
No ratings yet
Cross-Validation and Model Selection
46 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
61 pages
SFace Loss
No ratings yet
SFace Loss
12 pages
Unit V Classification
No ratings yet
Unit V Classification
69 pages

Intro Slides

Uploaded by

Intro Slides

Uploaded by

Spring 2023!

Logistical issues? Personal concerns?

Lab plus ~7 awesome LAs

Lab plus ~5 awesome LAs

Lab plus ~5 awesome LAs

plus ~6 awesome LAs

plus ~5 awesome LAs

plus ~6 awesome LAs

Programming (e.g. as in 6.009 or 6.006)

• numpy (Python package for matrix/linear algebra)

[Slides adapted from 6.790]

[Slides adapted from 6.790]

structure [Sachs et al 05] 0 Germany

[Mikolov et al., 2013]

Over 3D protein structures, etc.

de-noising diffusion models over images

2.3 Subsampling of Frequent Words

[Slides adapted from 6.790]

• What do we have? Data! And computation!

• For data point

• We typically consider a class of

• Quantify the error using a loss

• Validation or Test error (n’ new points):

How to get it: learning

You might also like