0% found this document useful (0 votes)

27 views43 pages

00intro 1

The document outlines the course structure for a Machine Learning class at The University of Chicago, led by Risi Kondor, covering topics such as clustering, regression, and deep learning. It details prerequisites, support resources, grading criteria, and the distinction between applied and theoretical machine learning. Additionally, it discusses the evolution of machine learning from classical AI, emphasizing its practical applications and various learning paradigms.

Uploaded by

zhanghaojing62

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views43 pages

00intro 1

Uploaded by

zhanghaojing62

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Topic 0: Introduction

STAT 37710/CAAM 37710/CMSC 35400 Machine Learning

Risi Kondor, The University of Chicago
Instructors

Risi Kondor (associate prof.)

Crerar 221
[email protected]

TAs:
Su Yeong Lee (CAAM)
Kexiang Wang (CAAM)

2
2/43
/43
Topics
1. Clustering
2. Dimensionality reduction
3. Manifold learning
4. Regression
5. Online algorithms
6. Kernel methods (Hilbert space algorithms)
7. Bayesian learning
8. Deep learning
9. Generative models

Note: this list is provisional and almost certain to change.

3
3/43
/43
Prerequisites

• Competence in coding in some programming language.

• Mathematical maturity: ML is a mathematical subject.
• Specific areas of math needed:
◦ Calculus
◦ Linear algebra
◦ Probability (minimal Statistics)
◦ Little bit of optimization.

4
4/43
/43
Support

Recitations:
• On an as needed basis, place and time TBD
Office Hours:
• Fridays 1pm Crerar 221
Online:
• canvas.uchicago.edu (slides, lecture notes, assignments and grades)

5
5/43
/43
Resources
Books (Strictly optional! More for “further reading” than anything else.)
• Kevin Murphy: Machine Learning: A probabilistic perspective (2012)
Warning: very Bayesian
• Zhang, Lipton, Li and Smola: Dive into deep learning (d2l.ai)
• Hastie, Tibshirani, Friedman: The Elements of Statistical Learning (2008)
(available electronically on the library’s web site)

Online Courses
• Andrew White’s book “Deep learning for molecules and materials”
https://dmol.pub/index.html

Links to more books, papers and videos will be posted on Canvas.

6
6/43
/43
Credit

• Assignments/projects (posted on Canvas): ∼ 50%

◦ Project centered course: one assignment for each topic.
◦ Projects involve coding up algorithms discussed in class and running them on
data.
◦ Recommended language: Python.
◦ Submitted work must be your own. Discussing problems is okay but must be
acknowledged. Code and parts of the writeup cannot be shared.
◦ Submission in .pdf via Canvas. Penalty for late submissions: 20% for 24
hours, 40% for 48 hours. No partial late homeworks.
◦ For typing up assignments LATEX is strongly preferred.
• Midterm ∼ 20%
• Final ∼ 30%

7
7/43
/43
8
8/43
/43
What is Machine Learning?
Two types of programming

1. Explicit: write a program that tells the computer what to do.

2. Learning: write a program that tells the computer how to learn what to do
from data. → This is what Machine Learning is about.

10
/43
10/43
11
11/43
/43
Machine Learning in the abstract

Given a training set {(x1 , y1 ) , (x2 , y2 ) , . . . , (xm , ym )} learn a function

f : x 7→ y

to predict the y ’s corresonding to future x ’s.

In particular, those in the test set

{(x′ 1 , y ′ 1 ) , (x′ 2 , y ′ 2 ) , . . . , (x′ m′ , y ′ m′ )} .

Actually, this is supervised learning. Modern ML also encompasses many

• Each (x, y) pair is called an example (or learning instance).

• x is called the input ( x ∈ X , where X is the input space ) .
• y is called the output ( y ∈ Y , where Y is the output space )
• The learned function
f: X →Y
is called the hypothesis (because the algorithm can never be sure how
close it is to the “truth”).
• The space F from which the algorithm chooses f is called the
hypothesis class.

13
13/43
/43
Deductive vs. inductive inference

• Deductive inference:
rules −→ data

• Inductive inference:
data −→ rules

ML is all about inductive inference → “Brave New Science of Data”.

Humans are experts at induction. However, ML takes a different approach.

Question: Give examples of inductive vs. deductive inferential processes.

Question: What are the relative strengths of humans vs. machines in
learning?

14
14/43
/43
Typical ML task 1: Regression

15
15/43
/43
Typical ML task 2: Classification

16
16/43
/43
Typical ML task 3: Ranking

Internet search
Elections Sports

17
17/43
/43
Typical ML task 4: Clustering

18
18/43
/43
ML task 5: Dimensionality Reduction

19
/43
19/43
Applied vs. theoretical ML

• Practitioners focus on solving real-world problems with ML (building

autonomous cars, finding disease genes, earning lots of money, etc.).
• Theorists work on devising new general purpose learning algorithms and
analyzing their behavior.
“Much of the art of machine learning is to reduce a range of disparate
problems to a fairly narrow set of prototypes. Much of the science of machine
learning is to then solve those problems and provide good guarantees.”
(Smola & Vishwanathan)

This course will focus on the fundamental algorithms rather than specific
applications.

20
/43
20/43
Origins: Classical Artificial Intelligence
AI vs. ML

Solves practical problems

Attempts to replicate human in- which humans think require
telligence in general. intelligence.

22
22/43
/43
Early attempts

The “Mechanical Turk” (Wolfgang von Kempelen, 1770)

23
23/43
/43
Formal reasoning = intelligence?

• Formal logic (Frege (1879) and others)

• Mathematics as a formal system (Russell & Whitehead, ∼ 1910)
• Gödel’s incompleteness results (1931)
• Turing machines and universality (1936)

“Since formal systems are the pinnacle of human achievement, intelligence

must be synonymous with formal reasoning.”

24
24/43
/43
Is the brain just a computer?
Pitts & McCullogh show that neurons appear to perform simple logical
operations (1943)

“So if all that the brain does is such mechanistic operations, then it should be
easy to imitate on Turing machines (i.e., computers)”

25
/43
25/43
The Turing test
In his landmark 1950 paper “Computing Machinery and Intelligence” Turing
proposes a positivist approach: “If a machine can fool a human into thinking
that it is a human, then it must be intelligent” → Weak AI

Prediction: “By the year 2000 machines with 120MB of memory would be
able to fool 30% of human judges in a 5min test”.

26
26/43
/43
Objections to the Turing test
Even if a computer passes the Turing test it cannot be truly intelligent
because...
1. Theological: computers have no soul
2. “Head in the sand”: it would be too scary
3. Mathematical: Godel incompleteness and such
4. Consciousness: Searle’s Chinese room argument
5. Disabilities: a machine will never be able to do fall in love/invent jokes/tell
right from wrong/etc.
6. Lady Lovelace’s: will never do anything original
7. The brain is not digital
8. The brain is not predictable
9. Extra-sensory perception

27
27/43
/43
28
28/43
/43
The Dartmouth conference (1956)

John McCarthy Marvin Minsky Allen Newell Herbert Simon

(1927–2011) (1927–2016) (1927–1992) (1916–2001)

”within a generation ... the problem of creating ’artificial intelligence’ will

substantially be solved” (Minsky)

29
29/43
/43
True beginnings: from philosophy to
building things
“We propose that a 2 month, 10 man study of artificial intelligence be carried
out during the summer of 1956 at Dartmouth College in Hanover, New
Hampshire. The study is to proceed on the basis of the conjecture that every
aspect of learning or any other feature of intelligence can in principle be so
precisely described that a machine can be made to simulate it. An attempt
will be made to find how to make machines use language, form abstractions
and concepts, solve kinds of problems now reserved for humans, and
improve themselves. We think that a significant advance can be made in one
or more of these problems if a carefully selected group of scientists work on it
together for a summer.”

McCarthy et al., 1955

30
/43
30/43
Early successes

• Newell and Simon’s “General Problem Solver” (1959)

• ELIZA (Weizenbaum, 1966)
• SHRDLU’s block world (Winograd 68–70)
• Prolog and expert systems 70’s–

31
31/43
/43
AI winters ’74-’80, ’87-’93

32
32/43
/43
New beginnings: Machine Learning
The birth of Machine Learning

Starting in late ’80’s, AI was transformed by a sequence of outside influences:

• Efficiently trainable neural network models
• Input from Physics community
• Influence of Bayesian Statistics
• Black box “geometric” learning algorithms
• Huge influence of the internet
• Firm foundations in Statistics
• Strong connections to optimization, signal processing, harmonic analysis,
probability, CS theory, ...
• MASSIVE PRACTICAL DEMAND

34
/43
34/43
The old vs. the new AI

Early: aiming for “general intelligence”, trying to imitate humans, tangled up

in formal systems and philosophy

New: pragmatic, focused on specific tasks, much closer ties to math and
statistics than neuroscience and logic, driver behind lots of technologies

Question: Classically, the subject that deals with the art of learning from data
is Statistics. So is ML just a branch of Statistics? No.

35
/43
35/43
Statistics
Nonaparametric statistics
Bayesian statistics
Probability
Empirical Process Theory

Computer Science
Artificial Intelligence
Computational Learning Th
Complexity Theory
Randomized Algorithms
Machine Databases
Learning Distributed Systems

Mathematics
Functional Analysis
Random geometry
Optimization
Numerical analysis

36
36/43
/43
Applications
NLP
Speech recogni-
tion
Translation
Computer Vision Summarization
Object detection Grading
Object recognition Search & rec.
Structure from motion Web search
Collaborative filtering
Ad placement

etc., etc.
Machine
Robotics
Learning
Autonomous vehicles
Robot assistants

Medical
Detection & imaging Finance
Automated diagno- High freq. trading
sis Portfolio selec-
tion
Comp Bio Risk analysis
Protein structure
Systems bio

37
37/43
/43
Hallmarks of ML
ML is ambitious:
• Datasets are often very high dimensional (∼ O(105 )) .
• Data is often abstract (structured objects vs. just vectors).
• Datasets are massive (∼ O(108 ) examples ) .
• Really want to build actual systems that work.

ML is brutal:
• Don’t need to think hard about the domain because with enough data,
even black box algorithms work really well (really?).
• Butcher the statistics as much as necessary to get an algorithm which
actually runs.
• Insist on algorithms that run in time
O(m3 ) → O(m2 ) → O(m) → o(m) .

38
38/43
/43
Taxonomy of Machine Learning
Taxonomy of machine learning 1.

Based on the output space Y :

• Classification: Y = {+1, −1}

Examples: spam/not spam, genuine/fraud, boy/girl,…
(generalization: multiclass classification Y = {1, 2, . . . , k} )
• Regression: Y = R
Examples: predict temperature tomorrow, price of a stock,…
(generalization: Y = Rd )
• Ranking: Y = Sn (group of permutations)
• Structured outputs: Y = anything
Examples: translate from Chinese to English, predict folding of protein,…

40
40/43
/43
Taxonomy of machine learning 2.

Based on the nature of the training data:

• Supervised learning: given {(xi , yi )}m

i=1 , learn f : X → Y .
Examples: classification, regression, …
• Unsupervised learning: given {x}m
i=1 , say something.
Examples: clustering, density estimation, dimensionality reduction,…
• Semi–supervised learning: given a (small) amount of labeled data
p
{(xi , yi )}m
i=1 and a (large) amount of unlabeled data {x}i=m+1 , learn
f : X → Y . Examples: learning parse trees, image search

41
41/43
/43
Taxonomy of machine learning 3.

Based on how the data is presented to the learner:

• Batch learning: see whole training set first, then predict on test
examples.
• Online learning: examples are presented one-by-one, first try and
predict yt , then find out what yt really is and learn from it.
• Transductive learning: like batch, but know test x′i ’s at training time.
• Active learning: algorithm can ask for next data point
• Reinforcement learning: exploring the world incurs a cost (games,
robotic control)

42
42/43
/43
Taxonomy of machine learning 4.

Based on the nature of the relationship between x and y :

• Deterministic: x fully determines y , so there is some ftrue out there so

that
y = ftrue (x).
• Stochasitic: x does not fully determine y , rather, for any given x , y is
drawn from some probability distribution px (y) .

In practical problems, invariably, we cannot assume a deterministic

relationship between inputs and outputs, so we use the stochastic model.

43
43/43
/43

Lecture 1
100% (1)
Lecture 1
81 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
17 pages
01-Introduction - Shared PDF
No ratings yet
01-Introduction - Shared PDF
71 pages
CHAP Introduction 1.2 Environmental Data Science 18p
No ratings yet
CHAP Introduction 1.2 Environmental Data Science 18p
18 pages
Machine Learning: ML by Poonam Dhamal
No ratings yet
Machine Learning: ML by Poonam Dhamal
72 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
37 pages
Overview of Machine Learning
No ratings yet
Overview of Machine Learning
60 pages
Intro ML Lecture 1
No ratings yet
Intro ML Lecture 1
9 pages
ML
No ratings yet
ML
18 pages
1 - ML Introduction1
No ratings yet
1 - ML Introduction1
23 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
Unit-1 ML
No ratings yet
Unit-1 ML
23 pages
ML PPT1
No ratings yet
ML PPT1
70 pages
1 Introduction
No ratings yet
1 Introduction
24 pages
ML 1
No ratings yet
ML 1
13 pages
MAI Lecture 01 Introduction
No ratings yet
MAI Lecture 01 Introduction
52 pages
Faiml Unit 2
No ratings yet
Faiml Unit 2
7 pages
Overview of Machine Learning PDF
100% (1)
Overview of Machine Learning PDF
57 pages
Elements of Machine Learning
No ratings yet
Elements of Machine Learning
116 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
60 pages
Week09a Intro ML
No ratings yet
Week09a Intro ML
17 pages
Unit 3
No ratings yet
Unit 3
47 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
74 pages
Unit-I - Machine Learning Concepts
No ratings yet
Unit-I - Machine Learning Concepts
135 pages
Motivation 24111
No ratings yet
Motivation 24111
23 pages
Machine Learning: Introducing
No ratings yet
Machine Learning: Introducing
18 pages
Report On Machine Learning
No ratings yet
Report On Machine Learning
13 pages
L001 Introduction
No ratings yet
L001 Introduction
15 pages
ML-UNIT - I - Part A
No ratings yet
ML-UNIT - I - Part A
88 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
23 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
63 pages
Artificial Intelligence & Machine Learning Digital Notes
100% (2)
Artificial Intelligence & Machine Learning Digital Notes
116 pages
1.2.1 ML Intro
No ratings yet
1.2.1 ML Intro
18 pages
Unit 1
No ratings yet
Unit 1
88 pages
Lecture 13 Intro Machine Learning
No ratings yet
Lecture 13 Intro Machine Learning
56 pages
Alzubi 2018 J. Phys. Conf. Ser. 1142 012012
No ratings yet
Alzubi 2018 J. Phys. Conf. Ser. 1142 012012
23 pages
Ai For Biginners (Autosaved)
No ratings yet
Ai For Biginners (Autosaved)
135 pages
FML CSOE-007 FML B Tech 6th Sem OE Till 9th Feb 2024
No ratings yet
FML CSOE-007 FML B Tech 6th Sem OE Till 9th Feb 2024
134 pages
LM #01-Introduction To ML
No ratings yet
LM #01-Introduction To ML
33 pages
01 Introduction
No ratings yet
01 Introduction
68 pages
Lecture - 1 Introduction To ML
No ratings yet
Lecture - 1 Introduction To ML
38 pages
A Beginner's Guide To Machine Learning Fundamentals (Compressed)
No ratings yet
A Beginner's Guide To Machine Learning Fundamentals (Compressed)
10 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
L21 Intro ML
No ratings yet
L21 Intro ML
30 pages
ML 01
No ratings yet
ML 01
15 pages
Karthik
No ratings yet
Karthik
10 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Lec1 PDF
No ratings yet
Lec1 PDF
16 pages
MLT Unit 1 Notes
No ratings yet
MLT Unit 1 Notes
29 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
Lecture 01 - Machine Learning Basics Revision
No ratings yet
Lecture 01 - Machine Learning Basics Revision
80 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Machine - Learning-MBA-unit-3 Machine - Learning-MBA-unit-3
No ratings yet
Machine - Learning-MBA-unit-3 Machine - Learning-MBA-unit-3
36 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
Learning in Big Data: Introduction To Machine Learning
No ratings yet
Learning in Big Data: Introduction To Machine Learning
25 pages
Machin Learning
No ratings yet
Machin Learning
6 pages
ML Microst
No ratings yet
ML Microst
264 pages
ML 23 First Lectures 2 3 v0.1
No ratings yet
ML 23 First Lectures 2 3 v0.1
66 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Big O Noation
No ratings yet
Big O Noation
20 pages
Field Theory
100% (1)
Field Theory
97 pages
BPR: Bayesian Personalized Ranking From Implicit Feedback: Rendle Et Al. 452 UAI 2009
No ratings yet
BPR: Bayesian Personalized Ranking From Implicit Feedback: Rendle Et Al. 452 UAI 2009
10 pages
2021-1 A
No ratings yet
2021-1 A
2 pages
CSC3113 Lec06
No ratings yet
CSC3113 Lec06
30 pages
Operations Research
No ratings yet
Operations Research
47 pages
Application of Sentiment Analysis On Product Review E-Commerce
No ratings yet
Application of Sentiment Analysis On Product Review E-Commerce
9 pages
Skin Cancer Detection
No ratings yet
Skin Cancer Detection
16 pages
Topic 2.1-2.3 - Definition of The Derivative - SOLUTIONS
No ratings yet
Topic 2.1-2.3 - Definition of The Derivative - SOLUTIONS
3 pages
Management Science
No ratings yet
Management Science
36 pages
20 Short Questions
No ratings yet
20 Short Questions
11 pages
Analytic II - HW3 - 1106
No ratings yet
Analytic II - HW3 - 1106
6 pages
Datastructure CT
No ratings yet
Datastructure CT
2 pages
Physics-Informed Neural Networks For Modeling Physiological Time Series For Cuf Ess Blood Pressure Estimation
No ratings yet
Physics-Informed Neural Networks For Modeling Physiological Time Series For Cuf Ess Blood Pressure Estimation
15 pages
Mastering EES Chapter1
No ratings yet
Mastering EES Chapter1
100 pages
Dsoop (Co 221) - 1
No ratings yet
Dsoop (Co 221) - 1
20 pages
ML Unit 4
No ratings yet
ML Unit 4
28 pages
Unit 1
No ratings yet
Unit 1
6 pages
Represent Real-Life Situations Using Exponential Function
No ratings yet
Represent Real-Life Situations Using Exponential Function
14 pages
Poly Graduate Employment Survey 2017
No ratings yet
Poly Graduate Employment Survey 2017
12 pages
6csa5 NSP Report DR - Neha Mahala. I
No ratings yet
6csa5 NSP Report DR - Neha Mahala. I
34 pages
Space & Time Complexity Chart
No ratings yet
Space & Time Complexity Chart
2 pages
Data Science in Engineering,: Ramin Madarshahian Francois Hemez Editors
No ratings yet
Data Science in Engineering,: Ramin Madarshahian Francois Hemez Editors
158 pages
Local Search, Hill Climbing, Simulated Annealing Genetic Algo
No ratings yet
Local Search, Hill Climbing, Simulated Annealing Genetic Algo
32 pages
Implementation of STFT For Auditory Compensation On Fpga: Objectives
No ratings yet
Implementation of STFT For Auditory Compensation On Fpga: Objectives
1 page
33.real Time Drowsy Driver Detection in Matlab
No ratings yet
33.real Time Drowsy Driver Detection in Matlab
5 pages
Case Study Report 2: 2020 Busa3015 - Business Forecasting
No ratings yet
Case Study Report 2: 2020 Busa3015 - Business Forecasting
7 pages
New Second-Order Limiting Directional Derivatives and C - Optimization
No ratings yet
New Second-Order Limiting Directional Derivatives and C - Optimization
20 pages
Report Cover
No ratings yet
Report Cover
9 pages
مدل سازی و بهینه سازی چند هدفه پارامترهای عملیاتی در آسیای نیمه خودشکن
No ratings yet
مدل سازی و بهینه سازی چند هدفه پارامترهای عملیاتی در آسیای نیمه خودشکن
13 pages