0% found this document useful (0 votes)

38 views51 pages

(Fall 2024) Intro To ML

Uploaded by

David Earnest

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views51 pages

(Fall 2024) Intro To ML

Uploaded by

David Earnest

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Introduction to ML

By: ML@B Edu Team

● Meet your course staff!
Outline ● Logistics
● Course introduction
● Intro to ML
Meet your course staff!
Course staff

Saathvik Selvan Vanessa Teo Rohan

Viswanathan

Derek Xu Chuyi Shang Eric Wang

Logistics
Logistics
● Website: ml.berkeley.edu/decal
● Edstem: https://edstem.org/us/join/bdSzpg
● Gradescope code: PY37RE
● Ofﬁce Hours times/location is TBD
○ Most likely some time on Thursday or Friday and most likely some location in Cory or Soda
● Enrollment codes have been sent out, make sure to use them!
● Syllabus is on the website, make sure to read it for more details and policies!
● We will take attendance starting Wednesday. Everyone is excused today.
● Course communication: make a private post on Edstem!
Brief Outline

Deep Learning Fundamental CV Transformers Generative AI

● Intro to ML ● Images and ● Sequence Modeling ● Variational
● Intro to Neural Convolutions ● Transformers Autoencoders
Networks ● Convolutional ● Vision ● Intro to GANs
● Optimization and Neural Networks Transformers ● Advanced GANs
Modern Deep ● Advanced CNNs ● Multimodality ● Vector Quantization
Learning ● Object Detection ● Self-supervised ● Intro to Diffusion
● Representations ● Segmentation Learning ● Diffusion
and Transfer ● Advanced Applications
Learning Detection and
Segmentation
Grading Breakdown
This course is graded on a PNP basis. You need a 70% to pass the course. Here is how
the points will be distributed this semester:

● Attendance (10%) — excused just for today

● Weekly Quizzes (20%) — quiz 1 due next monday Sep 16
● Programming Assignment 1 (10%) — homework 1 due monday Sep 23
● Programming Assignment 2 (20%)
● Programming Assignment 3 (20%)
● Programming Assignment 4 (20%)
Course Introduction
What is Computer Vision?
● Computer vision is broadly the subset of AI that deals with images
● It is used to clear checks, deliver mail, drive cars, and create art
● This course is a bootcamp in computer vision as it intersects with deep learning
Basic computer vision tasks Large-scale unsupervised learning

3D Vision Text-Based Image Generation

Introduction to ML
What is ML?
● ML is the paradigm of approximating a
function from data
○ A function here is just a set of rules that takes
in an input and spits out some output (like a
label or a predicted value)

● Why ML instead of programming the functions ourselves?

○ Sometimes we can’t possibly understand the patterns in our data, so it is extremely hard to come up
with these rules!
○ ML is fundamentally the process of allowing our data to guide a function’s creation
input

Nope
for i in range(10, 30):
if image[10][i] > 0.5:
count += 1
if low_thresh < count <
high_thresh:
return 7

Challenge: write a function to classify digits?

Is there some way of separating 7’s
from other digits?

Challenge: write a function to classify digits?

Narrowing in on ML
● Think of it as template creation!
○ When we usually define a function by hand, we have to
specify EVERYTHING
○ With ML, we are going to define a function (with math), but
leave out a few free parameters that will be learned from the
data: these will dictate the exact behavior of the function
○ For now, don’t think about the process of learning good
choices of parameters… that will come later!
● Example:
○ We will define our function to have the form:
if (input < a ) –> output1, else –> output2,
and learn the best value of a from our data
○ Here ‘a’ is the free parameter that specifies the exact
behavior of our example function
Note: This function is just a hypothesized
function that we hope will work well based on
what the data looks like

What would a good value for ‘a’ be?

Input dimension

What would a good value for ‘a’ be? Probably a = 1

Input dimension

Some function choices work better than others, no matter how well you choose your parameters.
Why might this function not work as well?
(Hint: we switched the inequality)
Previously, we had a single point, above which
things were blue, and red otherwise. However,
this strategy doesn’t really work in 2D…

Now, we might try and hypothesize that a 1D

line separates the data instead, above which all
points are blue but red below.

This is our FUNCTION that we are

hypothesizing exists… a 1D line in the form

y = mx+b

In this case, our parameters are m (the slope)

and b (the intercept/offset)
2D Example
This line
y = -1.09*x + 2.09
ends up being about as good as we can possibly
get, if we classify everything above the line as
blue and everything below as red

This isn’t perfect, but again, it’s not really

possible to do any better

Don’t worry for now about these values were

calculated or why it’s the best line

2D Example
This idea continues on well beyond 2D as well.
Here, our data is in 3D and we hypothesize that
a 2D plane can separate the data, above which
points are marked blue, below which they are
marked red… and this again is our function
deﬁnition.

This can further continue on forever into higher

dimensions!

The challenge is that we can’t immediately

visualize higher dimensional data, so it will be
difﬁcult to say if the data will nicely separate
along some linear boundary like this or not…

3D and so on…
Narrowing in on ML
● The art of ML is the following:
○ What form our function takes → this can be referred to as a model class
○ What specific parts of this function we are allowed to learn → these are our parameters
○ How we learn these parameters to approximate their “best” possible values
■ We will talk about this more later
● Every ML algorithm you will ever learn follows this pattern
○ Describe the generic form of a function with free parameters
○ Use the data to decide what free parameters will work best
● This is super important, PLEASE ASK QUESTIONS IF YOU HAVE THEM,
PLEASE ASK THEM, YOU ARE EXPECTED TO STILL BE SOMEWHAT IN THE
DARK HERE
Taxonomy of ML
● We’ve now got a definition of ML that describes ALL of ML in a way that is broad
enough to capture everything
● The set of problems in ML are super varied and it is often useful to have some
framework for how to classify different types of problems
Types of Machine Learning
Vocab
● Function / Model
○ These terms are used interchangeably
○ These refer to the function template (the “model class”) we have chosen for our problem
● Weights (and Biases)
○ Another way to denote the parameters in ML models that are learned from data
● Hyperparameter
○ This is some non-learnable parameter (like model size, model type, details about training procedure,
etc) that further specifies our overall learnable function
○ We need to manually choose these ourselves before we start learning the learnable parameters
● Loss Function / Cost Function / Risk Function
○ We haven’t introduced these terms yet, but they will come up later; just note that they are the same
(at least for our purposes)
Vocab
● “Feature”
○ This can refer to bits of our data (either the inputs themselves or some representation of them) that
we feed as input to a model
○ Ex: for a house, you might input quantities like its “number of bedrooms”, “number of floors”, “area in
square feet”, “cost of construction” etc. into a model that is trying to predict its price
○ Ex: for an image input, you squish its pixel values into a vector OR extract things like corners, edges,
shapes from it — these are both different “features” of the same image that can be fed into a model!
ML Pipeline
1. Define the Problem
2. Prepare the Data
3. Define the model + loss function
4. Minimize the loss function (train the model)
5. DONE!
Define the Problem
Define the Problem
● What task are you trying to solve with ML?
● What do your inputs look like?
● What should your outputs look like?
● What is our metric for success on a project level? What do we hope to achieve?
Prepare the Data
Data Representation / Preparation
● Collecting the data
○ Don’t take this for granted in the real world… garbage in ⇒ garbage out
● We need to represent our data with numbers
○ We need to go from text –> numbers
○ We need to go from image files –> numbers
○ Every data point needs to be represented with numbers in some way
● Feature Selection / Scaling
○ Finding which parts of the data are important and should be included as inputs to a model
○ May want to rescale some features so they’re all in the same range of values: normalization
● Vectors are one of the most basic and important representations of data
○ Basically take the important numbers and put them all in a vector (1d matrix) in a specific order
Case Study: Representing Labels
● One Hot Labeling
○ One of the most common labeling schemes for multi-class classification
■ Classification is a problem where you want a model to discern between ‘n’ different kinds of inputs,
like the problem of digit recognition
○ Instead of having a label of “4” to indicate the 4th class, make the label look like:
■ [0, 0, 0, 1, 0, 0, 0, … ]
○ In other words, put a 1 in the ith position of an all zeros vector to indicate the ith class
○ This scheme lets us view labels as probability distributions
■ Instead of simply saying that a data point is labeled as class 4 (see example above), we can say that is
has a 100% probability of belong to class 4 and 0% probability of belonging to any other class
■ This is especially useful since, as we will see next time, our models will output a probability
distribution over classes as well. For example, [0, 0, 0.1, 0.75, 0.15, 0, 0, …] might be an output where
the model thinks that a sample has 10% probability of belonging to class 3, 75% probability for class
4 and 15% for class 5.
Augmenting the Data
● We might want more data than we have, what can we do?
● We will find a bunch of transforms that don’t semantically change our data, i.e.,
both an input and its transformed version should have the same label
● Images:
○ We can add noise to images or blur/sharpen them slightly
○ We can rotate images or warp them a little bit
● Text:
○ We can replace some words with known synonyms
● This artificially gives us more examples to use during training
Augmented Data

Note — be very aware of what

your data looks like before
selecting an augmentation. Is a
rotated 9 still a 9?
Partitioning the Data
● In ML, we want to know how well our systems generalize
● We want to see how well these models perform on data they haven’t seen before
○ ML is useless unless it can work for new data in the real world
○ We need to have specific data that we set aside to test generalization with: data our model hasn’t
seen before during its training phase
● We make 3 splits of our data (ratio of these splits vary):
○ Training data: data used for optimizing the parameters
○ Validation data: data used to diagnose the training stage; to help select the kinds of models and
techniques that perform the best for the current problem
○ Testing data: data used for testing a model’s generalization only at the very end of the process
● There are more advanced ways of doing this (not covered here)
○ Ex. K-fold cross validation
Define the Model and Loss Function
Define the Model
● This is where we define our model
○ We are going to talk a LOT about different kinds of models in this course; don’t worry too much if
you don’t know any of them right now
● Which model can be used to solve our ML problem?
○ Some modeling techniques only work for very specific tasks
● Which model can best capture the structure of our data?
○ You may not know this right away but, for now, take an educated guess!
○ Remember what your model outputs should look like:
■ Will it be a single value, multiple values, images, text, etc?
○ We can also train different models on our training data, test them on the validation data and choose
the best-performing one
■ This is called hyperparameter tuning
○ Don’t spam model tests, quality over quantity
Define the Loss
● Reminder: we want to learn an optimal selection of parameters
○ We need some metric to optimize for
○ Different models will have different ways of learning specific parameters, but ALL of them will try to
optimize some kind of metric/function
● Once you have a model and your data, your job will be to minimize a loss function
○ High loss ⇒ bad parameters, low loss ⇒ good parameters
● Example: Supervised Learning
○ In supervised learning, we want our model’s output to match some labels, both of which are vectors
of the same shape
○ We can define the loss as the Mean Squared Error between our labels and model predictions
Train the Model,
Minimize the Loss
Training
● By now, we have our data, model, and loss function selected
● Now is the phase when we apply different algorithms to select the best
parameters using the training dataset
● Different models have different training procedures
○ These will be covered when we introduce each model
DONE!
Finishing Steps
● Use your testing set to measure your model’s final performance
● If it is a common, widely-available dataset, you can use it to compare your results
against state-of-the-art systems and see how you stack up!
○ MNIST (this is the handwritten digits dataset you saw earlier) is often used for proof-of-concepts
○ Imagenet is a major benchmark for image classification and generation
Generalization in ML
Generalization
● How well a model generalizes can be
characterized by the difference between
its performance on data it has seen vs
not seen
● If a model is made more “complex”, it
might be able to learn more “complex
patterns” but we also risk simply
memorizing the training data instead of
truly learning anything from it
Bias / Variance
● Bias:
○ A tendency towards certain predictions
○ How wrong is the model on average, regardless
of its training data?
● Variance:
○ How sensitive is the model to changes in the
training data?
○ Small changes in dataset → large changes in our
model and its predictions
● A good model needs to be both firm and
flexible: able to capture varying and
complex data yet robust enough to
generalize beyond just the training samples
How different hyperparameter settings (“model complexity”) can
affect generalization
Overfitting and Underfitting
● When we train (or perform hyperparameter tuning), we care about generalization
○ We need to hold out a small segment of our data to test our model with as we train (validation set)
○ We care about the discrepancy between training and validation metrics, as it is a good proxy for the
model’s final generalization capabilities on the unseen test set
● This involves a balance between the bias and variance errors
○ Underfitting: the model performs poorly on both the training and validation data
■ This indicates that you can likely increase model complexity without taking too much of a hit
to generalization performance
■ In terms of bias/variance, this means you have a high bias error
○ Overfitting: the model performs great on the training data but poorly on the validation data
■ This indicates that our model is in some way too complex and has started memorizing instead
of learning; it needs to be scaled down
■ In terms of bias/variance, this means you have a high variance error
Wrapping Up
Important Takeaways
● ML is template creation
● ML Pipeline
○ Define the problem
○ Prepare the data
○ Define the model and a loss function
○ Train the model
○ Report results
● There frequently exists a tradeoff between model complexity and generalization,
known as the bias-variance tradeoff
Contributors
● Slides by Jake Austin and Brian Liu
● Edited by Aryan Jain

Kenya Medical Training College Proposal
33% (3)
Kenya Medical Training College Proposal
13 pages
The Machine Learning Lifecycle in 2021
No ratings yet
The Machine Learning Lifecycle in 2021
20 pages
EE353 - 769 00 Course Introduction
No ratings yet
EE353 - 769 00 Course Introduction
28 pages
Andrew NG Complete Machine Learning
No ratings yet
Andrew NG Complete Machine Learning
170 pages
LXV50 2stroke Workshop Manual PDF
No ratings yet
LXV50 2stroke Workshop Manual PDF
162 pages
Config WCM
100% (1)
Config WCM
17 pages
Module 4 - Measurement of Angles and Directions
No ratings yet
Module 4 - Measurement of Angles and Directions
12 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
60 pages
20ECE633T Machine Learning in VLSI
No ratings yet
20ECE633T Machine Learning in VLSI
81 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Lecture 2
No ratings yet
Lecture 2
26 pages
Lecture 1 Course Introduction
No ratings yet
Lecture 1 Course Introduction
18 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
Ba K 0106 1 en
No ratings yet
Ba K 0106 1 en
20 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Building A ML System
No ratings yet
Building A ML System
42 pages
01.black Box ML
No ratings yet
01.black Box ML
67 pages
ML Revision
No ratings yet
ML Revision
207 pages
Learning AI Development With UX
No ratings yet
Learning AI Development With UX
41 pages
01 02 Intro
No ratings yet
01 02 Intro
11 pages
EECE 490: Introduction To Machine Learning: Chapter 2: Preparing Data For Statistical Machine Learning Algorithms
No ratings yet
EECE 490: Introduction To Machine Learning: Chapter 2: Preparing Data For Statistical Machine Learning Algorithms
105 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Lecture 4 - Basics of ML
No ratings yet
Lecture 4 - Basics of ML
59 pages
Week01 Intro AI
No ratings yet
Week01 Intro AI
53 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Unit II
No ratings yet
Unit II
14 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
10 pages
Unit 1
No ratings yet
Unit 1
38 pages
Intro
No ratings yet
Intro
38 pages
GML Slides 2024 04 29
No ratings yet
GML Slides 2024 04 29
206 pages
State of The Art Research Methodology For Machine
No ratings yet
State of The Art Research Methodology For Machine
58 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Machine Learning Updated
No ratings yet
Machine Learning Updated
14 pages
03 Machine Learning Overview
No ratings yet
03 Machine Learning Overview
24 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
Fall2024 W4995 Lecture1
No ratings yet
Fall2024 W4995 Lecture1
110 pages
Module 4
No ratings yet
Module 4
28 pages
Lesson 4 - Introduction Machine Learning
No ratings yet
Lesson 4 - Introduction Machine Learning
44 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
UNIT2
No ratings yet
UNIT2
20 pages
ML Notion 1
No ratings yet
ML Notion 1
18 pages
Contemporary ML For Physicists
No ratings yet
Contemporary ML For Physicists
91 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
ML 01
No ratings yet
ML 01
24 pages
Neural Networks in Healthcare Lecture 2 - 021808
No ratings yet
Neural Networks in Healthcare Lecture 2 - 021808
73 pages
AI ML Session Slides
No ratings yet
AI ML Session Slides
34 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
Week 12 Intro To DS and ML
No ratings yet
Week 12 Intro To DS and ML
67 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
A Practical and Technical Introduction To Machine Learning
No ratings yet
A Practical and Technical Introduction To Machine Learning
23 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Introduction To ML
No ratings yet
Introduction To ML
4 pages
5.1 Large Scale ML
No ratings yet
5.1 Large Scale ML
10 pages
EE353 - 769 06 Intro To ML
No ratings yet
EE353 - 769 06 Intro To ML
27 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
An Introduction To Machine Learning and Its Applications
No ratings yet
An Introduction To Machine Learning and Its Applications
8 pages
ABES Presentation
No ratings yet
ABES Presentation
91 pages
Machine Learning Roadmap PDF
No ratings yet
Machine Learning Roadmap PDF
4 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
OUM MARKETING MANAGEMENT BBPM2103 Topic 2
No ratings yet
OUM MARKETING MANAGEMENT BBPM2103 Topic 2
45 pages
Lab 6 Introduction To Basic Interface
No ratings yet
Lab 6 Introduction To Basic Interface
7 pages
ML Mdu 2024 10939237
No ratings yet
ML Mdu 2024 10939237
20 pages
Powertronic Installation Manual-Kawasaki Er-6N (2012-2018)
No ratings yet
Powertronic Installation Manual-Kawasaki Er-6N (2012-2018)
33 pages
AI Roadmap
No ratings yet
AI Roadmap
45 pages
IPCC Inventory Software Manual
No ratings yet
IPCC Inventory Software Manual
66 pages
36 Lean Manufacturing Tools
No ratings yet
36 Lean Manufacturing Tools
21 pages
Lesson 2 Current Trends and Emerging Technologies - JENCY JOY MALASIG
No ratings yet
Lesson 2 Current Trends and Emerging Technologies - JENCY JOY MALASIG
15 pages
How To Download Google Maps For Windows 11 - 10
No ratings yet
How To Download Google Maps For Windows 11 - 10
28 pages
Ks2 Mathematics 2001 Marking Scheme
No ratings yet
Ks2 Mathematics 2001 Marking Scheme
30 pages
Coding Guidelines-C
No ratings yet
Coding Guidelines-C
71 pages
Datasheet 1 RTG 1223160 E 2,400.0
No ratings yet
Datasheet 1 RTG 1223160 E 2,400.0
2 pages
Curriculum Vitae: Nguyen Viet Anh
No ratings yet
Curriculum Vitae: Nguyen Viet Anh
7 pages
Reverberation Time
No ratings yet
Reverberation Time
4 pages
Jtac Notes
No ratings yet
Jtac Notes
18 pages
Modulewise QuestionBank
No ratings yet
Modulewise QuestionBank
9 pages
Acpk Brochure 1
No ratings yet
Acpk Brochure 1
20 pages
PKG List (Submit To Mr. Jeong)
No ratings yet
PKG List (Submit To Mr. Jeong)
6 pages
T34 Catlogue - Catalogue - V2 - 2023
No ratings yet
T34 Catlogue - Catalogue - V2 - 2023
8 pages
Simple Packer-In C Gunther
No ratings yet
Simple Packer-In C Gunther
10 pages
Started On State Completed On Time Taken Marks Grade 5.00 100
No ratings yet
Started On State Completed On Time Taken Marks Grade 5.00 100
3 pages
05 RSB Cluster
No ratings yet
05 RSB Cluster
14 pages
.Trashed-1742732428-Abstraction in Java - GeeksforGeeks
No ratings yet
.Trashed-1742732428-Abstraction in Java - GeeksforGeeks
11 pages
20m Horizontal Lifeline
No ratings yet
20m Horizontal Lifeline
2 pages
Mickael Musindo
No ratings yet
Mickael Musindo
2 pages
PanduitProductDetails UTP28SP2MBU
No ratings yet
PanduitProductDetails UTP28SP2MBU
2 pages
Spys Mykola Resume
No ratings yet
Spys Mykola Resume
1 page
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
C++ Programming: From Novice to Expert in a Step-by-Step Journey
From Everand
C++ Programming: From Novice to Expert in a Step-by-Step Journey
Ryan Campbell
No ratings yet