0% found this document useful (0 votes)

14 views48 pages

Week3 LearningI

Uploaded by

albertadi412

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views48 pages

Week3 LearningI

Uploaded by

albertadi412

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

CSCI218: Foundations

of Artificial Intelligence
Lectures on learning
§ Learning: a process for improving the performance of an agent
through experience
§ Learning I (today):
§ The general idea: generalization from experience
§ Supervised learning: classification and regression
§ Learning II: neural networks and deep learning
§ Reinforcement learning: learning complex V and Q functions
Supervised learning
§ To learn an unknown target function f
§ Input: a training set of labeled examples (xj,yj) where yj = f(xj)
§ E.g., xj is an image, f(xj) is the label “giraffe”
§ E.g., xj is a seismic signal, f(xj) is the label “explosion”
§ Output: hypothesis h that is “close” to f, i.e., predicts well on unseen
examples (“test set”)
§ Many possible hypothesis families for h
§ Linear models, logistic regression, neural networks, decision trees, examples
(nearest-neighbor), grammars, kernelized separators, etc etc
§ Classification = learning f with discrete output value
§ Regression = learning f with real-valued output value
Inductive Learning (Science)
§ Simplest form: learn a function from examples
§ A target function: g
§ Examples: input-output pairs (x, g(x))
§ E.g. x is an email and g(x) is spam / ham
§ E.g. x is a house and g(x) is its selling price

§ Problem:
§ Given a hypothesis space H
§ Given a training set of examples xi
§ Find a hypothesis h(x) such that h ~ g

§ Includes:
§ Classification (outputs = class labels)
§ Regression (outputs = real numbers)
Classification example: Object recognition
x

f(x) giraffe giraffe giraffe llama llama llama

X= f(x)=?
Example: Spam Filter
§ Input: an email Dear Sir.
§ Output: spam/ham First, I must solicit your confidence in
this transaction, this is by virture of its
§ Setup: nature as being utterly confidencial and
top secret. …
§ Get a large collection of example emails, each labeled
“spam” or “ham” (by hand) TO BE REMOVED FROM FUTURE
§ Learn to predict labels of new incoming emails MAILINGS, SIMPLY REPLY TO THIS
MESSAGE AND PUT "REMOVE" IN THE
§ Classifiers reject 200 billion spam emails per day SUBJECT.

§ Features: The attributes used to make the ham / 99 MILLION EMAIL ADDRESSES
FOR ONLY $99
spam decision
§ Words: FREE! Ok, Iknow this is blatantly OT but I'm
beginning to go insane. Had an old Dell
§ Text Patterns: $dd, CAPS Dimension XPS sitting in the corner and
§ Non-text: SenderInContacts, AnchorLinkMismatch decided to put it to use, I know it was
§ … working pre being stuck in the corner,
but when I plugged it in, hit the power
nothing happened.
Example: Digit Recognition

§ Input: images / pixel grids 0

§ Output: a digit 0-9
1
§ Setup:
§ MNIST data set of 60K collection hand-labeled images
§ Note: someone has to hand label all this data! 2
§ Want to learn to predict labels of new digit images

1
§ Features: The attributes used to make the digit decision
§ Pixels: (6,8)=ON
§ Shape Patterns: NumComponents, AspectRatio, NumLoops
§ … ??
Other Classification Tasks
§ Medical diagnosis
§ input: symptoms
§ output: disease
§ Automatic essay grading
§ input: document
§ output: grades
§ Fraud detection
§ input: account activity
§ output: fraud / no fraud
§ Email routing
§ input: customer complaint email
§ output: which department needs to ignore this email
§ Fruit and vegetable inspection
§ input: image (or gas analysis)
§ output: moldy or OK

§ … many more
Regression example: Curve fitting
Regression example: Curve fitting
Regression example: Curve fitting
Regression example: Curve fitting
Regression example: Curve fitting
Basic questions
§ Which hypothesis space H to choose?
§ How to measure degree of fit?
§ How to trade off degree of fit vs. complexity?
§ “Ockham’s razor”
§ How do we find a good h?
§ How do we know if a good h will predict well?
Training and Testing
A few important points about learning
§ Data: labeled instances, e.g. emails marked spam/ham
§ Training set
§ Held out set
§ Test set
§ Features: attribute-value pairs which characterize each x Training
Data
§ Experimentation cycle
§ Learn parameters (e.g. model probabilities) on training set
§ (Tune hyperparameters on held-out set)
§ Compute accuracy of test set
§ Very important: never “peek” at the test set!
§ Evaluation Held-Out
§ Accuracy: fraction of instances predicted correctly Data
(Validation set)
§ Overfitting and generalization
§ Want a classifier which does well on test data
§ Overfitting: fitting the training data very closely, but not Test
generalizing well Data
§ Underfitting: fits the training set poorly
A few important points about learning

§ What should we learn where?

§ Learn parameters from training data
§ Tune hyperparameters on different data
§ Why?
§ For each value of the hyperparameters, train
and test on the held-out data
§ Choose the best value and do a final test on
the test data

§ What are examples of hyperparameters?

Supervised Learning
§ Classification = learning f with discrete output value
§ Regression = learning f with real-valued output value
Linear Regression

Hypothesis family: Linear functions

Linear Regression

1000
900

House price in $1000

800
(x, y=f(x)), x: house size, y: house price 700
600
500
400
300
500 1000 1500 2000 2500 3000 3500
House size in square feet

Berkeley house prices, 2009

Linear regression = fitting a straight line/hyperplane
40
1000
900

House price in $1000

hw(x) 800
20
700
600
500
0
400
0
x 20

300
500 1000 1500 2000 2500 3000 3500
Prediction: hw(x) = w0 + w1x House size in square feet

Berkeley house prices, 2009

Prediction error
Error on one instance: y – hw(x)

Error or “residual”
Observation y
Prediction hw(x)

0
0
x 20
Find w
§ Define loss function

§ Find w* to minimize loss function

Least squares: Minimizing squared error
§ L2 loss function: sum of squared errors over all examples

§ Loss = ____________________________
§ We want the weights w* that minimize loss
§ At w* the derivatives of loss w.r.t. each weight are zero:

§ ¶Loss/¶w0 = __________________________

§ ¶Loss/¶w1 = __________________________
§ Exact solutions for N examples:
§ w1 = [NSj xjyj – (Sj xj)(Sj yj)]/[NSj xj2 – (Sj xj)2] and w0 = 1/N [Sj yj – w1Sj xj]
§ For the general case where x is an n-dimensional vector
§ X is the data matrix (all the data, one example per row); y is the column of labels
§ w* = (XTX)-1XTy
Least squares: Minimizing squared error
§ L2 loss function: sum of squared errors over all examples
§ Loss = Sj (yj – hw(xj))2 = Sj (yj – (w0 + w1xj))2
§ We want the weights w* that minimize loss
§ At w* the derivatives of loss w.r.t. each weight are zero:
§ ¶Loss/¶w0 = – 2 Sj (yj – (w0 + w1xj)) = 0
§ ¶Loss/¶w1 = – 2 Sj (yj – (w0 + w1xj)) xj = 0
§ Exact solutions for N examples:
§ w1 = [NSj xjyj – (Sj xj)(Sj yj)]/[NSj xj2 – (Sj xj)2] and w0 = 1/N [Sj yj – w1Sj xj]
§ For the general case where x is an n-dimensional vector
§ X is the data matrix (all the data, one example per row); y is the column of labels
§ w* = (XTX)-1XTy
Regression vs Classification
§ Linear regression when output is binary, ! ∈ −1, 1
§ ℎ! " = $" + $# "
y !! + !" #

1
x
§ Linear classification
§ Used with discrete output values
§ Threshold a linear function
§ ℎ! " = 1, if $" + $# " ≥ 0
§ ℎ! " = −1, if $" + $# " < 0 y
§ w: weight vector $ !! + !" #
1
§ Activation function g x
Threshold perceptron as linear classifier
Binary Decision Rule
§ A threshold perceptron is a single unit
that outputs
§ y = hw(x) = 1 when w.x ³ 0
= -1 when w.x < 0
§ In the input vector space
§ Examples are points x w0 : -3

money
2
wfree : 4
§ The equation w.x=0 defines a hyperplane y=1 (SPAM) wmoney : 2
§ One side corresponds to y=1

w.x
1
§ Other corresponds to y=-1

=0
0
y=-1 (HAM) 0 1 free
Example
Dear Stuart, I’m leaving Macrosoft
to return to academia. The money is
w0 : -3
money

2 is great here but I prefer to be free

wfree : 4 to do my own research; and I really
y=1 (SPAM) wmoney : 2 love teaching undergrads!
Do I need to finish
w.x

1
x0 : 1 my BA first before applying?
=0

xfree : 1 Best wishes

xmoney : 1 Bill
0
y=-1 (HAM) 0 1 free
w.x = -3x1 + 4x1 + 2x1 = 3
Weight Updates

Need a different solution than before given the characteristic of perceptron

Perceptron learning rule
§ If true y* ¹ hw(x) (an error), adjust the weights
§ If w.x < 0 but the output should be y*=1
§ This is called a false negative
§ Should increase weights on positive inputs
§ Should decrease weights on negative inputs
§ If w.x > 0 but the output should be y*=-1
§ This is called a false positive
§ Should decrease weights on positive inputs
§ Should increase weights on negative inputs
Perceptron Learning Rule
§ Start with weights = 0
y = hw(x) = 1 when w.x ³ 0
§ For each training instance:
= -1 when w.x < 0
§ If wrong: adjust the weight vector by
adding or subtracting the feature
vector. y* is true label.
*
* x
Example
Dear Stuart, I wanted to let you know
that I have decided to leave Macrosoft
w0 : -3
money

2 and return to academia. The money is

wfree : 4 is great here but I prefer to be free
y=1 (SPAM) wmoney : 2 to pursue more interesting research
and I really love teaching
w.x

1
x0 : 1 undergraduates! Do I need to finish
=0

xfree : 1 my BA first before applying?

xmoney : 1 Best wishes
0 Bill
y=0 (HAM) 0 1 free
w.x = -3x1 + 4x1 + 2x1 = 3

w ¬ w + a y* x w ¬ (-3,4,2) + 0.5 (0 – 1) (1,1,1)

a = 0.5 = (-3.5,3.5,1.5)
Perceptron convergence theorem
Separable
§ A learning problem is linearly separable iff there is some
hyperplane exactly separating positive from negative examples

§ Convergence: if the training data are linearly separable,

perceptron learning applied repeatedly to the training set will
eventually converge to a perfect separator

Non-Separable
Example: Earthquakes vs nuclear explosions

7.5 1
7
0.9

Proportion correct
6.5
6 0.8
5.5
5 0.7
x2

4.5 0.6
4
3.5 0.5
3
2.5 0.4
4.5 5 5.5 6 6.5 7 0 100 200 300 400 500 600 700
x1 Number of weight updates

63 examples, 657 updates required

Perceptron convergence theorem
Separable
§ A learning problem is linearly separable iff there is some
hyperplane exactly separating +ve from –ve examples

§ Convergence: if the training data are separable, perceptron

learning applied repeatedly to the training set will eventually
converge to a perfect separator

Non-Separable
§ Convergence: if the training data are non-separable,
perceptron learning will converge to a minimum-error
solution provided the learning rate a is decayed appropriately
(e.g., a=1/t)
Perceptron learning with fixed a

7.5
1
7
6.5 0.9

Proportion correct
6
5.5 0.8
5
x2

0.7
4.5
4 0.6
3.5 0.5
3
2.5 0.4
4.5 5 5.5 6 6.5 7 0 20000 40000 60000 80000 100000
x1 Number of weight updates

71 examples, 100,000 updates

fixed ! = 0.2, no convergence
Perceptron learning with decaying a

7.5
7 1
6.5 0.9

Proportion correct
6
5.5 0.8
5
x2

0.7
4.5
4 0.6
3.5 0.5
3
2.5 0.4
4.5 5 5.5 6 6.5 7 0 20000 40000 60000 80000 100000
x1 Number of weight updates

71 examples, 100,000 updates

decaying ! = 1000/(1000 + *), near-convergence
Non-Separable Case
Even the best linear boundary makes at least one mistake
Other Linear Classifiers
§ Perceptron is perfectly happy as long as it separates the training data
y

$+,-.#,'/( ! ⋅ #
1

§ Logistic Regression § Support Vector Machines (SVM)

1 § Maximize margin between boundary and nearest points
$#$%&'$( # =
1 + ' )*
y
$#$%&'$( !⋅#
1

x
Perceptrons hopeless for XOR function

x1 x1 x1

1 1 1

0 0 0
0 1 x2 0 1 x2 0 1 x2
(a) x1 and x2 (b) x1 or x2 (c) x1 xor x2
Basic questions
§ Which hypothesis space H to choose?
§ How to measure degree of fit?
§ How to trade off degree of fit vs. complexity?
§ “Ockham’s razor”
§ How do we find a good h?
§ How do we know if a good h will predict well?
Classical stats/ML: Minimize loss function
§ Which hypothesis space H to choose?
§ E.g., linear combinations of features: hw(x) = wTx
§ How to measure degree of fit?
§ Loss function, e.g., squared error Σj (yj – wTx)2
§ How to trade off degree of fit vs. complexity?
§ Regularization: complexity penalty, e.g., ||w||2
§ How do we find a good h?
§ Optimization (closed-form, numerical); discrete search
§ How do we know if a good h will predict well?
§ Try it and see (cross-validation, bootstrap, etc.)

57
Probabilistic: Max. likelihood, max. a priori
§ Which hypothesis space H to choose?
§ Probability model P(y | x,h) , e.g., Y ~ N(wTx,σ2)
§ How to measure degree of fit?
§ Data likelihood Πj P(yj | xj,h)
§ How to trade off degree of fit vs. complexity?
§ Regularization or prior: argmaxh P(h) Πj P(yj | xj,h) (Max a Priori)
§ How do we find a good h?
§ Optimization (closed-form, numerical); discrete search
§ How do we know if a good h will predict well?
§ Empirical process theory (generalizes Chebyshev, CLT, PAC…);
§ Key assumption is (i)id
58
Bayesian: Computing posterior over H
§ Which hypothesis space H to choose?
§ All hypotheses with nonzero a priori probability
§ How to measure degree of fit?
§ Data probability, as for MLE/MAP
§ How to trade off degree of fit vs. complexity?
§ Use prior, as for MAP
§ How do we find a good h?
§ Don’t! Bayes predictor P(y|x,D) = Σh P(y|x,h) P(D|h) P(h)
§ How do we know if a good h will predict well?
§ Silly question! Bayesian prediction is optimal!!
59
Acknowledgement

The lecture slides are based on the materials from ai.Berkey.edu

Questions

W4 - Logistic Regression
No ratings yet
W4 - Logistic Regression
52 pages
Overview For Artificial Intelligence
No ratings yet
Overview For Artificial Intelligence
112 pages
Machine Learning - I
No ratings yet
Machine Learning - I
126 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
GML Slides 2024 04 29
No ratings yet
GML Slides 2024 04 29
206 pages
Lecture 01-2
No ratings yet
Lecture 01-2
33 pages
lec21-ML II
No ratings yet
lec21-ML II
66 pages
CS229 Lecture Notes
No ratings yet
CS229 Lecture Notes
142 pages
SP14 CS188 Lecture 22 - Perceptron - Print
No ratings yet
SP14 CS188 Lecture 22 - Perceptron - Print
35 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
383 Fall11 Lec19
No ratings yet
383 Fall11 Lec19
30 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Lecture 06 - Perceptron
No ratings yet
Lecture 06 - Perceptron
28 pages
AI Lec2.1 MLsupervised
No ratings yet
AI Lec2.1 MLsupervised
21 pages
lec22-ML III
No ratings yet
lec22-ML III
51 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Fertilizer Recommendation System Rishav
No ratings yet
Fertilizer Recommendation System Rishav
21 pages
Lecture 19
No ratings yet
Lecture 19
8 pages
@vtucode - in 21AI63 Model Set 1 Paper
No ratings yet
@vtucode - in 21AI63 Model Set 1 Paper
2 pages
cs188 sp23 Lec25 - Z
No ratings yet
cs188 sp23 Lec25 - Z
38 pages
Lec 21
No ratings yet
Lec 21
34 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
FRA Milestone 1 Jupyter Notebook PDF
100% (3)
FRA Milestone 1 Jupyter Notebook PDF
42 pages
Chapter 03 - 1731422626
No ratings yet
Chapter 03 - 1731422626
42 pages
Linear Models
No ratings yet
Linear Models
30 pages
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
No ratings yet
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
86 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Session 6 Machine Learning Algorithms
No ratings yet
Session 6 Machine Learning Algorithms
46 pages
NN Theory
No ratings yet
NN Theory
138 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Lec1 PerceptronPocket Recap
No ratings yet
Lec1 PerceptronPocket Recap
61 pages
QSRI Lecture1
No ratings yet
QSRI Lecture1
45 pages
3 Percept Ron
No ratings yet
3 Percept Ron
34 pages
Week3 Perceptron Mlprwerwerwer
No ratings yet
Week3 Perceptron Mlprwerwerwer
8 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Empowering Deep Learning For Images: A Comparative Analysis of Regularization Techniques in CNNs
No ratings yet
Empowering Deep Learning For Images: A Comparative Analysis of Regularization Techniques in CNNs
13 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Big Data Analytics With Java 1st Edition Rajat Mehta
No ratings yet
Big Data Analytics With Java 1st Edition Rajat Mehta
65 pages
Introduction To Machine Learning: Workshop On Machine Learning For Intelligent Image Processing
No ratings yet
Introduction To Machine Learning: Workshop On Machine Learning For Intelligent Image Processing
44 pages
Week5 Computer Vision
No ratings yet
Week5 Computer Vision
58 pages
ML Imp Ques 1
No ratings yet
ML Imp Ques 1
22 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
Supervised Learning
No ratings yet
Supervised Learning
5 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Minsky y Papert
No ratings yet
Minsky y Papert
77 pages
ch6 (Q 2,8,4)
No ratings yet
ch6 (Q 2,8,4)
9 pages
ML Unit I
No ratings yet
ML Unit I
14 pages
Lecture 1, Part 2: Linear Classification: Roger Grosse
No ratings yet
Lecture 1, Part 2: Linear Classification: Roger Grosse
10 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
Machine Learning - Classifiers and Boosting: Reading CH 18.6-18.12, 20.1-20.3.2
No ratings yet
Machine Learning - Classifiers and Boosting: Reading CH 18.6-18.12, 20.1-20.3.2
54 pages
Neural Networks Cheat Sheet - 2020 PDF
No ratings yet
Neural Networks Cheat Sheet - 2020 PDF
14 pages
60 ChatGPT Prompts For Data Science 2023
100% (3)
60 ChatGPT Prompts For Data Science 2023
67 pages
Lecture 2 Math
No ratings yet
Lecture 2 Math
34 pages
IBM-CBSE AI Project Logbook
No ratings yet
IBM-CBSE AI Project Logbook
28 pages
Week2 Lecture
No ratings yet
Week2 Lecture
39 pages
Week1 Lecture1
No ratings yet
Week1 Lecture1
40 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Predicting Stress, Anxiety, and Depression From Social Media Comments: A Holistic Multi-Modal Deep Learning and NLP Framework
No ratings yet
Predicting Stress, Anxiety, and Depression From Social Media Comments: A Holistic Multi-Modal Deep Learning and NLP Framework
6 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Mental Stress Detection Using Artificial Intelligence Models
No ratings yet
Mental Stress Detection Using Artificial Intelligence Models
11 pages
Fake News Detection
No ratings yet
Fake News Detection
5 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Neural Prophet vs. Prophet, Why Neural Prophet Is So Accurate by Bingblackbean Medium
No ratings yet
Neural Prophet vs. Prophet, Why Neural Prophet Is So Accurate by Bingblackbean Medium
20 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Week4 LearningII
No ratings yet
Week4 LearningII
39 pages
Open AI and Its Impact On Fraud Detection in Financial Industry
No ratings yet
Open AI and Its Impact On Fraud Detection in Financial Industry
24 pages
Jcau 02728
No ratings yet
Jcau 02728
18 pages
Data Science
No ratings yet
Data Science
38 pages
Demand Forecasting Using Machine Learning To Manage Product Inventory For Multi
No ratings yet
Demand Forecasting Using Machine Learning To Manage Product Inventory For Multi
6 pages
K024 K006 DWM ResearchPaper
No ratings yet
K024 K006 DWM ResearchPaper
16 pages
Crop Recommendation System by Artificial Neural Ne
No ratings yet
Crop Recommendation System by Artificial Neural Ne
18 pages
Model Selection Evaluation Algorithm Selection 1684595082
No ratings yet
Model Selection Evaluation Algorithm Selection 1684595082
51 pages
Machine Learning - 4
No ratings yet
Machine Learning - 4
23 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Research Paper On Skin Disease Claasification
No ratings yet
Research Paper On Skin Disease Claasification
25 pages
Machine Learning Driven Job Recommendation
No ratings yet
Machine Learning Driven Job Recommendation
10 pages
Create A Clustering Model With Azure Machine Learning Designer
No ratings yet
Create A Clustering Model With Azure Machine Learning Designer
22 pages
Dot Matrix Deep Learning
No ratings yet
Dot Matrix Deep Learning
10 pages
Report 2
No ratings yet
Report 2
16 pages
NLP Assignment 2-2
No ratings yet
NLP Assignment 2-2
2 pages
Implementation of Virtual Assistant With Sign Language Using Deep Learning and TensorFlow
No ratings yet
Implementation of Virtual Assistant With Sign Language Using Deep Learning and TensorFlow
7 pages
Creating A Dataset For High-Performance Computing Code Translation Using LLMS: A Bridge Between Openmp Fortran and C++
No ratings yet
Creating A Dataset For High-Performance Computing Code Translation Using LLMS: A Bridge Between Openmp Fortran and C++
7 pages
8478457
No ratings yet
8478457
13 pages
Lab 4 Specification
No ratings yet
Lab 4 Specification
3 pages
End-To-End Learning of Driving Models From Large-Scale Video Datasets
No ratings yet
End-To-End Learning of Driving Models From Large-Scale Video Datasets
9 pages
Ian Talks JS A-Z: WebDevAtoZ, #1
From Everand
Ian Talks JS A-Z: WebDevAtoZ, #1
Ian Eress
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Week3 LearningI

Uploaded by

Week3 LearningI

Uploaded by

CSCI218: Foundations

f(x) giraffe giraffe giraffe llama llama llama

§ Input: images / pixel grids 0

§ What should we learn where?

§ What are examples of hyperparameters?

Hypothesis family: Linear functions

House price in $1000

Berkeley house prices, 2009

House price in $1000

Berkeley house prices, 2009

§ Find w* to minimize loss function

2 is great here but I prefer to be free

xfree : 1 Best wishes

Need a different solution than before given the characteristic of perceptron

2 and return to academia. The money is

xfree : 1 my BA first before applying?

w ¬ w + a y* x w ¬ (-3,4,2) + 0.5 (0 – 1) (1,1,1)

§ Convergence: if the training data are linearly separable,

63 examples, 657 updates required

§ Convergence: if the training data are separable, perceptron

71 examples, 100,000 updates

71 examples, 100,000 updates

§ Logistic Regression § Support Vector Machines (SVM)

The lecture slides are based on the materials from ai.Berkey.edu

You might also like