0% found this document useful (0 votes)

58 views86 pages

Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India

Here are a few algorithms that can be used to find the minimum of a convex loss function: 1. Gradient descent: Start with random weights and iteratively update them in the direction of steepest descent as indicated by the gradient of the loss function. Take small steps to minimize the loss. 2. Coordinate descent: Fix all weights except one, find the optimal value for that weight while keeping others fixed, and iterate through weights. 3. Closed form solutions: For some loss functions like squared loss, there are closed form solutions that directly find the optimal weights in one shot without iterative updates. 4. Convex optimization solvers: Tools like CVXOPT can be used to directly find the global minimum

Uploaded by

sharma05031989

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views86 pages

Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India

Uploaded by

sharma05031989

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 86

INDIAN INSTITUTE OF TECHNOLOGY ROORKEE

Linear Classifier
by
Dr. Sanjeev Kumar
Associate Professor
Department of Mathematics
IIT Roorkee, Roorkee-247 667, India
[email protected]
[email protected]
Linear models
A strong high-bias assumption is linear separability:
 in 2 dimensions, can separate classes by a line

 in higher dimensions, need hyperplanes

A linear model is a model that assumes the data is linearly

separable
Linear models
A linear model in n-dimensional space (i.e. n
features) is define by n+1 weights:

In two dimensions, a line:

0 =w1 f1 + w2 f2 + b (where b = -a)

In three dimensions, a plane:

0 =w1 f1 + w2 f2 + w3 f3 + b
In m-dimensions, a hyperplane
m
0 =b + å wj fj
j=1
Perceptron learning algorithm
repeat until convergence (or for some # of iterations):
for each training example (f1, f2, …, fm, label):
m
prediction =b + å wj fj
j=1

if prediction * label ≤ 0: // they don’t agree

for each wj:
wj = wj + fj*label
b = b + label
Which line will it find?
Which line will it find?

Only guaranteed to find some

line that separates the data
Linear models
Perceptron algorithm is one example of a linear
classifier

Many, many other algorithms that learn a line (i.e. a

setting of a linear combination of weights)

Goals:
 Explore a number of linear training algorithms

 Understand why these algorithms work

Perceptron learning algorithm
repeat until convergence (or for some # of iterations):
for each training example (f1, f2, …, fm, label):
prediction =b + å
m
wj fj
j=1

if prediction * label ≤ 0: // they don’t agree

for each wi:
wi = wi + fi*label
b = b + label
A closer look at why we got it wrong

w1 w2 (-1, -1, positive)

0 * f1 +1* f2 =
We’d like this value to be positive
0 * - 1+1*- 1 =- 1 since it’s a positive value

didn’t contribute, contributed in the

but could have wrong direction
Intuitively these make sense
decrease decrease Why change by 1?
Any other way of doing it?
0 -> -1 1 -> 0
Model-based machine learning
1. pick a model
 e.g. a hyperplane, a decision tree,…
 A model is defined by a collection of parameters

What are the parameters for DT? Perceptron?

Model-based machine learning
1. pick a model
 e.g. a hyperplane, a decision tree,…
 A model is defined by a collection of parameters

2. pick a criteria to optimize (aka objective function)

What criterion do decision tree learning and

perceptron learning optimize?
Model-based machine learning
1. pick a model
 e.g. a hyperplane, a decision tree,…
 A model is defined by a collection of parameters

2. pick a criteria to optimize (aka objective function)

 e.g. training error

3. develop a learning algorithm

 the algorithm should try and minimize the criteria
 sometimes in a heuristic way (i.e. non-optimally)
 sometimes explicitly
Linear models in general
1. pick a model

0 =b + å
m
wj fj
j=1

These are the parameters we want to learn

2. pick a criteria to optimize (aka objective function)

Some notation: indicator function

ìï 1 if x =True üï
1[ x ] =í ý
î 0 if x =False
ï ï
þ

Convenient notation for turning T/F answers into numbers/counts:

drinks _ to _ bring _ for _ class = å 1[ x >=21]

xÎ class
Some notation: dot-product
Sometimes it is convenient to use vector notation

We represent an example f1, f2, …, fm as a single vector, x

Similarly, we can represent the weight vector w1, w2, …, wm as a single

vector, w

The dot-product between two vectors a and b is defined as:

m
a ×b =å a j b j
j=1
Linear models
1. pick a model

0 =b + å
n
wj fj
j=1

These are the parameters we want to learn

2. pick a criteria to optimize (aka objective function)

å 1[ y (w ×x + b) £0]
i i
i=1

What does this equation say?

0/1 loss function
n

å 1[ y (w ×x + b) £0]
i i
i=1
m
distance =b + å w j x j =w ×x + b distance from hyperplane
j=1

incorrect =yi (w ×xi + b) £0 whether or not the

prediction and label agree

n
0/1 loss =å 1[ yi (w ×xi + b) £0 ] total number of mistakes,
i=1 aka 0/1 loss
Model-based machine learning
1. pick a model

0 =b + å
m
wj fj
j=1

2. pick a criteria to optimize (aka objective function)

å 1[ y (w ×x + b) £ 0]
i i
i=1

3. develop a learning algorithm

n
argmin w,b å 1[ yi (w ×xi + b) £ 0 ] Find w and b that
i=1
minimize the 0/1 loss
Minimizing 0/1 loss

n
Find w and b that
argmin w,b å 1[ yi (w ×xi + b) £ 0 ]
minimize the 0/1 loss
i=1

How do we do this?
How do we minimize a function?
Why is it hard for this function?
Minimizing 0/1 in one dimension
n

å 1[ y (w ×x + b) £ 0]
i i
i=1

loss

Each time we change w such that the example

is right/wrong the loss will increase/decrease
Minimizing 0/1 over all w
n

å 1[ y (w ×x + b) £ 0]
i i
i=1

loss

Each new feature we add (i.e. weights) adds

another dimension to this space!
Minimizing 0/1 loss

n
Find w and b that
argmin w,b å 1[ yi (w ×xi + b) £ 0 ]
minimize the 0/1 loss
i=1

This turns out to be hard (in fact, NP-HARD )

Challenge:
- small changes in any w can have large changes in
the loss (the change isn’t continuous)
- there can be many, many local minima
- at any give point, we don’t have much information
to direct us towards any minima
More manageable loss functions
loss

w
What property/properties do we want from our loss function?
More manageable loss functions

loss

- Ideally, continues (i.e. differentiable) so we get an

indication of direction of minimization
- Only one minima
Convex functions
Convex functions look something like:

One definition: The line segment between any

two points on the function is above the function
Surrogate loss functions
For many applications, we really would like to minimize the
0/1 loss

A surrogate loss function is a loss function that provides an

upper bound on the actual loss function (in this case, 0/1)

We’d like to identify convex surrogate loss functions to make

them easier to minimize

Key to a loss function is how it scores the difference between

the actual label y and the predicted label y’
Surrogate loss functions

0/1 loss: l(y, y') =1[ yy' £0 ]

Ideas?
Some function that is a proxy for
error, but is continuous and convex
Surrogate loss functions

0/1 loss: l(y, y') =1[ yy' £ 0 ]

Hinge: l(y, y') =max(0,1- yy')

Exponential: l(y, y') =exp(- yy')

Squared loss: l(y, y') =(y - y')2

Why do these work? What do they penalize?

Surrogate loss functions
0/1 loss: l(y, y') =1[ yy' £ 0 ] Hinge: l(y, y') =max(0,1- yy')
Squared loss: l(y, y') =(y - y')2 Exponential: l(y, y') =exp(- yy')
Model-based machine learning
1. pick a model

0 =b + å
m
wj fj
j=1

2. pick a criteria to optimize (aka objective function)

n
use a convex surrogate
å exp(- y (w ×x + b))
i i loss function
i=1

3. develop a learning algorithm

n
argmin w,b å exp(- yi (w ×xi + b)) Find w and b that
i=1
minimize the
surrogate loss
Finding the minimum

You’re blindfolded, but you can see out of the bottom of the
blindfold to the ground right by your feet. I drop you off
somewhere and tell you that you’re in a convex shaped valley
and escape is at the bottom/minimum. How do you get out?
Finding the minimum

loss

How do we do this for a function?

One approach: gradient descent

Partial derivatives give us the

slope (i.e. direction to move)
in that dimension
loss

w
One approach: gradient descent

Partial derivatives give us the

slope (i.e. direction to move) in
that dimension
loss

Approach:
 pick a starting point (w)
 repeat: w
 pick a dimension
 move a small amount in that
dimension towards decreasing loss
(using the derivative)
One approach: gradient descent

Partial derivatives give us the

slope (i.e. direction to move) in
that dimension

Approach:
 pick a starting point (w)
 repeat:
 pick a dimension
 move a small amount in that
dimension towards decreasing loss
(using the derivative)
Gradient descent

 pick a starting point (w)

 repeat until loss doesn’t decrease in all dimensions:
 pick a dimension
 move a small amount in that dimension towards decreasing loss
(using the derivative)
d
w j =w j - h loss(w)
dw j

What does this do?

Gradient descent

 pick a starting point (w)

 repeat until loss doesn’t decrease in all dimensions:
 pick a dimension
 move a small amount in that dimension towards decreasing loss
(using the derivative)
d
w j =w j - h loss(w)
dw j

learning rate (how much we want to move in the error

direction, often this will change over time)
Some maths

d d n
dw j
loss = å
dw j i=1
exp(- yi (w ×xi + b))

n
d
=å exp(- yi (w ×xi + b)) - yi (w ×xi + b)
i=1 dw j
n
=å - yi xij exp(- yi (w ×xi + b))
i=1
Gradient descent

 pick a starting point (w)

 repeat until loss doesn’t decrease in all dimensions:
 pick a dimension
 move a small amount in that dimension towards decreasing loss
(using the derivative)

n
w j =w j + h å yi xij exp(- yi (w ×xi + b))
i=1

What is this doing?

Exponential update rule
n
w j =w j + hå yi xij exp(- yi (w ×xi + b))
i=1

for each example xi:

w j =w j + h yi xij exp(- yi (w ×xi + b))

Does this look familiar?

Perceptron learning algorithm!
repeat until convergence (or for some # of iterations):
for each training example (f1, f2, …, fm, label):
m
prediction =b + å wj fj
j=1

if prediction * label ≤ 0: // they don’t agree

for each wj:
wj = wj + fj*label
b = b + label

w j =w j + h yi xij exp(- yi (w ×xi + b))

or
w j =w j + xij yi c where c =h exp(- yi (w ×xi + b))
The constant
c =h exp(- yi (w ×xi + b))

learning rate label prediction

When is this large/small?

The constant
c =h exp(- yi (w ×xi + b))

label prediction

If they’re the same sign, as the

predicted gets larger there update
gets smaller

If they’re different, the more

different they are, the bigger the
update
Perceptron learning algorithm!
repeat until convergence (or for some # of iterations):
for each training example (f1, f2, …, fm, label):

prediction =b + å
m
wj fj
j=1

if prediction * label ≤ 0: // they don’t agree

for each wj: Note: for gradient descent, we always update
wj = wj + fj*label
b = b + label

w j =w j + h yi xij exp(- yi (w ×xi + b))

or
w j =w j + xij yi c where c =h exp(- yi (w ×xi + b))
Summary
Model-based machine learning:
 define a model, objective function (i.e. loss function),
minimization algorithm

Gradient descent minimization algorithm

 require that our loss function is convex
 make small updates towards lower losses

Perceptron learning algorithm:

 gradient descent
 exponential loss function (modulo a learning rate)
Regularization
Introduction
Introduction
Introduction
Introduction
Introduction
Ill-posed Problems

 In finite domains, most of the inverse problems are

referred to the ill-posed problem.

 The mathematical term well-posed problem stems

from a definition given by Jacques Hadamard. He
believed that mathematical models of physical
phenomena should have the properties that
•A solution exists
•The solution is unique
•The solution's behavior changes continuously
with the initial conditions.
Ill-conditioned Problems

 Problems that are not well-posed in the sense of

Hadamard are termed ill-posed. Inverse problems
are often ill-posed.

 Even if a problem is well-posed, it may still be ill-

conditioned, meaning that a small error in the
initial data can result in much larger errors in the
answers. An ill-conditioned problem is indicated by
a large condition number.
Example: Curve Fitting
Some Examples: Linear System of Eqn

Solve [A]{x} = {b}

Ill-conditioning
• A system of equations is singular if det|A| = 0
• If a system of equations is nearly singular it is ill-
conditioned.

Systems which are ill-conditioned are extremely

sensitive to small changes in coefficients of [A]
and {b}. These systems are inherently sensitive to
round-off errors.
One concern

n
argmin w,b å exp(- yi (w ×xi + b))
i=1
loss

What is this calculated on? w

Is this what we want to optimize?
Perceptron learning algorithm!
repeat until convergence (or for some # of iterations):
for each training example (f1, f2, …, fm, label):

prediction =b + å
m
wj fj
j=1

if prediction * label ≤ 0: // they don’t agree

for each wj: Note: for gradient descent, we always update
wj = wj + fj*label
b = b + label

w j =w j + h yi xij exp(- yi (w ×xi + b))

or
w j =w j + xij yi c where c =h exp(- yi (w ×xi + b))
One concern

n
argmin w,b å exp(- yi (w ×xi + b))
i=1
loss

We’re calculating this on the training

set

We still need to be careful about w

overfitting!

The min w,b on the training set is

generally NOT the min for the test set
How did we deal with this for the perceptron algorithm?
Overfitting revisited: regularization
A regularizer is an additional criteria to the loss function to
make sure that we don’t overfit

It’s called a regularizer since it tries to keep the parameters

more normal/regular

It is a bias on the model forces the learning to prefer certain

types of weights over others
n
argmin w,b å loss(yy') + l regularizer(w, b)
i=1
Regularizers

0 =b + å
n
wj fj
j=1

Should we allow all possible weights?

Any preferences?

What makes for a “simpler” model for a

linear model?
Regularizers

0 =b + å
n
wj fj
j=1

Generally, we don’t want huge weights

If weights are large, a small change in a feature can result in a

large change in the prediction

Also gives too much weight to any one feature

Might also prefer weights of 0 for features that aren’t useful

How do we encourage small weights? or penalize large weights?

Regularizers

0 =b + å
n
wj fj
j=1

How do we encourage small weights? or penalize large weights?

n
argmin w,b å loss(yy') + l regularizer(w, b)
i=1
Common regularizers

r(w, b) =å w j
sum of the weights
wj

2
sum of the squared weights r(w, b) = å wj
wj

What’s the difference between these?

Common regularizers

r(w, b) =å w j
sum of the weights
wj

åw
2
sum of the squared weights r(w, b) = j
wj

Squared weights penalizes large values more

Sum of weights will penalize small values more
p-norm

sum of the weights (1-norm) r(w, b) =å w j

åw
sum of the squared weights 2

(2-norm)
r(w, b) = j
wj

r(w, b) = p å w j = w
p p
p-norm
wj

Smaller values of p (p < 2) encourage sparser vectors

Larger values of p discourage large weights more
p-norms visualized

lines indicate penalty = 1

w2
p w2
1 0.5
For example, if w1 = 0.5 1.5 0.75
2 0.87
3 0.95
∞ 1
p-norms visualized

all p-norms penalize larger

weights

p < 2 tends to create sparse

(i.e. lots of 0 weights)

p > 2 tends to like similar

weights
Model-based machine learning
1. pick a model

0 =b + å
n
wj fj
j=1

2. pick a criteria to optimize (aka objective function)

å loss(yy') + lregularizer(w)
i=1

3. develop a learning algorithm

n
argmin w,b å loss(yy') + lregularizer(w) Find w and b
i=1
that minimize
Minimizing with a regularizer
We know how to solve convex minimization problems using
gradient descent:
n
argmin w,b å loss(yy')
i=1

If we can ensure that the loss + regularizer is convex then we

could still use gradient descent:
n
argmin w,b å loss(yy') + lregularizer(w)
i=1

make convex
Convexity revisited

One definition: The line segment between any

two points on the function is above the function

Mathematically, f is convex if for all x1, x2:

f (tx1 +(1- t)x2 ) £tf (x1 ) +(1- t) f (x2 ) " 0 < t <1
the value of the the value at some point
function at some point on the line segment
between x1 and x2 between x1 and x2
Adding convex functions
Claim: If f and g are convex functions then so is the
function z=f+g

Prove:
z(tx1 +(1- t)x2 ) £ tz(x1 ) +(1- t)z(x2 ) " 0 < t <1

Mathematically, f is convex if for all x1, x2:

f (tx1 +(1- t)x2 ) £ tf (x1 ) +(1- t) f (x2 ) " 0 < t <1
Adding convex functions
By definition of the sum of two functions:
z(tx1 +(1- t)x2 ) = f (tx1 +(1- t)x2 ) + g(tx1 +(1- t)x2 )
tz(x1 ) +(1- t)z(x2 ) =tf (x1 ) +tg(x1 ) +(1- t) f (x2 ) +(1- t)g(x2 )
=tf (x1 ) +(1- t) f (x2 ) +tg(x1 ) +(1- t)g(x2 )
Then, given that:
f (tx1 +(1- t)x2 ) £tf (x1 ) +(1- t) f (x2 )
g(tx1 +(1- t)x2 ) £ tg(x1 ) +(1- t)g(x2 )
We know:
f (tx1 +(1- t)x2 ) + g(tx1 +(1- t)x2 ) £tf (x1 ) +(1- t) f (x2 ) + tg(x1 ) +(1- t)g(x2 )
So: z(tx1 +(1- t)x2 ) £ tz(x1 ) +(1- t)z(x2 )
Minimizing with a regularizer
We know how to solve convex minimization problems using
gradient descent:
n
argmin w,b å loss(yy')
i=1

If we can ensure that the loss + regularizer is convex then we

could still use gradient descent:
n
argmin w,b å loss(yy') + lregularizer(w)
i=1

convex as long as both loss and regularizer are convex

p-norms are convex

r(w, b) = p å w j = w
p p

p-norms are convex for p >= 1

Model-based machine learning
1. pick a model

0 =b + å
n
wj fj
j=1

2. pick a criteria to optimize (aka objective function)

n
l
å exp(- yi (w ×xi + b)) + 2 w
2

i=1

3. develop a learning algorithm

n
l
argmin w,b å exp(- yi (w ×xi + b)) + w Find w and b
2

i=1 2 that minimize

Our optimization criterion
n
l
argmin w,b å exp(- yi (w ×xi + b)) + w
2

i=1 2

Loss function: penalizes

Regularizer: penalizes large
examples where the prediction
weights
is different than the label

Key: this function is convex allowing us to use gradient descent

Gradient descent

 pick a starting point (w)

 repeat until loss doesn’t decrease in all dimensions:
 pick a dimension
 move a small amount in that dimension towards decreasing loss
(using the derivative)
d
wi =wi - h (loss(w) + regularizer(w, b))
dwi

n
l
argmin w,b å exp(- yi (w ×xi + b)) +
2
w
i=1 2
Some more maths

d d n l
å
2
objective = exp(- yi (w ×x i + b)) + w
dw j dw j i=1 2

…
(some math happens)

n
=- å yi xij exp(- yi (w ×xi + b)) + lw j
i=1
Gradient descent

 pick a starting point (w)

n
w j =w j + h å yi xij exp(- yi (w ×xi + b)) - hlw j
i=1
The update
w j =w j + h yi xij exp(- yi (w ×xi + b)) - hl w j

learning rate direction to regularization

update
constant: how far from wrong

What effect does the regularizer have?

The update
w j =w j + h yi xij exp(- yi (w ×xi + b)) - hlw j

learning rate direction to regularization

update
constant: how far from wrong

If wj is positive, reduces wj moves wj towards 0

If wj is negative, increases wj
L1 regularization
n
argmin w,b å exp(- yi (w ×xi + b)) + w
i=1

d d n
dw j
objective = å
dw j i=1
exp(- yi (w ×xi + b)) + l w

n
=- å yi xij exp(- yi (w ×xi + b)) + lsign(w j )
i=1
L1 regularization
w j =w j + h yi xij exp(- yi (w ×xi + b)) - hlsign(w j )

learning rate direction to regularization

update
constant: how far from wrong

What effect does the regularizer have?

L1 regularization
w j =w j + h yi xij exp(- yi (w ×xi + b)) - hlsign(w j )

learning rate direction to regularization

update
constant: how far from wrong

If wj is positive, reduces by a constant moves wj towards 0

If wj is negative, increases by a constant regardless of magnitude
Regularization with p-norms
L1:
w j =w j + h (loss _ correction - l sign(w j ))
L2:
w j =w j + h (loss _ correction - l w j )

Lp:
p- 1
w j =w j + h (loss _ correction - l cw )
j

How do higher order norms affect the weights?

Regularizers summarized
L1 is popular because it tends to result in sparse solutions
(i.e. lots of zero weights)
However, it is not differentiable, so it only works for gradient
descent solvers

L2 is also popular because for some loss functions, it can

be solved directly (no gradient descent required, though
often iterative solvers still)

Lp is less popular since they don’t tend to shrink the

weights enough
The other loss functions
Without regularization, the generic update is:
w j =w j + h yi xij c
where
c =exp(- yi (w ×xi + b)) exponential

c =1[yy' <1] hinge loss

w j =w j + h (yi - (w ×xi + b)xij ) squared error

Test Epson Print Head
No ratings yet
Test Epson Print Head
4 pages
The Hundred-Page Machine Learning Book - Andriy Burkov
No ratings yet
The Hundred-Page Machine Learning Book - Andriy Burkov
16 pages
Linear Models
No ratings yet
Linear Models
30 pages
Minsky y Papert
No ratings yet
Minsky y Papert
77 pages
SML_Lecture5
No ratings yet
SML_Lecture5
45 pages
GML-slides-2024-04-29 (1)
No ratings yet
GML-slides-2024-04-29 (1)
206 pages
AI-Based Vocal Judging Application
No ratings yet
AI-Based Vocal Judging Application
8 pages
List of US Companies in India
80% (5)
List of US Companies in India
3 pages
Neural Networks Cheat Sheet - 2020 PDF
No ratings yet
Neural Networks Cheat Sheet - 2020 PDF
14 pages
ch6 (Q 2,8,4)
No ratings yet
ch6 (Q 2,8,4)
9 pages
Week2 DL
No ratings yet
Week2 DL
29 pages
Lecture2 PDF
No ratings yet
Lecture2 PDF
111 pages
L10 Learning II Gradient Based Learning
No ratings yet
L10 Learning II Gradient Based Learning
72 pages
06 Optimization Basics PDF
No ratings yet
06 Optimization Basics PDF
82 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Lec06 Matt[1]
No ratings yet
Lec06 Matt[1]
60 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
CS221 - Artificial Intelligence - Machine Learning - 2 Linear Regression
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 2 Linear Regression
24 pages
05_optimization_basics
No ratings yet
05_optimization_basics
94 pages
6.86x Machine Learning With Python: Linear Classifiers
No ratings yet
6.86x Machine Learning With Python: Linear Classifiers
7 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
61 pages
Week3_LearningI
No ratings yet
Week3_LearningI
48 pages
Week 2 Introduction To Linear Models - Revised - v1
No ratings yet
Week 2 Introduction To Linear Models - Revised - v1
54 pages
383-Fall11-Lec19
No ratings yet
383-Fall11-Lec19
30 pages
Fanuc Subprogram (Local Subroutine) - CNC Training Centre
No ratings yet
Fanuc Subprogram (Local Subroutine) - CNC Training Centre
10 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
IML-Summary
No ratings yet
IML-Summary
12 pages
L3_CSE256_FA24_FFN
No ratings yet
L3_CSE256_FA24_FFN
64 pages
Machine Learning: Support Vector Machines Kernel Methods
No ratings yet
Machine Learning: Support Vector Machines Kernel Methods
87 pages
Lec 03
No ratings yet
Lec 03
42 pages
Lecture15 Regularization
No ratings yet
Lecture15 Regularization
47 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Neural network intro lecture 4
No ratings yet
Neural network intro lecture 4
46 pages
2. Linear_ Regression_SGD
No ratings yet
2. Linear_ Regression_SGD
71 pages
C2000 User Manual PDF
No ratings yet
C2000 User Manual PDF
539 pages
HODL Lec 2 Training NNs Intro TF
No ratings yet
HODL Lec 2 Training NNs Intro TF
83 pages
Module2-Optimizations
No ratings yet
Module2-Optimizations
65 pages
3.Linear Regression
No ratings yet
3.Linear Regression
18 pages
NN Theory
No ratings yet
NN Theory
138 pages
Programme CNC
100% (1)
Programme CNC
294 pages
Packet Tracer 1.5.1
No ratings yet
Packet Tracer 1.5.1
9 pages
ml
No ratings yet
ml
10 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
ML-UNIT-I
No ratings yet
ML-UNIT-I
14 pages
Week11_regularization and optimization
No ratings yet
Week11_regularization and optimization
75 pages
Lecture 2 Math
No ratings yet
Lecture 2 Math
34 pages
CS480 6 Linear Models
No ratings yet
CS480 6 Linear Models
68 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
Aravali College of Engineering and Management: Topic: Change of Speech Year: 2 Year
No ratings yet
Aravali College of Engineering and Management: Topic: Change of Speech Year: 2 Year
4 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
ML Notes
No ratings yet
ML Notes
14 pages
Session 6 Machine Learning Algorithms
No ratings yet
Session 6 Machine Learning Algorithms
46 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Groove-Weld
No ratings yet
Groove-Weld
18 pages
Question Bank Subject: Materials Science and Engineering
No ratings yet
Question Bank Subject: Materials Science and Engineering
4 pages
Basic Machine Learning: Case Study
No ratings yet
Basic Machine Learning: Case Study
11 pages
Longitudes & Latitudes
No ratings yet
Longitudes & Latitudes
15 pages
week 06 - Deep Feedforward Networks - Optimization
No ratings yet
week 06 - Deep Feedforward Networks - Optimization
83 pages
of PWM DC Motor
50% (4)
of PWM DC Motor
21 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
195600-NEMAcourse-analog_mag-manual
No ratings yet
195600-NEMAcourse-analog_mag-manual
5 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Impact of Monetary Policy On Indonesia's Economic Growth
No ratings yet
Impact of Monetary Policy On Indonesia's Economic Growth
32 pages
Openstack Troubleshooting Field Survival Guide
No ratings yet
Openstack Troubleshooting Field Survival Guide
41 pages
Aubf Prelim 1
No ratings yet
Aubf Prelim 1
59 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Telemetry Streaming With iDRAC9 - What You Need To Get Started
No ratings yet
Telemetry Streaming With iDRAC9 - What You Need To Get Started
28 pages
2070516 - Nguyễn Thiên Phú - Assign1
No ratings yet
2070516 - Nguyễn Thiên Phú - Assign1
33 pages
Well Performance Analysis Based On Flow Calculations and Ipr
No ratings yet
Well Performance Analysis Based On Flow Calculations and Ipr
20 pages
HR Question:-: Interviews Question & Answer. Quality Analyst
No ratings yet
HR Question:-: Interviews Question & Answer. Quality Analyst
39 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Interview Questions Selenium & Appium: Fresher Academy
No ratings yet
Interview Questions Selenium & Appium: Fresher Academy
96 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Unit1 Os Notes
No ratings yet
Unit1 Os Notes
19 pages
Solar Energy: Mahfoud Abderrezek, Mohamed Fathi
No ratings yet
Solar Energy: Mahfoud Abderrezek, Mohamed Fathi
13 pages
Import Redfish Import Time From Datetime Import Datetime
No ratings yet
Import Redfish Import Time From Datetime Import Datetime
6 pages
Identification_of_herbal_materials_by_HPTLC_USP_General_Chapter_1064_3_Sep_2018
No ratings yet
Identification_of_herbal_materials_by_HPTLC_USP_General_Chapter_1064_3_Sep_2018
8 pages
DL Unit-2
No ratings yet
DL Unit-2
24 pages
Week 5 (CH 5) Tute Soln MA (S1 18)
No ratings yet
Week 5 (CH 5) Tute Soln MA (S1 18)
8 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Online Tools For Research and Analysis: 08 - 14 JUNE 2021
No ratings yet
Online Tools For Research and Analysis: 08 - 14 JUNE 2021
7 pages
Measurement of Gravitational Acceleration by Tracker - 202209
No ratings yet
Measurement of Gravitational Acceleration by Tracker - 202209
4 pages
Math FSA Vocabulary
No ratings yet
Math FSA Vocabulary
3 pages
"Refrigeration & Air Conditioning": Together We Succeed Department of Mechanical Engineering
No ratings yet
"Refrigeration & Air Conditioning": Together We Succeed Department of Mechanical Engineering
1 page
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
PHP- file - Manual
No ratings yet
PHP- file - Manual
2 pages
Curatronic Ltd. Announces The New Ultra-Flash Impulse PEMF Device Exceeding 1 Tesla
No ratings yet
Curatronic Ltd. Announces The New Ultra-Flash Impulse PEMF Device Exceeding 1 Tesla
2 pages
Offline Std 11 & 12 Time table
No ratings yet
Offline Std 11 & 12 Time table
1 page
Mathematics 10: Kauswagan National High School
No ratings yet
Mathematics 10: Kauswagan National High School
2 pages
Preboard Geo May 2015
No ratings yet
Preboard Geo May 2015
10 pages
State-of-the-Art ABAP - A Practical Programming Guide
No ratings yet
State-of-the-Art ABAP - A Practical Programming Guide
26 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet