0% found this document useful (0 votes)

22 views67 pages

Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron

The document discusses machine learning topics including linear regression, logistic regression, and hyperplane-based classifiers. It begins by introducing probabilistic linear regression, describing it using a maximum likelihood estimation framework. It then discusses regularized linear regression using a maximum a posteriori estimation approach, introducing a prior over the model parameters to encourage simpler models.

Uploaded by

Boul chandra Garai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views67 pages

Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron

Uploaded by

Boul chandra Garai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

Machine Learning by ambedkar@IISc

I Probabilistic view of linear regression

I Logistic regression

I Hyperplane based classifiers and

perceptron
Topics

Probabilistic View of Linear Regression

Logistic Regression

Hyperplane based classifiers and Perceptron

2
Probabilistic View of Linear
Regression
Maximum Likelihood Estimation

I Let X = x1 , x2 , . . . , xN , where xn ∈ Rd be some data that

is generated from xn ∼ P (x|θ)
I Recall: In the statistical approach to machine learning, we
assume that there is an underlying probability distribution
from which the data is sampled.
I Hence θ denotes the parameters of the distribution.
I For example xn ∼ N (x|µ, σ). That is θ = (µ, σ).

I Assumption: The data in X is generated i.i.d.

(independent and identically distributed). This is very
important assumption and we see this very often.
I Aim: Learn θ given the data X = x1 , x2 , . . . , xN .

3
Diversion: Some Probability

I We say two random variables X, Y are identical that

means that their probability distributions are the same

I If two Gaussian random variables are same only if their

means and variance (covariance matrices) are same.

I We say two random variables X, Y are independent if

P (X, Y ) = P (X)P (Y )

4
Maximum Likelihood Estimation (contd. . . )

I Given X = x1 , x2 , . . . , xN , and xn ∼ P (x|θ)

I Learn P so that likelihood of x1 , x2 , . . . , xN are sampled
from P is maximum.
I Equivalently learn or estimate θ so that likelihood of
x1 , x2 , . . . , xN are sampled from P is maximum.
I By the iid assumption

P (X|θ) = P (x1 , x2 , . . . , xN |θ)

N
Y
= P (xn |θ)
n=1

I P (X|θ) is the likelihood.

5
Maximum Likelihood Estimation (contd. . . )

How do we estimate θ given the data X.

Find value of θ that makes observed data most probable.

Find θ that maximizes likelihood function

N
X
L = log P (X|θ) = log P (xn |θ)
n=1

6
Maximum Likelihood Estimation (contd. . . )

N
X
∗
θMLE = arg max L(θ) = arg max log P (xn |θ)
θ θ n=1

7
Maximum Likelihood Estimation (contd. . . )

Example:
Suppose Xn is a binary random variable. Suppose it follows
Bernoulli distribution
i.e. P (x|θ) = θx (1 − θ)1−x
N
X N
X
L(θ) = log P (xn |θ) = xn log θ + (1 − xn ) log(1 − θ)
n=1 n=1
N N
∂L(θ) 1 X 1 X
= xn + (1 − xn )
∂θ θ 1−θ
n=1 n=1
N N
1 X 1 X
= xn + (N − xn )
θ 1−θ
n=1 n=1

8
Maximum Likelihood Estimation (contd. . . )

PN
∗ n=1 xn
=⇒ θM LE =
N
[ In a coin tossing experiment, it is just a fraction of heads ]

9
Maximum a Posteriori Estimate

I We will have a prior on parameter θ i.e. P (θ)

Yes θ is no more a mere number, it is a Random Variable.
I One can have knowledge on θ
I It acts as a regularizer (We will see later)
I Bayes Rule:
P (X|θ)P (θ)
P (θ|X) =
P (X)

P (θ|X) : P osterior
P (X|θ) : Likelihood
P (θ) : P rior
P (X) : Evidence

10
Maximum a Posteriori Estimate (contd. . . )

Bayes Rule:
P (X|θ)P (θ)
P (θ|X) =
P (X)

11
Maximum a Posteriori Estimate (contd. . . )

MAP Estimate

∗
θM AP = arg max P (θ|x)
θ

= arg max log P (x|θ) + log P (θ)

θ
N
X
= arg max log P (xn |θ) + log P (θ)
θ n=1

Note: When P (θ) is a uniform distribution, it reduces to MLE.

12
Linear Regression : Probabilistic Setting

I Each response is generated by a linear model + Gaussian

noise
Y = WTX + E

I That is, given N training samples {(xn , yn )N

n=1 } i.i.d.
d
xn ∈ R and yn ∈ R
I n ∼ N (0, σ 2 )
I yn ∼ N (wT xn , σ 2 )

=⇒ P (Y |X, W ) = N (y|wT x, σ 2 )
1 (y − wT x)2
= √ exp −
σ 2π 2σ 2

13
Linear Regression : ML Estimation

Log Likelihood

log L(w) = log P (D|w) = log P (y|X, W )

N
Y
= log P (yn |xn , w)
n=1
N
X
= log P (yn |xn , w)
n=1
N
(yn − wT xn )2

X 1
= − log(2πσ 2 ) −
2 2σ 2
n=1

14
Linear Regression : ML Estimation (contd. . . )

N
∗ 1 X
WM LE = arg max − (yn − wT xn )2
w 2σ 2
n=1
N
1 X
= arg min (yn − wT xn )2
w 2σ 2
n=1

i.e. ML Estimation in the case of Gaussian environment ≡ Least

square objective for regression

15
Linear Regression : MAP Estimate

I Here we introduce prior on the parameter w.

⇒ This will lead to regularization of model.
I Remember we treat parameters as Random Variables in
MAP.

I 0 , |λ−1
P(w) = N (w||{z} {z I} )
MeanVariance
I We have multivariate Gaussian
1 1
exp − (x − µ)T Σ−1 (X − µ)

N (x : µ, Σ) = p
D
(2π) |Σ| 2
1 λ
exp − wT w

=q
D D 2
(2π) 2 ( λ1 ) 2

16
Linear Regression : MAP Estimate (contd. . . )

I log posterior probability

P (D|w)P (w)
log(w|D) = log
P (D)
= log P (w) + log P (w|D) − log P (D)

∗
WM AP = arg max log P (w|D)
w

= arg max log P (w) + log P (D|w) + log P (D)
w

= arg max log P (w) + log P (D|w)
w

17
Linear Regression : MAP Estimate (contd. . . )

∗
WM AP = arg max log P (w|D)
w
D λ
= arg max − log 2π − wT w
w 2 2
N
X 1 (xn − wT xn )2
+ − log(2πσ 2 ) −
2 2σ 2
n=1
N
1 X λ
= arg max 2 (yn − wT xn )2 + wT w
w 2σ 2
n=1

MAP estimate in the case of Gaussian environment ≡ Least

square objective with L2 regularization.

18
MLE vs MAP

MAP estimate shrinks the estimate of w towards the prior.

19
Logistic Regression
Problem Set Up

I Two class classification

I Instead of the exact labels estimate the probabilities of the

labels.

I i.e predict

P (yn = 1|xn , w) = µn
P (yn = 0|xn , w) = 1 − µn

20
The Logistic Regression Model

1 exp(w| xn )
µn = f (xn ) = σ(w| xn ) = =
1 + exp(−w| xn ) 1 + exp(w| xn )

I Here σ is the sigmoid or logistic function.

I The model first computes a real-values score.
d
X
w| x = wi x i
i=1
and non-linearly squashes it between (0, 1) to turn this
into a probability.

µ
-1

21
0 w| x
The Decision Boundary

If w| x > 0 =⇒ P (yn = 1|xn , w) > P (yn = 0|xn , w)

w| x < 0 =⇒ P (yn = 1|xn , w) < P (yn = 0|xn , w)

w| x = 0

Logistic Regression: Decision Boundary

22
Loss Function Optimization

I Squared Loss

`(yn , f (xn ) = (yn − f (xn ))2

= (yn − σ(w| xn ))2

I This is non-convex and not easy to optimize.

I Cross Entropy loss


− log(µ ) when yn = 1
n
`(yn , f (xn )) =
− log(1 − µn ) when yn = 0

−P (y = 1|x , w ) when y = 1
n n n n
=
−P (yn = 0|xn , wn ) when yn = 0

23
Cross Entropy loss

l(yn , f (xn )) = −yn log(µn ) − (1 − yn ) log(1 − µn )

= −yn log(σ(w| xn )) − (1 − yn ) log(1 − σ(w| xn ))

I Cross Entropy Loss over entire data.

N
X
L(w) = l(yn , f (xn ))
n=1
XN
= [−yn log(µn ) − (1 − yn ) log(1 − µn )]
n=1
XN
=− [yn w| xn − log(1 + exp(w| xn )]
n=1
24
Cross Entropy loss

I By adding L2 regularizer.
N
X
L(w) = − [yn w| xn − log(1 + exp(w| xn )) + λ||w||2
n=1

25
Logistic Regression: MLE formulation

I AIM Learn w from the data that can predict the

probability of xn belong to 0 or 1.
I Log Likelihood: Given D = {(x1 , y1 ), . . . , (xN , yN )}
log L(w) = log P (D|w)
= log P (Y |X, w)
N
Y
= log P (yn |xn , w)
n=1
YN
= log µynn (1 − µn )1−yn
n=1
I ∵ Y is a Bernoulli random variable
P (yn = 1|xn , w) = µn
P (yn = 0|xn , w) = 1 − µn
26
Logistic Regression: MLE formulation(contd...)

Which is same as the cross entropy loss minimization.

27
Logistic Regression: MAP estimate

I MAP → We assume a prior distribution on the weight

vector w. That is

P (w) = N (w|0), λ−1 I)

1 λ |
= D exp − w w
(2π) 2 2

I Note: Multivariate Gaussian is defined as

1 1 | −1
P (w) = p exp − (X − µ) Σ (X − µ)
(2π)D |Σ| 2
I Then MAP estimate is
∗
WM AP = arg max log P (W |D)
w

28
Logistic Regression: MAP estimate (cont...)

I We have
∗
WM AP = arg max log P (W |θ)
w

= arg max log P (D|w) + log P (w)

w

D λ
= arg max − log 2π − w| w
w 2 2
N
X
|
− log(1 + exp(−yn w xn ))
n=1
N
X λ
= arg max log 1 + exp(−yn w xn ) + w| w
|
w 2
n=1
Which is same as the minimizing regularized cross entropy
loss.
29
Logistic Regression: Some Comments

I Objective function of Logistic Regression is same as SVMs

except for the loss function.

Logistic Regression → log loss

SVM → hinge loss

I Logistic regression can be extended to multiclass case: just

use softmax function.
exp(wk| x)
P (Y = k|w, x) = P | k = 1, 2, . . . , K classes
k exp(wk x)

30
Optimization is the Key

I Almost all problems in machine learning leads to

optimization problems

I The following two factors decides the fate of any method:

I What kind of optimization problem that we are led to

I What are all optimization methods that are available to us

I There are several methods that are available for

optimization, among these gradient descent methods are
most popular

31
Gradient Descent methods are Used in ....

I Linear Regression

I Logistic Regression

I It is just classification, but instead of labels it gives us class

probability

I Support Vector Machines

I Neural Networks

I The backbone of neural networks is Back-propagation

algorithm

32
Example of an objective

I Most often, we do not even

have functional form of the
objective.
I Given x, we can only
compute f (x)
I Sometime this may involve
a simulating a system
I Computing each f (x) can
be time consuming

I This becomes even more difficult as x is a D-dimensional

vector and D is very large

33
Multivariate Functions

(a) f (x, y) = x2 + y 2 (b) f (x, y) = −x2 + y 2

34
Partial Derivatives

(a) Surface given by

(b) Plane y = 1
f (x, y) = 9 − x2 − y2

(c) f (x, 1) = 8 − x2 denotes a

curve, and f 0 (x) = −2x denotes
derivative (or slope) of that
curve

35
Partial Derivatives (contd. . . )

36
Idea of Gradient Descent Algorithm

I Start at some random point (of course final results will

depend on this)
I Take steps based on the gradient vector of the current
position till convergence
I Gradient vector give us direction and rate of fastest increase
any any point
I Any point x if the gradient is nonzero, then the direction of
gradient is the direction in which the function most quickly
from x
I The magnitude of gradient is the rate of increase in that
direction

37
Idea of Gradient Descent Algorithm1

1
Credits for all the images in this sections goes to Michailidis and Maiden
38
Gradient Descent

I AIM: To minimize the function

N h
X i
L(w) = yn w| xn − log(1 + exp(w| xn )
n=1

I We do this by calculating the derivative of L w.r.t w.

I Note: Since log function is concave in w, this has a unique

minimum.

39
Gradient Descent

I AIM: To minimize the function

XN h i
L(w) = yn w| xn − log(1 + exp(w| xn )
n=1
I Gradient:
N
exp(w| xn )

∂L X
=− yn xn − xn
∂w 1 + exp(w| xn )
n=1
XN
=− (yn − µn )xn = X −1 (µ − y)
n=1
     
µ1 y1 x1
 .   .   . 
where µ =  .. 

 Y = . 
 .  and X =  . 
 . 
µN yN xN
N ×D

40
Gradient Descent (contd...)

I Since there is no closed form solution, we take a recourse to

iterative methods like gradient descent.
I Gradient Descent:
1 Initialize w(1) ∈ RD randomly.
2 Iterate until the convergence.
N
X
w(t+1) = w(t) − η µ(t)
n − yn xn
n=1
| {z }
Gradient at
previous value

(t) |
I µn = σ(w(t) xn )
I η is the learning rate.

41
Gradient Descent (contd...)

I We have the following update

N
X
w(t+1) = w(t) − η (µ(t)
n − yn )xn
n=1
| {z }
Gradient at previous value

I Note: Calculating gradient in each iteration requires all the

data. When N is large this may not be feasible.

I Stochastic Gradient Descent: Use mini-batches to compute

the gradient.

42
Gradient Descent: Some Remarks

Note on Learning Rate:

I Sometimes choosing the learning rate is difficult
I Larger learning rate → Too much fluctuation.
I Smaller learning rate → Slow convergence

To deal with this problem:

I Choose optimal step size at each iteration ηt using line
search.
I Add momentum to the update.

w(t+1) = w(t) − η(t) g (t) + αt w(t) − w(t−1)
I Use second order methods like Newton method to exploit
the curvature of the loss function. (But we need to compute
Hessian matrix.)
43
Multiclass Logistic or Softmax Regression

I Logistic regression can be extend for the multiclass case.

I Let yn ∈ {0, 1, . . . , k − 1}
I Define
exp(wk| xn )
P (yn = k|xn , W ) = PK |
l=1 exp(wk xnl )
= µnk

∗ µnK : Probability that nth sample belongs to

k th class and kl=1 µnl = 1
P

I Softmax: Class k with largest wk| xn dominates the

probability.

44
Multiclass Logistic or Softmax Regression

exp(wk| xn )
I P (yn = k|xn , W ) = PK |
l=1 exp(wk xn )

I W = [w1 w2 . . . wk ]D×K

I We can think of yn are drawn from multimodal distribution

N Y
K
Y yn
P (y|X, W ) = µnl l : Likelihood function
n=1 l=1

I where ynl = 1 if true class of example n is l and ynl = 0 for

all other l.

45
Hyperplane based classifiers and
Perceptron
Linear as Optimization

Supervised Learning Problem

I Given data {(xn , yn )}N

n=1 find f : X → Y that best
approximates the relation between X and Y .

I Determine f such a way that loss l(y, f (x)) is minimum.

I f and l are specific to the problem and the method that we

choose.

46
Linear Regression

I Data: {(xn , yn )}N

n=1
I xn ∈ RD is a D dimensional input
I yn ∈ R is the output
Aim is to find a hyperplane that fits best these points.
I Here hyperplane is a model of choice i.e.,
D
X
f (x) = x j wj + b = w | x + b
j=1

I Here w1 , . . . , wd and b are model parameters

I Best is determined by some loss function
N
X
Loss(w) = [yn − f (xn )]2
n=1
I Aim : Determine the model parameters that minimize the
loss. 47
Logistic Regression

Problem Set-Up

I Two class classification

I Instead of the exact labels estimate the probabilities of the

labels i.e.

Predict P (yn = 1|xn , w) = µn

P (yn = 0|xn , w) = 1 − µn

I Here (xn , yn ) is the input output pair.

48
Logistic Regression(Contd...)

Problem
Find a function f such that,

µ = f (xn )

Model
1
µn = f (xn ) = σ(w| xn ) =
1 + exp(−w| xn )
exp(w| xn )
=
1 + exp(w| xn )

49
Logistic Regression(Contd...)

Sigmoid Function

I Here σ(.) is the sigmoid function.

I The model first computes a real valued score

w| x = D
P
i=1 wi xi and then nonlinearly “squashes” it
between (0,1) to turn into a probability.

50
Logistic Regression(contd...)

Loss Function: Here we use cross entropy loss instead of

squared loss.
Cross entropy loss is defined as:


− log(µ ) when yn = 1
n
L(yn , f (xn )) =
− log(1 − µn ) when yn = 0
= −yn log(µn ) − (1 − yn ) log(1 − µn )
= −yn log(σ(w| xn )) − (1 − yn ) log(1 − σ(w| xn ))

And now empirical risk is

N
X
L(w) = − [yn w| xn − log(1 + exp(w| xn ))]
n=1
51
Logistic Regression(contd...)

By taking the derivative w.r.t w

N
∂L X
= (µn − yn )xn
∂w
n=1

I Here the Gradient Descent Algorithm is

1 Initialize w(1) ∈ RD randomly
2 Iterate until the convergence

N
X
(t+1)
w w(t) − η
= |{z} (µ(t) − yn )xn
| {z } |{z}
n=1
| n {z }
New learned previous Learning Gradient at
parameter or value rate previous value
weights

|
I Note: Here µ(t) = σ(w(t) xn )

52
Logistic Regression (contd. . . )

Let us take a look at the update equation again

N
X
(t+1)
w w(t) − η
= |{z} (µ(t) − yn )xn
| {z } |{z} | n {z }
New learned previous Learning n=1 Gradient at
parameter or value rate previous value
weights

What do we notice here?

Problem: Calculating gradient in each iteration requires all the
data. When N is large this may not be feasible.

53
Stochastic Gradient Descent

I Strategy: Approximate gradient using randomly chosen

data point (xn , yn )

w(t+1) = w(t) − ηt (µ(t)

n − yn )xn

.
(t)
I Also: Replace predicted label probability µn by predicted
(t)
binary label ŷn , where

1 if µ(t) > 0.5 or w(t)| x > 0
(t) n n
ŷn =
0 if µ(t)
n < 0.5 or w
(t)|
xn < 0

54
Stochastic Gradient Descent (contd. . . )

I Hence: Update rule becomes

w(t+1) = w(t) − ηt (ŷn(t) − yn )xn

I This is mistake driven update rule

I w(t) gets updated only when there is a misclassification i.e.

(t)
ŷn 6= yn

55
Stochastic Gradient Descent (contd. . . )

We will do one more simple change:

I Change: the class labels to {-1,+1}


−2y (t) (t)
n if ŷn = 6 yn
=⇒ ŷn(t) − yn = (t) (t)
0 if ŷn = yn

I Hence: Whenever there is a misclassification.

w(t+1) = w(t) − 2η(t) yn xn

I =⇒ This is a perceptron learning algorithm which is a

hyperplane based learning algorithm.

56
Hyperplanes

I Separates a d-dimensional space into two half

spaces(positive and negative)
I Equation of the hyperplane is

w| x = 0

I By adding bias b ∈ R
w| x + b = 0 b > 0 moving the
hyperplane parallely along w
b<0 opposite direction
57
Hyperplane based classification

I Classification rule

y = sign(w| x + b)

w| x + b > 0 =⇒ y = +1
w| x + b < 0 =⇒ y = −1

58
Hyperplane based classification

59
The Perceptron Algorithm (Rosenblatt, 1958)

I Aim is to learn a linear hyperplane to separate two classes.

I Mistake drives online learning algorithm

I Guaranteed to find a separating hyperplane if data is

linearly separable.

60
Perceptron Algorithm

I Given training data D = {(x1 , y1 ), ..., (xn , yn )}

I Initialize wold = [0, ..., 0], bold = 0

I Repeat until convergence.

I For a random (xn , yn ) ∈ D

I If yn (w| xn + b) ≤ 0
[Or sign(w| x + b) 6= yn i.e mistake mode]

I wnew = wold + yn xn

I bnew = bold + yn

61
Perceptron Convergence Theorem (Block and Novikoff )

"Roughly" : If the data is linearly separable perceptron

algorithm converges.

62
What if the data is not linearly separable?

Yes! In practice, most often the data is not linearly separable.

Then

I Make linearly separable using kernel methods.

I (Or) Use multilayer perceptron.

What are all these?

I The first leads to Support Vector Machines, that rules

machine learning for decades

I The second one leads to Deep Learning!

63
What we learned?

I Maximum Likelihood Estimates

I Bayes again! MAP

I Probabilistic view of Linear and Logistic Regression

I Hyperplanes and Perceptrons

I The two very big paradigms in ML

University of Edinburgh
100% (1)
University of Edinburgh
3 pages
C1 Advanced Sample Paper 2 Reading and Use of English 2022
100% (2)
C1 Advanced Sample Paper 2 Reading and Use of English 2022
15 pages
PLNB1BRE 202402010147235 Customer
No ratings yet
PLNB1BRE 202402010147235 Customer
6 pages
Action Research Proposal Sample
100% (3)
Action Research Proposal Sample
16 pages
Primary 6 (Grade 6) Contest Paper: Singapore and Asian Schools Math Olympiad 2020
No ratings yet
Primary 6 (Grade 6) Contest Paper: Singapore and Asian Schools Math Olympiad 2020
15 pages
Grammar Quiz Bee
No ratings yet
Grammar Quiz Bee
35 pages
JSREP Volume 38 Issue 183ج1 Pages 223-310
No ratings yet
JSREP Volume 38 Issue 183ج1 Pages 223-310
88 pages
Edfo 418 Comparative Education
No ratings yet
Edfo 418 Comparative Education
4 pages
BC Alumni NEwsletter February 2023 Final
No ratings yet
BC Alumni NEwsletter February 2023 Final
16 pages
Stereoisomerism Pyqs Nsec
100% (1)
Stereoisomerism Pyqs Nsec
8 pages
Math and Diversity Course Syllabus FA17
No ratings yet
Math and Diversity Course Syllabus FA17
5 pages
Machine Learning: What Is Data and Model? Machine Learning Workflow Distance Based Classifiers Bayes Decision Theory
No ratings yet
Machine Learning: What Is Data and Model? Machine Learning Workflow Distance Based Classifiers Bayes Decision Theory
81 pages
Machine Learning II: The Linear Model
No ratings yet
Machine Learning II: The Linear Model
48 pages
A. Kornberg - Never A Dull Enzyme
No ratings yet
A. Kornberg - Never A Dull Enzyme
33 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Guc 2 61 38781 2023-11-25T16 29 04
No ratings yet
Guc 2 61 38781 2023-11-25T16 29 04
3 pages
Chap1-2 Markov Chain
No ratings yet
Chap1-2 Markov Chain
82 pages
Logistic Regression: Adapted From: Tom Mitchell's Machine Learning Book Evan Wei Xiang and Qiang Yang
No ratings yet
Logistic Regression: Adapted From: Tom Mitchell's Machine Learning Book Evan Wei Xiang and Qiang Yang
15 pages
Revisiting Revisiting Logistic Regression & Naïve Logistic Regression & Naïve Bayes Bayes
No ratings yet
Revisiting Revisiting Logistic Regression & Naïve Logistic Regression & Naïve Bayes Bayes
46 pages
Final
No ratings yet
Final
145 pages
Unit 2 Eating Healthy Listening
No ratings yet
Unit 2 Eating Healthy Listening
4 pages
Lesson Plan
No ratings yet
Lesson Plan
16 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
Machine Learning
No ratings yet
Machine Learning
64 pages
ICT Infrastructure Roadmap: Alain Del Bustamante Pascua Undersecretary For Administration Department of Education
No ratings yet
ICT Infrastructure Roadmap: Alain Del Bustamante Pascua Undersecretary For Administration Department of Education
4 pages
Diannao Asplos2014
No ratings yet
Diannao Asplos2014
15 pages
ML - Unit 2
No ratings yet
ML - Unit 2
155 pages
Machine Learning: Support Vector Machines Kernel Methods
No ratings yet
Machine Learning: Support Vector Machines Kernel Methods
87 pages
Unit 3-Discriminative Models
No ratings yet
Unit 3-Discriminative Models
29 pages
20 Induction
No ratings yet
20 Induction
25 pages
Machine Learning 2
No ratings yet
Machine Learning 2
19 pages
Machine Learning - Unit 2
No ratings yet
Machine Learning - Unit 2
104 pages
HighPerformanceSpaceflightComputing HPSC
No ratings yet
HighPerformanceSpaceflightComputing HPSC
19 pages
Introduction To Machine Learning: 2 Linear Classifiers
No ratings yet
Introduction To Machine Learning: 2 Linear Classifiers
4 pages
E0294 Scribe Lecture 9
No ratings yet
E0294 Scribe Lecture 9
24 pages
Umaru Ali Shinkafi Polytechnic Sokoto
No ratings yet
Umaru Ali Shinkafi Polytechnic Sokoto
14 pages
Is RISC V Ready For Space A Security Perspective
No ratings yet
Is RISC V Ready For Space A Security Perspective
6 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
MP Set 092 17
No ratings yet
MP Set 092 17
14 pages
No Cs
No ratings yet
No Cs
3 pages
Sub 287
No ratings yet
Sub 287
16 pages
Future Trends in Computer Architecture
No ratings yet
Future Trends in Computer Architecture
4 pages
Lecture 8: Gradient Descent and Logistic Regression
No ratings yet
Lecture 8: Gradient Descent and Logistic Regression
39 pages
CH E 350, Process Heat Transfer Fall 2010: Course Content
No ratings yet
CH E 350, Process Heat Transfer Fall 2010: Course Content
2 pages
Linear Programming: - Socrates
No ratings yet
Linear Programming: - Socrates
21 pages
Resume JONATHAN MWITA 02 - 15 - 2023 7 - 15 - 50 AM
No ratings yet
Resume JONATHAN MWITA 02 - 15 - 2023 7 - 15 - 50 AM
2 pages
05.03 OBDP2021 Steenari
No ratings yet
05.03 OBDP2021 Steenari
9 pages
Skyline Daa 1
No ratings yet
Skyline Daa 1
8 pages
Hardware Is The New Software
No ratings yet
Hardware Is The New Software
8 pages
Introduction
No ratings yet
Introduction
10 pages
A European Roadmap To Leverage RISC-V in Space Applications
No ratings yet
A European Roadmap To Leverage RISC-V in Space Applications
7 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Business Analytics & Machine Learning: Logistic and Poisson Regressions
No ratings yet
Business Analytics & Machine Learning: Logistic and Poisson Regressions
62 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Homework 3 2005
No ratings yet
Homework 3 2005
2 pages
Maxima
No ratings yet
Maxima
3 pages
Adjacency Matrix To List
No ratings yet
Adjacency Matrix To List
2 pages
South Africa Heart Disease Project: Omar M. Osama Deyaa Eldeen A. Almahallawi June 16, 2010
No ratings yet
South Africa Heart Disease Project: Omar M. Osama Deyaa Eldeen A. Almahallawi June 16, 2010
7 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Linear - Regression
100% (1)
Linear - Regression
39 pages
Asg 0
No ratings yet
Asg 0
1 page
CH 1
No ratings yet
CH 1
24 pages
HW 4
No ratings yet
HW 4
1 page
List of Projects
No ratings yet
List of Projects
1 page
Lecture 6
No ratings yet
Lecture 6
19 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Note 4: EECS 189 Introduction To Machine Learning Fall 2020 1 MLE and MAP For Regression (Part I)
No ratings yet
Note 4: EECS 189 Introduction To Machine Learning Fall 2020 1 MLE and MAP For Regression (Part I)
6 pages
Essay 3 - The Importance of Financial Literacy in Today's World
No ratings yet
Essay 3 - The Importance of Financial Literacy in Today's World
2 pages
202440272655FF Ubah
No ratings yet
202440272655FF Ubah
1 page
Cheatsheet Supervised Learning
100% (1)
Cheatsheet Supervised Learning
4 pages
Monthly Planner Month of July To September Classes Nur - To Ukg
No ratings yet
Monthly Planner Month of July To September Classes Nur - To Ukg
5 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Lecture 05
No ratings yet
Lecture 05
5 pages
Notes 05
No ratings yet
Notes 05
51 pages
ML Basics Lecture2 Linear Classification
No ratings yet
ML Basics Lecture2 Linear Classification
34 pages
Lecture 2
No ratings yet
Lecture 2
8 pages
RSL+Exam+Fee+Sheet+2024 Malaysia+
No ratings yet
RSL+Exam+Fee+Sheet+2024 Malaysia+
1 page
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
7 Logistic-Regression
No ratings yet
7 Logistic-Regression
63 pages
9 Mle
No ratings yet
9 Mle
39 pages
COL774 Practice Problems
No ratings yet
COL774 Practice Problems
22 pages
Cheatsheet Supervised Learning
No ratings yet
Cheatsheet Supervised Learning
4 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
6) STD - 8th Answersheet
No ratings yet
6) STD - 8th Answersheet
3 pages
Tengeru Institute of Community Development
No ratings yet
Tengeru Institute of Community Development
48 pages
Handout 02 Logistic Regression
No ratings yet
Handout 02 Logistic Regression
39 pages
Lin Reg
No ratings yet
Lin Reg
34 pages
Syllabus in EE104
No ratings yet
Syllabus in EE104
7 pages
Lecture 5 - Logistic Regression
No ratings yet
Lecture 5 - Logistic Regression
28 pages
ML-chap10 2024 110300
No ratings yet
ML-chap10 2024 110300
29 pages
Fileml
No ratings yet
Fileml
54 pages
Sci ML Mock Exam 2023
No ratings yet
Sci ML Mock Exam 2023
8 pages
Final ML
No ratings yet
Final ML
54 pages
Lecture 6
No ratings yet
Lecture 6
13 pages
Binary Logistic Regression 2
No ratings yet
Binary Logistic Regression 2
43 pages
4 Linear Regression Additional Notes
No ratings yet
4 Linear Regression Additional Notes
8 pages
Output 25
No ratings yet
Output 25
8 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
BlackList Students
No ratings yet
BlackList Students
8 pages
Output 23
No ratings yet
Output 23
6 pages
Malala Yousafzai American English Student
No ratings yet
Malala Yousafzai American English Student
7 pages
Regression Probabilistic Perspective
No ratings yet
Regression Probabilistic Perspective
20 pages
04 Lecturenote MLE MAP Discriminative
No ratings yet
04 Lecturenote MLE MAP Discriminative
6 pages
Gb-Xi (Half Yearly) Set-1
No ratings yet
Gb-Xi (Half Yearly) Set-1
6 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
ECE 449 Notes
No ratings yet
ECE 449 Notes
5 pages
Wk05 Machine Learning
No ratings yet
Wk05 Machine Learning
6 pages
04 - Linear-Classification-2024
No ratings yet
04 - Linear-Classification-2024
65 pages
Chapter02 Introduction To DeepLearning
No ratings yet
Chapter02 Introduction To DeepLearning
84 pages
Henrique de Andrade Gomes - Gauge Theory and The Geometrisation of Physics-Cambridge University Press (2025)
No ratings yet
Henrique de Andrade Gomes - Gauge Theory and The Geometrisation of Physics-Cambridge University Press (2025)
84 pages
An Introduction To Management Science Quantitative Approach 15th Edition Anderson Test Bank Available Instantly
No ratings yet
An Introduction To Management Science Quantitative Approach 15th Edition Anderson Test Bank Available Instantly
337 pages
CSCI-43646364 S25 - Lecture 4
No ratings yet
CSCI-43646364 S25 - Lecture 4
92 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
Pavani Ishta Ecity Bill
No ratings yet
Pavani Ishta Ecity Bill
1 page
Frd Ut699 Space Wire Interface
No ratings yet
Frd Ut699 Space Wire Interface
7 pages