Lecture 2
Lecture 2
Volodymyr Kuleshov
Cornell Tech
Lecture 2
Then
n
Y
p(y , x1 , . . . xn ) = p(y ) p(xi | y )
i=1
1
1 + e−z
p(Y = 1 | x; α) = f (x, α)
2 Linear dependence:
Pn
let z(α, x) = α0 + i=1 αi xi .
p(Y = 1 | x; α) = σ(z(α, x)), where σ(z) = 1/(1 + e −z ) is the
logistic function
Dependence might be too simple
3 Non-linear dependence: let h(A, b, x) = g (Ax + b) be a non-linear
transformation of the inputs (features).
pNeural (Y = 1 | x; α, A, b) = σ(α0 + hi=1 αi hi )
P
More flexible
More parameters: A, b, α
Fully General
Bayes Net
Chain rule, Bayes rule, etc all still apply. For example,