0% found this document useful (0 votes)
57 views23 pages

ANN Theory

1. Deep learning uses neural networks with multiple hidden layers to analyze large amounts of data. 2. Weights and biases are adjusted during training using backpropagation to reduce loss and update the network to better predict target outputs. 3. The sigmoid activation function works well for binary classification but can cause the vanishing gradient problem in deep networks where gradients become very small and weights are not adequately updated during training.

Uploaded by

Lilambuj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views23 pages

ANN Theory

1. Deep learning uses neural networks with multiple hidden layers to analyze large amounts of data. 2. Weights and biases are adjusted during training using backpropagation to reduce loss and update the network to better predict target outputs. 3. The sigmoid activation function works well for binary classification but can cause the vanishing gradient problem in deep networks where gradients become very small and weights are not adequately updated during training.

Uploaded by

Lilambuj Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Decp Learntng.

Lohy Deep lecning ??


we have huge mount data.
2Advancement in tomput 1 Hatclwane
(GPO's)
pexception
(Single louyeyed Newtal network)
P layerS
Hidden
layer2 o/P l y
Studyhr
X

playhr
X

Sleephr LosS
funcion -9)
NewtoD Outpuut
Cwe run hdue
any humbe
Newnon
Hidden lay ens)

Dadaptt(Binany clasificaton)
Stu dy ply cleep Pans
h hr hr fa
3
5

3
HL 1 Fotw d
Propogaton
Bias

w 2
Exi i Act(9))| /1
Bic

BLck
=DwT 2
propotiON
uwerghs are th ual valuA th a re
attathtd to each i/P/fcatune and fhuy ton vey
thu mpo1 tance th t torespon cding featne
pre elichin he final op
wetghts Convey followtng Thinh9s
) Importen ce Jeatune. in predicHng pvailue.
werght 7 #eutune value N.

) Relutonship blo featunt and


Tanget/ outpus
C pmce, (M
populu y
Buy car co populu ty Buylornot
Buy con Y C pnce
(tm price ) + w (ran pop) +b=o
w
prce1cow CWCCa price) 7 -D we buy
but thi we don 't wan

So we l atign Jowe valuie.


or-ve Ualul
3 wt play an impo) tunt ro e in cha ngtng Onentati
thut stpeiulel dutq tlases.
Bicu i ed to Shift Cactitatfon funcHon to
eft or aigh
Bics
Bicu i4iali ewtion is dutomatic y neunl ntwovk

Sigmoid Acivation
funct on for binuy classificuton
Siqmoid=- , or 1
t e Xiwi+b)
<o O

O forwond nuopogcdio)
bias-
Tnput scign weights >
InputX we)ght +
Actua valul
output Acfivationni
fo
witb)

LosS y -9 Recu ce Loss to


clost fo zeno
for tis we need
to update

This done wsi ng


buce Aopog u t o
wsing optmi2 eNs
Ba ck uopo9 cuton
update wt's using opti mizens to
dute oSS y-9)
.
TlP layers
) welgNtS Bia qts added in forwond uopogaon.
3) Bic eath hid den lay
) Actvcttion funcHo)
los fun cion f(y-9) JW
6 op Hmieu Ba chward
Mopo gaution
F) updaute tha wt

Bia

X
O

U2

2
lo-Cy-9) J
g-Ecox1+b
Z= C ) sigmoid fwd
Aitivaction
Propogation
Ue cen Solve function
hon- (ineuy problemns
uwinqthi&
we need to updahe uts win9 back propogution,

Dwcight updation formula.


chao ude diffenentiation.
>Jwd propogatio
X
W2 wy A
X2
w3

(oss (yy) W
x3
bwd propoyutionn

wfigh upduton formua


ne = ld
-dlogs where E leanin g
wold. rute
loSS qracienf
h Slope
-ve
sloff
dwent

(-9)
Dod t LoS
when we are at A
we want to feuch W for
lobal inim4 -Ve slope
/

neu= Wsld -) (-ve) dloba


nma
o l d +n (condt) wg w

wnew wold updautin q this multiple time we wil


eath UC.
2) t e) s lope= +ve
Wnew d - (tve)
eld-p (tonst)
newWol d =Dupeluting this ntip le
Hmes uwe uull rea Ch
loba minìmd.
mpor ance eanng ate (N)
D t shoul d be mallen numbe So thut
we ran slow ly uach th tobal mihimq
a) for 1angen, le annihg te we muuy pot Jach
gtobal ninmq.
Bi wpdauH on
chain ule ffenunHatio joTmula.

bnew=botd
2
O 2
d

e W4 old ) J oss
u ol cd.

X chan ule
id
uo neo 1d 9
old
SL -xO So chain uude
1old
O2we
ple
ON

031 loss(9-9)

uInew u oid
hew Iod 022

SLOs
IdSOgI SO So
Juule
SO O deni vaives.

LO O O d wol d

Vanishing Haclient. uoblem


coMide aven deep nuhal ntwork
Sigmoid Ativation
b3 by
w W3

021 Oy
031 Loss (y-9)

new
old-9 JL SEMSE

TL Os1Oa O
Uotd O O So Iold
SLqmoid Acttvation
2

4e-z < o ' O


OS
o2S

O4dsigmoid)e oar
denivcive 1 deivadtue
Sigmoid w e .
ConclHon.

S
pld
O 2x0 IS x O:0 y O'oS O'o2
deivative Vauu decreases
Veny Smal valul.

new ofd 2 x (Smal numbeu)


Also q smau
Wo

Veny tm ay
wnew old
not git
he wt uul
lowy
pdated or vey ven
p dae d
This icalled kinq gtadi ent proble m
vanir

Soutton j change Activation functton


O igmoi d actHvctio 0 functio00

Tan h
Re1
Leaky Rely
Prelu
O sgmeid Otn)
oS Ss,Od0tx)0'2
gradicnt vanirliney
o-0-2. Pro blem.
Sigmoid
Not 2eMo w t upduion i
enteed wmve
ey eus y ih
Adv (ase 2eno
Centud Cuves
Smooth qracien, plevens jumps i0
Pvaluly
2 ince ofp blo o and , Jf
normalisey olp each nenon.
3 lean preelichon veay c(ose fo o r
Disdelu uwe cunnsr creuAP
Prone to 9radient anishn q.edeep ntunas
nckwoR.
Funchfen op not 2eo
(enied., ot updation
) Sigmoid has ex SSuL
ponntRa oppenction, wch
S lower for (ompukUs fo
Caleudase
ten b funtio0 hypen bolic tangent funcion)
Hanh () : e'- e*|
eMte-
Ctan ho))=D
(O Oto4

fo2
tan h(7)-D to 1
S
Ad ) zno tentric output
Disad) stil (an CYecte Vauishing talienA
Pioblem

for binoy tanh 6 hidden aye


flasifi cti on Cianmojd -
op lyer
ReLO funcio0
Rectified Linean unit
6

2 oS+
- 202 4 - 2
2
ReLolx) d ReLUa)
not diffenenkahle
a Ze4O
Re LU max(orx) d
not Zeno centic
*when Relu
du no vanishing Mgdic
neunon
2) fcunt caluletlon eacd.
3) when
Thi's a problemn.
input s tve, no
aolitn
Satuncution aob1en.
Disadu )not a zeno (enthic funtion Sin ce op is
ethm 0 oT

2) whun input is
-ve, ReLU inachve completely.
D ReLU d e d for inputk= O.
grddient = Zeno Same problem
Cigmoi d amd atan h.
41 Leaky ReLU function fcx)= max (o-te, «)
d (ft/dx

2 2

Adu: )Solves deadn cnon uoblem in re LO

5 Pardmetsic ReLU fundi on.


x) Re Lu
fix) Lea ry Re LU di Ji iS 4iso
aislerhable >PReLU
6) ELU (Exponenti a Lin
em unis) JunctHo)
fx)

ale-1), othenwis)
dfcx)
S
o

o:2
-IO-55
T4olues problems feLU.
Adu: ) No lead Relu istu
2) Zeno (crted.
3) Bet/ tht Relo (proved
Di'sadu t)
tomputaion uavy Cduu fo
Cdut to
cxponunHa tulh
CXponentw tale hn)
8). SwTSH ( se/f garec) funcron
fen)= Mx Sigmeid Cx)
gaing in LsT

9) May o u

fcx) max ( w,T 4b,,w x+ b2 ),


gennaliscd vnsion RelU and Leqky Rel
for relU w, b =o
for Lea ty selo w ,bj =o U2 FO-o)
,

3+ lean ahle funcion.


10) soft pu
fe )dn( Itez)
ft)
dif ennti dhle Hoought
iil to RcLU, hut
ulahvely Smeo
0 t o to Accephed tamg
-2
Tec hrigue _which a ctivation fun ction we dhould use?
Stu Binny clansi fitaton

P loyer (uwe sigmoid)

Hideen Lu Jf tonvengence j nof


dluwreuy use ReLU huppeninY wePPReLV
feL,
ELU
muli clans tlakiicatton

ue soft max acivation f

wdt RelO if tonuengence not

huppein e
PRel0, ELV
Reession
we linCu) a ctivution f
Also thene is
Sepaae
loss function.
-
C
any vaiutlon
ReLO
LosSfunctonu

DL (ANN)
Biny
Regression .cAaiin
MSE
clcnsfi ccutton
Ocro Binun
entopyCE
2) MAE
ctcgoT rc
CE
3)Hub joss
nuutie le
* LoSS VIS
cost funtlon
100 eCS in

fwd rop

loss= y-9) Single value


(ost Jun chion foT buteh inpu
(ostfh
Reqressi on
OMEAN Squannd error

LOSSA (9-9)*
uadruHC Eq" a d i e

Adu
Jeseent
3f i differentia ble
) + has any one lotalglobal nainimet:
3) 4 (onvenge feste
0iS ddv
not robuot to outi ens. we ul Squning fm error)
AeroF penctlsing
M e c t 0 tbsolue Error
Sub grudic
LF:19-9) cF 2 = y-9)
)Ro bt to outlies
disadv
not dif ad =O
2) Timt conjunuing lut to Sub adien

Huben _loss NSE +NAE


MAE
ype
pananule
f g-9 -9s
S-91-, ottuwse
NSE
clansificati on
tog r
OBinuy ros Entrop log lors.

10SS =
yx log (9) -

(-y)x log C1-9)


loss -log (i-9) ifyo we
Sigmoid in
log (9) iy=/
Last dayn fov
(alaulahon
Cateqovi (a Cross entopy O HE
j-2 3
oo Bad euha
2 3 Jood
:2S bad O
1-3 8
onutur.
L (1 9)=- G no
( uteqorn e
Row
9 i , 9ia, 9i3, J coluMn Jp
ic
o othnwisst

u e sft max Actiocution function P luyes

TCZ) = e
e ZJ
, Prob

e
tolwey
onL
0'6
HL LoS
Concusion C C.E
ultic las> Ae LU, Joftmax.
8inny C6
Bin CM R e L U , sigmoid

Reqression Rel 0, n eo MSE, MAE,


Activation huhenlosSS
*ph mi zenk
1) Gra dient duscent
2) sGp
(stochaotic GD)
3Mini Batch SGD
4) SGD auith momentum
s)ADtarad
6) RMS PRoP
7)Adam opti Mize
Gradient Descent LOS S/coSt4n
uot
updation for mulqa
8peuo old-d
wold
GHo bal
E lewninq «at*e n ma

weight
W
w2

w
wR
9
6
osbL
anch 2h i
(9-9)
we we optimi z ED 1D
batk propogfon to upcdate
ot So that we (an
loss /ey roY
Epocs
Disa clv
)dt oun4e extenAiVe technigul
more tam ugired)

Epocb
3Man thaing newnul netwo7k uith futl dataseh
for one ycle t Fod and bwd uopoguHon)
EX date
ooo 00 polrn
o9oo0-
Epoch.

Stoch aotic radient DeDe ent CsGD)


Epoch 1 nillion re cords in dataset
se Cord

1ogs
foation 1
updak
wt

2 record
1ss I t aton 2
updat

cOTmpletf. nillio Iteuion.


dutay
Suy we need o Epochs to th an model
= 10 Epoth x 1 ilron Itenatior =D Tota co illt0Tn u n

0isadu
tonuenge vey s lo uw
2) Ti me complexi fy hgh.
Adu:
low memory uguurcc.
3 Min - Batch S D
Bactch size= (o00 re rS.
1 nillton e coreld=D No
Epoth 1 tution Million
eooe O0Ttnation
coSth Ttatioo
ooofcend
9 Ttnaton MiM
batch

AdU

2)
Less Ronge intensi ve

3)
Betle
Betiu[impmoved
0ag T4uuto )

ConvnGence
AAA .Globa
m

T i complex1 y

SGD
Graclier -iu-batch
descent
SGrD

Noist

Globus
MNMn
noise =D sD mini-baheh sD
MO1P c s No ise
nois
To Temove noise
use mome rtum.
4) SGD uth momentum
aini btth momentum Ubed to eliminak
V
noise sing moving vgs.
EXponental weiyhted Avea982 Cxponential.
This is clso wsed in Iinu
Saies
Wneo Wod-
ne Dold
boid

EXponential weighted ayeage


CLyeaR tor Smoothenin g the
Tine t noisee
t t3
ralun(v). a
vole On
ati, VE d BE hypen panamete
Vt2
xvt, + (1-) x d2 Lo to 1

VEh B Vtp-
+C1-)Cn
This
pp lied n dl
Noo

wt

whure
Vao Br Vau, +(158) xd L
Smoothening
Loss/costfh Adv
> Smoothenin g noSe
r e d ,

a n d q u u c ke n

noise

> e
C o n v e n
g e n c e .

Reducinq th noise
. Quicken convengance

weqhtr
AdagucLd=D=D Adaptive Gradient oleor ent
wt wL--2 L fixcd in a abouc
dwE- optini2e).
Rcplace uith 2

- L
Reqwire nent:
Tniti dl4 lagR
Hate
eotning
uull make quitko6 Small no. to
Conven gence iutially, but avold 2en0
o
wapproa h gtobul ivisron
Cro
numo Jeoarng Aate
rlul 8hoUld udule
do uith eveny
fhet e do not nss H ie
global minima. epoch t incyeccocs.
CUs h T iDCreaE will n ereccse
wt

ll decreanc c we uull u a ch
hea to gtoha minima.

we have mct de leanin 9 route adaptue


Oisa cdu
for deep neun netwoTk
+va uuill be come
lange
'wui l eumost be rom
Own mode l
ncgligi ble
be vy Sloo 6r StoP
Convenging to gto buw nnq. ie E w tn
6 Ada deltd and
RMsProf
inticu Sdw O
Sdot E 2

le ng Teute ( Exponunti wt avnag


deerect slooly. by mple mentn sdw. ie
ex ponnHa uoeiqhted a
veae
H e r c Smocthening aissinq iC Vdw.
Addm opttnuen momentum-+ *ms prop inkutton
Vdw
Adaptive leunin9
Selw ate
Thiti Vdw, Vd , Sdw, Sey = o

be - -9 Vop Sdw+e

Sdwg= B sdwg- + (1*E)/L


Adu
Smootheninghe Gradi'cnt destent/ noisse
2) Leanning rate i adaptve
3) uick ConvngncE
u) Less Ruoune htendîve

for which et gonithm.s featune lealing Te2 {


Not Re
ANN Dt
LR RF
Lineom R XG-Boot
Ada booE
kNN
k-means
Distance baned,
optimiècrs involved J tor 9uicky
(on veagcnce

You might also like