0% found this document useful (0 votes)

20 views100 pages

Learning Algorithm

Uploaded by

palasek182

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views100 pages

Learning Algorithm

Uploaded by

palasek182

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 100

Advanced

Learning
Alogrithm
Demand Prediction

To illustrate how neural networks work, let's start with an

example. We'll use an example from demand prediction in
which you look at the product and try to predict, will this
product be a top seller or not? Let's take a look.

DeninkkodclickRakeurdloct ekg,ching tagingdevokene

min mot sin plan vididoon chung co ban chay ko ? Hay cingtim hin
.

In this example, you're selling T-shirts and you would like to know if
a particular T-shirt will be a top seller, yes or no, and you have
collected data of diﬀerent t-shirts that were sold at diﬀerent prices,
as well as which ones became a top seller. This type of application
is used by retailers today in order to plan better inventory levels as
well as marketing campaigns. If you know what's likely to be a top
seller, you would plan, for example, to just purchase more of that
stock in advance.

Trongvidyaybar dangban co thun viba, min bit c th chiccio thun ncio

ban chay vabanco dili thi thap disc vecc co thin da dic ban voi cas mic

gic Khoic nhan cing nhi chies niobanchay. Loa lingding may ligay nay
die cac nha ban le siding -
d lipkhoach v mic to ko ang nhi ke
-

hoach marketing. Nu nhibabit cainco garchi ban chay barco the lenk
-
,

hooch vidy
,
nhi la mic nhin kien hang othon.
In this example, the input feature x is the price of the T-shirt, and
so that's the input to the learning algorithm. If you apply logistic
regression to fit a sigmoid function to the data that might look like
that then the outputs of your prediction might look like this, 1/1
plus e to the negative wx plus b. Previously, we had written this as
f of x as the output of the learning algorithm.

Trong ridi may ,

dai vio xlagioco thin vicing dai a vioca that
toca. Ne ba p dingco hquylogistic dkhp vhamsigmoid thi dis
liv se ni tren ,
Kli d ket quo di door cotte tring tency .
/

nolif ca nhi lo can racia that toim

Trios day ,
ching ta toi viet x

Now e
wealphabet a to denote the output of this logistic
Regression algorithm We . can think it as a
very simplified
model of aringle neural in the brain.
Rek

Activation - Linear

Softmax
we will learn more about it

in followinglectures .

In this feature
example , we
just have one ,
now we're
have 4 features
going to .

The affordability ,
awareness , perceived quality are .
activations

Input layer : 4 node

Hidden layer :
Player ,
Enoder

Output layer : 1 node .

We will learn later in this course for chooling appropriate
architecture for a neural network.
Choosing the
right number
of hidden layers and number of hidden units per layer can

have an impact on the performance of learning algorithm a well .

We can call it that Multilayer perception .

Example : Recognizing Images
Face recognition :
Input Image > Output Identity the persion
-

: :

First we flatter image to an array

-
recto (input for the
model)
But how the hidden layers work ? We can see that :

From layer har less unito to the layer have more units , it

reparate the windows to smaller windows .

On the contrary
,
When from layer have more unith to layer
the previo windows to
have less units , it groub
create new windows
.

In the last hidden layer we compare object with the

windows to get accuracy for the output .

With another data it can operate like this.

,
Neural network layer
1
Layer :

> [1 >
-

layer
layer
-

In this ree that :

Wex
we can

unit

The output of layer 1 is the input of layer 2 .

↑ hen we have a -
the output of neural network
Now we can you al to predict with threshold 0 5
,
More complex neural network
Inference : Making predictions
(forward propagation)
Inference in code :

TerrorFlow Implementation
Good tasting coffe
?

We can see that :

Not nicely roalted let of bears :

·
not
langenagt
-

low temperature
-

too long or
highly temperature
Simple neural network :

Now we implement it in complex neural network :

Data in TensorFlow

Warning ! Note that we have

I type of data .

In tensorflow : tendor-matrix

numpy :
array
-
linear array

But don't
worry ,
we can use some function to convert

between it.
Building a neural network

Layer /
Layer I

First ,
we need to determine the architecture.
Then and data
,
preparing x y .

Let's go
to example about Digit classification :
We can convert pardel data frame to
numpy array
by function : df to-mampy 1)
.

Forward prop in single layer

-
Ku
Artifical General Intelligence
AGI
What is it ?
Vectorization (optional)
How neural network are implemented
efficiently ?
Matrix Multiplication
Then the result is :

The rule is that : The colums in vector A mint be

equal the rows in rector B
Terror flow Implementation

We have 3 stepl :
1.
Specify the model .

2
. Complie the model(ing alpecific loss function
much al Binary Crossentropy
- -

3. Train the model

Training Detail
Over view :

Lest go to detail each steps :

We can
-
change
any
loss function complie ,

accuracy between them to get optimized parameter

Back propagation : is a gradient estimation method

used to train neural network.
Alternatives to the

rigmoid activation
ReLLI :

Common activations :
But ,
How can we choose them ?

activation function
Choosing

When workingo einary dassificationener

Dression problems : We cane

linear activation.

valea
Regressionwithoutlegative
3 -
With hidden layer :

The recons of RelW more popular than Sigmoid :

1
. ReLU is a bit falter to compute because
it required computingiax of 0, 2

.
2 RelU function goes flat only in one

part of graph but sigmoid is

,
two ·
Summary :
But Why do
, we need
activation function !
What would happen if we were to me a linear activation function
for all of Rodel in neural network ?
will become different than just linear
It no
regression.
The result is like linear regression
.
Multiclass
Softmax
Neural network with
roftmax output

Digits dalification :

Implement Roftmax in Terrorflow :

Improved Implementation
of Roftmax in neural network

Maybe ,
we will havelome erross :

The result is that :

Improve with Logistic regression :

But it will be worke with Roftmax multi

Callification :
Softmax :

Logistic regression :
Advanced Optimization
We learned that Gradient Descent is led to
have
minimize the cost of alogrithn but now with
,

a
huge data it's now and we have another
,

alogritti is faster than it

It's Adam

This alogrithm can increase or decrease a

automatically .
With gradient descent , you have only Ringled
But with Adam you have muld for each w,
,
I
Additional Layer Types
Convolutional layer :

S
7
-
What is derivative ?
package calculatee
Alseeypy
to
Computation graph

The output of calculation is on the arrow

Forward prop is
left to right .

Oppolite the
Back prop right
: to left .

Let's check :

So why to
O we we back prop to compute
the derivatives?
* For
computing cost function ;:
use
left-to-right (Forward prop
For
computing all derivativel :

we right-to-left (Back prop

Larger neural network
Deciding what to try next
Evaluating the model
Logistic alogrithm
Model relection and train/
cross validation/test sett

I
test
for the fifth polynomial for m5, bi
turns out to the lowest. But when you eftimate how
well this model perform thistirm out to be
,
dlightly
flawed procedure is to report the test set error
, ,

Stes+ W4 b ,
5
To modify the training and testing procedure ,
instead of we relect the model andeplit
the datalet to train test sets we split
,
,
the datalet to 3 set (train-cross validation
~
test)
Instead of evaluating on test let , we do it on

2 V -
let , then you choose the model have lowest
C .
v error

Finally ,
we test set to report out the estimate
of the generalization error of how well this model do

or new data .
Example with neiral network :

>
-
pick model with lowesteverror
Diagnoising Dial and
variance

G Fror
The middle will
be the best model

But in some cases , it is possible toRimultanearly

have high bia and
hig↓ variance . It will turn out

. To
neural networks thislituation
in recognize ,

thatTtrain is
veryhigk
you can see .
For the part of the input , you have a very complicated
that overfit , so it overfits for the part of the input .

But then for some record , for other parts of the

doesn't fit the and
input ,
it
training well ,
so it underfits

for parts of the input .

=> If the algorithm does

poorly on the
training
ret and it even does much worke than the
, on
training
Ret .
Regularization and
Bias/Variance
How the regularization can impact to the
overall performance of the algorithm:

With large X :

With
large x the ,

algorithm is highly
motivated to
keep these
parametere Wi Ro
Riall and lo
you
end
with We
up we,
With the regularization
equale zero
,
the algorithm
overfit with train test .

We will choose the

x
intermediate
Usecret evaluate then pick was? b s
<
to ,

Use test let to

report .

The x is
oppolite the degree of polynomial ,

Whenincrease , Train err increase but with

degree of polynomial decrease.

Establising a baseline
level of performance.

Look in the picture ,

we can see that it had high train
and the Jo is
higher This thing
. leade we to

it has
conclude that high bial .

But wait ,
look the human level performance
-
human's error ,
the Train is only a little higher
than humans error .
It hal some recons. One
of them is the naily rounds in audio - .

So we should consider human level

performance .
What is baseline level performance ?

L
small Jtrain
high Jo-v ↳RighFrai
Learning curves

When you
have a
larger trainingeet ,
it's harder for quadratic function
to fit all examples perfectly.
When you have more and more
trainining data the growth
,

rate of I train gradually decrease ,

To's descent speed gradually decrease .
Let's go to
high bick :

see that when we increase the

You can ,
training petlice
the Tc and Train can't change loth of things They a .

will both flatter both of them will

out and
probably just continue
to be flat like that .

That givel this conclusion maybe a little bit superling

, ,

that if a
learning algorithm has high bias getting more training data
,

will not be by itself hope that much

With high variance :
Deciding what to try
next revisited
If
your alogrithm makes acceptbly large
T
errors

in its prediction what ,

do
you try
next .

There were bideas to do it.

With high variance :

getting more
training data or simplify
the model ,

With high bial : -makyor model more powerful to

give then more flexibility to fit more
complex or
wigly functions .
Bias ,
variance and
neural network
.

HighTo
o
High Train

↳ g-- Strain
With high bial problem ,
we can
yo a
larger in .

But what if
, mint is too dig ,
it will case high variance
problem. For this case ,
we can regularize this larger
n2
Iterative loop of
ML development .

choose for
larger mode I ,
regularize T

overfiting
under fiting

Go to
Span Classification example
to look how it work
The first way :
Or record way :

( The
the
quantity of each word
document.
in
Error analysis

But when we have more data ,

how can

we work ?
Adding data

Instead of adding more data everything under the

sur ,we just need
adding
data where error
analyris hal indicated it lit mig help J

In this example aoloI more

we can data-related
pharma Span
>
-
But it takes a lots of time and may be
. Then
expensive we can ule data
augmentation .
Train OCR alogritha read from
to text
Image .

(
/

the center of
Recognize digit in
window
We have research directions mode-centric and
2 :

data-centric In the palt , model an algorithm

were prioritized for deve

lopment but now, mode
,

and algorithm are ho nice . In the current ,

era of
explosion , we can develop in a data-centric
direction.
Transfer learning :
using
data from different task
We have I ways :

Option : suitable for training set

It's .

Option 2 : If
you
have largertraining let .
Fully cycle of machine

learning project
(
= to large number of hers

(
And not too
high of computational cost .

↓
Maybe related with the er
privacy
Ruve that condent allows
Make store
you to

this o lata .

Monitoring system allowed is to figure ait

data shifting and the
the
less accurate
algorithm wa become
↳ Then we can retrain model and then
to
carry out a model
update to replace this old model
M machine learning operations
LOps :

This refer to how to build , deploy and maintain

Fair biah and
,
ethics
Error metrics for skewed dataset

We can ue confilion matrix to evaluate the

models
Trade off precision and recall
We choose the algorithm with highest
Frscore .

Decision tree model

Decision Tree is a
very powerful model. It
widely
use in
many application .

Letago to a binary example .

Measuring purity

Entropy:
Information gain :

reduction in
=
entropy
Decision tree learning ,

putting it together .
*
Split features to get highest information gain
One hot
encoding :

Convert a feature to multiply binary

classification features.

↓
Continuous valued features

Split with thresholds then calculate

to find highest information

gain

But how choose threshold ?

can we

We have a rule that choose value

in area which is around of mid value
in list .
Regression Tree

*
problem
In this , we choose the highest
information
gain /reduction on variance)
Random forest :

Deep Learning Andrew NG
100% (3)
Deep Learning Andrew NG
173 pages
Detailed Lesson Plan in MAPEH 8 Arts q1
75% (4)
Detailed Lesson Plan in MAPEH 8 Arts q1
15 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Classification BP Regression KNN Other Classifiers - Final
No ratings yet
Classification BP Regression KNN Other Classifiers - Final
116 pages
Construction Readiness Review Pack
67% (3)
Construction Readiness Review Pack
91 pages
Lect 5
No ratings yet
Lect 5
89 pages
9 Neural Networks Learning
No ratings yet
9 Neural Networks Learning
38 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
Summarize The Concept of Consumer Learning
75% (4)
Summarize The Concept of Consumer Learning
15 pages
Deep Learning
100% (4)
Deep Learning
100 pages
Neural Networks with Python
From Everand
Neural Networks with Python
Mei Wong
No ratings yet
NN Notes
No ratings yet
NN Notes
39 pages
Introduction Deep Eng
No ratings yet
Introduction Deep Eng
50 pages
Unit I
No ratings yet
Unit I
90 pages
Slides 11
No ratings yet
Slides 11
48 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Neural Network
100% (1)
Neural Network
54 pages
Lecture W15ab
No ratings yet
Lecture W15ab
44 pages
Artificial Neural Networks - DL
No ratings yet
Artificial Neural Networks - DL
55 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
ML Unit-2
No ratings yet
ML Unit-2
141 pages
Activity 2 - Formula Writing and Nomenclature of Inorganic Compounds
No ratings yet
Activity 2 - Formula Writing and Nomenclature of Inorganic Compounds
2 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Unit III
No ratings yet
Unit III
29 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
ANNs
No ratings yet
ANNs
17 pages
Player Survival Guide v1.2
No ratings yet
Player Survival Guide v1.2
44 pages
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
No ratings yet
AI Unit5 Neural Network 1c2c9166 c1b7 47a3 8ce1 E914f1ab6afb
52 pages
Kepler's Cosmological Synthesis - Astrology, Mechanism and The Soul (PDFDrive) PDF
100% (1)
Kepler's Cosmological Synthesis - Astrology, Mechanism and The Soul (PDFDrive) PDF
201 pages
Notes On Introduction To Deep Learning
No ratings yet
Notes On Introduction To Deep Learning
19 pages
Neural Networks-A Diffusion Model Changing The Landscape
No ratings yet
Neural Networks-A Diffusion Model Changing The Landscape
13 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Ai - W7L13
No ratings yet
Ai - W7L13
46 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
L10 Neural Network
No ratings yet
L10 Neural Network
52 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Ann MJJ-1
No ratings yet
Ann MJJ-1
64 pages
UNIT II DL
No ratings yet
UNIT II DL
17 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Neural
No ratings yet
Neural
53 pages
Introduction of Water Treatment at Sinza A Ward
No ratings yet
Introduction of Water Treatment at Sinza A Ward
22 pages
Types of MAC Protocols
No ratings yet
Types of MAC Protocols
32 pages
Understanding and Coding Neural Networks From Scratch in Python and R
No ratings yet
Understanding and Coding Neural Networks From Scratch in Python and R
12 pages
Chapter 2 - Artificial Neural Networks
No ratings yet
Chapter 2 - Artificial Neural Networks
19 pages
Working of Multi-Layer Perceptron
No ratings yet
Working of Multi-Layer Perceptron
16 pages
Unit 1
No ratings yet
Unit 1
19 pages
Advanced Machine Learning: Neural Networks Decision Trees Random Forest Xgboost
No ratings yet
Advanced Machine Learning: Neural Networks Decision Trees Random Forest Xgboost
61 pages
Intro To DL
No ratings yet
Intro To DL
28 pages
Neural Networks
No ratings yet
Neural Networks
12 pages
Unit 1
No ratings yet
Unit 1
20 pages
IDEA TRIBE - 2025 - Broucher
No ratings yet
IDEA TRIBE - 2025 - Broucher
4 pages
MLS 1 - Presentation
No ratings yet
MLS 1 - Presentation
11 pages
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
Types of MAC Protocols
No ratings yet
Types of MAC Protocols
16 pages
DFPC Fire Instructor I NFPA 1041 2007
No ratings yet
DFPC Fire Instructor I NFPA 1041 2007
10 pages
Understanding Neural Networks
No ratings yet
Understanding Neural Networks
12 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
Ca 3 DL
No ratings yet
Ca 3 DL
6 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Means of Verification For TI-III - TEMPLATE
No ratings yet
Means of Verification For TI-III - TEMPLATE
72 pages
Data Sheet: Analog Meters With Two Moving-Iron Movements WQ 96 /2S WQ 144 /2S
No ratings yet
Data Sheet: Analog Meters With Two Moving-Iron Movements WQ 96 /2S WQ 144 /2S
4 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
Back Propagation Algorithm
No ratings yet
Back Propagation Algorithm
13 pages
MSME Certificate - OFG
No ratings yet
MSME Certificate - OFG
2 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
Q1W4 Solving Equations Tranformable Into Quadratic Equations Problem Solving Involving Quadratic Equation and Rational Algebraic Equations
No ratings yet
Q1W4 Solving Equations Tranformable Into Quadratic Equations Problem Solving Involving Quadratic Equation and Rational Algebraic Equations
38 pages
Reviewer Communication
No ratings yet
Reviewer Communication
2 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Neural Networks - Learning
No ratings yet
Neural Networks - Learning
26 pages
Soil Nails Field Pull Out Testing Evaluation and Applications
No ratings yet
Soil Nails Field Pull Out Testing Evaluation and Applications
11 pages
Division Memorandum No. 0555, S. 2024 - Reiteration On The Implementation of Modular Distance Learning As Provided in DepEd Order No. 037, S. 2022.
No ratings yet
Division Memorandum No. 0555, S. 2024 - Reiteration On The Implementation of Modular Distance Learning As Provided in DepEd Order No. 037, S. 2022.
2 pages
The Five Stages of Hawthorne Studies and Their Purposes
No ratings yet
The Five Stages of Hawthorne Studies and Their Purposes
3 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Understanding Backpropagation Algorithm - Towards Data Science
No ratings yet
Understanding Backpropagation Algorithm - Towards Data Science
11 pages
R - Matching Information
No ratings yet
R - Matching Information
8 pages
MCQS Unit IV Jacobian2
No ratings yet
MCQS Unit IV Jacobian2
6 pages
Tilt-Seal: High Performance Acrylic Latex
No ratings yet
Tilt-Seal: High Performance Acrylic Latex
2 pages
2SC5200/FJL4315 NPN Epitaxial Silicon Transistor: Applications
No ratings yet
2SC5200/FJL4315 NPN Epitaxial Silicon Transistor: Applications
7 pages
Chapter 02 Warehousing Decisions
No ratings yet
Chapter 02 Warehousing Decisions
12 pages
Burndy PDF
No ratings yet
Burndy PDF
36 pages
Bachelor Thesis Proposal Template (DD) - ITS
No ratings yet
Bachelor Thesis Proposal Template (DD) - ITS
31 pages
1 s2.0 S2212827124000428 Main
No ratings yet
1 s2.0 S2212827124000428 Main
6 pages
1.3 Food SOvereignty
No ratings yet
1.3 Food SOvereignty
16 pages
Nas5311 Spec
No ratings yet
Nas5311 Spec
4 pages
Reliability Analysis
No ratings yet
Reliability Analysis
22 pages
3D Dynamic Soil - Uid-Structure Interaction Analysis in The Time Domain
No ratings yet
3D Dynamic Soil - Uid-Structure Interaction Analysis in The Time Domain
6 pages
Advance Product Quality Process
No ratings yet
Advance Product Quality Process
34 pages