Chap11 Neural Nets

Business Data Science

Uploaded by

Jakhongir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views38 pages

Chap11 Neural Nets

Business Data Science

Uploaded by

Jakhongir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Chapter 11

Neural Network
Basic idea
• Combine input information in a complex & flexible neural
net “model”
• Model “coefficients” are continually tweaked in an
iterative process
• The network’s interim performance in classification and
prediction informs successive tweaks
Basic idea

X1 X2 X3 Y Input Black box

1 0 0 0
1 0 1 1 X1
1 1 0 1 Output
1 1 1 1
0 0 1 0
X2 Y
0 1 0 0
0 1 1 1 X3
0 0 0 0

Output Y is 1 if at least two of the three inputs are equal to 1.

Basic idea
Input
nodes Black box
X1 X2 X3 Y
1 0 0 0 Output
1 0 1 1 X1 0.3 node
1 1 0 1
1 1 1 1
X2 0.3
0 0 1 0
 Y
0 1 0 0
0 1 1 1 X3 0.3 t=0.4
0 0 0 0

Y  I (0.3 X 1  0.3 X 2  0.3 X 3  0.4  0)

1 if z is true
where I ( z )  
0 otherwise
Basic idea
• Model is an assembly of
inter-connected nodes
and weighted links
Input
• Output node sums up nodes Black box
each of its input value X1
Output
w1 node
according to the weights w2
of its links X2  Y
w3
• Compare output node X3 t
against some threshold
t
General structure
• Multiple layers
- Input layer (raw observations)
- Hidden layers
- Output layer
• Nodes
• Weights (like coefficients, subject to iterative
adjustment)
• Bias values (also like coefficients, but not
subject to iterative adjustment)
General structure
x1 x2 x3 x4 x5
Input Neuron i Output
Input
Layer
I1 wi1
wi2 Activation
I2
wi3
Si function Oi Oi
g(Si )
I3
Hidden
Layer
threshold, t

Output
Layer Training ANN means learning
the weights of the neurons
y
Multi-layer perceptron
Learning algorithm
• Initialize the weights (w0, w1, …, wk)
• Adjust the weights in such a way that the output of ANN
is consistent with class labels of training examples
- Objective function:

E   Yi  f ( wi , X i )
2

i
- Find the weights wi’s that minimize the above objective
function
 e.g., backpropagation algorithm
Example: Fat & Salt Content to Predict
Consumer Acceptance of Cheese
Example: Data
Moving through the
Network
Input layer
• For input layer, input = output
- e.g., for record #1:
Fat input = output = 0.2
Salt input = output = 0.9

• Output of input layer = input into hidden

layer
Hidden layer
• In this example, it has 3 nodes
• Each node receives as input the output
of all input nodes
• Output of each hidden node is a
function of the weighted sum of inputs
p
𝑜𝑢𝑡𝑝𝑢𝑡 𝑗=𝑔(Θ j + ∑ wij x i)
i=1
Weights
• The weights q (theta) and w are typically
initialized to random values in the range
-0.05 to +0.05
• Equivalent to a model with random
prediction (in other words, no predictive
value)
• These initial weights are used in the first
round of training
Initial pass of the network
Output of node 3
if g is a Logistic function

p
𝑜𝑢𝑡𝑝𝑢𝑡 𝑗=𝑔(Θ j + ∑ wij x i)
i=1

1
𝑜𝑢𝑡𝑝𝑢𝑡 3 = ¿¿
1 +𝑒 ¿
Output layer
• The output of the last hidden
• layer
becomes input for the output layer
• Uses same function as above, i.e. a
function g of the weighted average

1
𝑜𝑢𝑡𝑝𝑢𝑡 6 =
1 +𝑒¿¿ ¿
Output node

If cutoff for “1” is 0.5,

then classify as “1”
Relation to linear regression
• A net with a single output node and no hidden layers,
where g is the identity function, takes the same form as
a linear regression model

p
^𝑦 =Θ+ ∑ wi xi
i=1
Training the model
Preprocessing Steps
• Scale variables to 0-1
• Categorical variables
- If equidistant categories, map to equidistant
interval points in 0-1 range
- Otherwise, create dummy variables
• Transform (e.g., log) skewed variables
Initial Pass Through Network
• Goal: Find weights that yield best
predictions
• The process we described below is
repeated for all records
- At each record compare prediction to actual
- Difference is the error for the output node
- Error is propagated back and distributed to all
the hidden nodes and used to update their
weights
Back Propagation (“back-prop”)

• Output from output node k: ^𝑦 𝑘

• Error associated with that node:

¿
Error is Used to Update Weights

𝑛𝑒𝑤 𝑜𝑙𝑑

𝜃 𝑗 =𝜃 𝑗 +𝑙 ( 𝑒𝑟𝑟 𝑗 )

¿
l = constant between 0 and 1, reflects the “learning
rate” or “weight decay parameter”
Case Updating
• Weights are updated after each record
is run through the network
• Completion of all records through the
network is one epoch (also called
sweep or iteration)
• After one epoch is completed, return to
first record and repeat the process
Batch Updating
• All records in the training set are fed to
the network before updating takes
place
• In this case, the error used for updating
is the sum of all errors from all records
Common Criteria to Stop Updating
• When weights change very little from
one iteration to the next
• When the misclassification rate reaches
a required threshold
• When a limit on runs is reached
XLMiner Output: Final Weights

Note: XLMiner uses two output nodes (P[1] and P[0]); diagrams show just one output
node (P[1])
Fat/Salt Example: Final Weights
XLMiner: Final Classifications
Avoiding Overfitting
• With sufficient hidden nodes and training
iterations, neural net can easily overfit
the data

• To avoid overfitting:
- Track error in validation data
- Limit iterations
- Limit complexity of network
User Inputs
Specify Network Architecture
• Number of hidden layers
- Most popular – one hidden layer
• Number of nodes in hidden layer(s)
- More nodes capture complexity, but increase
chances of overfit
• Number of output nodes
- For classification, one node per class (in binary
case can also use one)
- For numerical prediction use one
Network Architecture, cont.
• Learning Rate l
- Low values “downweight” the new
information from errors at each iteration
- This slows learning, but reduces tendency
to overfit to local structure
Advantages

• Good predictive ability

• Can capture complex relationships
• No need to specify a model
Disadvantages
• Considered a “black box” prediction machine,
with no insight into relationships between
predictors and outcome
• No variable-selection mechanism, so you have
to exercise care in selecting variables
• Heavy computational requirements if there are
many variables (additional variables
dramatically increase the number of weights to
calculate)
Summary
Neural networks can be used for classification and
prediction
Can capture a very flexible/complicated relationship
between the outcome and a set of predictors
The network “learns” and updates its model iteratively
as more data are fed into it
Major danger: overfitting
Requires large amounts of data
Good predictive performance, yet “black box” in nature

What Happens If We Graph Both F and F On The Same Set of Axes, Using The X-Axis For The Input Tobothfandf ?
0% (2)
What Happens If We Graph Both F and F On The Same Set of Axes, Using The X-Axis For The Input Tobothfandf ?
2 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
Chapter 11 Neural Nets (Python)
No ratings yet
Chapter 11 Neural Nets (Python)
43 pages
DL - ANN - RNN - CNN (Autosaved) (Autosaved)
No ratings yet
DL - ANN - RNN - CNN (Autosaved) (Autosaved)
53 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
UNIT 3 - Backpropagation Algorithm
No ratings yet
UNIT 3 - Backpropagation Algorithm
38 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Unit 2 - Soft Computing
No ratings yet
Unit 2 - Soft Computing
49 pages
Classification BP Regression KNN Other Classifiers - Final
No ratings yet
Classification BP Regression KNN Other Classifiers - Final
116 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Lecture 10
No ratings yet
Lecture 10
155 pages
Neural
No ratings yet
Neural
53 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Neural Networks Unit-3
No ratings yet
Neural Networks Unit-3
14 pages
Data Mining, Advance Methods
No ratings yet
Data Mining, Advance Methods
83 pages
Back Propagation Algorithm
No ratings yet
Back Propagation Algorithm
13 pages
P95 Course Slides
No ratings yet
P95 Course Slides
86 pages
Neural Networks Handout
No ratings yet
Neural Networks Handout
7 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Classification Advanced
No ratings yet
Classification Advanced
51 pages
Multi Layer Perceptron 1
No ratings yet
Multi Layer Perceptron 1
54 pages
Chapter 11 Neural Nets
No ratings yet
Chapter 11 Neural Nets
39 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
AI17-Neural Networks
No ratings yet
AI17-Neural Networks
34 pages
Neural Network
100% (1)
Neural Network
54 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
NN Introduction MES
No ratings yet
NN Introduction MES
39 pages
3ML.05.NeuralNetworks DeepLearning
No ratings yet
3ML.05.NeuralNetworks DeepLearning
67 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
15 pages
Lect 9 DM
No ratings yet
Lect 9 DM
35 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
Chapter 5 Artificial Neural Networks
No ratings yet
Chapter 5 Artificial Neural Networks
50 pages
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
No ratings yet
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
7 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
CSD311: Artificial Intelligence
No ratings yet
CSD311: Artificial Intelligence
12 pages
Data Mining Techniques: Presentation On Neural Network
No ratings yet
Data Mining Techniques: Presentation On Neural Network
55 pages
P5 Neural Nets
No ratings yet
P5 Neural Nets
114 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Artificial Neural Networks - MLP
No ratings yet
Artificial Neural Networks - MLP
52 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
31 pages
Week 9 Neural Networks
No ratings yet
Week 9 Neural Networks
40 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Bai 1 Eng
No ratings yet
Bai 1 Eng
10 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
06 NeuralNetworks 2024
No ratings yet
06 NeuralNetworks 2024
82 pages
19 - Introduction To Neural Networks
No ratings yet
19 - Introduction To Neural Networks
7 pages
DHSCH 6
No ratings yet
DHSCH 6
30 pages
DWDM Unit4-2
No ratings yet
DWDM Unit4-2
4 pages
Dersnot 6452 1668688984
No ratings yet
Dersnot 6452 1668688984
36 pages
Backpropagation
No ratings yet
Backpropagation
7 pages
Unit - 2
No ratings yet
Unit - 2
24 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Digital Circuit Simulation Using Excel
From Everand
Digital Circuit Simulation Using Excel
Anthony Mazzurco
No ratings yet
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Chap15 Cluster Analysis
No ratings yet
Chap15 Cluster Analysis
55 pages
Chap10 Logistic Regression
No ratings yet
Chap10 Logistic Regression
36 pages
Chap5 Evaluating Performance
No ratings yet
Chap5 Evaluating Performance
54 pages
Discussion
No ratings yet
Discussion
2 pages
Unit4 Discussion Assignment Unit 4 Algebra
No ratings yet
Unit4 Discussion Assignment Unit 4 Algebra
3 pages
Overview of Data Mining Process
No ratings yet
Overview of Data Mining Process
43 pages
What Happens If We Graph Both and On The Same Set of Axes, Using The X-Axis For The Input To Both and ?
No ratings yet
What Happens If We Graph Both and On The Same Set of Axes, Using The X-Axis For The Input To Both and ?
3 pages
Unit 4 DA Math 1201
100% (2)
Unit 4 DA Math 1201
4 pages
Discussion
No ratings yet
Discussion
2 pages
Question: What Happens If We Graph Both And: Composition of Function and Inverse Function
No ratings yet
Question: What Happens If We Graph Both And: Composition of Function and Inverse Function
3 pages
DF Unit4 Math1201
No ratings yet
DF Unit4 Math1201
3 pages
Chapter 3 - Gonzalez
No ratings yet
Chapter 3 - Gonzalez
25 pages
Exam 04
No ratings yet
Exam 04
29 pages
Learn SQL
0% (1)
Learn SQL
395 pages
Module 0 About This Course: Module 1 Getting Started With Rational Software Architect
No ratings yet
Module 0 About This Course: Module 1 Getting Started With Rational Software Architect
3 pages
Flow Chart Basics
No ratings yet
Flow Chart Basics
1 page
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
No ratings yet
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
14 pages
BIM Evolution Acronyms and Definitions Sa Id Kori PHD PDF
No ratings yet
BIM Evolution Acronyms and Definitions Sa Id Kori PHD PDF
47 pages
Topics: Accessing Class Members
No ratings yet
Topics: Accessing Class Members
12 pages
Illustrating Chinese Dragon Using Inkscape (Or Other Vector G...
No ratings yet
Illustrating Chinese Dragon Using Inkscape (Or Other Vector G...
10 pages
Chapter 4 Appendix Prolems
No ratings yet
Chapter 4 Appendix Prolems
2 pages
2013 Database Management System: CS/B.Tech/CSE/New/SEM-6/CS-601/2013
No ratings yet
2013 Database Management System: CS/B.Tech/CSE/New/SEM-6/CS-601/2013
7 pages
Analyzing Data With Power BI and Power Pivot For Excel
No ratings yet
Analyzing Data With Power BI and Power Pivot For Excel
22 pages
Mod 6 PDF
No ratings yet
Mod 6 PDF
9 pages
Relational Database Design
No ratings yet
Relational Database Design
6 pages
Create A 3D Floor Plan Model From An Architectural Schematic in Blender
No ratings yet
Create A 3D Floor Plan Model From An Architectural Schematic in Blender
45 pages
SCM Report On Demand Forecasting
No ratings yet
SCM Report On Demand Forecasting
43 pages
Lecture 3
No ratings yet
Lecture 3
25 pages
DBMS Notes by Dinudinesh
No ratings yet
DBMS Notes by Dinudinesh
18 pages
Business PlugIn B2 Student PPT
No ratings yet
Business PlugIn B2 Student PPT
33 pages
nORMALIZATION EXTRANOTES
100% (1)
nORMALIZATION EXTRANOTES
5 pages
02 Probability, Bayes Theorem and The Monty Hall Problem
No ratings yet
02 Probability, Bayes Theorem and The Monty Hall Problem
34 pages
Domain Model Ppt-1updated
No ratings yet
Domain Model Ppt-1updated
17 pages
Arena
No ratings yet
Arena
33 pages
Probit and Logit-Madesh
No ratings yet
Probit and Logit-Madesh
22 pages
Preparing and Interpreting Technical Drawing
No ratings yet
Preparing and Interpreting Technical Drawing
17 pages
Time Series
100% (1)
Time Series
61 pages
1 Introduction To VHDL 1
No ratings yet
1 Introduction To VHDL 1
26 pages
Assignment # 1: Data Warehousing & Data Mining
No ratings yet
Assignment # 1: Data Warehousing & Data Mining
3 pages
Jyotish Aur Hum-O.K
No ratings yet
Jyotish Aur Hum-O.K
29 pages
Reference Partitioning Method
No ratings yet
Reference Partitioning Method
4 pages