5.MLP in Practice

The document provides an overview of Multilayer Perceptrons (MLPs), outlining their structure, including input, hidden, and output layers, and their application in supervised learning tasks. It details the steps for implementing MLPs, such as data preprocessing, model architecture, loss functions, training processes, evaluation metrics, and hyperparameter tuning. Key concepts like normalization, one-hot encoding, forward and backpropagation, and optimization algorithms like SGD and Adam are also discussed.

Uploaded by

kameswarisreevalli2801

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views19 pages

5.MLP in Practice

Uploaded by

kameswarisreevalli2801

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

MLP_In_Practice

A Multilayer Perceptron (MLP) is a type of neural network that consists

of multiple layers of neurons, including an input layer, one or more
hidden layers, and an output layer. It's used for supervised learning
tasks, particularly classification and regression. MLPs are fully
connected, meaning each neuron in one layer is connected to every
neuron in the next layer.
When implementing MLP in practice, the
process generally involves these steps:

1. Data Preprocessing
• Before training an MLP, data needs to be preprocessed
Normalization/Standardization: The features are usually scaled to have
a mean of 0 and a standard deviation of 1. This helps the network learn
more efficiently.
continue.....
• In machine learning, especially for models like neural networks,
normalization and standardization are crucial for efficient learning.
These techniques ensure that features are on a similar scale,
preventing some variables from dominating others due to differences
in magnitude.
continue....
• Train-Test Split: You divide the data into a training set and a test set.
The training set is used to train the model, and the test set is used to
evaluate the model's performance.
• Categorical Encoding (for classification tasks): For categorical
features, you might need to apply techniques like one-hot encoding.
• When working with categorical features in machine learning, we must
convert them into a numerical format because most machine learning
models work with numbers, not text. Categorical encoding is the
process of transforming categorical variables into numerical
representations.
(A) One-Hot Encoding (OHE)
•Converts categories into binary columns (0s and 1s).
•Each unique category becomes a separate column.
2. Model Architecture
• MLPs typically have:
• Input Layer: Takes the input features of your data.
• Hidden Layers: One or more layers of neurons. These layers perform
transformations of the data.
• Output Layer: Provides the output predictions (for classification or
regression).
• Each neuron in a layer takes a weighted sum of the inputs, passes it
through an activation function, and then forwards it to the next layer.
• Common activation functions:
I. Sigmoid: f(x) = 1 / (1 + exp(-x))
3. Loss Function
• The loss function measures how well the network's predictions match
the actual outputs. Common choices are:
• Mean Squared Error (MSE): For regression tasks.
• Cross-Entropy Loss: For classification tasks.
4. Training the Network
• Training an MLP involves the following:
1) Forward Propagation: The input data is passed through the network,
layer by layer, to produce an output.
2) Backpropagation: The error (difference between predicted and actual
values) is propagated backward through the network to adjust the
weights using gradient descent or other optimization algorithms.
3) Gradient Descent: The model's weights are updated using an
optimization algorithm like stochastic gradient descent (SGD) or
RMSprop (RMSprop (Root Mean Square Propagation).
What is SGD?
• SGD is an optimization algorithm that updates the weights using the
gradient of the loss function computed from a single random sample
(or a small batch) at each step.

where:
•η is eta (learning rate) controls the step size.
•dL/dw is the gradient of the loss with respect to the
weight
5. Evaluation
• After training, the model is evaluated on the test set using metrics like:
1) Accuracy (for classification)
2) Precision, Recall, F1-score (for classification)
3) Mean Absolute Error (MAE) or Mean Squared Error (MSE) (for
regression)
• Accuracy is a common metric for classification tasks, defined as the
ratio of correctly predicted instances to the total instances in the
dataset:

Mean Squared Error (MSE): Measures the average squared difference

between actual and predicted values.
6. Hyperparameter Tuning
• You can improve the MLP's performance by tuning hyperparameters
such as:
1) Number of hidden layers and neurons
2) Learning rate
3) Batch size
4) Optimizer choice (SGD, etc.)
Number of Hidden Layers and
Neurons
• What it does?
• Hidden layers and neurons control the model's complexity.
• More layers → Higher ability to capture complex patterns (but risk of
overfitting).
• More neurons per layer → More capacity to learn, but increases
computational cost.

• How to tune?

• Start with one hidden layer and gradually increase.

Learning Rate (α)
• What it does?
• Controls how much the model adjusts weights during training.
• A high learning rate → Faster learning,
• A low learning rate → More stable learning, but may take longer to
converge.
• How to tune?

• Common values: 0.01, 0.001, 0.0001.

Batch Size
• What it does?

• Determines how many samples are used to compute gradients before updating
weights.
• Small batch size (e.g., 32, 64) → More noise, but better generalization.
• Large batch size (e.g., 256, 512) → Faster training, but may lead to poor
generalization.
• How to tune?

• Start with 32 or 64, and experiment with larger sizes.

• If training is unstable, reduce batch size.
Optimizer Choice (SGD, Adam,
RMSprop, etc.)
• What it does?
• Optimizers adjust weights based on gradient updates.
• SGD (Stochastic Gradient Descent) → Works well but can be slow.
• Adam (Adaptive Moment Estimation) → Combines benefits of SGD and
momentum, works well for most cases.
• RMSprop → Good for recurrent networks, normalizes learning rate.
• How to tune?
• Start with Adam (default: learning rate = 0.001).
• If training is unstable, try SGD with momentum (0.9).
• If gradients explode or vanish, try RMSprop.

Homework 8 - Solution
No ratings yet
Homework 8 - Solution
8 pages
Multilayer Neural Network
No ratings yet
Multilayer Neural Network
27 pages
AI Week 12
No ratings yet
AI Week 12
2 pages
Structure: Input Layer Hidden Layers
No ratings yet
Structure: Input Layer Hidden Layers
2 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Exp 3
No ratings yet
Exp 3
7 pages
Report 2
No ratings yet
Report 2
17 pages
Aiml Unit 5
No ratings yet
Aiml Unit 5
34 pages
Chapter 5 Summary
No ratings yet
Chapter 5 Summary
5 pages
MLP 1122 20240509 ch10 DeepNN
No ratings yet
MLP 1122 20240509 ch10 DeepNN
47 pages
ML Video
No ratings yet
ML Video
8 pages
3rd Unit ML
No ratings yet
3rd Unit ML
7 pages
Unit 1
No ratings yet
Unit 1
72 pages
Graph Theory Report
No ratings yet
Graph Theory Report
9 pages
Chapter 4 2025
No ratings yet
Chapter 4 2025
19 pages
Unit II - Neural Networks - Most Important Questions - With Answers-Exam
No ratings yet
Unit II - Neural Networks - Most Important Questions - With Answers-Exam
22 pages
Basics of Deep Learning
No ratings yet
Basics of Deep Learning
20 pages
Multilayer Perceptron (MLP) & Linear Separabaility
No ratings yet
Multilayer Perceptron (MLP) & Linear Separabaility
7 pages
Multi Layer Perceptron
No ratings yet
Multi Layer Perceptron
62 pages
Unit-3 ML
No ratings yet
Unit-3 ML
21 pages
Lecture 2
No ratings yet
Lecture 2
52 pages
Chapter 2 - Artificial Neural Networks
No ratings yet
Chapter 2 - Artificial Neural Networks
19 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Unit 4 ML NN, DL, CNN-1
No ratings yet
Unit 4 ML NN, DL, CNN-1
84 pages
Training NNs
No ratings yet
Training NNs
34 pages
Unit 2 - ML
No ratings yet
Unit 2 - ML
18 pages
1,2& 5 Mod SEM
No ratings yet
1,2& 5 Mod SEM
53 pages
Omar Arif Omar - Arif@seecs - Edu.pk National University of Sciences and Technology
No ratings yet
Omar Arif Omar - Arif@seecs - Edu.pk National University of Sciences and Technology
44 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
20 pages
Soft Computing Unit 2
No ratings yet
Soft Computing Unit 2
23 pages
AI ML Nov 15
No ratings yet
AI ML Nov 15
32 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
DL Notes ALL
No ratings yet
DL Notes ALL
63 pages
Lec 8
No ratings yet
Lec 8
43 pages
Unit 4 Neural Networks
No ratings yet
Unit 4 Neural Networks
76 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Neural Network
No ratings yet
Neural Network
97 pages
ML Module 2
No ratings yet
ML Module 2
59 pages
Multilayer Perceptron Algorithm
No ratings yet
Multilayer Perceptron Algorithm
3 pages
Ann TP
No ratings yet
Ann TP
40 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
19 pages
ML Module 1
No ratings yet
ML Module 1
12 pages
Inbound 8392301798635648784
No ratings yet
Inbound 8392301798635648784
43 pages
Understanding and Coding Neural Networks From Scratch in Python and R
No ratings yet
Understanding and Coding Neural Networks From Scratch in Python and R
12 pages
ML Notes
No ratings yet
ML Notes
16 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
195 pages
MLP Chap11
No ratings yet
MLP Chap11
24 pages
1 Intro
No ratings yet
1 Intro
18 pages
Week 3
No ratings yet
Week 3
17 pages
Slide 2-f2
No ratings yet
Slide 2-f2
52 pages
Unit 3
No ratings yet
Unit 3
13 pages
ML Sem
No ratings yet
ML Sem
24 pages
Pattern Classification 11. Backpropagation & Time-Series Forecasting
No ratings yet
Pattern Classification 11. Backpropagation & Time-Series Forecasting
78 pages
Module 1
No ratings yet
Module 1
22 pages
Multi LP
No ratings yet
Multi LP
8 pages
Ann MPDM Ii
No ratings yet
Ann MPDM Ii
42 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Probability Neuron Network
No ratings yet
Probability Neuron Network
84 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Price Compare Docmentation
No ratings yet
Price Compare Docmentation
72 pages
CRT - Technical Content
No ratings yet
CRT - Technical Content
9 pages
6.deriving Back Propogation
No ratings yet
6.deriving Back Propogation
11 pages
Bigdata
No ratings yet
Bigdata
18 pages
Supervised ML
No ratings yet
Supervised ML
69 pages
3rd Lecture
No ratings yet
3rd Lecture
21 pages
10.2. Deep Learning (CNN)
No ratings yet
10.2. Deep Learning (CNN)
50 pages
Introduction To Artificial Neural Networks With Keras - IITR Batch 2
No ratings yet
Introduction To Artificial Neural Networks With Keras - IITR Batch 2
252 pages
Modeling With UML: Understood Among Project Participants. in The Latter Lies The Strength of Standards and
No ratings yet
Modeling With UML: Understood Among Project Participants. in The Latter Lies The Strength of Standards and
5 pages
Architecture: Simple Neural Nets For Pattern Classification
No ratings yet
Architecture: Simple Neural Nets For Pattern Classification
15 pages
The Mcculloch Neuron (1943) : A B G B P W G A
No ratings yet
The Mcculloch Neuron (1943) : A B G B P W G A
33 pages
K-Max Pooling Operation
No ratings yet
K-Max Pooling Operation
134 pages
Unit 2 DL
No ratings yet
Unit 2 DL
44 pages
Deep Learning
No ratings yet
Deep Learning
38 pages
NN Examples Matlab
No ratings yet
NN Examples Matlab
91 pages
DFA
No ratings yet
DFA
12 pages
CST384 Jun 2022
No ratings yet
CST384 Jun 2022
2 pages
DL QB With Ans
No ratings yet
DL QB With Ans
38 pages
Handwritten Digit Recognition of MNIST Dataset Using Deep Learning State-Of-The-Art Artificial Neural Network ANN and Convolutional Neural Network CNN
No ratings yet
Handwritten Digit Recognition of MNIST Dataset Using Deep Learning State-Of-The-Art Artificial Neural Network ANN and Convolutional Neural Network CNN
7 pages
Introduction To OO SAD & UML: Slide 1
No ratings yet
Introduction To OO SAD & UML: Slide 1
27 pages
01 DS 2019 CODESIGN Correction Ex1
No ratings yet
01 DS 2019 CODESIGN Correction Ex1
17 pages
Adaline Madaline
No ratings yet
Adaline Madaline
32 pages
6 Probability Distributions
No ratings yet
6 Probability Distributions
137 pages
Chapter 3. Inheritance
No ratings yet
Chapter 3. Inheritance
55 pages
Turing Machine To Upload
No ratings yet
Turing Machine To Upload
84 pages
FLAT - UNIT 1 Notes
100% (2)
FLAT - UNIT 1 Notes
18 pages
Beta Distribution Summary
No ratings yet
Beta Distribution Summary
9 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
Deep Learning Simplified From Asimovinstitute PDF
No ratings yet
Deep Learning Simplified From Asimovinstitute PDF
21 pages
Oops in Python
No ratings yet
Oops in Python
8 pages
SIE 321 Probabilistic Models in OR Homework 7: Problem 1
No ratings yet
SIE 321 Probabilistic Models in OR Homework 7: Problem 1
2 pages
Supplemental 1
No ratings yet
Supplemental 1
54 pages
05 PDF
No ratings yet
05 PDF
70 pages
Build Deep Learning NN Models
No ratings yet
Build Deep Learning NN Models
6 pages

5.MLP in Practice

Uploaded by

5.MLP in Practice

Uploaded by

MLP_In_Practice

A Multilayer Perceptron (MLP) is a type of neural network that consists

Mean Squared Error (MSE): Measures the average squared difference

• Start with one hidden layer and gradually increase.

• Common values: 0.01, 0.001, 0.0001.

• Start with 32 or 64, and experiment with larger sizes.

You might also like