Soft Max

The softmax function converts a real-valued vector into a probability distribution by normalizing it so the elements sum to 1 and fall in the range of 0 to 1. It differs from an element-wise logistic function by applying to the entire vector. A common use is as the output layer in a neural network for classification problems, where the softmax output can represent the probability that the input belongs to each class. This combines well with cross entropy loss for training.

Uploaded by

Pooja Patwari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

427 views6 pages

Soft Max

Uploaded by

Pooja Patwari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Softmax function

• its purpose is to convert a real valued array

into probabilities (with range 0 to 1), rather
than just introduce a nonlinearity.
• It differs from the logistic function in that it
does not operate element-wise on a vector.
Rather the softmax applies to an entire vector
The softmax function
The use of Softmax
• Softmax layer as the output layer

Ordinary Layer

z1   
y1   z1
In general, the output of
z2   
y2   z 2
network can be any value.

May not be easy to interpret

z3   
y3   z3
Softmax
Probability:
• Softmax layer as the output layer  1 > 𝑦𝑖 > 0
 𝑖 𝑦𝑖 = 1
Softmax Layer

3 0.88 3

e
20
z1 e e z1
 y1  e z1 zj

j 1

1 0.12 3
z2 e e z 2 2.7
 y2  e z2
e
zj

j 1
0.05 ≈0
z3 -3 
3

e
z3
e y3  e z3 zj
e
3 j 1

 e zj

j 1
softmax for multi-class classification
• Softmax pushes the largest component of the vector towards 1
while pushing all the other components towards zero. Also, all the
outputs sum to 1, regardless of the sum of the components of the
input vector. Thus, the output of the softmax function can be
intepreted as a probability distribution.

• A common application is to use softmax in the output layer for a

classi-fication problem. The output vector has a component
corresponding to each target class, and the softmax output is
interpreted as the probability of the input belonging to the
corresponding class.
• Excellent combination with Cross entropy loss ( will give an
assignment problem)

Data Dosen May 2015
No ratings yet
Data Dosen May 2015
115 pages
Muhs Nashik Thesis Guidelines
100% (3)
Muhs Nashik Thesis Guidelines
7 pages
Denim Jeans Report
No ratings yet
Denim Jeans Report
53 pages
Hill Climbing Vs Simulated Annealing
100% (1)
Hill Climbing Vs Simulated Annealing
14 pages
Elements of Short Story Summary
100% (1)
Elements of Short Story Summary
4 pages
Class XI Physics DPP Set (08) - Kinematics & NLM
No ratings yet
Class XI Physics DPP Set (08) - Kinematics & NLM
13 pages
MCQs Template GAT GRE
No ratings yet
MCQs Template GAT GRE
6 pages
Unit 2 AI
No ratings yet
Unit 2 AI
107 pages
Deep Learning Lab Practicals
No ratings yet
Deep Learning Lab Practicals
24 pages
GSL Dictionary
No ratings yet
GSL Dictionary
290 pages
Chapter
100% (1)
Chapter
101 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Optimization Techniques in Deep Learning
No ratings yet
Optimization Techniques in Deep Learning
14 pages
1599807727design Thinking Practitioner - Course - Brocchure
No ratings yet
1599807727design Thinking Practitioner - Course - Brocchure
13 pages
Mukesh Makwana
No ratings yet
Mukesh Makwana
101 pages
Backpropagation Learning in Neural Networks
No ratings yet
Backpropagation Learning in Neural Networks
27 pages
Unit 2
No ratings yet
Unit 2
112 pages
Support Vector Machines (SVM) : N I y X D
No ratings yet
Support Vector Machines (SVM) : N I y X D
5 pages
Unit 4
No ratings yet
Unit 4
79 pages
AIML - 04 Single Layer Perceptron
No ratings yet
AIML - 04 Single Layer Perceptron
11 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
(Course Code: 4340002) : For All Diploma Courses
No ratings yet
(Course Code: 4340002) : For All Diploma Courses
8 pages
Lecture Notes - Logistic Regression
100% (1)
Lecture Notes - Logistic Regression
11 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
28 pages
Btech CSE
No ratings yet
Btech CSE
17 pages
Agri Crops NC I - Grading System
No ratings yet
Agri Crops NC I - Grading System
2 pages
UNIT-I - Introduction To Computer Vision
No ratings yet
UNIT-I - Introduction To Computer Vision
45 pages
Gradient Descent Algorithms and Variations - PyImageSearch
No ratings yet
Gradient Descent Algorithms and Variations - PyImageSearch
21 pages
Bidisha
No ratings yet
Bidisha
2 pages
The Multilayer Perceptron
No ratings yet
The Multilayer Perceptron
11 pages
Case Analysis
No ratings yet
Case Analysis
4 pages
Unit 2 Approach and Method in ELT 2015-2016
100% (1)
Unit 2 Approach and Method in ELT 2015-2016
19 pages
Kernel Methods: Feature Mapping at No Cost
No ratings yet
Kernel Methods: Feature Mapping at No Cost
25 pages
Eda PDF
100% (1)
Eda PDF
45 pages
Jara B Childs Science Lesson Plan 1
No ratings yet
Jara B Childs Science Lesson Plan 1
9 pages
GD in LR
No ratings yet
GD in LR
23 pages
CS 601 Machine Learning Unit 3
No ratings yet
CS 601 Machine Learning Unit 3
37 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Engl111 71761act1 BanguisMyraFritzie
No ratings yet
Engl111 71761act1 BanguisMyraFritzie
2 pages
Gradient Descent Optimization
No ratings yet
Gradient Descent Optimization
27 pages
Working Paper: Management Communication: History, Distinctiveness, and Core Content
No ratings yet
Working Paper: Management Communication: History, Distinctiveness, and Core Content
36 pages
Matplotlib PDF
No ratings yet
Matplotlib PDF
16 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Non-Linear Classifiers
No ratings yet
Non-Linear Classifiers
19 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
Nueral Network Mcqs
No ratings yet
Nueral Network Mcqs
6 pages
Gradient Descent Learning: Minimize Objective Function: Error Landscape
No ratings yet
Gradient Descent Learning: Minimize Objective Function: Error Landscape
14 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Suummary Notes Cognitive Approach
No ratings yet
Suummary Notes Cognitive Approach
11 pages
Pakikipagkapwa RRL
No ratings yet
Pakikipagkapwa RRL
4 pages
The Problem of Overfitting: Overfitting With Linear Regression
No ratings yet
The Problem of Overfitting: Overfitting With Linear Regression
32 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
5 pages
RBF, KNN, SVM, DT
No ratings yet
RBF, KNN, SVM, DT
9 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
Introduction To SVM
No ratings yet
Introduction To SVM
24 pages
Dissertation Uni Erlangen Medizin
100% (2)
Dissertation Uni Erlangen Medizin
8 pages
DL Lab Manual
No ratings yet
DL Lab Manual
65 pages
ML Lab
No ratings yet
ML Lab
21 pages
SVM Optimization: Derivation of The Lagrangian Dual
No ratings yet
SVM Optimization: Derivation of The Lagrangian Dual
13 pages
Deep Learning CNN
100% (1)
Deep Learning CNN
28 pages
Bidirectional RNN and RVNN
No ratings yet
Bidirectional RNN and RVNN
15 pages
Activation Function
No ratings yet
Activation Function
13 pages
2nd Counselling 2011
No ratings yet
2nd Counselling 2011
3 pages
Notes EIC17103 11 8 20 PDF
No ratings yet
Notes EIC17103 11 8 20 PDF
8 pages
AJOKE 1 F Pneumagogy, A Proposed Theory For Effective Teaching and Learning in Christian Kingdom Education (FINAL)
No ratings yet
AJOKE 1 F Pneumagogy, A Proposed Theory For Effective Teaching and Learning in Christian Kingdom Education (FINAL)
16 pages
Model With One-Word Context: 2vec 2vec 2vec 2vec
100% (1)
Model With One-Word Context: 2vec 2vec 2vec 2vec
17 pages
Regularization: Swetha V, Research Scholar
No ratings yet
Regularization: Swetha V, Research Scholar
32 pages
ch9 Ensemble Learning
No ratings yet
ch9 Ensemble Learning
19 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
Psychology 3
No ratings yet
Psychology 3
4 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
RBM, DBN, and DBM
No ratings yet
RBM, DBN, and DBM
79 pages
Self Organizing Maps
No ratings yet
Self Organizing Maps
27 pages
SRM Valliammai Engineering College (An Autonomous Institution)
No ratings yet
SRM Valliammai Engineering College (An Autonomous Institution)
9 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
Naukri GreeshmaKantipudi (1y 0m)
No ratings yet
Naukri GreeshmaKantipudi (1y 0m)
1 page
Dropout Vs Pruning
No ratings yet
Dropout Vs Pruning
2 pages
Quantum COMP
No ratings yet
Quantum COMP
6 pages
2.neural Network
No ratings yet
2.neural Network
19 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
An Introduction To Kohonen Self Organizing Maps: Rajarshi Guha
No ratings yet
An Introduction To Kohonen Self Organizing Maps: Rajarshi Guha
12 pages
Rasiklal M. Dhariwal Institute of Technology: Shri Jain Vidya Prasarak Mandal's Chinchwad, Pune-33
No ratings yet
Rasiklal M. Dhariwal Institute of Technology: Shri Jain Vidya Prasarak Mandal's Chinchwad, Pune-33
3 pages
6 - Train - Test - Split - Ipynb - Colaboratory
No ratings yet
6 - Train - Test - Split - Ipynb - Colaboratory
5 pages
SPM May 2016 Question Paper
No ratings yet
SPM May 2016 Question Paper
1 page
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
No ratings yet
ML Concepts: 1. Parametric Vs Non-Parametric Models:: Examples: Linear, Logistic, SVM
34 pages
Technical Seminar: Sapthagiri College of Engineering
No ratings yet
Technical Seminar: Sapthagiri College of Engineering
18 pages
Marx Resume For Sales and Marketing
No ratings yet
Marx Resume For Sales and Marketing
3 pages
Assistant Professor English Solved Papers 2nd Edition 2nd Edition Yct Expert Team Instant Download
100% (1)
Assistant Professor English Solved Papers 2nd Edition 2nd Edition Yct Expert Team Instant Download
91 pages
HENOK MEZGEBE ASEMAHUGN ID MLO-3436-15A SL
No ratings yet
HENOK MEZGEBE ASEMAHUGN ID MLO-3436-15A SL
3 pages
Lab I TENSOR FLOW AND KERAS
No ratings yet
Lab I TENSOR FLOW AND KERAS
3 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet

Soft Max

Uploaded by

Soft Max

Uploaded by

Softmax function

• its purpose is to convert a real valued array

May not be easy to interpret

• A common application is to use softmax in the output layer for a

You might also like