Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
42 views
7 pages
Cross Interopy
Uploaded by
Epic Arrow
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save cross-interopy For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
42 views
7 pages
Cross Interopy
Uploaded by
Epic Arrow
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save cross-interopy For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 7
Search
Fullscreen
= My Notes Softmax classification with cross-entropy (2/2) This tutorial will describe the softmax function used to mode! multiclass classification problems. We will provide derivations of the gradients used for optimizing any parameters with regards to the cross-entropy . The previous section described how to represent classification of 2 classes with the help of the logistic function . For multiclass classification there exists an extension of this logistic function, called the softmax function , which is used in multinomial logistic regression . What follows will explain the softmax function and how to derive it. This is the second part of a 2-part tutorial on classification models trained by cross-entropy: © Part 1: Logistic classification with cross-entropy fimax classification with cross-entropy (this) # Python imports a ‘matplotlib inline %config InlineBackend.figure_format = ‘svg’ import numpy as np import matplotlib import matplotlib.pyplot as plt # Plotting Library from matplotlib.colors import colorConverter, ListedColormap from mpl_toolkits.mplot3d import Axes3D # 3D plots from matplotlib import cm # CoLormaps import seaborn as sns_ # Fancier plots 4# Set matpLotlib and seaborn plotting style sns.set_style('darkgrid’) #Softmax function The logistic output function described in the previous section can only be used for the classification between two target classes t = 1 and t = 0. This logistic function can be generalized to output a multiclass categorical probability distribution by the softmax function This softmax function ¢ takes as input a C-dimensional vector z and outputs a C-dimensional vector y of real values between 0 and 1. This function is a normalized exponential and is defined as: The denominator 7¢_, e* acts as a regularizer to make sure that 370, ye = 1. As the output layer of a neural network, the softmax function can be represented graphically as a layer with C neurons We can write the probabilities that the class is t = ¢ for c .-C given input z as: P(t =1\z) <(2): (& pe=cz)| |s@c Dee | ose Where P(t = |z) is thus the probability that that the class is c given the input z. These probabilities of the output P(t = 1|z) for an example system with 2 classes (¢ = 1, t = 2) and input % = [z1, 22} are shown in the figure below. The other probability P(¢ = 2\z) will be complementary. def softmax(z): “"'Softmax function" *” return np.exp(z) / np.sum(np.exp(z))4# Plot the softmax output for 2 dimensions for both classes # Plot the output in function of the weights # Define a vector of weights for which we want to plot the output nb_of_zs = 33 2S = np.linspace(-10, 10, num=nb_of_zs) # input zs_1, 25_2 = np.meshgrid(zs, 2s) # generate grid y = np.zeros((nb_of_zs, nb_of_zs, 2)) # initialize output # FILL the output matrix for each combination of input z's for i in range(nb_of_zs): for j in range(nb_of_zs): ylisde:] = softmax(np.asarray((zs_a[i,j], 7s_2[4,3]])) # Plot the Loss function surfaces for both classes with sns.axes_style("whitegrid"): fig = plt.figure(Figsize=(6, 4)) ax = fig.add_subplot(1, 1, 1, projections'3d") # Plot the Loss function surface for t=1 surf = ax.plot_surface(zs_1, zs_2, y[:,:,0], linewidth=0, cmap=cm.magma) ax.view_init(elev=30, azim=70) cbar = fig.colorbar(surf) ax.set_xlabel('$z_1$", fontsize=2) ax.set_ylabel('$z_2$", fontsize=12) ax.set_zlabel('$y_1$", fontsize=12) ax.set_title ('$P(t=1]\nathbf(z})$") cbar.ax.set_ylabel("$P(t=1| \nathbf{z})$", fontsize=12) plt.show() # P(t =1|z) 0.8 0.8 - S06 - _ 0.4 - poe N 02 - tl \ 0.40 ¥ —— 0.2 * ~ . -10 . “5 10 5 °Derivative of the softmax function To use the softmax function in neural networks, we need to compute its derivative. If we define Do = WE e* for c= 1-+-C'so that ye = e*/Eq, then this derivative Oy;/Az; of the output y of the softmax function with respect to its input z can be calculated as: oe i — etiet 4 OPy Fi in Ou _ OS _ ABoreret _ et Bonmet _ et Oz = Ze Se Ze see yg OYE 0— e%e# ee fix jg = =- =-y AI aT E Bede MM Note that if i = j this derivative is similar to the derivative of the logistic function.Cross-entropy loss function for the softmax function To derive the loss function for the softmax function we start out from the likelihood function that a given set of parameters 0 of the model can result in prediction of the correct class of each input sample, as in the derivation for the logistic loss function. The maximization of this likelihood can be written as: argmax L(6\t, 2) 3 The likelihood £(6|t, z) can be rewritten as the joint probability of generating t and z given the parameters 8: P(t, 2|8). Which can be decomposed as a conditional distribution and a marginal: P(t, 2\0) = P(t\z, 0) P(z|0) Since we are not interested in the probability of z we can reduce this to: L(|t,z) = P(t|z, 6). Which can be written as P(t|z) for fixed 8. Since each t, is dependent on the full z, and only 1 class can be activated in the t we can write ° P(t|z) = TL Pee \2) =] s@ el As was noted during the derivation of the loss function of the logistic function, maximizing this likelihood can also be done by minimizing the negative log-likelihood —log L(O|t, 2) = &(t, 2) = v= —S te log(ye) a Which is the cross-entropy error function . Note that for a 2 class system output f2 = 1 — and this results in the same error function as for logistic regression: &(t, y) = —telog(ye) — (1 — te) log(1 — ye). The cross-entropy error function over a batch of multiple samples of size n can be calculated as 1» ne = Vialtays) =- SOV tic og(yie) a ae Where tic is 1 if and only if sample 4 belongs to class ¢, and yi is the output probability that sample i belongs to class ¢.Derivative of the cross-entropy loss function for the softmax function The derivative 8€/82; of the loss function with respect to the softmax input z; can be calculated as: Note that we already derived dy,/O2; for i = j and i # j above. The result that G€/z; = yi — ti for all i € C is the same as the derivative of the cross- entropy for the logistic function which had only one output node. This is the second part of a 2-part tutorial on classification models trained by cross-entropy: © Part 1: Logistic classification with cross-entropy © Part 2: Softmax classification with cross-entropy (this) To see the softmax function in action on a minimal neural network, please read part 4 of this series on how to implement a neural network in NumPy. # Versions used Mload_ext watermark Ywatermark --python Ywatermark --iversions * Python implementation: cPython Python version 23.9.8 Python version 7.23.1 seaborn : 0.11.1 numpy 1.20.2 matplotlib: 3.4.2This post at peterroelants.github io is generated from an IPython notebook file, Link to the full [Python notebook file ® Softmax I ® Logistic Regression I ® Machine Learning I ®@ Cross-Entropy Classification I ® Gradient Descent I ® Neural Networks I D Notebook 0one
You might also like
Asic Design
PDF
No ratings yet
Asic Design
63 pages
Slides MC Softmax Regression
PDF
No ratings yet
Slides MC Softmax Regression
11 pages
6.neural Networks 2
PDF
No ratings yet
6.neural Networks 2
44 pages
Derivative of CCE Loss Function
PDF
No ratings yet
Derivative of CCE Loss Function
16 pages
Lessson 13 ANN
PDF
No ratings yet
Lessson 13 ANN
76 pages
DL145611 03 Shallow
PDF
No ratings yet
DL145611 03 Shallow
92 pages
Neural Networks
PDF
No ratings yet
Neural Networks
63 pages
Neural Network
PDF
No ratings yet
Neural Network
14 pages
Main
PDF
No ratings yet
Main
9 pages
L3 Cse256 Fa24 FFN
PDF
No ratings yet
L3 Cse256 Fa24 FFN
64 pages
Unit 2 DL
PDF
No ratings yet
Unit 2 DL
70 pages
DeepNotes Softmax&Crossentropy
PDF
No ratings yet
DeepNotes Softmax&Crossentropy
14 pages
Cs217 Perceptron Sigmoid Softmax Week5 3feb25
PDF
No ratings yet
Cs217 Perceptron Sigmoid Softmax Week5 3feb25
90 pages
Softmax Reg Skimmed - Ipynb - Colab
PDF
No ratings yet
Softmax Reg Skimmed - Ipynb - Colab
9 pages
Module 1 - Problems in Neural Network
PDF
No ratings yet
Module 1 - Problems in Neural Network
20 pages
What Is A Neural Network?
PDF
No ratings yet
What Is A Neural Network?
7 pages
Logistic Regression
PDF
No ratings yet
Logistic Regression
29 pages
CM20315 05 Loss
PDF
No ratings yet
CM20315 05 Loss
100 pages
Solution 5
PDF
No ratings yet
Solution 5
4 pages
Loss Functions
PDF
No ratings yet
Loss Functions
15 pages
CHAPTER 3.3 - Activation - Loss - Accuracy
PDF
No ratings yet
CHAPTER 3.3 - Activation - Loss - Accuracy
14 pages
Softmax
PDF
No ratings yet
Softmax
5 pages
Activation - Loss - Accuracy
PDF
No ratings yet
Activation - Loss - Accuracy
16 pages
Softmax
PDF
No ratings yet
Softmax
17 pages
Lecture 3
PDF
No ratings yet
Lecture 3
24 pages
Deeplearning - Ai Deeplearning - Ai
PDF
No ratings yet
Deeplearning - Ai Deeplearning - Ai
47 pages
Practice QuestionsV1
PDF
No ratings yet
Practice QuestionsV1
7 pages
Lect 8
PDF
No ratings yet
Lect 8
117 pages
Ch2-Training, Optimization and Regularization of DNN-new
PDF
No ratings yet
Ch2-Training, Optimization and Regularization of DNN-new
114 pages
2021 Logistic Regression
PDF
No ratings yet
2021 Logistic Regression
33 pages
HODL Lec 2 Training NNs Intro TF
PDF
No ratings yet
HODL Lec 2 Training NNs Intro TF
83 pages
3a Variations
PDF
No ratings yet
3a Variations
17 pages
Practical-5 - 2CEIT606 - Artificial Intelligence
PDF
No ratings yet
Practical-5 - 2CEIT606 - Artificial Intelligence
14 pages
W02 MLOptDL
PDF
No ratings yet
W02 MLOptDL
23 pages
SoftMax Regress Real
PDF
No ratings yet
SoftMax Regress Real
8 pages
Practice QuestionsV1
PDF
No ratings yet
Practice QuestionsV1
7 pages
Bản sao của softmax - regression.ipynb - Colab
PDF
No ratings yet
Bản sao của softmax - regression.ipynb - Colab
6 pages
03-Linear Classification
PDF
No ratings yet
03-Linear Classification
17 pages
Notes6 Classification
PDF
No ratings yet
Notes6 Classification
10 pages
Detailed Sigmoid and Softmax Activation Function
PDF
No ratings yet
Detailed Sigmoid and Softmax Activation Function
5 pages
Lec 04 Deep Networks 2
PDF
No ratings yet
Lec 04 Deep Networks 2
78 pages
Lecture 06 - Multiclass Logistic Regression
PDF
No ratings yet
Lecture 06 - Multiclass Logistic Regression
12 pages
VHDL FSM
PDF
No ratings yet
VHDL FSM
33 pages
3a Variations4
PDF
No ratings yet
3a Variations4
5 pages
Dat 300
PDF
No ratings yet
Dat 300
12 pages
Curs4site PDF
PDF
No ratings yet
Curs4site PDF
44 pages
Types of Neural Networks
PDF
No ratings yet
Types of Neural Networks
7 pages
Image Composition Assessment With Saliency-Augmented Multi-Pattern Pooling
PDF
No ratings yet
Image Composition Assessment With Saliency-Augmented Multi-Pattern Pooling
14 pages
Cross Entropy Loss Intro, Applications
PDF
No ratings yet
Cross Entropy Loss Intro, Applications
21 pages
C2 W2 SoftMax
PDF
No ratings yet
C2 W2 SoftMax
7 pages
Understand The Softmax Function in Minutes: Data Science Bootcamp
PDF
No ratings yet
Understand The Softmax Function in Minutes: Data Science Bootcamp
15 pages
Homework 2
PDF
No ratings yet
Homework 2
3 pages
C2 W2 SoftMax
PDF
No ratings yet
C2 W2 SoftMax
7 pages
06 Lectureslides LinearClassification Fixed
PDF
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
Joystick Modul Datenblatt
PDF
No ratings yet
Joystick Modul Datenblatt
8 pages
02 - Linear Models - D (Multiclass Classification)
PDF
No ratings yet
02 - Linear Models - D (Multiclass Classification)
9 pages
cs231n Github Io Neural Networks Case Study
PDF
No ratings yet
cs231n Github Io Neural Networks Case Study
17 pages
A Friendly Introduction To Cross Entropy Loss
PDF
No ratings yet
A Friendly Introduction To Cross Entropy Loss
10 pages
2 Softmaxregression
PDF
No ratings yet
2 Softmaxregression
4 pages
AI SVM Network
PDF
No ratings yet
AI SVM Network
10 pages
Medium Understand The Softmax Function in Minutes F3a59641e86d
PDF
No ratings yet
Medium Understand The Softmax Function in Minutes F3a59641e86d
14 pages
Deep Learning With Python A Crash Course To Deep Learning With Illustrations in Python Programming Language
PDF
100% (2)
Deep Learning With Python A Crash Course To Deep Learning With Illustrations in Python Programming Language
59 pages
Deep Learning Assignment3 Solution
PDF
No ratings yet
Deep Learning Assignment3 Solution
9 pages
Soft Max
PDF
No ratings yet
Soft Max
6 pages