Markdown To PDF
Markdown To PDF
1. Introduction
This report explains the training process of a neural network implemented in the provided code. The model consists of a single hidden layer and uses the following
key components:
We will also delve into how the derivatives of the loss function are computed with respect to weights and biases, which form the basis of the backpropagation
process.
2. Model Architecture
3. Forward Pass
The forward pass computes the predictions for each data sample using the following steps:
4. Loss Function
The cross-entropy loss for a dataset of size ( N ) is: [ \text{Loss} = -\frac{1}{N} \sum_{i=1}^N \log(\hat{y} {i, y_i}) ] where ( \hat{y}{i, y_i} ) is the predicted probability of
the true class ( y_i ).
The backward pass calculates gradients of the loss with respect to weights and biases to update them during training. The following steps are performed:
Error at Output Layer: [ \delta_{\text{output}} = \hat{y} - y_{\text{one-hot}} ] where ( y_{\text{one-hot}} ) is the one-hot encoding of the true labels.
Gradient of Output Weights: [ \nabla_{\beta} = \frac{1}{N} \delta_{\text{output}}^T \cdot z_{\text{bias}} ]
Error at Hidden Layer: [ \delta_{\text{hidden}} = (\delta_{\text{output}} \cdot \beta_{\text{no-bias}}) \odot \sigma'(z) ] where ( \beta_{\text{no-bias}} )
excludes the bias weights, and ( \sigma'(z) = z \cdot (1 - z) ) is the derivative of the sigmoid function.
Gradient of Hidden Weights: [ \nabla_{\alpha} = \frac{1}{N} \delta_{\text{hidden}}^T \cdot X_{\text{bias}} ]
6. Parameter Updates
Using stochastic gradient descent (SGD), the weights are updated as: [ \alpha \leftarrow \alpha - \eta \nabla_{\alpha}, \quad \beta \leftarrow \beta - \eta
\nabla_{\beta} ] where ( \eta ) is the learning rate.
7. Training Process
9. Conclusion
The training process effectively leverages backpropagation to optimize the weights and biases by minimizing the cross-entropy loss. Gradients computed with
respect to the loss ensure that each parameter is adjusted to reduce classification error over time.