0% found this document useful (0 votes)
3 views1 page

Misc ML Concepts

Batch normalization normalizes the inputs of a layer during training by adjusting for the mini-batch mean and standard deviation, ensuring zero mean and unit variance. It then applies scale and shift transformations to allow the network to learn optimal scaling and bias. This process helps maintain reasonable activation ranges, reducing the vanishing gradient problem and enabling more stable training of deep neural networks.

Uploaded by

kjmfqkwj4v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views1 page

Misc ML Concepts

Batch normalization normalizes the inputs of a layer during training by adjusting for the mini-batch mean and standard deviation, ensuring zero mean and unit variance. It then applies scale and shift transformations to allow the network to learn optimal scaling and bias. This process helps maintain reasonable activation ranges, reducing the vanishing gradient problem and enabling more stable training of deep neural networks.

Uploaded by

kjmfqkwj4v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Batch Normalization

- During training, for each mini-batch of training examples, batch normalization


normalizes the inputs of a specific layer by subtracting the mini-batch mean and
dividing by the mini-batch standard deviation. This normalization step ensures that
the inputs to the layer have zero mean and unit variance.
- After normalization, batch normalization applies a scale and shift transformation
to the normalized inputs. The scale parameter allows the network to learn the
optimal scaling for the normalized inputs, while the shift parameter allows the
network to learn the optimal bias.
- By normalizing the inputs, batch normalization helps to keep the activations in a
reasonable range throughout the network. This mitigates the issues caused by the
vanishing gradient problem, as the gradients are less likely to become extremely
small or vanish as they propagate through the layers. It enables more stable and
efficient training of deep neural networks.

You might also like