Aiml-Qb - Unit 5
Aiml-Qb - Unit 5
5. Show the perceptron that calculates parity of its three inputs (Nov/Dec 2023)
14. What is stochastic gradient descent and why is it used in the training of neural
networks?
Stochastic Gradient Descent is an optimization algorithm that can be used to train neural
network models. The Stochastic Gradient Descent algorithm requires gradients to be
calculated for each variable in the model so that new values for the variables can be
calculated.
2
15. What is stochastic gradient descent and why it is used in the training of neural
networks ? (April/May 2024)
Stochastic Gradient Descent (SGD) is an optimization algorithm used to minimize the loss
function during the training of neural networks. It is a variant of gradient descent where,
instead of using the entire dataset to compute the gradient, it uses a single randomly selected
training example (or a small batch) at each step.
Why it is used:
Efficiency: SGD is computationally more efficient because it updates the model parameters
after each training example or small batch, rather than waiting for the entire dataset.
Faster Convergence: Due to its stochastic nature, SGD can escape local minima and find
better solutions, often converging faster than batch gradient descent, especially with large
datasets.
16. What are the disadvantages of stochastic gradient descent?
SGD is much faster but the convergence path of SGD is noisier than that of original
gradient descent. This is because in each step it is not calculating the actual gradient but an
approximation. So we see a lot of fluctuations in the cost.
17. How do you solve the vanishing gradient problem within a deep neural network?
The vanishing gradient problem is caused by the derivative of the activation function used
to create the neural network. The simplest solution to the problem is to replace the activation
function of the network. Instead of sigmoid, use an activation function such as ReLU
20. Why is ReLU better than Softmax?Give the equation of both ( April/May 2024)
As per our business requirement, we can choose our required activation function.
Generally , we use ReLU in hidden layer to avoid vanishing gradient problem and better
computation performance , and Softmax function use in last output layer .
3
Part – B
1. Draw the architecture of a single layer perceptron (SLP) and explain its operation.
Mention its advantages and disadvantages.(April/May 2024)
2. Draw the architecture of a Multilayer perceptron (MLP) and explain its operation.
Mention its advantages and disadvantages.
3. Explain the stochastic optimization methods for weight determination.
4. Explain the steps in the back propagation learning algorithm.What is importance of it in
designing neural networks (APRIL/MAY 2023)
5. Explain a deep feedforward network with a neat sketch (APRIL/MAY 2023)
6. Elaborate the process of training hideen layers by ReLU in deep networks ( Nov/Dec
2023)
7. Briefly explain hints and the different ways it can be used ( Nov/Dec 2023)
8. List the factors that affect the performance of multilayer feed-forward neural network.
9. Difference between a Shallow Net & Deep Learning Net.
10. How do you tune hyperparameters for better neural network performance? Explain in
detail.(April/May 2024)