Unit - 4
Unit - 4
Mathematical Representation
The perceptron computes the weighted sum of the inputs and applies
an activation function to produce the output.
Problem: Classify emails as spam (1) or not spam (0) based on features
like the presence of specific keywords.
Features:
Each neuron in a layer is connected to every neuron in the next layer via
weighted connections. The network learns by adjusting these weights
during training.
Backpropagation Algorithm
Backpropagation is the process of propagating the error backward through
the network to update the weights and biases. It involves the following
steps:
1. Forward Propagation:
o Input data is passed through the network to compute the
output.
o The loss (error) is calculated by comparing the predicted
output with the actual output.
2. Backward Propagation:
o The error is propagated backward through the network.
o The gradients of the loss with respect to the weights and
biases are computed.
o The weights and biases are updated using gradient descent.
2. Strengths of Backpropagation
1. Efficiency:
o Backpropagation efficiently computes gradients using the
chain rule of calculus, making it suitable for training large
neural networks.
2. Versatility:
o It can be applied to various network architectures, including
feedforward networks, convolutional neural networks (CNNs),
and recurrent neural networks (RNNs).
3. Scalability:
o Backpropagation scales well with the size of the dataset and
the complexity of the model.
4. Automatic Feature Learning:
o It enables neural networks to automatically learn hierarchical
features from raw data, reducing the need for manual feature
engineering.
5. Wide Applicability:
o It is used in a wide range of applications, including image
recognition, natural language processing, and speech
recognition.
3. Limitations of Backpropagation
4. Practical Considerations
1. Learning Rate:
o The learning rate determines the size of the weight updates.
o Too high: The algorithm may diverge.
o Too low: The algorithm may converge slowly.
o Solution: Use learning rate schedules or adaptive learning rate
methods (e.g., Adam).
2. Batch Size:
o The batch size affects the stability and speed of training.
o Smaller batches: Noisier gradients but faster convergence.
o Larger batches: Smoother gradients but slower convergence.
o Solution: Experiment with different batch sizes.
3. Weight Initialization:
o Proper weight initialization is crucial for effective training.
o Poor initialization can lead to vanishing or exploding gradients.
o Solution: Use techniques like Xavier or He initialization.
4. Regularization:
o Regularization techniques prevent overfitting and improve
generalization.
o Common methods: L2 regularization (weight decay), dropout,
and early stopping.
5. Monitoring Training:
o Monitor training and validation loss to detect overfitting or
underfitting.
o Use techniques like learning curves and validation checks.
Steps:
1. Data Collection:
o Collect facial images of all employees.
2. Data Preprocessing:
o Resize images to 64x64 pixels and convert to
grayscale.
o Normalize pixel values to [0, 1].
3. Model Training:
o Train the ANN model on the preprocessed dataset.
4. Deployment:
o Deploy the trained model to a real-time system.
o Use a camera to capture facial images and predict the
employee's identity.
5. Monitoring:
o Continuously monitor the system's performance and retrain
the model as needed.