Deep+Learning+Module-02+Search+Creators
Deep+Learning+Module-02+Search+Creators
Module-02
• A feedforward neural network is the simplest form of artificial neural network (ANN)
• Information moves in only one direction: forward, from input nodes through hidden nodes
to output nodes
1. Origins
2. Evolution
1. Input Layer
o No computation performed
2. Hidden Layers
3. Output Layer
1. Sigmoid (Logistic)
o Range: [0,1]
o Properties:
▪ Smooth gradient
o Range: [-1,1]
o Properties:
▪ Zero-centered
▪ Stronger gradients
o Properties:
▪ Computationally efficient
4. Leaky ReLU
o Properties:
2. Gradient-Based Learning
1. Definition
2. Properties
o Properties:
▪ Always positive
▪ Differentiable
2. Cross-Entropy Loss
o Properties:
3. Huber Loss
o Formula:
o Formula: θ = θ - α∇J(θ)
b) RMSprop
c) Momentum
o Reduces oscillation
1. Mathematical Basis
1. Input Processing
o Data normalization
o Weight initialization
o Bias addition
2. Layer Computation
python
Copy
Z = W * A + b # Linear transformation
3. Output Generation
o Prediction computation
o Error calculation
1. Error Calculation
2. Weight Updates
3. Detailed Steps
python
Copy
# Output layer
dZ = A - Y # For MSE
dW = (1/m) * dZ * A_prev.T
db = (1/m) * sum(dZ)
# Hidden layers
dZ = dA * activation_derivative(Z)
dW = (1/m) * dZ * A_prev.T
db = (1/m) * sum(dZ)
4.1 L1 Regularization
1. Mathematical Form
o Formula: L1 = λΣ|w|
o Promotes sparsity
2. Properties
4.2 L2 Regularization
1. Mathematical Form
o Formula: L2 = λΣw²
2. Properties
o No sparse solutions
4.3 Dropout
1. Basic Concept
2. Implementation Details
python
Copy
A = A * mask
1. Implementation
2. Benefits
o Prevents overfitting
5. Advanced Concepts
1. Purpose
o Speeds up training
2. Algorithm
python
Copy
1. Xavier/Glorot Initialization
2. He Initialization
o Variance = 2/nin
6. Practical Implementation
1. Architecture Choices
o Number of layers
o Activation functions
2. Hyperparameter Selection
o Learning rate
o Batch size
o Regularization strength
1. Data Preparation
o Splitting data
o Normalization
o Augmentation
2. Training Loop
o Forward pass
o Loss computation
o Backward pass
o Parameter updates
1. Basic Concepts
2. Mathematical Problems
3. Implementation Challenges
1. Activation Functions
2. Loss Functions
3. Regularization
o L1 = λΣ|w|
o L2 = λΣw²
4. Gradient Descent
o Update: w = w - α∇J(w)
o Momentum: v = βv - α∇J(w)
1. Vanishing Gradients
2. Overfitting
o Add dropout
o Use regularization
3. Poor Convergence