Module 1 DL
Module 1 DL
1. Cell Body (Soma): The cell body contains the nucleus and is the neuron's control center. It
processes incoming signals and maintains cell health.
2. Dendrites: These are branching structures that receive signals from other neurons and convey
them toward the cell body. Dendrites increase the neuron's surface area, allowing it to connect
with many other neurons.
3. Axon: The axon is a long, thin projection that transmits signals away from the cell body to other
neurons, muscles, or glands. The axon often ends in a structure called the axon terminal, where
neurotransmitters are released.
• Efficient Information Processing: Neurons can process complex information quickly, enabling fast
responses.
• Plasticity: Neurons adapt by strengthening or weakening connections, supporting learning and
memory.
• Vulnerability to Damage: Neurons are sensitive and can be damaged by injury, disease, or aging.
• Energy-Intensive: Neurons require a lot of energy to maintain their function.
McCulloch-Pitts Neuron
The MCP Neuron, named after Warren
McCulloch and Walter Pitts, is a fundamental
concept in the history of artificial neurons and
neural networks. This is simplified version of
Biological Neuron.
An MCP neuron is a simplified version of a biological neuron, receiving binary inputs (0 or 1) and generating
a binary output. These inputs are weighted equally (all weights set to 1), and the neuron’s output is
determined by applying a weighted sum to its inputs and comparing it to a threshold. If the weighted sum is
greater than or equal to the threshold, the neuron outputs 1; otherwise, it outputs 0.
The structure of an MCP neuron is elegantly straightforward. It comprises input connections, each
associated with a weight representing its importance, a summation function that aggregates these weighted
inputs, and an activation function that determines the neuron’s output based on the summation result
One of the key features of the MCP neuron is its activation threshold. If the summation of weighted inputs
exceeds this threshold, the neuron “fires,” producing an output signal. Otherwise, it remains inactive. This
binary nature of output — either firing or not firing — mirrors the basic behavior of biological neurons.
Multi-layer perception is also known as MLP. It is fully connected dense layers, which transform any
input dimension to the desired dimension. A multi-layer perception is a neural network that has multiple
layers. To create a neural network, we combine neurons together so that the outputs of some neurons
are inputs of other neurons.
A multi-layer perceptron has one input layer and for each input, there is one neuron(or node), it has one
output layer with a single node for each output and it can have any number of hidden layers and each
hidden layer can have any number of nodes. A schematic diagram of a Multi-Layer Perceptron (MLP) is
depicted below.
In the multi-layer perceptron diagram above, we can see that there are three inputs and thus three input
nodes and the hidden layer has three nodes. The output layer gives two outputs, therefore there are two
output nodes. The nodes in the input layer take input and forward it for further process, in the diagram
above the nodes in the input layer forwards their output to each of the three nodes in the hidden layer,
and in the same way, the hidden layer processes the information and passes it to the output layer.
Every node in the multi-layer perception uses a sigmoid activation function. The sigmoid activation
function takes real values as input and converts them to numbers between 0 and 1 using the sigmoid
formula.
α(x) = 1/( 1 + exp(-x))
Deep Networks: Fundamentals
Deep networks, also known as deep neural networks (DNNs), are complex architectures in artificial
intelligence and machine learning composed of multiple layers of interconnected neurons. They are an
extension of simple neural networks, designed to capture intricate patterns and relationships in data,
making them highly effective for tasks like image recognition, natural language processing, and game
playing.
1. Architecture:
o Input Layer: Receives raw data (e.g., images, text) and passes it to the network.
o Hidden Layers: Multiple layers (often many, in deep networks) that transform inputs into more
abstract features. These layers enable deep networks to learn complex representations, which is the
foundation of their strength.
o Output Layer: Produces the final prediction or classification.
2. Activation Functions:
o These introduce non-linearity, allowing the network to model complex relationships in the data.
Common functions include ReLU (Rectified Linear Unit), Sigmoid, and Tanh.
3. Forward and Backward Propagation:
o Forward Propagation: Input data moves forward through the network, layer by layer, until it reaches
the output.
o Backward Propagation (Backprop): A process for adjusting weights in response to the error in the
output. It uses the gradient descent algorithm to update weights and minimize prediction errors.
4. Loss Function:
o Measures the difference between predicted and actual outputs. Common choices are Mean Squared
Error (MSE) for regression tasks and Cross-Entropy Loss for classification tasks.
5. Optimization and Training:
o Uses optimization algorithms, like gradient descent and its variants (e.g., Adam, RMSprop), to
minimize the loss function iteratively and refine weights.
6. Regularization:
o Techniques like dropout and L2 regularization help prevent overfitting by adding constraints or noise
during training.
Single-Layer Perceptron
A single layer perceptron (SLP) is a feed-forward network based on a threshold transfer function. SLP is the
simplest type of artificial neural networks and can only classify linearly separable cases with a binary target.
Activation functions are mathematical equations that determine the output of a neural network. The
function is attached to each neuron in the network, and determines whether it should be activated or not,
based on whether each neuron’s input is relevant for the model’s prediction.
Single-Layer Perceptron is a type of perceptron is limited to learning linearly separable patterns. It is
effective for tasks where the data can be divided into distinct categories through a straight line. While
powerful in its simplicity, it struggles with more complex problems where the relationship between
inputs and outputs is non-linear.
A perceptron is a linear classifier; that is, it is an algorithm that classifies input by separating two categories
with a straight line. Input is typically a feature vector x multiplied by weights w and added to a bias b :
A single-layer perceptron does not include hidden layers, which allow neural networks to model a feature
hierarchy. It is, therefore, a shallow neural network, which prevents it from performing non-linear
classification.
Basic Terminologies of Deep learning
Neuron: The basic unit in a neural network, similar to a neuron in the brain. It receives inputs, processes
them with weights, and passes the result through an activation function to produce an output.
Layer: A group of neurons that process data at the same stage of a neural network. Layers include the
input layer, hidden layers, and the output layer. Hidden layers allow deep networks to learn complex
patterns.
Activation Function: A function applied to the output of each neuron to introduce non-linearity, helping
the network learn complex patterns. Common activation functions include ReLU (Rectified Linear Unit),
Sigmoid, and Tanh.
Forward Propagation: The process of passing input data through the network layers to generate
predictions or outputs.
Loss Function (Cost Function): A function that measures the difference between the predicted output
and the actual target output. Examples include Mean Squared Error (MSE) for regression and Cross-
Entropy Loss for classification.
Backward Propagation (Backpropagation): The algorithm for updating weights by calculating the
gradient of the loss function concerning each weight. It involves moving backward from the output layer to
the input layer to minimize prediction error.
Gradient Descent: An optimization algorithm that adjusts weights to minimize the loss function by
iteratively moving in the direction of the steepest decrease in loss.
Learning Rate: A hyperparameter that determines the step size in weight adjustments during training. A
smaller rate leads to slower but more stable convergence, while a larger rate may speed up training but
risks overshooting the optimal solution.
Epoch: One complete pass of the entire training dataset through the neural network. Training typically
involves multiple epochs to improve model accuracy.
Overfitting: A scenario where the model performs well on training data but poorly on new, unseen data.
Overfitting occurs when the model learns noise in the training data instead of general patterns.
Underfitting: The opposite of overfitting, where the model fails to capture underlying patterns in the
data, resulting in poor performance on both training and test data.
Regularization: Techniques like L2 regularization and dropout are used to prevent overfitting by
introducing constraints or noise to the model during training.
Convolutional Neural Network (CNN): A type of deep learning model commonly used for image data.
CNNs use convolutional layers that process image pixels in local patches, detecting patterns like edges and
shapes.
Recurrent Neural Network (RNN): A type of network used for sequential data, like time series or natural
language. RNNs retain information from previous steps, making them suitable for tasks where context is
essential.