0% found this document useful (0 votes)
5 views19 pages

AI Module5

Module 5 discusses Artificial Neural Networks (ANNs), which are inspired by biological neurons and consist of interconnected nodes that process data. It covers the structure and functioning of biological neurons, the mapping to artificial neurons, and various types of ANNs including Single-Layer Feedforward, Multi-Layer Perceptrons, and Convolutional Neural Networks. The document emphasizes the importance of activation functions, learning algorithms like backpropagation, and the applications of ANNs in fields such as image recognition and medical diagnosis.

Uploaded by

deekshagowdam22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views19 pages

AI Module5

Module 5 discusses Artificial Neural Networks (ANNs), which are inspired by biological neurons and consist of interconnected nodes that process data. It covers the structure and functioning of biological neurons, the mapping to artificial neurons, and various types of ANNs including Single-Layer Feedforward, Multi-Layer Perceptrons, and Convolutional Neural Networks. The document emphasizes the importance of activation functions, learning algorithms like backpropagation, and the applications of ANNs in fields such as image recognition and medical diagnosis.

Uploaded by

deekshagowdam22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Module 5: Artificial Neural Network

Biological Neurons (The Inspiration for ANN)


Introduction

Biological neurons are the fundamental units of the human nervous system and serve as the
inspiration for artificial neural networks. Understanding how biological neurons work helps
us grasp the logic behind artificial neurons, which form the basis of machine learning
models like ANNs.

Structure of a Biological Neuron

A biological neuron consists of the following main parts:

1. Dendrites

o Receive electrical signals (inputs) from


other neurons.

2. Cell Body (Soma)

o Processes incoming signals and decides whether to transmit the signal


further.

3. Axon

o A long fiber that transmits electrical impulses to other neurons.

4. Synapse

o The junction between two neurons where signal transmission occurs via
neurotransmitters.

Working of a Biological Neuron

1. Signals are received through dendrites from other neurons.

2. If the signal strength exceeds a threshold, the cell body activates.

3. The signal is passed through the axon to the next neuron.

4. At the synapse, neurotransmitters are released, passing the signal to the receiving
neuron's dendrites.
Biological to Artificial Mapping

Biological Neuron Part Equivalent in Artificial Neuron

Dendrites Inputs

Synaptic Weights Weights

Soma (Cell Body) Summation function

Activation Potential Threshold/Activation Function

Axon Output

Key Characteristics of Biological Neurons

• Neurons communicate using electrochemical signals.

• Learning in the brain is believed to occur through synaptic plasticity—adjusting


the strength of connections (similar to adjusting weights in ANN).

• Human brain has ~86 billion neurons connected via trillions of synapses—ANNs
mimic this structure at a much smaller scale.

Why Are Biological Neurons Important in AI?

• They inspired the computational model of artificial neurons.

• Provided the idea of learning through connections and feedback.

• The concept of thresholds, activation, and connectivity all originate from


neuroscience.

Real-World Analogy

When you touch something hot:

• Sensory neurons (input) detect heat.

• The brain (processing) interprets the signal.

• Motor neurons (output) command the hand to pull away.

This biological flow—input → processing → output—is the core principle behind all neural
network models.
Biological neurons are the foundation of all neural processing in the brain. Their structure
and behavior inspired the design of artificial neurons used in machine learning. Though
artificial models are simplified, they capture the essence of biological learning: using inputs,
adjusting weights, and generating outputs to adapt and learn.

What is an Artificial Neural Network?


An Artificial Neural Network (ANN) is a computational model inspired by the human
brain’s biological neural networks. Just like the brain processes information using a
network of neurons, ANNs are made up of interconnected nodes (called artificial neurons)
that process data and learn to make predictions or decisions.

ANNs are the backbone of deep learning and have been successfully applied to tasks like
image recognition, speech processing, natural language understanding, and medical
diagnosis.

Key Concepts1. Biological Inspiration

• The human brain contains billions of neurons that communicate through synapses.

• Similarly, an ANN consists of artificial neurons that are connected and pass
information between each other.

Analogy:

• Biological neuron → Artificial neuron (node)

• Synapse → Connection (with weight)

• Firing signal → Activation function output

2. Structure of an ANN

A basic ANN has the following structure:

1. Input Layer – Takes raw input features (e.g., pixels, sensor data).

2. Hidden Layer(s) – Processes input data using weighted connections and activation
functions.

3. Output Layer – Produces final predictions or classifications.

Diagram (Conceptual):

Input Layer → Hidden Layer(s) → Output Layer


Each connection between neurons has an associated weight, which gets adjusted during
training.

3. Forward Propagation

The process of passing inputs through the network from input to output is called forward
propagation. At each neuron, the weighted sum of inputs is computed and passed through
an activation function.

Why Use Neural Networks?

• Can model non-linear and complex relationships.

• Learn directly from raw data (no need for manual feature engineering).

• Adaptable to many domains (vision, NLP, robotics, healthcare).

Advantages

• High accuracy with enough data and training.

• Learns from experience without hard-coded rules.

• Can be stacked into deep networks for more learning capacity.

Limitations

• Requires a large amount of data to perform well.

• Computationally intensive; needs GPU/TPU for large models.

• Harder to interpret than traditional models (black-box nature).

Real-Life Examples of ANN Use

• Google Translate uses ANNs for language translation.

• Self-driving cars use ANNs for object detection and navigation.

• Healthcare: Predicting disease risk based on medical records.


Artificial Neural Networks mimic the human brain to solve complex problems where
traditional algorithms fail. They form the foundation of modern AI and deep learning
systems and are widely used across industries for prediction, classification, and decision-
making tasks.

Structure of an Artificial Neuron


Introduction

An artificial neuron (also called a node or perceptron) is the fundamental building block
of an artificial neural network. It is inspired by the structure and function of a biological
neuron in the human brain. The artificial neuron takes multiple inputs, processes them, and
generates an output.

This simple structure enables neural networks to learn from data and make decisions.

Biological vs Artificial Neuron (Analogy)

Biological Neuron Artificial Neuron

Dendrites Inputs (features)

Synapse Weights

Soma (cell body) Summation + Activation function

Axon Output

Components of an Artificial Neuron

1. Inputs (x1,x2,...,xn)(x_1, x_2, ..., x_n)(x1,x2,...,xn)

Each input represents a feature (e.g., age, height, pixels in an image).

2. Weights (w1,w2,...,wn)(w_1, w_2, ..., w_n)(w1,w2,...,wn)

Each input is assigned a weight which signifies its importance in the final output.

3. Weighted Sum (Net Input)

The neuron computes a weighted sum of the inputs:

Where bbb is the bias term that allows shifting the activation.

4. Activation Function
The weighted sum is passed through an activation function to introduce non-linearity and
decide the output.

5. Output

The result of the activation function is the neuron's output.

Common Activation Functions

Example

Let’s assume:

• Inputs: x1=0.5,x2=0.3x_1 = 0.5, x_2 = 0.3x1=0.5,x2=0.3

• Weights: w1=0.4,w2=0.6w_1 = 0.4, w_2 = 0.6w1=0.4,w2=0.6

• Bias: b=0.1b = 0.1b=0.1

Then:

z=(0.5×0.4)+(0.3×0.6)+0.1=0.2+0.18+0.1=0.48z = (0.5 \times 0.4) + (0.3 \times 0.6) + 0.1 =


0.2 + 0.18 + 0.1 = 0.48z=(0.5×0.4)+(0.3×0.6)+0.1=0.2+0.18+0.1=0.48

Apply sigmoid activation:

Output=11+e−0.48≈0.617\text{Output} = \frac{1}{1 + e^{-0.48}} \approx


0.617Output=1+e−0.481≈0.617

Importance of Activation Functions

• Introduces non-linearity, enabling the network to learn complex patterns

• Decides whether a neuron should be activated (fired) or not

• Affects training stability and performance

The structure of an artificial neuron defines how neural networks process information.
With input weights, bias, and activation functions, neurons can simulate the behavior of the
brain and learn complex mappings between inputs and outputs. Understanding this
structure is essential before studying multilayer networks and training techniques like
backpropagation.

Perceptron and Learning Theory


Introduction

The Perceptron is the earliest and simplest type of artificial neural network, introduced by
Frank Rosenblatt in 1958. It models a single neuron and is used primarily for binary
classification problems. The perceptron is a linear classifier, meaning it can only solve
problems where classes are linearly separable.

This topic bridges the gap between biological inspiration and computational learning.

Structure of a Perceptron

A perceptron receives multiple inputs, computes a weighted sum, applies a threshold


function, and produces a binary output (0 or 1).

Mathematical Representation:

Where:

Example:

Learning in Perceptron

Perceptrons learn by adjusting weights based on training examples. The goal is to


minimize classification error.

Perceptron Learning Rule


If the prediction is incorrect, update weights as follows:

Where:

The learning continues until all training examples are classified correctly (or max epochs
are reached).

Perceptron Convergence Theorem

If the data is linearly separable, the perceptron learning algorithm is guaranteed to find a
set of weights that classify all points correctly in finite time.

Applications

• AND, OR logic gates

• Binary classification (spam/not spam)

• Image edge detection (in early vision systems)

Limitations

• Cannot solve non-linearly separable problems (e.g., XOR problem)

• Only outputs binary results (0 or 1)

• No hidden layers → Limited learning capacity

This led to the development of multilayer perceptrons (MLP) with backpropagation,


which can learn complex, non-linear functions.

Summary of Learning Theory in Perceptron

Term Meaning

Hypothesis The function the perceptron is learning


Term Meaning

Error Difference between predicted and actual output

Learning rate Controls the speed of weight updates

Convergence Whether weights stabilize over iterations

The perceptron is a foundational concept in neural networks. While simple, it introduces


core ideas like weights, bias, activation functions, and supervised learning. Understanding
its strengths and limitations helps appreciate more advanced networks like MLPs and deep
learning models.

Types of Artificial Neural Networks (ANNs)


Introduction

Artificial Neural Networks come in different architectures depending on the nature of the
problem (e.g., classification, regression, image recognition, pattern mapping). The choice of
network type determines how data flows, how training happens, and what kinds of
problems the network can solve.

Each type of ANN has a different structure, learning strategy, and area of application.

1. Single-Layer Feedforward Neural Network (SLFN)

• Consists of one input layer and one output layer.

• No hidden layer.

• Signals flow in one direction: input → output.

Use Case: Basic binary classification (e.g., spam detection with simple features).

Limitation: Can only learn linearly separable patterns (like the Perceptron).

2. Multi-Layer Feedforward Neural Network (MLP)

• Most commonly used ANN architecture.

• Includes one or more hidden layers between input and output layers.

• Each neuron in a layer is connected to every neuron in the next layer.

Uses: Handwritten digit recognition, voice classification, fraud detection.

Key Feature: Uses backpropagation for training.


Structure:

3. Radial Basis Function (RBF) Network

• Uses radial basis functions as activation functions.

• Two-layer network:

o Hidden layer with RBF neurons

o Output layer with linear weights

Key Characteristic:

• Focuses on distance from center (like similarity-based models).

Applications:

• Function approximation

• Time-series prediction

• Control systems

4. Recurrent Neural Network (RNN)

• Has memory: Can retain previous inputs using feedback loops.

• Suitable for sequential data (like time series or text).

Structure:

• Unlike feedforward networks, outputs from previous steps are fed back into the
network.

Applications:

• Language modeling

• Speech recognition

• Stock market prediction

5. Convolutional Neural Network (CNN)

• Specialized for image and spatial data.


• Uses convolutional layers that scan over input data using filters.

Key Features:

• Feature extraction via convolution

• Downsampling using pooling layers

Applications:

• Image classification

• Face recognition

• Medical imaging

6. Self-Organizing Feature Map (SOFM / Kohonen Network)

• A type of unsupervised neural network.

• Maps high-dimensional input data to a 2D grid, preserving similarity.

Use Case:

• Clustering

• Data visualization

• Pattern discovery

This will also be discussed in detail later.

Comparison Table (Summary)

Type Learning Key Feature Use Case

SLFN Supervised No hidden layer Simple binary classification

MLP Supervised Hidden layers, backpropagation General-purpose ML tasks

RBF Supervised Distance-based learning Function approximation

RNN Supervised Memory from past inputs Text, time-series analysis


Type Learning Key Feature Use Case

CNN Supervised Convolution and pooling layers Image and video processing

SOFM Unsupervised Topology-preserving mapping Clustering, visualization

Different types of ANNs are suited for different kinds of problems. Choosing the right
network depends on the data type, learning task, and computational resources. While
feedforward networks form the foundation, specialized types like CNNs, RNNs, and SOFMs
power many of today’s advanced AI systems.

Learning in Multilayer Perceptron (MLP)


Introduction

A Multilayer Perceptron (MLP) is a class of feedforward neural networks that includes


one or more hidden layers between input and output. Unlike single-layer perceptrons,
MLPs can learn non-linear patterns.

The learning in MLP is typically done using the backpropagation algorithm, a supervised
learning technique that updates weights to minimize the prediction error.

Structure of an MLP

• Input Layer: Receives input features.

• Hidden Layer(s): Intermediate layers that apply weights and activation functions.

• Output Layer: Produces final prediction (classification or regression).

Each neuron in one layer is fully connected to neurons in the next.

Common architecture:

Input → Hidden Layer 1 → Hidden Layer 2 → ... → Output

Forward Propagation

1. Inputs are passed through the network.

2. Each neuron computes a weighted sum:

3. An activation function is applied (e.g., ReLU, sigmoid).


4. Output flows to the next layer until final predictions are made.
Backpropagation (Learning Algorithm)

Backpropagation is the core of MLP training. It works by comparing predictions


to actual outputs, calculating errors, and adjusting weights backward through the
network.

Steps of Backpropagation

1. Initialization

• Randomly initialize weights and biases.

2. Forward Pass

• Compute predicted output using current weights.

3. Compute Error

• Calculate error using a loss function.


Example:

4. Backward Pass (Gradient Descent)

• Use partial derivatives to compute the gradient of the error with respect to each
weight.

• Adjust weights to minimize error:

Where η is the learning rate.

5. Repeat

• Continue updating weights for multiple epochs until error is minimized.

Role of Activation Functions

• Introduce non-linearity, enabling the network to learn complex mappings.

• Common choices:

o Sigmoid: Good for probabilities


o ReLU: Fast and effective for hidden layers

o Softmax: Used in multi-class classification

Use of Cost/Loss Functions

• Measure the difference between predicted and actual outputs.

• Common choices:

o Mean Squared Error (MSE) for regression

o Cross-Entropy Loss for classification

Example

• Input: Image pixels

• Hidden Layers: Extract abstract features

• Output: Classifies into digits 0–9

• Backpropagation updates weights layer by layer to reduce classification error.

Advantages of MLP

• Can learn non-linear patterns

• Works well for structured data, image, and speech inputs

• Scalable to deep networks (Deep Learning)

Limitations

• Requires more computational power

• May get stuck in local minima

• Sensitive to hyperparameters (e.g., learning rate, number of layers)

MLPs form the basis of most modern deep learning systems. The backpropagation
algorithm allows these networks to learn from errors and continuously improve.
Understanding this process is critical for training accurate, reliable machine
learning models.

Radial Basis Function (RBF) Neural Network


Introduction

The Radial Basis Function (RBF) Neural Network is a special type of feedforward
neural network that uses radial basis functions as activation functions in the hidden
layer. RBF networks are particularly useful for function approximation, time-series
prediction, and pattern recognition tasks.

They are faster to train and require fewer parameters compared to multilayer perceptrons,
but are generally suitable for smaller datasets.

Architecture of RBF Network

RBF networks consist of three layers:

1. Input Layer

o Accepts the input feature vector.

o No computation is done here.

2. Hidden Layer (RBF Layer)

o Uses radial basis functions (commonly Gaussian) as activation functions.

o Each neuron represents a prototype (center) in input space.

3. Output Layer

o Produces the final output using a linear combination of the hidden neuron
outputs.

Radial Basis Function (Gaussian Function)

The commonly used RBF is the Gaussian function:

Where:

• x: Input vector

• c: Center of the RBF neuron

• σ: Spread (controls the width of the curve)

• ∥x−c∥: Euclidean distance between input and center

Working of RBF Network

1. Compute distance between the input and each RBF center.

2. Apply the RBF (e.g., Gaussian) to get activation values.


3. Use these activations in the output layer to calculate the final prediction.

Visual Representation

You can visualize each RBF neuron as a bell-shaped curve centered at a point in input
space. The neuron activates more when the input is closer to its center.

Learning in RBF Networks

RBF networks typically learn in two stages:

1. Unsupervised stage (clustering)

o Determine the centers of RBFs using k-means or random sampling.

2. Supervised stage (linear learning)

o Learn the weights from the hidden layer to the output layer using least
squares method or gradient descent.

Applications of RBF Networks

• Function approximation

• Classification tasks (e.g., speech recognition)

• Time-series forecasting

• Handwriting recognition

Advantages

• Faster training compared to MLPs

• Good for interpolation problems

• Easier to interpret due to localized response of neurons

Limitations

• Performance degrades with high-dimensional data

• Selecting optimal number of centers and spread σ\sigmaσ is tricky

• Not suitable for problems with large or noisy datasets

Real-Life Example
An RBF network trained to recognize handwritten digits might use 10 RBF neurons (one
for each digit 0–9). Each neuron is highly responsive to inputs that are similar in shape
to the digit it represents.

RBF neural networks offer a powerful yet simple architecture for classification and
regression. Their structure, based on radial distance, provides a localized response to
input patterns, making them suitable for a wide range of real-world problems with
moderate complexity.

Self-Organizing Feature Map (SOFM) / Kohonen Network


Introduction

A Self-Organizing Feature Map (SOFM), also known as a Kohonen Network, is an


unsupervised learning neural network that is used primarily for clustering and
visualization of high-dimensional data.

It is unique because it not only groups similar data but also preserves the topological
structure, meaning similar input patterns are mapped close together on the output
grid.

Key Characteristics

• Unsupervised learning: Does not use labeled data.

• Topology preservation: Inputs that are close in the original space remain close in
the map.

• Dimensionality reduction: Converts high-dimensional data into 2D visual maps.

Architecture

1. Input Layer

o Accepts high-dimensional input vectors.

2. Output Layer (2D Grid)

o Organized as a 1D or 2D grid of neurons (nodes).

o Each neuron has an associated weight vector of the same dimension as the
input.

Each neuron in the output grid competes to represent input data.

Working of SOFM (Kohonen Algorithm)

1. Initialize the weights of all output neurons randomly.

2. For each input vector:


o Compute distance between input and all weight vectors (typically
Euclidean distance).

o Identify the Best Matching Unit (BMU) – neuron closest to the input.

3. Update weights of the BMU and its neighbors to make them more like the input:

1. Where:

o η(t) is the learning rate

o h(t) is the neighborhood function

2. Repeat for all inputs and for multiple epochs.

Neighborhood Function

• Determines which neurons (besides BMU) should have their weights updated.

• Often defined as a Gaussian centered around BMU.

• The size of the neighborhood shrinks over time, leading to more precise mapping.

Example Visualization

Imagine organizing a color palette:

• Input: RGB color values

• Output grid: 2D map

• Result: Similar shades (e.g., blues, greens) cluster together.

SOFM would visually organize these colors such that related ones appear close together.

Applications of SOFM

• Data clustering

• Market segmentation

• Pattern recognition

• Gene expression analysis

• Dimensionality reduction for visualization

Advantages
• Excellent for visualizing complex, high-dimensional data

• Requires no labeled data

• Preserves input topology

• Learns data patterns naturally

Limitations

• Choosing grid size and training parameters can be tricky

• Not ideal for real-time or large-scale learning

• Convergence can be slow

Summary Table

Feature Details

Learning type Unsupervised

Output structure 1D or 2D grid of neurons

Key mechanism Competitive learning (BMU)

Preserves topology? Yes

Use case Clustering, visualization, dimensionality reduction

The Self-Organizing Feature Map is a powerful tool for unsupervised learning and data
exploration. It provides an intuitive way to understand large datasets by projecting them
into lower dimensions, while maintaining the structure of the original data. Though it's less
common in modern deep learning pipelines, it remains widely used in exploratory data
analysis and research.

You might also like