0% found this document useful (0 votes)
12 views25 pages

Unit I

The document provides an overview of neural networks, drawing parallels between biological neurons and artificial neurons, highlighting their structures, functions, similarities, and differences. It discusses various neural network architectures, including single-layered, multi-layered, fully connected networks, and recurrent neural networks, along with their applications and training mechanisms. Additionally, it covers concepts such as knowledge representation, error correction learning, and memory-based learning in the context of artificial intelligence.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views25 pages

Unit I

The document provides an overview of neural networks, drawing parallels between biological neurons and artificial neurons, highlighting their structures, functions, similarities, and differences. It discusses various neural network architectures, including single-layered, multi-layered, fully connected networks, and recurrent neural networks, along with their applications and training mechanisms. Additionally, it covers concepts such as knowledge representation, error correction learning, and memory-based learning in the context of artificial intelligence.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

UNIT 1

Neural Networks and the Human Brain

Definition: Neural networks in machine learning are inspired by the structure and function of the
human brain. Both systems consist of interconnected units (neurons in the brain, nodes in ANNs)
that process information. (Haykin).

Comparing the Basic Neuron Model in ANNs to a Biological Neuron’s Structure:

Biological Neuron Structure

Components of a Biological Neuron

● Dendrites: These are branching extensions of the neuron that receive signals from other
neurons. They act as input channels, collecting electrical impulses and transmitting them
to the cell body.
● Cell Body (Soma): This is the central part of the neuron that contains the nucleus and or-
ganelles. It processes incoming signals and integrates them to determine if an action po-
tential should be generated.
● Axon: A long, slender projection that transmits electrical impulses away from the cell
body to other neurons or muscles.
● Axon Terminals: The endpoints of the axon where neurotransmitters are released to
communicate with other neurons across synapses.
● Synapses: Junctions between neurons where chemical signals (neurotransmitters) are ex-
changed. (Freeman & Kapura)

Function

● Biological neurons process information through complex biochemical interactions and


electrical signals. Synaptic plasticity allows neurons to strengthen or weaken their con-
nections based on activity and experience.

Example: When a sensory neuron detects a stimulus, it sends an electrical signal through its
axon to the brain, where the signal is processed and interpreted.

Artificial Neuron Model

Components of an Artificial Neuron

● Inputs: These are numerical values (features) fed into the neuron from the previous layer
in the network.
● Weights: Each input is associated with a weight that adjusts the input's influence on the
neuron's output. Weights are updated during training to minimize error.
● Summation Function: The weighted inputs are summed together, often with a bias term
added, to calculate the neuron's net input.
● Activation Function: This function applies a non-linear transformation to the net input
to produce the neuron's output. Common activation functions include sigmoid, tanh, and
ReLU. ( Haykin)

Function

● Artificial neurons perform a mathematical operation on their inputs. They use weights to
scale the inputs, sum them, and then apply an activation function to produce an output.
This output is passed to the next layer in the network or as the final result.

Example: In a neural network for image classification, an artificial neuron might process pixel
values of an image, apply weights to these values, and use an activation function to determine
whether a specific feature (e.g., an edge) is present.

Similarities between Biological neuron and Artificial neuron:

● Basic Operation: Both biological and artificial neurons process information and send sig-
nals to other neurons or layers.
● Information Integration: In both systems, inputs are integrated to produce an output.
Biological neurons integrate electrical signals, while artificial neurons integrate numeri-
cal values.

Differences

● Complexity: Biological neurons are vastly more complex, involving biochemical interac-
tions, varying neurotransmitters, and intricate signaling pathways. Artificial neurons are
simplified models focusing on mathematical operations.
● Learning Mechanism: Biological neurons use synaptic plasticity and chemical signals
for learning, while artificial neurons use algorithms like backpropagation and gradient de-
scent to adjust weights and improve performance.
● Communication: Biological neurons communicate through chemical signals at synapses,
whereas artificial neurons communicate through numerical values passed between layers.

While a biological neuron adapts based on experience and changes in synaptic strength,
an artificial neuron adjusts its weights during training to minimize prediction error
.

Fig: Biological neuron.

Fig: Representation of Artificial neuron.


Introduction to Neural Network Architectures:

Neural network architectures refer to the various structures and configurations of neural networks de-
signed to solve specific types of problems. Each architecture is optimized for particular tasks such as im-
age recognition, natural language processing, or time-series prediction. (Haykin)

Single-Layered Neural Networks

Structure and Function

● Explanation: Single-layered neural networks, also known as single-layer perceptrons, consist of


an input layer directly connected to an output layer with no hidden layers in between. Each input
is associated with a weight, and the weighted sum is passed through an activation function to pro -
duce the output. (Haykin)

A single-layer perceptron can solve linearly separable problems like basic binary classification
tasks. For instance, classifying points on a 2D plane into two categories using a linear boundary.

Limitations

● Explanation: Single-layered networks are limited to solving only linearly separable problems.
They cannot capture complex patterns or relationships in the data. (Haykin)

A single-layer perceptron struggles with the XOR problem, where the decision boundary is not
linear.
Multi-Layered Neural Networks

Structure and Function

● Explanation: Multi-layered neural networks, also known as Multi-Layer Perceptrons (MLPs),


consist of one or more hidden layers between the input and output layers. Each layer is fully con-
nected to the next, allowing the network to model complex, non-linear relationships. (Haykin)

An MLP with multiple hidden layers can handle more complex tasks like image recognition or
function approximation. For example, predicting house prices based on various features such as
size, location, and number of rooms.

Training

● Explanation: MLPs are trained using algorithms like backpropagation, which adjusts the weights
based on the error between predicted and actual outputs. This process involves propagating the
error backwards through the network to update the weights. (Haykin)

Training an MLP to recognize handwritten digits using the MNIST dataset involves adjusting
weights to minimize the classification error.
Fully Connected Networks

Structure and Function

● Explanation: Fully connected networks, also known as dense networks, are characterized by ev-
ery neuron in one layer being connected to every neuron in the next layer. This structure allows
the network to learn a rich representation of the data. (Haykin)

Fully connected layers are often used in the final stages of convolutional networks to combine
features learned by convolutional and pooling layers into a final classification output.

Applications

● Explanation: Fully connected networks are used in various applications, including feature ex-
traction, classification tasks, and regression. They are a fundamental component of many deep
learning models. (Haykin)

In a deep learning model for speech recognition, fully connected layers may process features ex -
tracted by convolutional layers to make predictions about spoken words.

Recurrent Neural Networks (RNNs)


Structure and Function

● Explanation: Recurrent Neural Networks (RNNs) are designed to handle sequential data by in-
corporating cycles in the network architecture. This allows them to maintain a form of memory of
previous inputs, making them suitable for tasks involving time-series or sequential data. (Haykin)

RNNs are used in natural language processing for tasks such as language modeling and text gen -
eration, where the context of previous words affects the prediction of the next word.

Variants of RNNs

● Long Short-Term Memory (LSTM): An RNN variant designed to overcome the vanishing gra-
dient problem by using gates to control the flow of information and maintain long-term depen -
dencies. (Haykin)
● Gated Recurrent Unit (GRU): A simplified variant of LSTM with fewer gates but similar capa-
bilities in managing long-term dependencies. (Haykin)

LSTMs used for machine translation, where understanding the context of an entire sentence is
essential for translating to another language.

Neural Networks Viewed as Directed Graphs:

Nodes and Edges in Neural Networks


● Explanation: In neural networks, nodes represent neurons or units, and directed
edges represent the connections or weights between neurons. The direction of the
edge signifies the flow of information from one neuron to another. (Haykin)
● Example: In a feedforward neural network, edges direct the flow of data from in-
put neurons through hidden layers to output neurons.

Layers and Connections

● Explanation: Neural networks are structured in layers. Each layer consists of multi-
ple nodes, and connections between nodes form directed edges. In a feedforward
network, edges go from input nodes to hidden nodes and then to output nodes.
(Haykin)
● Example: In a Convolutional Neural Network (CNN), the directed graph shows
how input images pass through convolutional layers, pooling layers, and fully con-
nected layers.

Types of Directed Graphs in Neural Networks

Feedforward Neural Networks

● Explanation: In feedforward networks, the directed graph is acyclic, meaning there


are no cycles. Information flows in one direction from input to output. (Haykin)
● Example: The architecture of a simple MLP is a directed acyclic graph where each
node in a layer is connected to every node in the subsequent layer.

Recurrent Neural Networks (RNNs)

● Explanation: In RNNs, the directed graph can contain cycles because neurons can
have connections to themselves or previous layers. This cyclical structure allows
the network to maintain a memory of previous inputs. (Haykin)
● Example: In an RNN used for time-series prediction, the directed graph includes
cycles that represent the feedback of past information to the network.

Graph Neural Networks (GNNs)


● Explanation: GNNs extend the concept of directed graphs to more complex graph
structures where nodes represent entities, and directed edges represent relation-
ships or interactions. These networks can handle arbitrary graph topologies.
(Haykin)
● Example: GNNs applied to social network analysis where nodes represent individ-
uals, and directed edges represent interactions or relationships.

Advantages of Viewing Neural Networks as Directed Graphs

Clarity in Architecture

● Explanation: Viewing neural networks as directed graphs provides a clear visualiza-


tion of how data flows through the network, making it easier to understand and de-
sign network architectures. (Haykin)
● Example: Visualizing a neural network for image classification helps in under-
standing how input data progresses through various layers.

Complexity Management

● Explanation: Directed graphs help manage the complexity of neural networks by


providing a structured way to represent and analyze connections and dependencies
between neurons. (Haykin)
● Example: In a deep learning model, directed graphs help in visualizing the com-
plex interconnections between multiple layers and units.

Efficient Computation

● Explanation: Directed graphs facilitate the use of algorithms for efficient computa-
tion, such as forward propagation, backpropagation, and optimization techniques.
(Haykin)
● Example: Backpropagation algorithms use the directed graph structure to compute
gradients and update weights efficiently.
Fig: Neural network as directed graph

Introduction to Knowledge Representation:

Definition: Knowledge representation is the field of artificial intelligence (AI) that


deals with how to formally represent information about the world in a way that a
computer system can utilize to solve complex tasks, reason, and make decisions.
(Haykin)

Example: Representing a person's knowledge of how to bake a cake in a way that


a computer can understand and use for generating a recipe.

Error Correction Learning:


Definition

● Error Correction Learning is a learning method where an artifi-


Explanation:
cial neural network adjusts its weights to minimize the discrepancy between
its predicted output and the actual target output. This adjustment is per-
formed iteratively based on the error calculated for each prediction. (Haykin)
● Example: In a neural network used for image classification, the network
makes predictions on images, calculates the error between the predicted and
actual class labels, and updates the weights to reduce this error.

Key Concepts

Learning Rule

● Explanation: The learning rule defines how weights are updated based on the
error. A common rule used is the Delta Rule, which adjusts weights propor-
tionally to the error term and the input. (Haykin)

Backpropagation

● Explanation:Backpropagation is a popular algorithm for error correction


learning in multi-layer neural networks. It involves calculating the gradient
of the error function with respect to each weight by propagating the error
backward through the network. (Haykin)
Steps:

1. Forward Pass: Compute the output of the network using the current
weights.
2. Error Calculation: Calculate the difference between the predicted
output and the target output.
3. Backward Pass: Compute the gradient of the error with respect to
each weight and update the weights accordingly.

Forward Pass

● During the forward pass, the input data is fed into the network,
Explanation:
and the network computes the output based on the current weights and acti-
vation functions. (Haykin)
● Example: For an image classification network, the forward pass involves
passing pixel values through layers of the network to produce class scores.

Error Computation

● The error is computed as the difference between the network's


Explanation:
predicted output and the actual target value. This error is used to determine
how much the weights need to be adjusted. (Haykin)
● Example: If the network predicts "cat" but the actual label is "dog," the er-
ror will quantify how far the prediction is from the true label.

Backward Pass

● The backward pass involves calculating the gradients of the error


Explanation:
function with respect to each weight using the chain rule. These gradients
are used to adjust the weights to minimize the error. (Haykin)
● Example: In a multi-layer network, gradients are computed for each layer,
and weights are updated to reduce the error in the output layer.

NUMERICAL PROBLEM:

In a single-layer neural network, you are working with a perceptron that receives an input of 0.5.
After performing a forward pass, the network produces an output of 0.54. You aim to train the
network to achieve a target output of 0.8. With a learning rate set to 0.1, and an initial weight of
0.4, calculate the following:

1. Determine the change in weight based on the given learning rate and the difference be-
tween the desired and actual outputs. Also, find the new weight after applying this up-
date.
2. If the actual output were adjusted to 0.6 instead of 0.54, how would this affect the weight
update? Calculate the new weight in this scenario as well.

Examples

Perceptron Learning

● The Perceptron algorithm is a simple error correction learning


Explanation:
method used for binary classification. It adjusts weights based on misclassi-
fied examples. (Haykin)
● Example: If the perceptron classifies a sample incorrectly, it adjusts the
weights according to the error term.
Multilayer Perceptron (MLP)

● An MLP uses backpropagation for error correction learning. It


Explanation:
adjusts weights in a multi-layered network to minimize the error across all
layers. (Haykin)
● Example: In training an MLP for digit recognition, the network adjusts its
weights iteratively to improve classification accuracy on handwritten digits.

Memory-Based Learning
Introduction

Definition

● Explanation:Memory-Based Learning, also known as Instance-Based Learn-


ing, is an approach where past examples (instances) are stored and used di-
rectly for making predictions or decisions. It contrasts with model-based
learning, where a model is built from data to make predictions. (Haykin)

Key Concepts

Instance-Based Learning

● Instance-Based Learning can be expressed as a


Mathematical Representation:
problem of similarity computation. Given a new instance xxx, the goal is to
find its most similar instances in the training set {(xi,yi)}i=1N\{(x_i,
y_i)\}_{i=1}^N{(xi,yi)}i=1N, where xix_ixi represents instances and
yiy_iyi represents labels or outcomes.
● Mathematical Formulation: The similarity between two instances x and xi
can be measured using a distance metric d(x,xi).

Detailed Process

Storage of Instances

● All training instances


Mathematical Representation: are stored in
memory. For a new instance x, we compute the similarity or distance to all
stored instances.
Query Processing
● Mathematical Representation: Given a new query x, the system retrieves the k
nearest neighbors based on the chosen similarity measure. Let {xi1,
xi2,.......} be the k nearest neighbors to x.

Decision Making

● For a classification task, the class label of the new instance is


Classification:
determined by the majority vote among the k nearest neighbors.

is the predicted class.


● Regression: For a regression task, the prediction is the average of the out-
comes of the k nearest neighbors.

Where y^ is the predicted value.

Hebbian Learning
Introduction

Definition

● Hebbian Learning is a principle based on the idea that the con-


Explanation:
nections between neurons strengthen when they are activated together. This
concept is summarized by the phrase "cells that fire together, wire together,"
proposed by psychologist Donald Hebb. (Haykin)
● Example: In a neural network, if two neurons are activated simultaneously,
the connection (synaptic weight) between them strengthens.

Key Concepts
Hebb’s Rule

● Hebb’s Rule states that the change in the synaptic weight Δwij
Explanation:
between two neurons i and j is proportional to the product of their activa-
tions.

Weight Update Rule

● Explanation:The weight between neurons is updated based on their activa-


tions using Hebb’s Rule.

Detailed Process

Activation and Correlation


● Explanation:Hebbian Learning relies on the correlation between the activa-
tions of connected neurons. If two neurons are active simultaneously, their
connection strength increases.

Updating Weights

● Explanation: Weights are updated during each learning iteration based on cur-
rent activations, continuing until the network’s connections reflect input pat-
terns.

Examples

Simple Hebbian Learning

● In simple neural networks, Hebbian Learning adjusts weights


Explanation:
based on the correlation of neuron activations.
● Example: Associative memory networks reinforce connections between
neurons representing simultaneous patterns, improving recall.
Pattern Formation

● Hebbian Learning is used in pattern formation to reinforce com-


Explanation:
mon features based on co-activation.
● Example: In feature recognition networks, Hebbian Learning strengthens
connections to neurons representing frequently co-activated features.

Applications
Neural Networks

● Explanation:Hebbian Learning enhances connections in neural networks


where neurons frequently activate together, improving performance in tasks
like pattern recognition.
● Example: In self-organizing maps, Hebbian Learning organizes input data
into clusters by reinforcing neuron connections based on similarity.
5.2 Cognitive Systems

● Hebbian Learning models learning processes observed in biolog-


Explanation:
ical systems, simulating aspects of human learning and memory.
● Example: Artificial neural networks use Hebbian Learning to mimic synap-
tic plasticity and enhance cognitive functions.

Competitive Learning and Boltzmann Learning


Competitive Learning

Introduction

● Competitive Learning is a learning approach where neurons


Explanation:
compete to become active in response to a given input. Only the most "win-
ning" neuron (the one with the highest activation) updates its weights, while
other neurons' weights remain unchanged. This process is useful for cluster-
ing and organizing input data into distinct categories. (Haykin)
● Example: In a self-organizing map, neurons compete to represent different
clusters of input data.

Competitive Learning Algorithm

● Explanation:The Competitive Learning algorithm involves neurons compet-


ing to respond to input patterns, with only the winning neuron updating its
weights.
Winner-Takes-All

● Explanation: In the Competitive Learning process, the neuron with the highest
similarity to the input pattern (or lowest distance) is selected as the winner.

Applications

● Competitive Learning is commonly used in unsupervised learn-


Explanation:
ing tasks such as clustering and dimensionality reduction.
● Example: In image compression, Competitive Learning can be used to clus-
ter similar image pixels together.
Boltzmann Learning
Introduction

● Boltzmann Learning is based on the principles of statistical me-


Explanation:
chanics and is used in Boltzmann Machines. It involves probabilistic weight
adjustments, where the probability of weight changes is governed by the
Boltzmann distribution.
● Example: Boltzmann Machines are used in pattern recognition and combi-
natorial optimization problems.
Boltzmann Machine

● Explanation:A Boltzmann Machine is a type of stochastic neural network


where neurons are connected with probabilistic weights. It uses the Boltz-
mann distribution to model the probability of a given state.

Learn-
ing Rule

● The learning rule in Boltzmann Machines adjusts weights to


Explanation:
minimize the difference between the probability distributions of the ob-
served and model states.
Simulated Annealing

● Boltzmann Machines use a process called simulated annealing to


Explanation:
escape local minima and find a more optimal solution.

Applications

● Boltzmann Learning is used in various applications including op-


Explanation:
timization problems, complex system modeling, and probabilistic reasoning.
● Example: In combinat orial optimization, Boltzmann Machines can be ap-
plied to find optimal solutions for problems such as the traveling salesman
problem.
Statistical Nature of the Learning Process

Introduction

Definition

● The statistical nature of the learning process refers to the ways in


Explanation:
which learning algorithms leverage statistical methods to make inferences
and decisions based on data. This approach recognizes that learning involves
probabilistic elements and seeks to model and optimize based on statistical
principles. (Haykin)
● Example: Learning algorithms often estimate probabilities, variances, and
other statistical measures to optimize performance and make predictions.

Key Concepts

Probability Distributions

● Many learning algorithms rely on probability distributions to


Explanation:
model the uncertainty and variability in the data. These distributions help in
making predictions and understanding the underlying data structure.
Expectation and Variance

● Statistical learning methods often use expectation and variance to estimate


and quantify the behavior of the learning process.

Maximum Likelihood Estimation (MLE)

● MLE is a method for estimating the parameters of a probabilistic


Explanation:
model by maximizing the likelihood function, which measures how well the
model explains the observed data.
Bayesian Inference

● Explanation:Bayesian inference involves updating probabilities based on new


evidence, using Bayes' Theorem. It combines prior knowledge with ob-
served data to make predictions or decisions.

Statistical Learning Theory

Bias-Variance Tradeoff

● The bias-variance tradeoff is a key concept in statistical learning,


Explanation:
highlighting the balance between model complexity and its ability to gener-
alize.
● Formula for Bias-Variance Decomposition:

where:

● Bias measures the error due to the model’s assumptions,


● Variance measures the error due to sensitivity to fluctuations in the training
data,
● Irreducible Error represents the noise inherent in the data.
Overfitting and Underfitting

● Overfitting occurs when a model learns the noise in the training


Explanation:
data rather than the underlying pattern, while underfitting happens when the
model is too simple to capture the data's complexity.
● Formula for Model Complexity:

Complexity=Number of Parameters

More complex models may fit the training data better but risk overfitting,
while simpler models might underfit.

You might also like