0% found this document useful (0 votes)
27 views9 pages

Unit 4 - Artificial Intelligence

The document discusses neural networks and their architecture. It describes different types of neural networks including feedforward, recurrent, convolutional and LSTM networks. It also covers the basic components of neural networks like neurons, layers and connections. Additionally, it discusses commonly used neural network models and provides details about feedforward neural networks.

Uploaded by

merimarji6265
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views9 pages

Unit 4 - Artificial Intelligence

The document discusses neural networks and their architecture. It describes different types of neural networks including feedforward, recurrent, convolutional and LSTM networks. It also covers the basic components of neural networks like neurons, layers and connections. Additionally, it discusses commonly used neural network models and provides details about feedforward neural networks.

Uploaded by

merimarji6265
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIT – 4

NEURAL NETWORKS

INTRODUCTION TO NEURAL NETWORKS


Neural networks are a class of machine learning algorithms inspired by the structure and function of the
human brain. They are used for various tasks including pattern recognition, classification, regression, and
sequence generation.

TYPES OF NEURAL NETWORKS:


1. Feedforward Neural Networks (FNN): In a feedforward neural network, information flows in one
direction—from the input layer through the hidden layers to the output layer. They are commonly
used for tasks such as classification and regression.
2. Recurrent Neural Networks (RNN): RNNs are designed to process sequential data where the
current output depends not only on the current input but also on past inputs. They are well-suited
for tasks such as time series prediction, natural language processing, and speech recognition.
3. Convolutional Neural Networks (CNN): CNNs are specialized for processing grid-like data such as
images. They use convolutional layers to detect spatial patterns and hierarchical features within
the input data, making them highly effective for tasks like image classification, object detection,
and image segmentation.
4. Long Short-Term Memory Networks (LSTM): LSTMs are a type of recurrent neural network
architecture designed to address the vanishing gradient problem and capture long-term
dependencies in sequential data. They are widely used in tasks such as speech recognition,
language modeling, and machine translation.

TRAINING NEURAL NETWORKS:


Neural networks are trained using a process called backpropagation, which involves adjusting the weights
of connections between neurons based on the error or loss observed during the training process. The goal
is to minimize the difference between the predicted output of the network and the actual output (target)
by iteratively updating the weights using optimization algorithms like gradient descent.

APPLICATIONS OF NEURAL NETWORKS:


Neural networks have applications across various domains including:

 Image recognition
 Natural language processing
 Sentiment analysis
 Recommendation systems
 Autonomous vehicles
 Medical diagnosis
 Financial forecasting
In summary, neural networks are powerful machine learning models that mimic the behavior of the
human brain and are capable of learning complex patterns and relationships in data, making them
valuable tools for a wide range of applications.
ARCHITECTURE OF NEURAL NETWORKS
Neural Network Architecture refers to the design and structure of an artificial neural network (ANN),
which is a machine learning model inspired by the human brain. It defines how the network's layers,
nodes, and connections are organized to process and analyze data.

The input layer receives the raw data, which is then passed through one or more hidden layers that
perform mathematical computations. The output layer produces the final results, such as predictions or
classifications. The architecture of a neural network determines the complexity, capacity, and
functionality of the model.

BASIC COMPONENTS OF NEURAL NETWORKS:


1. Neurons (Nodes): The basic building blocks of neural networks. Neural networks consist of
interconnected layers of nodes called neurons, which process input data and pass the results to
subsequent layers. Each neuron applies a mathematical function to its input to produce an output,
which is then fed into the next layer.
2. Connections (Edges): Neurons are connected to each other through connections, which have
associated weights. These weights determine the strength of the connection. During the training
phase, the neural network adjusts the weights based on the provided data to improve networks
performance.
3. Layers: Neurons are organized into layers within a neural network. The three main types of layers
are:
 Input Layer: Receives input data and passes it to the next layer.
 Hidden Layers: Intermediate layers between the input and output layers where
computations are performed.
 Output Layer: Produces the final output of the network based on the computations
performed in the hidden layers.
MODELS OF ARTIFICIAL NEURAL NETWORKS
Artificial Neural Networks (ANNs) encompass various models tailored to different tasks, data structures,
and learning paradigms. Here's a breakdown of some prominent models:

 Feedforward Neural Networks (FNNs): Basic neural networks where information moves in one
direction, from input nodes through hidden layers (if any) to output nodes. Multi-layer Perceptrons
(MLPs) are a common type of FNN with multiple layers of neurons.
 Convolutional Neural Networks (CNNs): Specifically designed for processing grid-like data like
images. Utilizes convolutional layers to detect spatial patterns by applying convolution operations.
Effective in tasks such as image classification, object detection, and image segmentation are
example for CNNs.
 Recurrent Neural Networks (RNNs): Ideal for sequential data where the order of elements
matters, like time series, text, and speech. It Incorporates feedback loops to process sequences,
enabling the network to exhibit temporal dynamics. Long Short-Term Memory (LSTM) and Gated
Recurrent Unit (GRU) are popular RNN variants, addressing the vanishing gradient problem and
capturing long-term dependencies.
 Generative Adversarial Networks (GANs): Comprises two networks: a generator and a
discriminator, trained adversarially. The generator creates synthetic samples, aiming to fool the
discriminator, which in turn distinguishes between real and fake samples. It is widely used for
generating realistic images, data augmentation, and unsupervised learning tasks.
 Autoencoder Neural Networks: Unsupervised learning models aimed at learning efficient data
representations. It consists of an encoder that compresses the input into a latent space and a
decoder that reconstructs the input from the compressed representation. It is used for
dimensionality reduction, feature learning, and anomaly detection.
 Recursive Neural Networks (RecNNs): Tailored for hierarchical structures like parse trees or
linguistic structures. Applies the same set of weights recursively at each level, capturing
hierarchical relationships within the data.
 Self-Organizing Maps (SOMs): Unsupervised learning models that create low-dimensional
representations of input space. It utilizes competitive learning to cluster and visualize high-
dimensional data onto a two-dimensional grid.
 Modular Neural Networks: Consists of multiple interconnected subnetworks, each specialized in
solving a particular subtask. It integrates these subnetworks to perform complex tasks efficiently.
These models represent the breadth of neural network architectures, each tailored to specific data
characteristics and learning requirements. Choosing the appropriate model depends on the nature of the
data and the problem domain.
FEEDFORWARD NEURAL NETWORKS
A feedforward neural network is one of the simplest types of artificial neural networks devised. In this
network, the information moves in only one direction—forward—from the input nodes, through the
hidden nodes (if any), and to the output nodes. There are no cycles or loops in the network. Feedforward
neural networks were the first type of artificial neural network invented and are simpler than other neural
network architectures.

Architecture of Feedforward Neural Networks

The architecture of a feedforward neural network consists of three types of layers: the input layer, hidden
layers, and the output layer. Each layer is made up of units known as neurons, and the layers are
interconnected by weights.

Input Layer: This layer consists of neurons that receive inputs and pass them on to the next layer. The
number of neurons in the input layer is determined by the dimensions of the input data.

Hidden Layers: These layers are not exposed to the input or output and can be considered as the
computational engine of the neural network. Each hidden layer's neurons take the weighted sum of the
outputs from the previous layer, apply an activation function, and pass the result to the next layer. The
network can have zero or more hidden layers.

Output Layer: The final layer that produces the output for the given inputs. The number of neurons in the
output layer depends on the number of possible outputs the network is designed to produce.

Each neuron in one layer is connected to every neuron in the next layer, making this a fully connected
network. The strength of the connection between neurons is represented by weights, and learning in a
neural network involves updating these weights based on the error of the output.

How Feedforward Neural Networks Work

The working of a feedforward neural network involves two phases: the feedforward phase and the
backpropagation phase.

Feedforward Phase: In this phase, the input data is fed into the network, and it propagates forward
through the network. At each hidden layer, the weighted sum of the inputs is calculated and passed
through an activation function, which introduces non-linearity into the model. This process continues until
the output layer is reached, and a prediction is made.
Backpropagation Phase: Once a prediction is made, the error (difference between the predicted output
and the actual output) is calculated. This error is then propagated back through the network, and the
weights are adjusted to minimize this error. The process of adjusting weights is typically done using a
gradient descent optimization algorithm.

Challenges and Limitations

While feedforward neural networks are powerful, they come with their own set of challenges and
limitations. One of the main challenges is the choice of the number of hidden layers and the number of
neurons in each layer, which can significantly affect the performance of the network. Overfitting is
another common issue where the network learns the training data too well, including the noise, and
performs poorly on new, unseen data.

In conclusion, feedforward neural networks are a foundational concept in the field of neural networks and
deep learning. They provide a straightforward approach to modeling data and making predictions and
have paved the way for more advanced neural network architectures used in modern artificial intelligence
applications.
LEARNING RULES

An artificial neural network's learning rule or learning process is a method, mathematical logic or
algorithm which improves the network's performance and/or training time. Thus learning rules updates
the weights and bias levels of a network when a network simulates in a specific data environment.
Applying learning rule is an iterative process. It helps a neural network to learn from the existing
conditions and improve its performance.

The different learning rules in neural network are as follows;

 Hebbian learning rule


 Perceptron learning rule
 Delta learning rule
 Correlation learning rule
 Outstar learning rule

1. PERCEPTRON LEARNING RULE

In this type Network starts its learning by assigning a random value to each weight. This rule is a mistake
correction algorithm for the supervised learning algorithm of the single-layer feedforward networks with
a linear activation function. As the nature of the algorithm is supervised, the algorithm compares the
variation between the actual and the expected output in an attempt to calculate the error. If in any case,
we encounter an error, then it is inferred that an alteration is required in the weights of the connections.

As you know, each connection in a neural network has an associated weight, which changes in the course
of learning. According to it, an example of supervised learning, the network starts its learning by assigning
a random value to each weight. In a perceptron neural network, a learning sample refers to a single
instance or data point used during the training process to adjust the weights of the perceptron. The
network then compares the calculated output value with the expected value. Next calculates an error
function ∈, which can be the sum of squares of the errors occurring for each individual in the learning
sample.

Computed as follows:
Perform the first summation on the individuals of the learning set, and perform the second summation
on the output units. Eij and Oij are the expected and obtained values of the jth unit for the ith individual.

The network then adjusts the weights of the different units, checking each time to see if the error function
has increased or decreased. As in a conventional regression, this is a matter of solving a problem of least
squares. Since assigning the weights of nodes according to users, it is an example of supervised learning.

2. DELTA LEARNING RULE

Also known as the Least Mean Square or LMS rule, Delta Rule was given by Bernard Widrow and Marcian
Hoff. It minimizes the error expanding over the entire training pattern. It has a continuous activation
function and is an example of a Supervised Learning algorithm.

The base of the rule lies in the gradient-descent approach. The algorithm tends to compare the input
vectors and the variations in the output vector. If the difference between the expected and actual output
vector is not significant or not there, then the weights between the connections are unaffected. Whereas,
if we observe any difference then we have to alter the weights in such a way that the difference is null or
insignificant.

In other words, if we do not observe any difference in the output vectors, then the learning will not take
place. It is also seen that when we plot the graph of the weight of a network with linear activation function
and no hidden input against the squared difference, then we get a paraboloid in the n-space. As the
proportionality constant is also negative, the graph that we achieve is a concave upward graph and thus
it has the least value associated with it. The vertex of this graph is the area where we achieve the region
of lowest possible error. The weight vector that corresponds to such a vertex point is the ideal weight that
we need to consider.

The mathematical representation of the rule goes as:

You might also like