Unit 4 - Artificial Intelligence
Unit 4 - Artificial Intelligence
NEURAL NETWORKS
Image recognition
Natural language processing
Sentiment analysis
Recommendation systems
Autonomous vehicles
Medical diagnosis
Financial forecasting
In summary, neural networks are powerful machine learning models that mimic the behavior of the
human brain and are capable of learning complex patterns and relationships in data, making them
valuable tools for a wide range of applications.
ARCHITECTURE OF NEURAL NETWORKS
Neural Network Architecture refers to the design and structure of an artificial neural network (ANN),
which is a machine learning model inspired by the human brain. It defines how the network's layers,
nodes, and connections are organized to process and analyze data.
The input layer receives the raw data, which is then passed through one or more hidden layers that
perform mathematical computations. The output layer produces the final results, such as predictions or
classifications. The architecture of a neural network determines the complexity, capacity, and
functionality of the model.
Feedforward Neural Networks (FNNs): Basic neural networks where information moves in one
direction, from input nodes through hidden layers (if any) to output nodes. Multi-layer Perceptrons
(MLPs) are a common type of FNN with multiple layers of neurons.
Convolutional Neural Networks (CNNs): Specifically designed for processing grid-like data like
images. Utilizes convolutional layers to detect spatial patterns by applying convolution operations.
Effective in tasks such as image classification, object detection, and image segmentation are
example for CNNs.
Recurrent Neural Networks (RNNs): Ideal for sequential data where the order of elements
matters, like time series, text, and speech. It Incorporates feedback loops to process sequences,
enabling the network to exhibit temporal dynamics. Long Short-Term Memory (LSTM) and Gated
Recurrent Unit (GRU) are popular RNN variants, addressing the vanishing gradient problem and
capturing long-term dependencies.
Generative Adversarial Networks (GANs): Comprises two networks: a generator and a
discriminator, trained adversarially. The generator creates synthetic samples, aiming to fool the
discriminator, which in turn distinguishes between real and fake samples. It is widely used for
generating realistic images, data augmentation, and unsupervised learning tasks.
Autoencoder Neural Networks: Unsupervised learning models aimed at learning efficient data
representations. It consists of an encoder that compresses the input into a latent space and a
decoder that reconstructs the input from the compressed representation. It is used for
dimensionality reduction, feature learning, and anomaly detection.
Recursive Neural Networks (RecNNs): Tailored for hierarchical structures like parse trees or
linguistic structures. Applies the same set of weights recursively at each level, capturing
hierarchical relationships within the data.
Self-Organizing Maps (SOMs): Unsupervised learning models that create low-dimensional
representations of input space. It utilizes competitive learning to cluster and visualize high-
dimensional data onto a two-dimensional grid.
Modular Neural Networks: Consists of multiple interconnected subnetworks, each specialized in
solving a particular subtask. It integrates these subnetworks to perform complex tasks efficiently.
These models represent the breadth of neural network architectures, each tailored to specific data
characteristics and learning requirements. Choosing the appropriate model depends on the nature of the
data and the problem domain.
FEEDFORWARD NEURAL NETWORKS
A feedforward neural network is one of the simplest types of artificial neural networks devised. In this
network, the information moves in only one direction—forward—from the input nodes, through the
hidden nodes (if any), and to the output nodes. There are no cycles or loops in the network. Feedforward
neural networks were the first type of artificial neural network invented and are simpler than other neural
network architectures.
The architecture of a feedforward neural network consists of three types of layers: the input layer, hidden
layers, and the output layer. Each layer is made up of units known as neurons, and the layers are
interconnected by weights.
Input Layer: This layer consists of neurons that receive inputs and pass them on to the next layer. The
number of neurons in the input layer is determined by the dimensions of the input data.
Hidden Layers: These layers are not exposed to the input or output and can be considered as the
computational engine of the neural network. Each hidden layer's neurons take the weighted sum of the
outputs from the previous layer, apply an activation function, and pass the result to the next layer. The
network can have zero or more hidden layers.
Output Layer: The final layer that produces the output for the given inputs. The number of neurons in the
output layer depends on the number of possible outputs the network is designed to produce.
Each neuron in one layer is connected to every neuron in the next layer, making this a fully connected
network. The strength of the connection between neurons is represented by weights, and learning in a
neural network involves updating these weights based on the error of the output.
The working of a feedforward neural network involves two phases: the feedforward phase and the
backpropagation phase.
Feedforward Phase: In this phase, the input data is fed into the network, and it propagates forward
through the network. At each hidden layer, the weighted sum of the inputs is calculated and passed
through an activation function, which introduces non-linearity into the model. This process continues until
the output layer is reached, and a prediction is made.
Backpropagation Phase: Once a prediction is made, the error (difference between the predicted output
and the actual output) is calculated. This error is then propagated back through the network, and the
weights are adjusted to minimize this error. The process of adjusting weights is typically done using a
gradient descent optimization algorithm.
While feedforward neural networks are powerful, they come with their own set of challenges and
limitations. One of the main challenges is the choice of the number of hidden layers and the number of
neurons in each layer, which can significantly affect the performance of the network. Overfitting is
another common issue where the network learns the training data too well, including the noise, and
performs poorly on new, unseen data.
In conclusion, feedforward neural networks are a foundational concept in the field of neural networks and
deep learning. They provide a straightforward approach to modeling data and making predictions and
have paved the way for more advanced neural network architectures used in modern artificial intelligence
applications.
LEARNING RULES
An artificial neural network's learning rule or learning process is a method, mathematical logic or
algorithm which improves the network's performance and/or training time. Thus learning rules updates
the weights and bias levels of a network when a network simulates in a specific data environment.
Applying learning rule is an iterative process. It helps a neural network to learn from the existing
conditions and improve its performance.
In this type Network starts its learning by assigning a random value to each weight. This rule is a mistake
correction algorithm for the supervised learning algorithm of the single-layer feedforward networks with
a linear activation function. As the nature of the algorithm is supervised, the algorithm compares the
variation between the actual and the expected output in an attempt to calculate the error. If in any case,
we encounter an error, then it is inferred that an alteration is required in the weights of the connections.
As you know, each connection in a neural network has an associated weight, which changes in the course
of learning. According to it, an example of supervised learning, the network starts its learning by assigning
a random value to each weight. In a perceptron neural network, a learning sample refers to a single
instance or data point used during the training process to adjust the weights of the perceptron. The
network then compares the calculated output value with the expected value. Next calculates an error
function ∈, which can be the sum of squares of the errors occurring for each individual in the learning
sample.
Computed as follows:
Perform the first summation on the individuals of the learning set, and perform the second summation
on the output units. Eij and Oij are the expected and obtained values of the jth unit for the ith individual.
The network then adjusts the weights of the different units, checking each time to see if the error function
has increased or decreased. As in a conventional regression, this is a matter of solving a problem of least
squares. Since assigning the weights of nodes according to users, it is an example of supervised learning.
Also known as the Least Mean Square or LMS rule, Delta Rule was given by Bernard Widrow and Marcian
Hoff. It minimizes the error expanding over the entire training pattern. It has a continuous activation
function and is an example of a Supervised Learning algorithm.
The base of the rule lies in the gradient-descent approach. The algorithm tends to compare the input
vectors and the variations in the output vector. If the difference between the expected and actual output
vector is not significant or not there, then the weights between the connections are unaffected. Whereas,
if we observe any difference then we have to alter the weights in such a way that the difference is null or
insignificant.
In other words, if we do not observe any difference in the output vectors, then the learning will not take
place. It is also seen that when we plot the graph of the weight of a network with linear activation function
and no hidden input against the squared difference, then we get a paraboloid in the n-space. As the
proportionality constant is also negative, the graph that we achieve is a concave upward graph and thus
it has the least value associated with it. The vertex of this graph is the area where we achieve the region
of lowest possible error. The weight vector that corresponds to such a vertex point is the ideal weight that
we need to consider.