0% found this document useful (0 votes)
26 views16 pages

Ann Project Assignment

Uploaded by

adetayomathew285
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views16 pages

Ann Project Assignment

Uploaded by

adetayomathew285
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

ADRTAYO MATHEW (190808064)1

NAME: ADETAYO MATHEW TEMITOPE

Supervisor: DR KS OJO

PROJECT ASSIGNMENT

22/11/2024

ARTIFICIAL NUERAL NETWORKS

What are Artificial neural networks?

Artificial Neural Networks (ANNs) are designed to mimic the way the human brain

processes information. They use interconnected nodes (neurons) and layers to simulate the

brain's neural network. Here is a bit more detail on how this works:

Neural Network Structure

1. Neurons: The basic units of a neural network, like neurons in the brain. Each neuron

receives input, processes it, and passes the output to the next layer.

2. Layers: Neural networks are composed of multiple layers:

o Input Layer: Receives the initial data.

o Hidden Layers: Intermediate layers that process the data through weighted

connections.

o Output Layer: Produces the final output.

Learning Process

1. Forward Propagation: Data is passed through the network, layer by layer, to

generate an output.

2. Loss Calculation: The difference between the predicted output and the actual output

is measured.

3. Backpropagation: The network adjusts the weights of the connections to minimize

the error, like how the brain learns from experience.


Adetayo 2

Brief history

Although the study of the human brain is thousands of years old. The first step towards

neural networks took place in 1943, when Warren McCulloch, a neurophysiologist, and a young

mathematician, Walter Pitts, wrote a paper on how neurons might work. They modelled a simple neural

network with electrical circuits.

In 1949, Donald Hebb reinforced the concept of neurons in his book, The Organization of Behaviour. It

pointed out that neural pathways are strengthened each time they are used.

In the 1950s, Nathanial Rochester from the IBM research laboratories led the first effort to simulate a

neural network.

In 1956 the Dartmouth Summer Research Project on Artificial Intelligence provided a boost to both

artificial intelligence and neural networks. This stimulated research in AI and in the much lower-level

neural processing part of the brain.

In 1957, John von Neumann suggested imitating simple neuron functions by using telegraph relays or

vacuum tubes.

In 1958, Frank Rosenblatt, a neurobiologist of Cornell, began work on the Perceptron. He was intrigued

with the operation of the eye of a fly. Much of the processing which tells a fly to flee is done in its eye.

The Perceptron, which resulted from this research, was built in hardware and is the oldest neural

network still in use today. A single-layer perceptron was found to be useful in classifying a continuous-

valued set of inputs into one of two classes. The perceptron computes a weighted sum of the inputs,

subtracts a threshold, and passes one of two possible values out as the result.

In 1959, Bernard Widrow and Marcian Hoff of Stanford developed models they called ADALINE and

MADALINE. These models were named for their use of Multiple Adaptive Linear Elements.

MADALINE was the first neural network to be applied to a real-world problem. It is an adaptive filter

which eliminates echoes on phone lines. This neural network is still in commercial use.
Adetayo 3

Marvin Minsky & Seymour Papert proved the Perceptron to be limited in their book, Perceptrons

Progress on neural network research halted due fear, unfulfilled claims, etc. until 1981. This caused

respected voices to critique the neural network research. The result was to halt much of the funding. This

period of stunted growth lasted through 1981.

1982 — John Hopfield presented a paper to the national Academy of Sciences. His approach to create

useful devices; he was likeable, articulate, and charismatic.

1982- US-Japan Joint Conference on Cooperative/ Competitive Neural Networks at which Japan

announced their Fifth-Generation effort resulted US worrying about being left behind. Soon funding

was flowing once again.

1985 — American Institute of Physics began what has become an annual meeting — Neural Networks

for Computing. By 1987, the Institute of Electrical and Electronic Engineer’s (IEEE) first International

Conference on Neural Networks drew more than 1,800 attendees.

In 1997 — A recurrent neural network framework, Long Short-Term Memory (LSTM) was proposed

by Schmid Huber & Hochreiter.

In 1998, Yann LeCun published Gradient-Based Learning Applied to Document Recognition.

several other steps have been taken to get us to where we are now; today, neural networks discussions

are prevalent; the future is here! Currently most neural network development is simply proving that the

principal works. This research is developing neural networks that, due to processing limitations, take

weeks to learn.

Reservoir computing

Reservoir Computing is a computational framework derived from recurrent neural

network theory. Wolfgang Maass independently proposed it, Thomas Nat schlager, and

Henry Markram in 2002, and by Herbert Jaeger in 2001. The concept involves using a fixed,

random, and recurrent neural network called the reservoir, which transforms input signals
Adetayo 4

into a high-dimensional dynamic state. The key advantage is that only the output layer is

trained, making the process computationally efficient.

Reservoir Computing has its roots in earlier neural network architectures like

Liquid State Machines (LSMs) and Echo State Networks (ESNs). These models

demonstrated that randomly connected recurrent neural networks could be used for tasks such

as interval and speech discrimination. The framework has since been applied to various

fields, including time-series prediction, pattern recognition, and control systems.

History of Physics-Informed Neural Networks (PINNs)

Physics-Informed Neural Networks (PINNs) are a type of neural network that

incorporates physical laws described by partial differential equations into the training

process. This approach ensures that the network's predictions adhere to known physical

principles, making them highly accurate for scientific and engineering applications.

The concept of PINNs emerged from the need to solve complex physical problems

where traditional numerical methods fall short. By embedding physical laws into the neural

network, PINNs can enhance the information content of the available data, facilitating the

learning algorithm to capture the right solution even with a low amount of training examples.

This approach has been particularly useful in fields like fluid dynamics, where the Navier-

Stokes equations govern the behavior of fluids5.

Integration with Artificial Neural Networks (ANNs)

Both Reservoir Computing and PINNs represent significant advancements in the field

of ANNs. Reservoir Computing extends the capabilities of recurrent neural networks by

simplifying the training process and leveraging the dynamic properties of the reservoir. This

makes it particularly effective for tasks involving temporal patterns and dynamic systems.
Adetayo 5

On the other hand, PINNs enhance the robustness and accuracy of ANNs by

incorporating physical laws into the training process. This integration allows ANNs to solve

complex physical problems more effectively, bridging the gap between data-driven and

physics-based modeling.

How Does a Neural Network Work?

The ability of a neural network to ‘think’ has revolutionized computing as we know it.

These smart solutions are capable of interpreting data and accounting for context.

Four critical steps that neural networks take to operate effectively are:

• Associating or training enables neural networks to ‘remember’ patterns. If the

computer is shown an unfamiliar pattern, it will associate the pattern with the closest

match present in its memory.

• Classification or organizing data or patterns into predefined classes.

• Clustering or the identification of a unique aspect of each data instance to classify it

even without any other context present.

• Prediction, or the production of expected results using a relevant input, even when all

context is not provided upfront.

Neural networks require high throughput to carry out these functions accurately in

near real-time. This is achieved by deploying numerous processors to operate parallel to each

other, which are arranged in tiers.

The neural networking process begins with the first tier receiving the raw input data.

You can compare this to the optic nerves of a human being receiving visual inputs. After that,

each consecutive tier gets the results from the preceding one. This goes on until the final tier

has processed the information and produced the output.


Adetayo 6

Every individual processing node contains its database, including all its past learnings

and the rules that it was either programmed with originally or developed over time. These

nodes and tiers are all highly interconnected.

The learning process (also known as training) begins once a neural network is

structured for a specific application. Training can take either a supervised approach or an

unsupervised approach. In the former, the network is provided with correct outputs either

through the delivery of the desired input and output combination or the manual assessment of

network performance. On the other hand, unsupervised training occurs when the network

interprets inputs and generates results without external instruction or support.

Adaptability is one of the essential qualities of a neural network. This characteristic

allows machine learning algorithms to be modified as they learn from their training and

subsequent operations. Learning models are fundamentally centered around the weightage of

input streams, wherein, each node assigns a weight to the input data it receives from its

preceding nodes. Inputs that prove instrumental to deriving the correct answers are given

higher weightage in subsequent processes.

Apart from adaptability, neural networks leverage numerous principles to define their

operating rules and make determinations.

Fuzzy logic, gradient-based training, Bayesian methods, and genetic algorithms all

play a role in the decision-making process at the node level. This helps individual nodes

decide what should be sent ahead to the next tier based on the inputs received from the

preceding tier.

Basic rules on object relationships can also help ensure higher quality data modeling.

For instance, a facial recognition neural network can be instructed ‘teeth are always below

the nose’ or ‘ears are on each side of a face’. Adding such rules manually can help decrease

training time and aid in the creation of a more efficient neural network model.
Adetayo 7

However, the addition of rules is not always a good thing. Doing so can also lead to

incorrect assumptions when the algorithm tries to solve problems unrelated to the rules.

Preloading the wrong ruleset can lead to the creation of neural networks that provide

irrelevant, incorrect, unhelpful, or counterproductive results. This makes it essential to choose

the rules that are added to the system carefully.

While neural networking, and especially unsupervised learning, still have a long way

to go before attaining perfection, we might be closer to achieving a defining breakthrough

than we think. It is a fact that the connections within a neural network are nowhere as

numerous or efficient as those in the human brain. However, Moore’s Law, which states that

the average processing power of computers is expected to double every two years, is still

flourishing. This trend gives our expectations from AI and neural networks a definitive

direction.

Types of artificial neural networks with their peculiarities, advantages and

disadvantages 3. Activation functions and types

Sure! Let's dive into the fascinating world of artificial neural networks (ANNs)

and activation functions.

Types of Artificial Neural Networks

1. Feedforward Neural Networks (FNN)

o Peculiarities: The simplest type of ANN where connections between nodes do

not form a cycle. Information moves in one direction—from input nodes,

through hidden nodes (if any), to output nodes.

o Advantages: Easy to design and implement, good for simple pattern

recognition tasks.

o Disadvantages: Limited in handling complex tasks, prone to overfitting.


Adetayo 8

2. Convolutional Neural Networks (CNN)

o Peculiarities: Primarily used for image and video recognition. They use

convolutional layers to automatically and adaptively learn spatial hierarchies

of features.

o Advantages: Excellent for image processing, reduces the number of

parameters, and handles spatial data well.

o Disadvantages: Requires a large amount of data and computational power, can

be complex to design.

3. Recurrent Neural Networks (RNN)

o Peculiarities: Designed to recognize patterns in sequences of data, such as time

series or natural language. They have connections that form directed cycles,

allowing them to maintain a memory of previous inputs.

o Advantages: Effective for sequential data, good for tasks like language

modeling and time series prediction.

o Disadvantages: Prone to vanishing gradient problem, difficult to train.

4. Long Short-Term Memory Networks (LSTM)

o Peculiarities: A type of RNN designed to overcome the vanishing gradient

problem. They have a more complex architecture that includes memory cells

to maintain information for long periods.

o Advantages: Better at capturing long-term dependencies, effective for tasks

like speech recognition and language translation.

o Disadvantages: Computationally expensive, complex to design and train.


Adetayo 9

5. Generative Adversarial Networks (GAN)

o Peculiarities: Consist of two neural networks, a generator and a discriminator,

that compete against each other. The generator creates data, and the

discriminator evaluates it.

o Advantages: Capable of generating realistic data, useful for tasks like image

generation and data augmentation.

o Disadvantages: Difficult to train, can suffer from mode collapse.

ANN AND DYNAMIC SYSTEMS

The exploration of dynamical systems using ANNs began in earnest with the advent of

Hopfield networks (1982), a form of recurrent neural network (RNN) applied to

optimization problems and associative memory.

Reservoir Computing (2000s), including Echo State Networks (ESNs) and Liquid State

Machines (LSMs), specifically targeted dynamical system modeling.

1. Deep Learning Era:

With the resurgence of ANNs in the 2010s, thanks to increased computational power and

better algorithms, ANNs were increasingly applied to complex nonlinear dynamical systems.

Techniques like Long Short-Term Memory (LSTM) and Transformer networks now

dominate in dynamical systems analysis.


Adetayo 10

Efficiency in Modeling Dynamical Systems

Why Use ANNs for Dynamical Systems?

1. Universal Approximation:

ANNs can approximate any continuous function, making them ideal for modeling the

often-nonlinear dynamics of systems like weather, fluid mechanics, or stock markets.

2. Data-Driven Modeling:

Unlike traditional approaches that require a deep understanding of physical laws,

ANNs can model systems purely based on data.

3. Flexibility:

ANNs can handle time-series data, spatial data, or any combination thereof, enabling

them to analyze a wide variety of dynamical behaviors.

Advantages of ANNs in Dynamical Systems

1. Scalability:

o ANNs can handle large datasets and high-dimensional systems.

2. Nonlinear Modeling:

o They excel at capturing nonlinearity, chaos, and other complexities in

dynamical systems.

3. Generalization:

o Once trained, ANNs can predict future states of a dynamical system with high

accuracy.

4. Automation:

o Require minimal domain knowledge compared to traditional mechanistic

models.
Adetayo 11

Disadvantages of ANNs in Dynamical Systems

1. Data Dependency:

o Performance is highly dependent on the quantity and quality of training data.

2. Interpretability:

o ANNs are often criticized as "black boxes," making it difficult to extract

physical insights.

3. Overfitting:

o Poor generalization may occur if the network is too complex relative to the

available data.

4. Computational Cost:

o Training ANNs for large or high-resolution dynamical systems can be

resource intensive.

Applications in Dynamical Systems

1. Weather and Climate Modeling:

ANNs are used to predict temperature, rainfall, and other meteorological phenomena.

2. Fluid Dynamics:

o Techniques like Physics-Informed Neural Networks (PINNs) integrate

physical laws to predict flow patterns in fluids.

3. Stock Market Prediction:

o Time-series forecasting of financial data, a classic example of chaotic

dynamics.

4. Robotics and Control:

o ANNs help design control systems for robots, including path optimization and

stability.
Adetayo 12

5. Biological Systems:

o Model neuron activity or the spread of diseases.

Future Directions

• Physics-Informed Neural Networks (PINNs):

Combine data-driven ANN methods with physical laws for more accurate and

interpretable models of dynamical systems.

• Spiking Neural Networks (SNNs):

Mimic the timing-based behavior of biological neurons, potentially offering efficiency

in modeling real-world dynamics.

• Hybrid Models:

Use ANNs to complement traditional numerical methods (e.g., finite element

analysis), bridging data-driven and mechanistic approaches.

Types of Activation Functions

Activation functions can be broadly categorized into the following types:

1. Linear Activation Function

• Definition: The output is proportional to the input.

f(x) = x

• Advantages:

o Simple to implement.

o Useful for regression problems in the output layer.

• Disadvantages:

o Lacks non-linearity, making it unsuitable for hidden layers.

o All layers collapse into a single layer due to no differentiation in output.


Adetayo 13

2. Non-Linear Activation Functions

a. Sigmoid Function

• Definition:
1
f(x)= 1+𝑒 −𝑥

• Range: (0,1)

• Advantages:

o Outputs can be interpreted as probabilities.

o Useful in binary classification problems.

• Disadvantages:

o Vanishing gradient problem for large/small input values.

o Slow convergence during training.

b. Hyperbolic Tangent (Tanh) Function

• Definition:

𝑒 𝑥 −𝑒 −𝑥
f(x)=tanh(x) = 𝑒 𝑥 +𝑒 −𝑥

• Range: (−1,1)

• Advantages:

o Zero-centered output, making optimization easier.

o Non-linearity helps model complex patterns.

• Disadvantages:

o Suffers from the vanishing gradient problem.

c. Rectified Linear Unit (ReLU)

• Definition:

f(x)=max (0, x)
Adetayo 14

• Range: [0, ∞)

• Advantages:

o Simple and computationally efficient.

o Mitigates vanishing gradient issues in most cases.

o Sparse activation (activates only some neurons).

• Disadvantages:

o Can suffer from the dying ReLU problem, where neurons output zero

permanently for negative inputs.

d. Leaky ReLU

• Definition:

f(x)=x if x>0, else f(x)=αx where α is a small constant (e.g., 0.01).

• Range: (−∞,∞)

• Advantages:

o Addresses the dying ReLU problem by allowing small gradients for negative

inputs.

• Disadvantages:

o The choice of α\alphaα is arbitrary and affects performance.

e. Exponential Linear Unit (ELU)

• Definition:

f(x)= x if x>0, else f(x)=α(−1)where α>0

• Range: (−α, ∞)

• Advantages:

o Smooth transitions, helping convergence.


Adetayo 15

o Avoids vanishing gradients for negative inputs.

• Disadvantages:

o Computationally expensive due to the exponential term.

f. SoftMax Function

• Definition:

𝒆−𝒙𝒊
f(x)i=∑𝑵 𝒙𝒋
, where xi is the i-th output and N is the total number of outputs.
𝒋=𝟏 𝒆

• Range: (0,1), sums to 1.

• Advantages:

o Ideal for multi-class classification problems.

o Outputs represent probabilities.

• Disadvantages:

o Prone to numerical instability for large input values.

g. Swish Function

• Definition:
𝟏
f(x)=x.sigmoid(x) = x.𝟏+𝒆−𝒙

• Range: (−∞,∞)

• Advantages:

o Smooth, non-monotonic, and helps improve accuracy.

o Suitable for deep networks.

• Disadvantages:

o Computationally intensive.

Choosing the Right Activation Function


Adetayo 16

• Input/Hidden Layers:

o Use ReLU or its variants (Leaky ReLU, ELU) for fast training and improved

performance.

• Output Layers:

o Sigmoid: Binary classification.

o SoftMax: Multi-class classification.

o Linear: Regression tasks.

References

[2004.11826] Dynamical Systems and Neural Networks - arXiv.org

Dynamical Systems and Artificial Neural Networks

Brief History of Neural Networks. Although the study of the human brain… | by Kate

Strachnyi | Analytics Vidhya | Medium

What Is a Neural Network and its Types?-

You might also like