Artificial Neural Network

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Artificial Neural Network

An artificial neural network is an interconnected group of nodes, inspired


by a simplification of neurons in a brain. Here, each circular node
represents an artificial neuron and an arrow represents a connection
from the output of one artificial neuron to the input of another.

An ANN is based on a collection of connected units or nodes called


artificial neurons, which loosely model the neurons in a biological brain.
Each connection, like the synapses in a biological brain, can transmit a
signal to other neurons.

An artificial neuron receives a signal then processes it and can signal


neurons connected to it. The "signal" at a connection is a real number,
and the output of each neuron is computed by some non-linear function
of the sum of its inputs. The connections are called edges.

Neurons and edges typically have a weight that adjusts as learning


proceeds. The weight increases or decreases the strength of the signal
at a connection.

Neurons may have a threshold such that a signal is sent only if the
aggregate signal crosses that threshold.

Typically, neurons are aggregated into layers. Different layers may


perform different transformations on their inputs. Signals travel from the
first layer (the input layer), to the last layer (the output layer), possibly
after traversing the layers multiple times.
Feedforward neural network

A feedforward neural network is an artificial neural network wherein


connections between the nodes do not form a cycle.[1]
.
The feedforward neural network was the first and simplest type of
artificial neural network devised.[2] In this network, the information moves
in only one direction—forward—from the input nodes, through the
hidden nodes (if any) and to the output nodes. There are no cycles or
loops in the network.[1]

Single-layer perceptron

The simplest kind of neural network is a single-layer perceptron network,


which consists of a single layer of output nodes; the inputs are fed
directly to the outputs via a series of weights.

The sum of the products of the weights and the inputs is calculated in
each node, and if the value is above some threshold (typically 0) the
neuron fires and takes the activated value (typically 1); otherwise it takes
the deactivated value (typically -1).

Neurons with this kind of activation function are also called artificial
neurons or linear threshold units.
In the literature the term perceptron often refers to networks consisting
of just one of these units.

A perceptron can be created using any values for the activated and
deactivated states as long as the threshold value lies between the two.

Perceptrons can be trained by a simple learning algorithm that is usually


called the delta rule. It calculates the errors between calculated output
and sample output data, and uses this to create an adjustment to the
weights, thus implementing a form of gradient descent.
Single-layer perceptrons are only capable of learning linearly
separable patterns;

Multi-layer perceptron

This class of networks consists of multiple layers of computational units,


usually interconnected in a feed-forward way. Each neuron in one layer
has directed connections to the neurons of the subsequent layer. In
many applications the units of these networks apply a sigmoid
function as an activation function.

The universal approximation theorem for neural networks states that


every continuous function that maps intervals of real numbers to some
output interval of real numbers can be approximated arbitrarily closely by
a multi-layer perceptron with just one hidden layer. This result holds for a
wide range of activation functions, e.g. for the sigmoidal functions.

Multi-layer networks use a variety of learning techniques, the most


popular being back-propagation. Here, the output values are compared
with the correct answer to compute the value of some predefined error-
function. By various techniques, the error is then fed back through the
network. Using this information, the algorithm adjusts the weights of
each connection in order to reduce the value of the error function by
some small amount.

After repeating this process for a sufficiently large number of training


cycles, the network will usually converge to some state where the error
of the calculations is small. In this case, one would say that the network
has learned a certain target function. To adjust weights properly, one
applies a general method for non-linear optimization that is
called gradient descent.

For this, the network calculates the derivative of the error function with
respect to the network weights, and changes the weights such that the
error decreases (thus going downhill on the surface of the error
function). For this reason, back-propagation can only be applied on
networks with differentiable activation functions.

You might also like