Neural Networks and Recurrent Neural Networks
Neural Networks and Recurrent Neural Networks
A neural network
consists of one layer of input units, one layer of output units, and in-between
are multiple layers that are referred to as hidden units. The outputs of the input
units form the inputs for the units of the first hidden layer (i.e., the first layer
of hidden units), and the outputs of the units of each hidden layer form the
input for each subsequent hidden layer. The outputs of the last hidden layer
form the input for the output layer. The output of each unit is a function over
the weighted sum of its inputs. The weights of this weighted sum performed in
each unit are learned through gradient-based optimization from training data
that consists of example inputs and desired outputs for those example inputs.
Recurrent Neural Networks (RNNs) are a special type of neural networks where
RNNs can be unfolded, as shown in Fig. 4. Each step in the unfolding is referred
to as a time step, where xt is the input at time step t. RNNs can take an arbitrary
of one element of the sequence at each time step. st is the hidden state at time
step t and contains information extracted from all time steps up to t. The hidden
state s is updated with information of the new input xt after each time step:
st = f(Uxt +Wst−1), where U and W are vectors of weights over the new inputs
and the hidden state respectively. In practice, either the hyperbolic tangent or
the logistic function is generally used for function f, which is referred to as the