Deep Learning UNIT-3
Deep Learning UNIT-3
The anatomy of a neural network involves understanding its fundamental components, including
layers, neurons, weights, biases, and activation functions.
The code you provided creates a Dense layer in Keras with 32 units (neurons) and specifies an
input shape of (784,)
• The input data should have a shape of (784,), which means it should be a 1D array with
784 elements.
This code creates a Sequential model with two Dense layers, each with 32 neurons. The first
layer specifies an input shape of (784,), while the second layer does not specify an input shape,
allowing it to automatically infer the input shape from the output shape of the previous layer.
2. Models: networks of layers
In deep learning, a model is typically represented as a directed, acyclic graph (DAG) of layers.
Network topologies (hypothesis space) that serve different purposes and address different types
of tasks
a) Two-branch networks
b) Multihead networks
c) Inception blocks
b) Multihead networks:
In multihead networks, the model has multiple output branches (heads), each producing its own
output prediction.
c) Inception blocks:
These blocks consist of multiple parallel convolutional branches with different filter sizes or
operations (e.g., 1x1, 3x3, 5x5 convolutions), followed by concatenation or merging of their
outputs.
Optimizers:
An optimizer is an algorithm that adjusts the weights of the neural network based on the
gradients of the loss function with respect to the weights. It determines how the weights are
updated during the training process.
A neural network that has multiple outputs may have multiple loss functions (one per output).
But the gradient-descent process must be based on a single scalar loss value; so, for multiloss
networks, all losses are combined (via averaging) into a single scalar quantity.
➢ It came into the market on 9th November 2015 under the Apache License 2.0.
➢ It is built in such a way that it can easily run on multiple CPUs and GPUs as well as on
mobile operating systems.
➢ It consists of various wrappers in distinct languages such as Java, C++, or Python.
Type:
Type describes the data type assigned to Tensor’s elements.
A user needs to consider the following activities for building a Tensor −
➢ Build an n-dimensional array
➢ Convert the n-dimensional array.
➢ TensorFlow provides a flexible system of Tensor operations, which enables users to build
computations as graphs. The modular architecture allows users to use parts of the
TensorFlow graph across various components of an application.
It utilizes GPUs for faster computation and efficiently computes the gradients by building
symbolic graphs automatically. It has come out to be very suitable for unstable expressions, as it
first observes them numerically and then computes them with more stable algorithms.
Using Theano:
➢ define expression
➢ compile expression
➢ execute expression
CNTK
➢ The models are trained using C++ or Python, but it incorporates C# or Java to load the model
for making predictions.
➢ We can implement CNN, FNN, RNN, Batch Normalisation and Sequence-to-Sequence with
attention.
➢ It provides us the functionality to add new user-defined core-components on the GPU from
Python.
➢ It also provides automatic hyperparameter tuning.
➢ We can implement Reinforcement learning, Generative Adversarial Networks (GANs),
Supervised as well as Unsupervised learning.
➢ For massive datasets, CNTK has built-in optimised readers.
There are two ways to define a model: using the Sequential class (only for linear stacks of layers,
which is the most common network architecture by far) or the functional API (for directed
acyclic graphs of layers, which lets you build completely arbitrary architectures).
The learning process is configured in the compilation step, where you specify the optimizer and
loss function(s) that the model should use, as well as the metrics you want to monitor during
training. Here’s an example with a single loss function, which is by far the most common case:
Finally, the learning process consists of passing Numpy arrays of input data (and the
corresponding target data) to the model via the fit() method,
• Install frameworks like TensorFlow, PyTorch, or Keras based on your preferences and
project requirements.
Development Environment:
Security:
• Implement security measures such as firewalls, antivirus software, and regular software
updates to protect your workstation from cyber threats.
• Jupyter Notebooks are indeed a popular choice for running deep learning experiments.
• Jupyter Notebooks allow you to write and execute code in a cell-by-cell fashion, enabling
interactive development.
• Jupyter Notebooks support inline plotting, which allows you to visualize data, model
architectures, training curves, and other metrics directly within the notebook.
➢ Launch an EC2 instance using the Deep Learning AMI provided by AWS.
➢ Choose an instance type with appropriate CPU and memory resources for your
experiments.
➢ Install and configure Jupyter Notebooks on the EC2 instance.
➢ Start a Jupyter server and access it through your web browser.
➢ Install the necessary dependencies including Python, TensorFlow, Keras, and other
libraries on your local Unix workstation.
➢ Once everything is set up, you can choose to run your Keras experiments either as
Jupyter notebooks or as a regular Python codebase directly on your local machine.
• RMSprop (Root Mean Square Propagation) is a popular choice of optimizer for neural
networks.
• This specifies the loss function to use. Since you're performing binary classification
(predicting either 0 or 1).
• accuracy', which measures the proportion of correctly classified samples.
4. Validating your approach
To validate the approach of building and training the neural network model, Split the Data:
Divide your dataset into training and validation sets.
To vectorize the labels, there are two possibilities: you can cast the label list as an integer tensor,
or you can use one-hot encoding. One-hot encoding is a widely used format for categorical data,
also called categorical encoding.
Input Layer:
input_shape=(10000,) specifies that each input sample has 10,000 features. This is consistent
with your vectorized input data, where each feature corresponds to one term in a large
vocabulary (assuming a one-hot encoded vector of vocabulary size 10,000).
The first Dense layer has 64 units, which is a significant increase from a hypothetical smaller
design, reducing the likelihood that this layer acts as a bottleneck.
Hidden Layer:
Another Dense layer with 64 units and 'relu' activation. This adds depth to the network, allowing
it to learn more complex patterns in the data. The 'relu' activation function is used for introducing
non-linearity into the model, helping it to learn more complex relationships in the data.
Output Layer:
A Dense layer with 46 units corresponds to the 46 different output classes.
The 'softmax' activation function is used because it outputs a probability distribution over the 46
classes, which is suitable for multiclass classification. Each output will give the probability that
the input belongs to one of the 46 classes.