Deep Learning
Deep Learning
1
Deep Learning
Chapters
1. Introduction to Deep Learning
2. Introduction to Linear Algebra
3. Artificial Neural Networks
4. Convolutional Neural Networks
5. Transfer Learning
6. Computer Vision
7. Natural Language processing
8. Recurrent Neural Networks
2
Chapter 1
3
1. Introduction to Deep Learning and Neural Networks
• Deep learning is specifically characterized by the use of neural networks that have multiple
layers, allowing the algorithm to learn hierarchical representations of the input data.
• Deep learning algorithm uses a process called backpropagation to adjust the weights of the neural
network in order to improve its performance on a given task.
• The more layers a neural network has, the more complex patterns it can learn to recognize in the
input data.
5
1. Introduction to Deep Learning and Neural Networks
• In deep learning, the algorithm can automatically learn relevant features from the input data,
saving time and effort.
6
1. Introduction to Deep Learning and Neural Networks
• Flexibility:
• Deep learning algorithms are highly flexible and can be applied to a wide range of tasks,
from computer vision and speech recognition to natural language processing and
autonomous driving.
7
1. Introduction to Deep Learning and Neural Networks
8
1. Introduction to Deep Learning and Neural Networks
Source:
https://www.smartsheet.com/sites/default/files/IC-Brain-
Neuron-Structure.svg 9
1. Introduction to Deep Learning and Neural Networks
Brain Neuron
• A typical neuron consists of three main parts:
• Cell body (or soma)
• Dendrites
• Axon
• The cell body contains the nucleus and other organelles that carry out basic cellular functions.
• The dendrites are thin, branching extensions of the cell body that receive input from other
neurons.
• The axon is a long, thin extension that transmits electrical signals away from the cell body to
other neurons.
10
1. Introduction to Deep Learning and Neural Networks
Brain Neuron
• When a neuron is stimulated, an electrical signal called an action potential travels down the axon
and causes the release of neurotransmitters from the terminal branches of the axon.
• These neurotransmitters diffuse across the synapse and bind to receptors on the dendrites of the
receiving neuron, causing it to generate its own action potential.
• The structure and function of biological neurons have inspired the development of artificial
neural networks in machine learning.
11
1. Introduction to Deep Learning and Neural Networks
Artificial Neuron
Source:
https://miro.medium.com/v2/resize:fit:1200/1*hkYlTODpjJgo3
12
2DoCOWN5w.png
1. Introduction to Deep Learning and Neural Networks
Neural Network
Spurce: https://www.hotelmize.com/wp-
content/uploads/2021/04/How-does-a-neural-network-
works.png 13
Chapter 2
14
2. Introduction to Linear Algebra
• It involves the study of properties of matrices, determinants, vectors, and linear equations and
their applications to different fields such as physics, engineering, computer science, and
economics.
• The main focus of linear algebra is to find solutions to systems of linear equations, which are
used to represent a wide range of real-world problems.
15
2. Introduction to Linear Algebra
• It has many applications in modern technology and is an essential tool for solving complex
problems in engineering and science.
16
2. Introduction to Linear Algebra
Scalar
• Scalar is a single number that is used to scale a vector or a matrix.
• A scalar can be any real or complex number, and it can be positive, negative, or zero
• When a scalar is multiplied with a vector, it results in a new vector that is parallel to the original
vector, but its magnitude is either increased or decreased
• If the scalar is negative, the resulting vector is in the opposite direction of the original vector.
17
2. Introduction to Linear Algebra
Scalar
• When a scalar is multiplied with a matrix, it results in a new matrix where each element of the
original matrix is multiplied by the scalar. It is one of the fundamental operations in linear
algebra.
• Scalars are essential in linear algebra because they allow us to manipulate vectors and matrices
and perform various operations on them, such as:
• Adding
• Subtracting
• Multiplying
18
2. Introduction to Linear Algebra
Vectors
• Vector is defined by its components, which are a set of n real or complex numbers that represent
the magnitude of the vector in each dimension
[10
• 20 It is called column vector
30]
19
2. Introduction to Linear Algebra
Vectors
• We can perform scalar operations such as addition, subtraction and more, the result vector will
have same number of scalars
• We can perform vector operations such as addition, subtraction and more, the result vector will
have same number of scalars of vectors present in the operation
• In linear algebra, vectors are used to represent many types of data, such as position, velocity,
force, and acceleration, among others.
20
2. Introduction to Linear Algebra
Matrix
• Matrix is a rectangular array of numbers, symbols, or expressions arranged in rows and columns.
• The number of rows and columns of a matrix is called its dimensions. For example, a matrix with
m rows and n columns is said to be an m×n matrix.
• The elements of a matrix are typically denoted by a subscript notation, such as Aij, where i
denotes the row number and j denotes the column number.
21
2. Introduction to Linear Algebra
Matrix
• A 2x2 matrix given bellow:
10 20
• with 2 rows and 3 columns
30 40
• Identity matrix
1 0 0
• 0 1 0
0 0 1
22
2. Introduction to Linear Algebra
Matrix
• Scalar operation on matrix:
• The result matrix will have same row and columns, with different magnitude
• Matrix operations:
• The result matrix rows and columns depends on matrices present in the operation
23
2. Introduction to Linear Algebra
Tensor
• Tensor is a mathematical object that generalizes the concept of a vector and a matrix. A tensor is a
multi-dimensional array of numbers that can represent a wide range of quantities
• They are also used in machine learning and computer vision to represent and manipulate data in
multi-dimensional arrays.
• The rank of a tensor is the number of indices needed to specify its components. For example, a
vector can be represented as a rank-1 tensor, while a matrix is a rank-2 tensor
• Tensors can be added, multiplied by scalars, and multiplied with other tensors using a special
operation called tensor multiplication, which generalizes matrix multiplication. 24
2. Introduction to Linear Algebra
Tensor
• Tensor as follows:
10 20 1 2
• 30 40 3 4
5 6 5 6
25
2. Introduction to Linear Algebra
26
2. Introduction to Linear Algebra
TensorFlow
• TensorFlow is a popular open-source deep learning framework developed by Google.
• It was first released in 2015 and has since become one of the most widely used deep learning
frameworks in the world.
• TensorFlow is designed to make it easy to build and train machine learning models, particularly
deep neural networks, by providing a flexible and scalable platform that can run on a variety of
devices, including CPUs, GPUs, and TPUs
• TensorFlow provides a variety of pre-built functions and modules that can be easily customized
and combined to create complex neural network architectures. 27
2. Introduction to Linear Algebra
TensorFlow
• One of the key features of TensorFlow is its ability to perform distributed training, which allows
models to be trained across multiple devices and machines.
• TensorFlow also supports a variety of programming languages, including Python, C++, Java, and
more, making it accessible to developers with different programming backgrounds.
• Keras is a popular high-level API for building neural networks and runs on top of TensorFlow
and other deep learning frameworks.
28
Chapter 3
29
3. Artificial Neural Networks
What is ANN?
• An artificial neural network is composed of a large number of interconnected processing units,
known as neurons, that work together to process input data and generate output predictions.
• Each neuron receives input signals from other neurons or from the input data, performs a
calculation, and passes the result to other neurons in the network.
• The connections between neurons are modeled using weights, which are learned during the
training process.
30
3. Artificial Neural Networks
What is ANN?
• ANNs are mainly categorized as follows:
• Feed forward NN
• Feed backward NN
Source: https://www.saedsayad.com/images/ANN_4.png
31
3. Artificial Neural Networks
Source: https://www.thewindowsclub.com/wp-content/uploads/2017/11/Neural-
Network.jpg 32
3. Artificial Neural Networks
Activation Function
• An activation function is a mathematical function that is applied to the output of each neuron in a
neural network.
• The activation function determines whether the neuron will be "activated" or "deactivated" based
on the input it receives
• It helps to introduce nonlinearity into the network, which is important for modeling complex
relationships in the data.
33
3. Artificial Neural Networks
Activation Function
• Commonly used activation functions:
• Linear
• Sigmoid
• Tanh
• ReLU
• Softmax
Source: https://ai-artificial-intelligence.webyes.com.br/wp-
content/uploads/2022/09/image-1-967x1024.png 34
3. Artificial Neural Networks
Optimizers
• Optimizers are algorithms used in artificial neural networks during the training phase to adjust
the weights and biases of the network to minimize the loss/error.
• Some commonly used optimizers are:
• Gradient Descent:
• It works by calculating the gradient of the loss function with respect to the weights and
biases of the network, and adjusting the weights and biases in the opposite direction of
the gradient to minimize the loss.
• Stochastic Gradient Descent (SGD):
• This optimizer is a variation of Gradient Descent that uses a random sample of the
training data for each update. This can help speed up the training process for large
datasets. 35
3. Artificial Neural Networks
Optimizers
• Adagrad:
• Adagrad is an optimizer that adapts the learning rate of each parameter based on the
historical gradients. This can help improve the convergence rate of the model.
• RMSprop:
• RMSprop is an optimizer that uses a moving average of the squared gradients to adjust
the learning rate. This can help prevent the learning rate from being too high or too low.
• Adam:
• Adam stands for Adaptive Moment Estimation is a popular optimizer that uses both the
first and second moments of the gradients to adjust the learning rate during training. 36
3. Artificial Neural Networks
Optimizers
Source:
https://www.google.co.in/url?sa=i&url=https%3A%
2F%2Fwww.researchgate.net%2Ffigure%2Flocal-
minima-vs-global-
minimum_fig2_341902041&psig=AOvVaw2WowYiT
x4iaP_J3uJ3FE41&ust=1679052356647000&source
=images&cd=vfe&ved=0CBAQjRxqFwoTCOCU04ar4
P0CFQAAAAAdAAAAABAQ
37
3. Artificial Neural Networks
Optimizers
Source: https://mpopov.com/images/adam-
animated.gif
38
3. Artificial Neural Networks
Optimizers
Source: https://user-
images.githubusercontent.com/11681225/50
016682-39742a80-000d-11e9-81da-
ab0406610b9c.gif
39
3. Artificial Neural Networks
#reading data
ds = pd.read_csv('data/Advertisments.csv’)
x = ds.iloc[:,:-1].values
y = ds.iloc[:,-1].values
40
3. Artificial Neural Networks
model = Sequential([
Input(shape=(3,)),
Dense(1, activation='linear')
])
41
3. Artificial Neural Networks
42
3. Artificial Neural Networks
ANN BC Model
# importing required libraries
import pandas as pd
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Input
#reading data
bank_ds = pd.read_csv('data/bank.csv', delimiter=';’)
43
3. Artificial Neural Networks
ANN BC Model
# Encoding categorical variables
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
44
3. Artificial Neural Networks
ANN BC Model
# train, test split
from sklearn.model_selection import train_test_split
x = bank_ds.drop(['y'], axis=1).values
y = bank_ds['y'].values
45
3. Artificial Neural Networks
ANN BC Model
# ANN Arch for Binary Classification
model_binary = Sequential()
model_binary.add(Dense(8, activation='relu’))
model_binary.add(Dense(1, activation='sigmoid')) #for binary class classification
46
3. Artificial Neural Networks
ANN BC Model
# Evaluating model
y_proba = model_binary.predict(x_test)
y_pred = []
for proba in y_proba:
if proba<0.5:
y_pred.append(0)
else:
y_pred.append(1)
accuracy_score(y_test, y_pred)
47
3. Artificial Neural Networks
ANN BC Model
# list(zip(y_proba, y_pred))
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
48
3. Artificial Neural Networks
ANN MC Model
iris_ds = datasets.load_iris()
x = iris_ds.data
y = iris_ds.target
49
3. Artificial Neural Networks
ANN MC Model
# ANN Arch for Multi Class Classification
model_mc = Sequential()
model_mc.add(Dense(8, activation='relu'))
model_mc.add(Dense(4, activation='relu'))
model_mc.add(Dense(3, activation='softmax'))
50
3. Artificial Neural Networks
IC-ANN
import numpy as np
import pandas as pd
import tensorflow as tf
51
3. Artificial Neural Networks
IC-ANN
(x_train, y_train), (x_test, y_test) = mnist_ds = datasets.mnist.load_data()
y_train_c = to_categorical(y_train)
y_test_c = to_categorical(y_test)
52
3. Artificial Neural Networks
IC-ANN
model = Sequential()
model.add(Dense(250, activation='relu', input_shape=(28*28,)))
model.add(Dense(10, activation='softmax'))
53
4. Convolutional Neural Networks
What is CNN?
Source: https://www.alphazoneeyeclinic.com/wp-
content/uploads/2021/09/sight_and_brain_pathway.png
54
4. Convolutional Neural Networks
What is CNN?
• CNN stands for Convolutional Neural Network.
• It is commonly used for image and video analysis.
• In a CNN, the input data is processed by a series of convolutional layers, which apply a set
of learnable filters to the input in order to extract features that are important for the task at
hand.
• The output of the convolutional layers is then fed into a set of fully connected layers, which
perform the final classification.
• CNNs are widely used in various applications, such as image classification, object detection,
face recognition, and many others.
55
4. Convolutional Neural Networks
CNN Architecture
56
4. Convolutional Neural Networks
57
4. Convolutional Neural Networks
58
4. Convolutional Neural Networks
59
4. Convolutional Neural Networks
62
4. Convolutional Neural Networks
IC-CNN
import tensorflow as tf
from tensorflow.keras import datasets
63
4. Convolutional Neural Networks
IC-CNN
(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()
# reshaping x_train into (60000, 784) and Y_train into one hot encoding
x_train_v = x_train.reshape(x_train.shape[0], x_train.shape[1]*x_train.shape[2])
y_train_c = to_categorical(y_train, num_classes=10)
# reshaping x_test into (10000, 784) and Y_test into one hot encoding
x_test_v = x_test.reshape(x_test.shape[0], x_test.shape[1]*x_test.shape[2])
y_test_c = to_categorical(y_test, num_classes=10)
64
4. Convolutional Neural Networks
IC-CNN
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense
model_cnn = Sequential()
65
4. Convolutional Neural Networks
IC-CNN
model_cnn.add(Flatten())
model_cnn.add(Dense(500, activation='relu'))
model_cnn.add(Dense(250, activation='relu'))
model_cnn.add(Dense(10, activation='softmax'))
model_cnn.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model_cnn.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))
66
4. Convolutional Neural Networks
state-of-the-art CNN
• state-of-the-art CNN networks that have achieved impressive results in various computer vision
tasks.
• Here are some examples:
• LeNet-5
• VGGNet
• AlexNet
• GoogLeNet
• ResNet
Link for more data: https://keras.io/api/applications/
ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
67
4. Convolutional Neural Networks
state-of-the-art CNN
• LeNet-5
68
4. Convolutional Neural Networks
state-of-the-art CNN
• VGGNet
state-of-the-art CNN
• AlexNet
70
4. Convolutional Neural Networks
state-of-the-art CNN
• GoogLeNet
Image Source:
https://miro.medium.com/max/1400/1*66hY3zZTf0Lw2ItybiRxyg.png 71
4. Convolutional Neural Networks
state-of-the-art CNN
• ResNet
Image Source:
https://miro.medium.com/max/1400/1*S3TlG0XpQZSIpoDIUCQ0RQ.jpeg 72
5. Transfer Learning
• In transfer learning, the knowledge learned from one or more related tasks is transferred to a new
task in order to improve performance and reduce the amount of data and training time required to
train the new model.
• The idea is that the knowledge learned from a related task can help the model learn more quickly
and effectively on a new, similar task.
73
5. Transfer Learning
IC-TR-VGG16
import tensorflow as tf
import tensorflow_datasets as tfds
from keras.utils import to_categorical
74
5. Transfer Learning
IC-TR-VGG16
train_ds = tf.image.resize(train_ds, (150, 150))
test_ds = tf.image.resize(test_ds, (150, 150))
IC-TR-VGG16
base_model.trainable = False
train_ds = preprocess_input(train_ds)
test_ds = preprocess_input(test_ds)
flatten_layer = layers.Flatten()
dense_layer_1 = layers.Dense(50, activation='relu')
dense_layer_2 = layers.Dense(20, activation='relu')
output_layer = layers.Dense(5, activation='softmax')
76
5. Transfer Learning
IC-TR-VGG16
model = models.Sequential([
base_model,
flatten_layer,
dense_layer_1,
dense_layer_2,
output_layer
])
77
5. Transfer Learning
IC-TR-VGG16
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
epochs = 2
batch_size = 32
78
6. Computer Vision
• It involves the use of various techniques and algorithms to analyze, process, and interpret
images and videos in order to extract useful information.
• Computer vision has a wide range of applications in various industries, such as healthcare,
entertainment, automotive, security, and more.
• Some examples include facial recognition, object detection, autonomous vehicles, medical
imaging, and augmented reality. 79
6. Computer Vision
• Object Detection:
• This task involves identifying and localizing objects within an image. Object detection is
used in various applications such as self-driving cars, and video surveillance.
• Image Segmentation:
• This task involves dividing an image into different segments or regions based on various
attributes such as color, texture, or shape. 80
6. Computer Vision
OpenCV
• OpenCV is an open-source computer vision library that provides a wide range of tools and
algorithms for image and video processing.
• It was originally developed by Intel in 1999 and has since been maintained by a community
of developers.
• OpenCV offers various features and functions for image and video processing, including
image and video capture, image filtering, feature detection, object tracking, and machine
learning.
• It is written in C++, but also provides interfaces for Python, Java, and other programming
languages.
81
6. Computer Vision
OpenCV
• Installing OpenCV
• pip install opencv-python
• Importing OpenCV
• import cv2
• print(cv2.__version__)
82
6. Computer Vision
• cv2.imshow():
• It is used to display an image in a window. It takes the image data as input and displays
it on the screen.
• cv2.cvtColor():
• It is used to convert an image from one color space to another. It takes the image data
and the desired color space as input and returns the converted image.
83
6. Computer Vision
• cv2.rectangle():
• It is used to draw a rectangle on an image. It takes the image data, the position of the
top-left corner, the position of the bottom-right corner, and the color and thickness of the
rectangle as input.
84
6. Computer Vision
• cv2.waitKey()
• This function waits for a specified delay for a keyboard event to occur. It takes an
integer argument representing the delay in milliseconds as input, and returns the key
code of the pressed key as output.
85
6. Computer Vision
import cv2
# Reading an image
img = cv2.imread('image.jpg')
import cv2
# Reading an image
img = cv2.imread('image.jpg')
87
6. Computer Vision
88
6. Computer Vision
import cv2
while(True):
# Reading a frame from the video stream
ret, frame = vid.read()
90
7. Natural Language processing
• NLP tasks:
• Language understanding
• Language generation
• Language translation
• Sentiment analysis. 91
7. Natural Language processing
• NLP models typically involve some form of machine learning, such as supervised or
unsupervised learning, to train the models on large datasets of text data.
• These models can be based on various techniques such as rule-based systems, statistical models,
or deep learning models such as recurrent neural networks (RNNs) or transformers.
• As the demand for natural language processing applications continues to grow, there is a need for
continued research and development to improve the accuracy, and applicability of NLP models. 92
7. Natural Language processing
NLP Terminology
• Token
• Tokenization
• Stemming
• Lemmatization
• Part-of-speech (POS) tagging
• Named entity recognition (NER)
• Sentiment analysis
• Language modeling
• Machine translation
• Information retrieval
• Text classification 93
7. Natural Language processing
NLP Terminology
• Token:
• Token is an individual unit of text that has been separated or segmented from a larger body
of text.
• Tokenization:
• The process of splitting a text into individual units or tokens, such as words or sub words, for
further analysis.
• Stemming:
• A process of reducing words to their base or root form, such as converting "walking" to
"walk". 94
7. Natural Language processing
NLP Terminology
• Lemmatization:
• A process of reducing words to their base or dictionary form, such as converting "am", "are",
and "is" to "be".
NLP Terminology
• Sentiment analysis:
• A process of determining the sentiment or emotional tone of a text, such as whether a movie
review is positive or negative.
• Language modeling:
• A process of predicting the probability of a sequence of words in a language, which is used
for tasks such as speech recognition and machine translation.
• Machine translation:
• A process of automatically translating text from one language to another.
96
7. Natural Language processing
NLP Terminology
• Information retrieval:
• A process of retrieving relevant information from a large collection of documents or text,
such as using a search engine to find web pages.
• Text classification:
• A process of categorizing text into predefined categories, such as classifying news articles
into topics like sports, politics, and entertainment.
97
7. Natural Language processing
Vectorization
• Vectorization is the process of converting text data into numerical vectors, which can be used as
input for machine learning algorithms.
• Bag-of-words
• Word embeddings
• Character embeddings
98
7. Natural Language processing
Vectorization
• Bag-of-words:
• The bag-of-words model represents text as a collection of words and their frequency of
occurrence in a document.
• Each word is treated as a separate feature, and the resulting vector represents the frequency of
each word in the document.
• It is calculated as the product of the term frequency (how often a word appears in a document)
and the inverse document frequency (how often the word appears in all documents in the
corpus).
99
7. Natural Language processing
Vectorization
• Word embeddings:
• Word embeddings are a type of dense vector representation of words, where each word is
represented by a vector of fixed dimensionality.
• Word embeddings capture the semantic meaning of words and their relationships with other
words in the vocabulary.
• Character embeddings:
• Character embeddings are similar to word embeddings, but they represent each character in a
word as a vector.
• Character embeddings are useful for capturing morphological information, such as prefixes
and suffixes, that can help with tasks such as named entity recognition.
100
7. Natural Language processing
NLP Packages
• Natural Language Toolkit (NLTK):
• NLTK is a comprehensive library for NLP in Python. It provides modules for tokenization,
stemming, lemmatization, POS tagging, and more.
• Gensim:
• Gensim is a library for topic modeling and similarity analysis of text.
• It provides modules for text processing, document similarity analysis, and topic modeling.
101
8. RNN
What is RNN?
• RNN stands for Recurrent Neural Network
• It is a type of neural network architecture designed for processing sequential data, such speech,
and natural language text.
• RNNs can process sequences of variable lengths and maintain a memory of the past inputs they
have processed.
• The key idea behind RNNs is the use of hidden states, which are updated at each time step as the
network processes the input sequence.
102
8. RNN
RNN
Source:
https://www.ibm.com/content/da
m/connectedassets-adobe-
cms/worldwide-
content/cdp/cf/ul/g/27/80/what-
are-recurrent-neural-networks-
combined.component.simple-
narrative-
xl.ts=1671203207332.jpg/content/
adobe-
cms/us/en/topics/recurrent-
neural-
networks/jcr:content/root/table_o
f_contents/intro/simple_narrative/
image
103
8. RNN
RNN Types
Source: https://i.stack.imgur.com/6VAOt.jpg
104
8. RNN
Limitations of RNN
• One limitation of RNNs is the vanishing gradient problem, which makes it difficult for the
network to propagate gradients over long sequences.
• Which use different mechanisms for updating the hidden state and maintaining memory over
long sequences
105
8. RNN
106
8. RNN
RNN Types
Source: https://user-images.githubusercontent.com/15166794/39033683-3020ce04-
44ae-11e8-821f-1a9652ff5025.png
107
8. RNN
SA-LSTM/GRU
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
MAX_LEN = 256
x_train = pad_sequences(x_train, maxlen=MAX_LEN, value=word_index['the'],
padding='post’)
x_test = pad_sequences(x_test, maxlen=MAX_LEN, value=word_index['the'],
padding='post')
108
8. RNN
SA-LSTM/GRU
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, GRU, Dense
lstm_model = Sequential()
lstm_model.add(Embedding(10000, 128, input_shape=(MAX_LEN, )))
lstm_model.add(LSTM(64, activation='tanh'))
lstm_model.add(Dense(1, activation='sigmoid’))
109
8. RNN
SA-LSTM/GRU
gru_model = Sequential()
gru_model.add(Embedding(10000, 128, input_shape=(MAX_LEN, )))
gru_model.add(GRU(64, activation='tanh'))
gru_model.add(Dense(1, activation='sigmoid'))
110