Deep Learning
Deep Learning
❖In the case of a deep learning model, the feature extraction step is
completely unnecessary. The model would recognize these unique
characteristics of a car and make correct predictions without human
intervention.
TensorFlow
TensorFlow is an open-source platform for machine learning and a
symbolic math library that is used for machine learning applications.
Keras
It is an Open Source Neural Network library that runs on top of Theano or
Tensorflow. It is designed to be fast and easy for the user to use. It is a
useful library to construct any deep learning algorithm of whatever choice
we want.
Tensorhigh-performan
1. ceFlow is written in Keras is written in Python.
C++, CUDA, Python.
TensorFlow is a
framework that offers
3. Keras is a high-Level API.
both high and
low-level APIs.
Processing Power
Deep learning can require significant processing power. Complex models
trained on bigdata datasets can take hours, days or even more to train. The
models we present in this chapter can be trained in minutes to just less
than an hour on computers with conventional CPUs. You’ll need only a
reasonably current personal computer. We’ll discuss the special
high-performance hardware called GPUs (Graphics Processing Units) and
TPUs (Tensor Processing Units) developed by NVIDIA and Google to meet
the extraordinary processing demands of edge-of-the-practice
deep-learning applications.
Keras Built-In Datasets
Here are some of Keras’s datasets (from the module
tensorflow.keras.datasets13) for practicing deep learning.
❖ MNIST database of handwritten digits
Used for classifying handwritten digit images, this dataset contains
28-by-28 grayscale digit images labeled as 0 through 9 with 60,000 images
for training and 10,000 for testing. We use this dataset in Section 16.6,
where we study convolutional neural networks.
❖Fashion-MNIST database of fashion articles
Used for classifying clothing images, this dataset contains 28-by-28
grayscale images of clothing labeled in 10 categories16 with 60,000 for
training and 10,000 for testing.
❖IMDb Movie reviews—Used for sentiment analysis, this dataset contains
reviews labeled as positive (1) or negative (0) sentiment with 25,000
reviews for training and 25,000 for testing.
❖CIFAR1018 small image classification
Used for small-image classification, this dataset contains 32-by-32 color
images labeled in 10 categories with 50,000 images for training and 10,000
for testing.
❖CIFAR10019 small image classification
Also, used for small-image classification, this dataset contains 32-by-32
color images labeled in 100 categories with 50,000 images for training and
10,000 for testing.
Neural Network
A neural network is a computational model inspired by the
human brain, composed of layers of interconnected nodes
(neurons) that learn patterns from data.
Similar to the human brain that has neurons interconnected to
one another, artificial neural networks also have neurons that
are interconnected to one another in various layers of the
networks. These neurons are known as nodes.
The given figure illustrates the typical diagram of Biological Neural Network.
The typical Artificial Neural Network looks something like the given figure.
Dendrites from Biological Neural Network represent inputs in Artificial Neural
Networks, cell nucleus represents Nodes, synapse represents Weights, and
Axon represents Output.
Dendrites Inputs
Synapse Weights
Axon Output
Input Layer:
Hidden Layer:
The hidden layer presents in-between input and output layers. It performs all
the calculations to find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the hidden layer,
which finally results in output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of
the inputs and includes a bias. This computation is represented in the form of
a transfer function.
1. Basic Components
Neuron (Node): Performs a computation on inputs.
Formula:
z = sum(wi * xi) + b
a = Activation(z)
w1*x1+w2*x2+b, which can be represented like Ax1+Bx2+c
Which is a line equation.
Layers: -
Input Layer: Takes input features.
Hidden Layers: Perform computations.
Output Layer: Produces final prediction
2. Forward Propagation
The process of passing data through the network to make a
prediction.
3. Loss Function
Measures how far the prediction is from the actual value.
Examples:
MSE (Mean Squared Error) for regression.
Cross-Entropy for classification.
4. Forward Propagation
The process of passing data through the network to make a
prediction.
5. Optimization Algorithm
Gradient Descent: Updates weights to minimize loss.
Variants include: SGD, Adam, RMSProp.
6. Training Process
1. Initialize weights randomly.
2. Forward pass.
3. Compute loss.
4. Backward pass.
5. Update weights.
6. Repeat for many epochs.
8. Popular Architectures
- Feedforward Neural Network (FNN)
- Convolutional Neural Network (CNN) for images
- Recurrent Neural Network (RNN) for sequences
- Transformer for NLP and more
Tensor
The main use of a tensor is to hold and manipulate data in
deep learning and machine learning.
● 0D → Single point
● 1D → Line of numbers
iris = load_iris()
X = iris.data # Features (4 features: Sepal length, Sepal
width, Petal length, Petal width)
y = iris.target # Labels (0, 1, 2)
# Check data
print("Features shape:", X.shape)
print("Labels shape:", y.shape)
output:
Features shape: (150, 4)
Labels shape: (150,)
Output:
predictions = model.predict(np.array([[5.1, 3.5, 1.4, 0.2]]))
print(predictions)
Output:
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 35ms/step
[[0.59903073 0.37916926 0.0218 ]]
● Horizontal Sobel:
[-1 0 1]
[-2 0 2]
[-1 0 1]
● Vertical Sobel:
[-1 -2 -1]
[ 0 0 0]
[ 1 2 1]
2. Sharpening Filter
[ 0 -1 0]
[-1 5 -1]
[ 0 -1 0]
4. Learned Filters
Spread out across the input with gaps (dilations), useful for
capturing wider context without losing resolution (common in
segmentation tasks).
7. Separable Filters
𝐻+2𝑃−𝑓
Output size formula:Output height/width=⌊ 𝑆
⌋+1
Steps:
Filter/kernel (2×2):
[[1, 0],
[0, -1]]
Step-by-step:
conv=∑(element-wise product)
Example at top-left (first 2×2 region):
Input slice:
[[1, 2],
[3, 1]]
Filter:
[[1, 0],
[0, -1]]
2. Apply ReLU
Output:
1. Convolution Output:
[[ 0. 0. -2.]
[ 3. 0. -1.]
[ 0. -2. 0.]]
Program:
import numpy as np
kernel = np.array([
[1, 0],
[0, -1]
])
# Convolution
def convolve2d(image, kernel, stride=1):
k = kernel.shape[0]
output_dim = (image.shape[0] - k) // stride + 1
output = np.zeros((output_dim, output_dim))
for i in range(0, output_dim):
for j in range(0, output_dim):
region = image[i:i+k, j:j+k]
output[i, j] = np.sum(region * kernel)
return output
# ReLU
def relu(x):
return np.maximum(0, x)
# Max Pooling
def max_pooling(image, pool_size=2, stride=1):
output_dim = (image.shape[0] - pool_size) // stride + 1
output = np.zeros((output_dim, output_dim))
for i in range(0, output_dim):
for j in range(0, output_dim):
region = image[i:i+pool_size, j:j+pool_size]
output[i, j] = np.max(region)
return output
# Run pipeline
conv_output = convolve2d(input_img, kernel)
relu_output = relu(conv_output)
pool_output = max_pooling(relu_output)
# Print results
print("Convolution Output:\n", conv_output)
print("ReLU Output:\n", relu_output)
print("Max Pooling Output:\n", pool_output)
Output:
Convolution Output:
[[ 0. 0. -2.]
[ 3. 0. -1.]
[ 0. -2. 0.]]
ReLU Output:
[[0. 0. 0.]
[3. 0. 0.]
[0. 0. 0.]]
Max Pooling Output:
[[3. 0.]
[3. 0.]]
Output:
(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)
Visualizing Digits
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(font_scale=2)
import numpy as np
index = np.random.choice(np.arange(len(X_train)), 24,
replace=False)
figure, axes = plt.subplots(nrows=4, ncols=6, figsize=(16, 9))
for item in zip(axes.ravel(), X_train[index], y_train[index]):
axes, image, target = item
axes.imshow(image, cmap=plt.cm.gray_r)
axes.set_xticks([]) # remove x-axis tick marks
axes.set_yticks([]) # remove y-axis tick marks
axes.set_title(target)
plt.tight_layout()
Output:
Data Preprocessing
# Reshape to (num_samples, height, width, channels)
X_train = X_train.reshape(60000, 28, 28, 1)
X_test = X_test.reshape(10000, 28, 28, 1)
print(X_train.shape, X_test.shape)
Output:
(60000, 28, 28, 1) (10000, 28, 28, 1)
#Normalizing the Image Data
X_train = X_train.astype('float32') / 255
X_test = X_test.astype('float32') / 255
Flatten(),
Dense(128, activation='relu'),
Dropout(0.5),
Dense(10, activation='softmax') # 10 classes for digits 0-9
])
model.summary()
Output:
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━
━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃ Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━
━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D) │ (None, 26, 26, 32) │ 320 │
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ max_pooling2d (MaxPooling2D) │ (None, 13, 13, 32) │ 0│
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ conv2d_1 (Conv2D) │ (None, 11, 11, 64) │ 18,496 │
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ max_pooling2d_1 (MaxPooling2D) │ (None, 5, 5, 64) │ 0│
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ flatten (Flatten) │ (None, 1600) │ 0│
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ dense (Dense) │ (None, 128) │ 204,928 │
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ dropout (Dropout) │ (None, 128) │ 0│
├─────────────────────────────────┼────────────────────────┼
───────────────┤
│ dense_1 (Dense) │ (None, 10) │ 1,290 │
└─────────────────────────────────┴────────────────────────┴
───────────────┘
Output:
Epoch 1/5
844/844 ━━━━━━━━━━━━━━━━━━━━ 41s
47ms/step - accuracy: 0.8279 - loss: 0.5348 -
val_accuracy: 0.9837 - val_loss: 0.0565
Epoch 2/5
844/844 ━━━━━━━━━━━━━━━━━━━━ 39s
46ms/step - accuracy: 0.9717 - loss: 0.0967 -
val_accuracy: 0.9888 - val_loss: 0.0415
Epoch 3/5
844/844 ━━━━━━━━━━━━━━━━━━━━ 41s
46ms/step - accuracy: 0.9808 - loss: 0.0645 -
val_accuracy: 0.9893 - val_loss: 0.0332
Epoch 4/5
844/844 ━━━━━━━━━━━━━━━━━━━━ 39s
46ms/step - accuracy: 0.9847 - loss: 0.0521 -
val_accuracy: 0.9910 - val_loss: 0.0319
Epoch 5/5
844/844 ━━━━━━━━━━━━━━━━━━━━ 41s
46ms/step - accuracy: 0.9873 - loss: 0.0428 -
val_accuracy: 0.9917 - val_loss: 0.0327
<keras.src.callbacks.history.History at
0x78845d371910>
Evaluate the Model
import time # Import the time module
t1=time.time()
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")
t2=time.time()
print(f"Total time taken: {t2-t1:.2f}")
Output:
313/313 ━━━━━━━━━━━━━━━━━━━━ 2s
7ms/step - accuracy: 0.9857 - loss: 0.0390
Test accuracy: 0.9900
Total time taken: 2.63
Make Predictions
predictions = model.predict(X_test)
print(f"Prediction for first test image:
{tf.argmax(predictions[0]).numpy()}")
Output:
313/313 ━━━━━━━━━━━━━━━━━━━━ 2s
7ms/step
Prediction for first test image: 7
Dimensionality Reduction
Dimensionality reduction is the process of reducing the
number of features (dimensions) in a dataset while
preserving as much important information as possible.