Day 2

Machine Learning And Deep
Learning: An Overview
Date : 11.07.2023
Samit Biswas
Assistant Professor
Department Of Computer Science and Technology
IIEST, Shibpur
Lecture Topics
• What is Machine Learning?

• What are the types of Machine learning algorithms?
• Applications
• Neural Networks and Deep Learning
• ML and DL Libraries in PYTHON
• Case Studies
Pioneer of AI
Source : https://www.forbes.com/sites/gilpress/2021/05/28/on-
thinking-machines-machine-learning-and-how-ai-took-over-
statistics/?sh=397aa8792513
Source : https://towardsdatascience.com/the-inception-of-machine-learning-90b9fc3737ff
Arthur L Samuel. “Some studies in machine learning using the game of checkers”.
In: IBM Journal of research and development 3.3 (1959), pp. 210–229.
Traditional Programming
Input Program Computer Output
Machine Learning
Input
Computer Program
Output
Machine
Learning
AI Deep Learning
Rules learned from data

without being explicitly
Intelligence created programmed
by rules Learn data representations from
abstraction
Applications
•Email Detection (Spam or Not)

•Face Detection / Matching
•Prediction in Stock Markets
•Weather Predictions
•Recommender System
•Improved Healthcare System
Types of ML Algorithms
• Data is labelled
• Learn from previous feedback
Supervised Learning
• Predict Future Outcome
• Unlabelled data
UnSupervised • No previous feedback
Learning • Analyze hidden pattern within
data
• Decision Making
Reinforcement • Maximize Reward
Learning • Learn series of works
Source : Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Supervised Learning
Labelled data “supervises” the machines to predict correct output
Input
Model Circle
Labels
Circle Rectangle
Triangle
Supervised Learning : Type
Supervised Learning
Classification Regression
Classification Vs Regression
Classification Regression
Area
Image (sq.ft)
Classification House Regression

Good or Model Price Model
Bad? Prediction
Classification
Binary Classification with two features (f1 & f2)
Predict Class Label (L)
Decision Boundary
(Linear)
f1
f2
Regression
Target T
(Output)
f (Input Features /
Observations)
Unsupervised Learning : Type
Unsupervised Learning
Dimensionality
Clustering
Reduction
Clustering
Find the hidden structure and relationships among unlabelled data
Group the similar kind of data into similar clusters or partitions
APPLICATION : To improve Marketing Strategy
Input Data : Ages and Purchase History of

customer
Goal : Group customers based on spending

K-Means Clustering
1. Choose number of clusters K (Here 3)
2. Initialize Centroid
3. Assign data points to the nearest cluster by Euclidean Distance
metric
Choose the cluster where the distance between the data point and the centroid is minimum.
4. Reinitialize Centroids by calculating the average of all data points of that cluster.
5. Repeat Step 3 & 4 until the assignments of data points to correct clusters are not
changing anymore.
Dimensionality Reduction
1. Removes least important variables , noise in the data
2. Reduces complexity of model, prevents overfitting
Source : https://towardsdatascience.com/dimensionality-reduction-cheatsheet-15060fee3aa
Supervised vs Unsupervised Learning
Supervised Unsupervised
Learning Learning
Discrete Classification Clustering
Continuous Regression Dimensionality

Reduction
Machine Learning (Structured Data)
Source : Stevens et al., Deep Learning with PyTorch. Manning, 2020

Deep Learning vs Machine Learning
Source : Stevens et al., Deep Learning with PyTorch. Manning, 2020

Classification Reminder (Deep Learning)
Input (I) =
Deep
Learning Output
Model
Label (L) = Flower

Mathematical Mapping
Training set : Ƭ = {< x[i], y[i]>, i = 1,2, 3, ……………..,n}

Function : f(x) = Ơ
Data Representation
Classification
Ơ = Rm -> RL
L = no. of classes
Convolutional Neural Network (CNN)
Not invented overnight!
Source : CS231N_2018
Concept Applied First………………
Source : Illustration of LeCun et al. 1998

Basic Operations on CNN
1. Convolution
2. Relu
3. Pooling/Downsampling
4. Unpooling/ Upsampling
Convolution
1. Feature Extractor - > Image to Multi-dimensional Feature representation
2.Input at convolutional layer denote by I, kernel size k, and s the stride of pixels i.e
the sliding of kernels, then output o obtained after applying convolutions -
𝑰−𝒌
o= +1
𝒔
Source : cs231n_2018
Pooling/Downsampling
1. Reduce the dimension, introduce invariability to the small translations into input images.
2. Output from convolutional layer denote by c, size of kernel size is k, and s signifies the
stride of pixels i.e the sliding of kernels, then output o obtained after applying max-pooling
–
𝒄−𝒌
o= +1
𝒔
Source : Cs231n_2018
Unpooling / Upsampling
1. Performs reverse operation of Pooling.
2. Used to reconstruct input images from

its feature representation.
3. Number of upsampling layers equals

to the number of downsampling layers
, the output size is equivalent to the
input size.
ML and DL Libraries in Python
Some Real-Life Problem Analysis
Text line segmentation from warped printed and

Handwritten Document Images
1. Severe distortions like warping, perspective distortions, multiple folds , etc. present
2. Handwritten documents with touching or overlapping characters
Arpita Dutta, Arpan Garai, Samit Biswas, and Amit Kumar Das (2021). Segmentation of text
lines using multi-scale CNN from warped printed and handwritten document images.
International Journal On Document Analysis and Recognition (IJDAR) 24,
299–313 (2021). https://doi.org/10.1007/s10032-021-00370-8
Pixel Level segmentation problem
Input Output
Multi-Scale CNN Model
Deep Learning Model Configuration
Ground Truth Annotation : Semi Automatic Way
Touching, overlapping components and splitting lines are marked with red, green
circles and blue lines, respectively; The height of the black rectangle signifies the
height of the minimum bounding box of the components that do not have
intersection with splitting line. The height of the blue rectangle denotes Tc and
the height of the brown rectangle signifies Oc .
Hc -> the mean of all these heights. Tc >= 2 * Hc
Oc -> the height of an overlapping component, Hc < Oc < 2 * Hc

The width of Hc / 7 pixels marked with red rectangular boxes;
Generated ground truth images

Output on English Warped Document Images
Output on Bangla Warped Document Images
Output On Bengali Handwritten Dataset
Output on English Handwritten Documents
Comic Document Image Analysis
Panel/Character extraction from Comic
Document Page Images
Bounding Box labelling Problem
Localization + Classification Problem.
Arpita Dutta and Samit Biswas (2019). CNN Based Extraction of

Panels/Characters from Bengali Comic Book Page Images. 2019 International
Conference on Document Analysis and Recognition Workshops (ICDARW), 2019,
pp. 38-43. https://doi.org/10.1109/ICDARW.2019.00012
Input Output
Methodology
Backbone Architecture
Proposed Deep Learning Model
Post Processing
Total number of bounding boxes = (52 * 52 * 3) + (26 * 26 * 3) + (13 * 13 * 3) = 10647
Post processing to select relevant bounding boxes

Result On Bengali Comic Dataset
Result on Japanese Comic Manga Dataset
Result On French Comics
Speech balloons and narrative Text box segmentation
from Comic Images
Constraints : varieties of outlines, structural layouts depending on artists’ choice
Arpita Dutta, Samit Biswas, and Amit Kumar Das (2021). CNN-based segmentation of
speech balloons and narrative text boxes from comic book page images. International
Journal On Document Analysis and Recognition (IJDAR) 24, 4962 (2021).
https://doi.org/10.1007/s10032-021-00366-4
Methodology
Ground Truth Generation : Semi automatic Way
Proposed Deep Learning Architecture
Attention Module (AM) output
Result On English Comic Dataset
Further Reading Materials and Resources
1. Machine Learning, Tom Mitchell, McGraw Hill,

1997.
2. Neural Networks And Learning Machines, Simon
Haykin, 2008
3. Pattern Recognition and Machine Learning,
Christopher M. Bishop, 2006
4. Deep Learning, Ian Goodfellow, 2015

Thank You !
Contact :
https://in.linkedin.com/in/arpita-dutta-301167191
https://scholar.google.com/citations?hl=en&user=1EX1os8ly-0C
https://orcid.org/0000-0002-4220-3418
https://www.researchgate.net/profile/Arpita-Dutta-11

Day 2

Uploaded by

Copyright:

Available Formats

Day 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Day 2

Uploaded by

Copyright:

Available Formats

Machine Learning And Deep

• What is Machine Learning?

Input Program Computer Output

Rules learned from data

•Email Detection (Spam or Not)

Labelled data “supervises” the machines to predict correct output

Classification House Regression

Binary Classification with two features (f1 & f2)

Predict Class Label (L)

Find the hidden structure and relationships among unlabelled data

Group the similar kind of data into similar clusters or partitions

APPLICATION : To improve Marketing Strategy

Input Data : Ages and Purchase History of

Goal : Group customers based on spending

1. Choose number of clusters K (Here 3)

2. Reduces complexity of model, prevents overfitting

Discrete Classification Clustering

Continuous Regression Dimensionality

Source : Stevens et al., Deep Learning with PyTorch. Manning, 2020

Source : Stevens et al., Deep Learning with PyTorch. Manning, 2020

Label (L) = Flower

Training set : Ƭ = {< x[i], y[i]>, i = 1,2, 3, ……………..,n}

Not invented overnight!

Source : Illustration of LeCun et al. 1998

1. Performs reverse operation of Pooling.

2. Used to reconstruct input images from

3. Number of upsampling layers equals

Text line segmentation from warped printed and

2. Handwritten documents with touching or overlapping characters

Hc -> the mean of all these heights. Tc >= 2 * Hc

Oc -> the height of an overlapping component, Hc < Oc < 2 * Hc

Generated ground truth images

Bounding Box labelling Problem

Localization + Classification Problem.

Arpita Dutta and Samit Biswas (2019). CNN Based Extraction of

Total number of bounding boxes = (52 * 52 * 3) + (26 * 26 * 3) + (13 * 13 * 3) = 10647

Post processing to select relevant bounding boxes

Constraints : varieties of outlines, structural layouts depending on artists’ choice

1. Machine Learning, Tom Mitchell, McGraw Hill,

4. Deep Learning, Ian Goodfellow, 2015

You might also like