Day 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

Machine Learning And Deep

Learning: An Overview
Date : 11.07.2023
Samit Biswas
Assistant Professor
Department Of Computer Science and Technology
IIEST, Shibpur
Lecture Topics

• What is Machine Learning?


• What are the types of Machine learning algorithms?
• Applications
• Neural Networks and Deep Learning
• ML and DL Libraries in PYTHON
• Case Studies
Pioneer of AI

Source : https://www.forbes.com/sites/gilpress/2021/05/28/on-
thinking-machines-machine-learning-and-how-ai-took-over-
statistics/?sh=397aa8792513

Source : https://towardsdatascience.com/the-inception-of-machine-learning-90b9fc3737ff

Arthur L Samuel. “Some studies in machine learning using the game of checkers”.
In: IBM Journal of research and development 3.3 (1959), pp. 210–229.
Traditional Programming

Input Program Computer Output

Machine Learning

Input

Computer Program

Output
Machine
Learning

AI Deep Learning

Rules learned from data


without being explicitly
Intelligence created programmed
by rules Learn data representations from
abstraction
Applications

•Email Detection (Spam or Not)


•Face Detection / Matching
•Prediction in Stock Markets
•Weather Predictions
•Recommender System
•Improved Healthcare System
Types of ML Algorithms

• Data is labelled
• Learn from previous feedback
Supervised Learning
• Predict Future Outcome

• Unlabelled data
UnSupervised • No previous feedback
Learning • Analyze hidden pattern within
data

• Decision Making
Reinforcement • Maximize Reward
Learning • Learn series of works

Source : Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Supervised Learning

Labelled data “supervises” the machines to predict correct output

Input

Model Circle

Labels

Circle Rectangle

Triangle
Supervised Learning : Type

Supervised Learning

Classification Regression
Classification Vs Regression

Classification Regression
Area
Image (sq.ft)

Classification House Regression


Good or Model Price Model
Bad? Prediction
Classification

Binary Classification with two features (f1 & f2)

Predict Class Label (L)

Decision Boundary
(Linear)

f1

f2
Regression

Target T
(Output)

f (Input Features /
Observations)
Unsupervised Learning : Type

Unsupervised Learning

Dimensionality
Clustering
Reduction
Clustering

Find the hidden structure and relationships among unlabelled data

Group the similar kind of data into similar clusters or partitions

APPLICATION : To improve Marketing Strategy

Input Data : Ages and Purchase History of


customer

Goal : Group customers based on spending


K-Means Clustering

1. Choose number of clusters K (Here 3)

2. Initialize Centroid
3. Assign data points to the nearest cluster by Euclidean Distance
metric

Choose the cluster where the distance between the data point and the centroid is minimum.
4. Reinitialize Centroids by calculating the average of all data points of that cluster.

5. Repeat Step 3 & 4 until the assignments of data points to correct clusters are not
changing anymore.
Dimensionality Reduction
1. Removes least important variables , noise in the data

2. Reduces complexity of model, prevents overfitting

Source : https://towardsdatascience.com/dimensionality-reduction-cheatsheet-15060fee3aa
Supervised vs Unsupervised Learning

Supervised Unsupervised
Learning Learning

Discrete Classification Clustering

Continuous Regression Dimensionality


Reduction
Machine Learning (Structured Data)

Source : Stevens et al., Deep Learning with PyTorch. Manning, 2020


Deep Learning vs Machine Learning

Source : Stevens et al., Deep Learning with PyTorch. Manning, 2020


Classification Reminder (Deep Learning)

Input (I) =

Deep
Learning Output
Model

Label (L) = Flower


Mathematical Mapping

Training set : Ƭ = {< x[i], y[i]>, i = 1,2, 3, ……………..,n}


Function : f(x) = Ơ

Data Representation
Classification

Ơ = Rm -> RL

L = no. of classes
Convolutional Neural Network (CNN)

Not invented overnight!

Source : CS231N_2018
Concept Applied First………………

Source : Illustration of LeCun et al. 1998


Basic Operations on CNN

1. Convolution

2. Relu

3. Pooling/Downsampling

4. Unpooling/ Upsampling
Convolution
1. Feature Extractor - > Image to Multi-dimensional Feature representation

2.Input at convolutional layer denote by I, kernel size k, and s the stride of pixels i.e
the sliding of kernels, then output o obtained after applying convolutions -
𝑰−𝒌
o= +1
𝒔

Source : cs231n_2018
Pooling/Downsampling
1. Reduce the dimension, introduce invariability to the small translations into input images.

2. Output from convolutional layer denote by c, size of kernel size is k, and s signifies the
stride of pixels i.e the sliding of kernels, then output o obtained after applying max-pooling

𝒄−𝒌
o= +1
𝒔

Source : Cs231n_2018
Unpooling / Upsampling

1. Performs reverse operation of Pooling.

2. Used to reconstruct input images from


its feature representation.

3. Number of upsampling layers equals


to the number of downsampling layers
, the output size is equivalent to the
input size.
ML and DL Libraries in Python
Some Real-Life Problem Analysis

Text line segmentation from warped printed and


Handwritten Document Images

1. Severe distortions like warping, perspective distortions, multiple folds , etc. present

2. Handwritten documents with touching or overlapping characters

Arpita Dutta, Arpan Garai, Samit Biswas, and Amit Kumar Das (2021). Segmentation of text
lines using multi-scale CNN from warped printed and handwritten document images.
International Journal On Document Analysis and Recognition (IJDAR) 24,
299–313 (2021). https://doi.org/10.1007/s10032-021-00370-8
Pixel Level segmentation problem

Input Output
Multi-Scale CNN Model
Deep Learning Model Configuration
Ground Truth Annotation : Semi Automatic Way

Touching, overlapping components and splitting lines are marked with red, green
circles and blue lines, respectively; The height of the black rectangle signifies the
height of the minimum bounding box of the components that do not have
intersection with splitting line. The height of the blue rectangle denotes Tc and
the height of the brown rectangle signifies Oc .

Hc -> the mean of all these heights. Tc >= 2 * Hc

Oc -> the height of an overlapping component, Hc < Oc < 2 * Hc


The width of Hc / 7 pixels marked with red rectangular boxes;

Generated ground truth images


Output on English Warped Document Images
Output on Bangla Warped Document Images
Output On Bengali Handwritten Dataset
Output on English Handwritten Documents
Comic Document Image Analysis
Panel/Character extraction from Comic
Document Page Images

Bounding Box labelling Problem

Localization + Classification Problem.

Arpita Dutta and Samit Biswas (2019). CNN Based Extraction of


Panels/Characters from Bengali Comic Book Page Images. 2019 International
Conference on Document Analysis and Recognition Workshops (ICDARW), 2019,
pp. 38-43. https://doi.org/10.1109/ICDARW.2019.00012
Input Output
Methodology
Backbone Architecture
Proposed Deep Learning Model
Post Processing

Total number of bounding boxes = (52 * 52 * 3) + (26 * 26 * 3) + (13 * 13 * 3) = 10647

Post processing to select relevant bounding boxes


Result On Bengali Comic Dataset
Result on Japanese Comic Manga Dataset
Result On French Comics
Speech balloons and narrative Text box segmentation
from Comic Images

Constraints : varieties of outlines, structural layouts depending on artists’ choice

Arpita Dutta, Samit Biswas, and Amit Kumar Das (2021). CNN-based segmentation of
speech balloons and narrative text boxes from comic book page images. International
Journal On Document Analysis and Recognition (IJDAR) 24, 4962 (2021).
https://doi.org/10.1007/s10032-021-00366-4
Methodology
Ground Truth Generation : Semi automatic Way
Proposed Deep Learning Architecture
Attention Module (AM) output
Result On English Comic Dataset
Further Reading Materials and Resources

1. Machine Learning, Tom Mitchell, McGraw Hill,


1997.
2. Neural Networks And Learning Machines, Simon
Haykin, 2008
3. Pattern Recognition and Machine Learning,
Christopher M. Bishop, 2006

4. Deep Learning, Ian Goodfellow, 2015


Thank You !

Contact :
https://in.linkedin.com/in/arpita-dutta-301167191

https://scholar.google.com/citations?hl=en&user=1EX1os8ly-0C

https://orcid.org/0000-0002-4220-3418

https://www.researchgate.net/profile/Arpita-Dutta-11

You might also like