Day 2
Day 2
Day 2
Learning: An Overview
Date : 11.07.2023
Samit Biswas
Assistant Professor
Department Of Computer Science and Technology
IIEST, Shibpur
Lecture Topics
Source : https://www.forbes.com/sites/gilpress/2021/05/28/on-
thinking-machines-machine-learning-and-how-ai-took-over-
statistics/?sh=397aa8792513
Source : https://towardsdatascience.com/the-inception-of-machine-learning-90b9fc3737ff
Arthur L Samuel. “Some studies in machine learning using the game of checkers”.
In: IBM Journal of research and development 3.3 (1959), pp. 210–229.
Traditional Programming
Machine Learning
Input
Computer Program
Output
Machine
Learning
AI Deep Learning
• Data is labelled
• Learn from previous feedback
Supervised Learning
• Predict Future Outcome
• Unlabelled data
UnSupervised • No previous feedback
Learning • Analyze hidden pattern within
data
• Decision Making
Reinforcement • Maximize Reward
Learning • Learn series of works
Source : Raschka and Mirjalily (2019). Python Machine Learning, 3rd Edition
Supervised Learning
Input
Model Circle
Labels
Circle Rectangle
Triangle
Supervised Learning : Type
Supervised Learning
Classification Regression
Classification Vs Regression
Classification Regression
Area
Image (sq.ft)
Decision Boundary
(Linear)
f1
f2
Regression
Target T
(Output)
f (Input Features /
Observations)
Unsupervised Learning : Type
Unsupervised Learning
Dimensionality
Clustering
Reduction
Clustering
2. Initialize Centroid
3. Assign data points to the nearest cluster by Euclidean Distance
metric
Choose the cluster where the distance between the data point and the centroid is minimum.
4. Reinitialize Centroids by calculating the average of all data points of that cluster.
5. Repeat Step 3 & 4 until the assignments of data points to correct clusters are not
changing anymore.
Dimensionality Reduction
1. Removes least important variables , noise in the data
Source : https://towardsdatascience.com/dimensionality-reduction-cheatsheet-15060fee3aa
Supervised vs Unsupervised Learning
Supervised Unsupervised
Learning Learning
Input (I) =
Deep
Learning Output
Model
Data Representation
Classification
Ơ = Rm -> RL
L = no. of classes
Convolutional Neural Network (CNN)
Source : CS231N_2018
Concept Applied First………………
1. Convolution
2. Relu
3. Pooling/Downsampling
4. Unpooling/ Upsampling
Convolution
1. Feature Extractor - > Image to Multi-dimensional Feature representation
2.Input at convolutional layer denote by I, kernel size k, and s the stride of pixels i.e
the sliding of kernels, then output o obtained after applying convolutions -
𝑰−𝒌
o= +1
𝒔
Source : cs231n_2018
Pooling/Downsampling
1. Reduce the dimension, introduce invariability to the small translations into input images.
2. Output from convolutional layer denote by c, size of kernel size is k, and s signifies the
stride of pixels i.e the sliding of kernels, then output o obtained after applying max-pooling
–
𝒄−𝒌
o= +1
𝒔
Source : Cs231n_2018
Unpooling / Upsampling
1. Severe distortions like warping, perspective distortions, multiple folds , etc. present
Arpita Dutta, Arpan Garai, Samit Biswas, and Amit Kumar Das (2021). Segmentation of text
lines using multi-scale CNN from warped printed and handwritten document images.
International Journal On Document Analysis and Recognition (IJDAR) 24,
299–313 (2021). https://doi.org/10.1007/s10032-021-00370-8
Pixel Level segmentation problem
Input Output
Multi-Scale CNN Model
Deep Learning Model Configuration
Ground Truth Annotation : Semi Automatic Way
Touching, overlapping components and splitting lines are marked with red, green
circles and blue lines, respectively; The height of the black rectangle signifies the
height of the minimum bounding box of the components that do not have
intersection with splitting line. The height of the blue rectangle denotes Tc and
the height of the brown rectangle signifies Oc .
Arpita Dutta, Samit Biswas, and Amit Kumar Das (2021). CNN-based segmentation of
speech balloons and narrative text boxes from comic book page images. International
Journal On Document Analysis and Recognition (IJDAR) 24, 4962 (2021).
https://doi.org/10.1007/s10032-021-00366-4
Methodology
Ground Truth Generation : Semi automatic Way
Proposed Deep Learning Architecture
Attention Module (AM) output
Result On English Comic Dataset
Further Reading Materials and Resources
Contact :
https://in.linkedin.com/in/arpita-dutta-301167191
https://scholar.google.com/citations?hl=en&user=1EX1os8ly-0C
https://orcid.org/0000-0002-4220-3418
https://www.researchgate.net/profile/Arpita-Dutta-11