0% found this document useful (0 votes)
57 views

Digital Image Processing

This document proposes an unsupervised deep image clustering (DIC) method for image segmentation. It consists of two parts: 1) a feature transformation subnetwork that extracts features from the image using a CNN architecture, and 2) a deep clustering subnetwork that iteratively clusters the features into segments. The method is tested on the Berkeley Segmentation Dataset and achieves state-of-the-art performance according to multiple evaluation metrics, outperforming methods like K-means, mean-shift, and normalized cuts. Visual results also show it effectively merges similar regions and separates diverse ones.

Uploaded by

Unaixa Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Digital Image Processing

This document proposes an unsupervised deep image clustering (DIC) method for image segmentation. It consists of two parts: 1) a feature transformation subnetwork that extracts features from the image using a CNN architecture, and 2) a deep clustering subnetwork that iteratively clusters the features into segments. The method is tested on the Berkeley Segmentation Dataset and achieves state-of-the-art performance according to multiple evaluation metrics, outperforming methods like K-means, mean-shift, and normalized cuts. Visual results also show it effectively merges similar regions and separates diverse ones.

Uploaded by

Unaixa Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Unsupervised Image Segmentation

using Deep Image Clustering

Monazza Qadeer Khan


206154
Introduction
• Object segmentation is the most vital operation in image processing
techniques prior to image analysis

• Object segmentation is a challenging problem in the field of computer


vision and it has been widely applied in areas such as object recognition
and image classification

• Generally speaking, object segmentation methods can be divided into


three categories, unsupervised, semi-supervised and fully supervised.
Introduction
• In fully supervised segmentation, accurate labeled training dataset is
used

• In unsupervised segmentation, there are no ground truth labels

• Focus of this project is on unsupervised image segmentation

• It has two parts: extraction of features from given image and division of
image into different regions
Supervised vs. Unsupervised
Problem Statement
• Conventional clustering methods like K-means , Active contour ,
normalized cut , MLSS and SAS can be used for segmentation
• These methods have two principal drawbacks i.e. they are sensitive to
the segmentation parameters such as cluster numbers and the whole
procedure is complex, which cannot be optimized easily
• So, a deep image clustering (DIC) network is designed and
implemented
• It consists of a feature transformation subnetwork and a
differentiable deep clustering subnetwork; it divides the image space
into different clusters
Objectives
• Encouraged by neural networks’ flexibility and their ability for modelling intricate
patterns, an unsupervised segmentation framework based on a novel deep image
clustering (DIC) model is proposed
• The DIC consists of a feature transformation subnetwork (FTS) and a trainable
deep clustering subnetwork (DCS) for unsupervised image clustering
• FTS is built on a simple and capable network architecture
• DCS can assign pixels with different cluster numbers by updating cluster
associations and cluster centers iteratively
Material
• Extensive experiments have been conducted on the Berkley Segmentation
Database
• The experimental results show that DCS is more effective in aggregating features
during the clustering procedure
• DIC has also proven to be less sensitive to varying segmentation parameters and
of lower computation costs
• DIC can achieve significantly better segmentation performance compared to the
state-of-the-art techniques
Material
Berkeley Segmentation Dataset (BSD)
• The dataset consists of 500 natural images, ground-truth human annotations
and benchmarking code
• The data is explicitly separated into disjoint train, validation and test
subsets
• The dataset is an extension of the BSDS300, where the original 300 images
are used for training / validation and 200 fresh images, together with human
annotations, are added for testing
• Each image was segmented by five different subjects on average
Flow Diagram
Illustration of the proposed DIC framework for unsupervised image segmentation. DIC
consists of a FTS and a DCS and DIC is trained by an iterative refinement loss.
Methodology
Unsupervised image segmentation
• Includes technical details like preprocessing steps, features, how they
are extracted, their visualization, model training and testing
• Deep image clustering model consists of two modules:
1. a subnetwork for feature extraction
2. and a deep clustering subnetwork
• Super-pixel guided iterative refinement loss
• Over-fitting training protocol optimizing the network parameters in an end-to-end
way
Methodology
1. Network architecture for Feature Transformation subnetwork (FTS)

• Autoencoder architecture is used and the connection is skipped for


constructing the feature transformation subnetwork(FTS)

• The CNN for feature extraction is composed of a series of convolution layers


interleaved with batch normalization (BN) and ReLU activations

• FTS consists of six convolution blocks, one max-pooling operation, one


deconvolution operation and a simple convolution operation.
Methodology
• We use max-pooling, which down samples the input by a factor of 2, after the
2nd convolution block to increase the receptive field
• Then the 4th convolution block outputs are up-sampled by deconvolution and
concatenated with the 2nd convolution block outputs to pass onto the 5th
convolution block
• After the 6th convolution block and the simple convolution block, feature Y
with dimension C is generated
Methodology
• We use 3* 3 convolution filters with the number of output channels set to 64,
128 or 192 in each block, except the last CNN layer which outputs C channels

• The resulting C dimensional features Y can be taken as coarse cluster


associations

• In order to aggregate the features more effectively, Y will be passed onto the
following deep clustering module that iteratively updates the pixel-clusters
associations and cluster centers for 𝜏 iterations
Methodology
The flowchart of the feature transformation subnetwork.

• Convolution block (CB) - 33 convolution


• Batch-normalization max-pooling(MP) with the factor 2
• Relu
• Max-pooling(MP) with the factor 2
• Deconvolution(DC) of sample features by 2 times
Methodology
2. Deep Clustering Subnetwork

• Firstly the extracted feature Y is flattened to the dimension N C, where N


D H W, H is the height of image, W is the width of image and C is the
channel number or super-pixel number (SPN). Then a neural network
based clustering procedure is designed

• The cluster centers Ω are defined as the initializations for feature


clustering. Assuming the cluster centers are defined as Ω={Ω1, Ω2, Ω3,
…,ΩM}, M is the number of default clusters and Ωi is with dimension C*1
Methodology
The flowchart of the deep clustering subnetwork. DCS contains two iterative steps:
calculating cluster associations H and updating cluster centers Ω
Experimentation
• The segmentation results on two Berkley Segmentation Databases (BSDS300 and
BSDS500) [35] which consists of 300 and 500 natural images respectively, are
reported.
• To quantitatively evaluate the segmentation results, five criteria are used:
1. Probabilistic Rand Index (PRI)
2. Variation of Information (VoI)
3. Global Consistency Error (GCE)
4. Boundary Displacement Error (BDE)
5.Segmentation Covering (SC).
• The segmentation performance is better if PRI and SC is large and the other three
are smaller compared to the ground truths
Experimentation
𝜏 is set as 3 according to the cross-validation
  experiments. Training epoch T is set
as T = 100, the learning rate is set as 2 and the momentum is set as 0.9
Illustration of the iteration clustering process
Results
• In order to evaluate the proposed method DIC comprehensively, we compare the average
scores of the DIC’s with sixteen benchmark algorithms, such as Ncut, Mean-shift gPb-owt-
ucm, MLSS, W-Net MLSS , the optimal Image scale (OIS) is selected for segmenting images
in the Berkley Segmentation Database

• DIC works better in merging similar pixels and separating diverse regions by learning
from local image patterns adaptively
Results
The visual comparison between DIC and other state-of-the-arts, such as
MLSS, SAS
Demo
• Github link: https://github.com/zmbhou/DIC
• BSD dataset link:
https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
• Contour Detection and Image Segmentation
Resources:
http://web.archive.org/web/20160306133802/http://www.eecs.berkeley.edu/Research/P
rojects/CS/vision/grouping/resources.html#bsds500
Demo
Thank You
Q&A

You might also like