Image Classification

This document outlines a lab session focused on image classification, specifically for malaria detection using machine learning techniques. It covers installation of necessary packages, image processing, feature extraction, and model evaluation, emphasizing the importance of combining global and local feature descriptors. The document also provides details on dataset organization, feature extraction methods, and training classifiers using various machine learning algorithms.

Uploaded by

semeriuss

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views8 pages

Image Classification

Uploaded by

semeriuss

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Image classification

Learning Objectives
At the end of this session you will be able to:
● Familiarized with installation missing packages
● Perform image processing
● Perform feature extraction and selection
● Conduct model Evaluation
Introduction
The feature of a machine learning technique to categorize or classify an object into its
corresponding label with the help of learned discributer from hundreds of image is called as object
classification.
This is one of a supervised learning problem where the users must provide training data (set of
objects along with its labels) to the machine learning technique so that it learns how to categorize
each object (by learning the feature behind ) with respect to its class.
In this lab , you will be introduced into one such object classification problem namely malaria
detection and classification, which is a hard problem because there is a similarity between the
infected and the non infected. As you know machine learning is all about learning from the past
data, so huge dataset of malaria images to perform real-time malaria detection and classification.
Without caring too much on real-time malaria disease classification, in this lab you will see how
to perform a simple image classification task using opencv, and machine learning algorithms with
the help of python.

Feature Extraction
Features are the information or list of numbers that are extracted from an image. These are real-
valued numbers (integers, float or binary). There are a wider range of feature extraction algorithms
in Computer Vision.
When deciding about the features that could quantify infected from non infected, we could possibly
think of Color, Texture and Shape as the primary ones. This is an obvious choice to globally
quantify and represent the disease infected or non infected.

But this approach is less likely to produce good results, if we choose only one feature vector, as
these species have many attributes in common like the non infected will be similar to infected in
terms of color and so on. So, we need to quantify the image by combining different feature
descriptors so that it describes the image more effectively.
Global Feature Descriptors
These are the feature descriptors that quantifies an image globally. These don’t have the concept
of interest points and thus, takes in the entire image for processing. Some of the commonly used
global feature descriptors are

Color - Color Channel Statistics (Mean, Standard Deviation) and Color Histogram
Shape - Hu Moments, Zernike Moments
Texture - Haralick Texture, Local Binary Patterns (LBP)
Others - Histogram of Oriented Gradients (HOG), Threshold Adjacency Statistics (TAS)
Local Feature Descriptors
These are the feature descriptors that quantifies local regions of an image. Interest points are
determined in the entire image and image patches/regions surrounding those interest points are
considered for analysis. Some of the commonly used local feature descriptors are

SIFT (Scale Invariant Feature Transform)

SURF (Speeded Up Robust Features)
ORB (Oriented Fast and Rotated BRIEF)
BRIEF (Binary Robust Independed Elementary Features)
Combining Global Features
There are two popular ways to combine these feature vectors. For global feature vectors, we just
concatenate each feature vector to form a single global feature vector. This is the approach we will
be using in this lab.For local feature vectors as well as combination of global and local feature
vectors, we need something called Bag of Visual Words (BOVW). This approach will not be
discussed in this lab, but there are lots of resources to learn this technique. Normally, it uses
Vocabulary builder, K-Means clustering, Linear SVM, and Td-Idf vectorization.
Dataset
Thanks to the National Institute of Health (NIH) the malaria dataset is available for download from
https://ceb.nlm.nih.gov/proj/malaria/cell_images.zip.
The malaria dataset is composed of a total of 27,598 segmented cell images extracted from thin
blood smear slide images. The cell images are organized into two folders, parasitized and
uninfected, with 13,799 cell images in each, making this a balanced dataset.For more information
about the dataset please read the aforementioned link.
Organizing Dataset
The folder structure for this lab is given below.
Create a dataset folder and subfolders Status_Healthy, Status_infected inside it.
Select randomly and copy 80% of parasitized and uninfected cell images into Status_Healthy,
Status_infected folders.
Crate another testdata folder and subfolders Status_Healthy, Status_infected inside it.
Select randomly and copy 20% of the cell images from each category and put them in the testdata
folder.
Global Feature Extraction

The three global feature descriptors are

Color Histogram that distinguish color of infected from non infected.
Hu Moments that quantifies shape of infected from non infected.
Haralick Texture that quantifies texture of infected from non infected.
As you might know images are matrices, so there should be an efficient way to store the feature
vectors locally. The program takes one image at a time, extract three global features, concatenates
the three global features into a single global feature and saves it along with its label in an HDF5
file format.
Instead of using HDF5 file-format, “.csv” file-format could be used to store the features. But, as
the amount of data is very large,using HDF5 format is worth it.
Importing Required Libraries
First, the required libraries to carry out the experiment are imported as follows:

scikit-image, opencv, to read cell image files and process them as required
matplotlib and seaborn to view cell images and plot some graphs
basic python os and math functionality
numpy and random to manipulate arrays and generate random numbers
scikit-learn, sklearn, to carry feature engineering, model fitting and hyperparameter searching
Installation of libraries
To install any library you can use any either conda or pip. For instance to install open cv search
pip install opencv on the web and then copy and past in the cmd terminal.
To install the mahotas package use this .
conda install -c conda-forge mahotas
Functions for global feature descriptors
To extract Hu Moments features from the image, cv2.HuMoments() function provided by
OpenCV will be used. The argument to this function is the moments of the image cv2.moments()
flatenned. This means the moment of the image is computed and converted it to a vector using
flatten(). Before doing that, the color image should be converted into a grayscale image as
moments expect images to be grayscale.

Haralick Textures
To extract Haralick Texture features from the image, the mahotas library will be used. The function
mahotas.features.haralick() will be used. Before doing that, the color image should be converted
into a grayscale image as haralick feature descriptor expect images to be grayscale.

Color Histogram
To extract Color Histogram features from the image, the cv2.calcHist() function provided by
OpenCV is will be used. The arguments it expects are the image, channels, mask, histSize (bins)
and ranges for each channel [typically 0-256). The histogram is then normalized using normalize()
function of OpenCV and return a flattened version of this normalized matrix using flatten().

For each of the training label name, loop through the corresponding folder to get all the images
inside it. For each image, first resize the image into a fixed size. Then, the three global features
and concatenate these three features using NumPy’s np.hstack() function is extracted. Keep track
of the feature with its label using those two lists created below - labels and global_features. You
could even use a dictionary here. Below is the code snippet to do these.
After extracting features and concatenating it, the data should be locally saved. Before saving this
data, the LabelEncoder() is used to encode the labels in a proper format. This is to make sure that
the labels are represented as unique numbers.
Training classifiers
After extracting, concatenating and saving global features and labels from the training dataset, it’s
time to train the system. To do that, the Machine Learning models need to be created. For creating
the machine learning model’s, scikit-learn library will be used.

The Logistic Regression, Linear Discriminant Analysis, K-Nearest Neighbors, Decision Trees,
Random Forests, Gaussian Naive Bayes and Support Vector Machine will be use as the machine
learning models. To understand these algorithms, please refer on the internet.
Furthermore, the train_test_split function provided by scikit-learn will be to split the training
dataset into train_data and test_data. By this way, the models are trained with the train_data and
test the trained model with the unseen test_data. The split size is decided by the test_size parameter.

All the necessary libraries to work with are imported and create a models list. This list will have
all the machine learning models that will get trained with the locally stored features.
Testing classifier
Use the code below to test the model you built.
Assignment
This question will use the malaria dataset once again. Again, create a test set consisting of 1/2 of
the data using the rest for training.
Prepare the features in the form of csv format and indicate the name of the features
Fit a randomForest model, decision tree,Knn,logistic regression model,linear discriminant
analysis,quadratic discriminant analysis, and Boosting algorithm model to the training data.
Predict the labels for the corresponding test data.
Compute the confusion matrix for the test data.
Compute the AUC (Area Under the Curve) for each classifier.
Plot ROC curves as evaluate on the test data.
Out of all classifiers used in this assignment, which would you choose as a final model for the
malaria data?

Competencies Proficiency Scale
100% (1)
Competencies Proficiency Scale
2 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Machine Learning Notes
No ratings yet
Machine Learning Notes
2 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Tutorial 7 Developing A Simple Image Classifier
No ratings yet
Tutorial 7 Developing A Simple Image Classifier
11 pages
Technical Report
No ratings yet
Technical Report
4 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
No ratings yet
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
22 pages
Mental Illness Prediction Using Deep Learning
No ratings yet
Mental Illness Prediction Using Deep Learning
58 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Project
No ratings yet
Project
15 pages
2023 Article Jatit 19Vol101No14-2
No ratings yet
2023 Article Jatit 19Vol101No14-2
6 pages
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
OOP SE-203: Submitted By: - ANUJ 2K20/SE/21 - Anurag Munshi 2K20/Se/25
No ratings yet
OOP SE-203: Submitted By: - ANUJ 2K20/SE/21 - Anurag Munshi 2K20/Se/25
9 pages
A Proposal For Automatic Diagnosis of Malaria: (Extended Abstract)
No ratings yet
A Proposal For Automatic Diagnosis of Malaria: (Extended Abstract)
2 pages
Learning OpenCV 3 Application Development
From Everand
Learning OpenCV 3 Application Development
Samyak Datta
No ratings yet
Projects Instructions
No ratings yet
Projects Instructions
3 pages
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
Machine Learning Algorithm
No ratings yet
Machine Learning Algorithm
18 pages
Haar - Cascades - 1 Ref
No ratings yet
Haar - Cascades - 1 Ref
5 pages
Generative Ai: A Comprehensive Guide to Innovative Ai Models (A Step-by-step Understanding of Fundamental Concepts With Practical Applications)
From Everand
Generative Ai: A Comprehensive Guide to Innovative Ai Models (A Step-by-step Understanding of Fundamental Concepts With Practical Applications)
Anthony Phillips
No ratings yet
Malaria Detection Using CNN
100% (1)
Malaria Detection Using CNN
9 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
1 An Introduction To Machine Learning With Scikit Learn
No ratings yet
1 An Introduction To Machine Learning With Scikit Learn
2 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
From Everand
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
Fouad Sabry
No ratings yet
AI Medical Diagnosis Week 01
No ratings yet
AI Medical Diagnosis Week 01
5 pages
BigData Assessment2 26230605
No ratings yet
BigData Assessment2 26230605
14 pages
Malaria Detection Using Deep Learning
No ratings yet
Malaria Detection Using Deep Learning
5 pages
IBM Deep Learning Peer Review
No ratings yet
IBM Deep Learning Peer Review
43 pages
04 MLModelingBasics
No ratings yet
04 MLModelingBasics
61 pages
Image Classification
No ratings yet
Image Classification
18 pages
Scikit-Image: Image Processing in Python
No ratings yet
Scikit-Image: Image Processing in Python
19 pages
En Subject
No ratings yet
En Subject
12 pages
Face Detection With Python
0% (1)
Face Detection With Python
20 pages
Python Final Project Group 03
No ratings yet
Python Final Project Group 03
18 pages
Malaria Parasite Detection Using Deep Learning
No ratings yet
Malaria Parasite Detection Using Deep Learning
7 pages
Aphelion Software: Unlocking Vision: Exploring the Depths of Aphelion Software
From Everand
Aphelion Software: Unlocking Vision: Exploring the Depths of Aphelion Software
Fouad Sabry
No ratings yet
Computer Vision Fundamental Matrix: Please, suggest a subtitle for a book with title 'Computer Vision Fundamental Matrix' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Computer Vision Fundamental Matrix: Please, suggest a subtitle for a book with title 'Computer Vision Fundamental Matrix' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Handcrafted Features Vs Deep-Learned Features - Hermite Polynomial Classification of Liver Images
No ratings yet
Handcrafted Features Vs Deep-Learned Features - Hermite Polynomial Classification of Liver Images
6 pages
Machine Learning in Python
No ratings yet
Machine Learning in Python
5 pages
Deep Learning Project For Computer Vision With Python 2022
No ratings yet
Deep Learning Project For Computer Vision With Python 2022
297 pages
Machine Learning: Aigerim Bogyrbayeva
No ratings yet
Machine Learning: Aigerim Bogyrbayeva
85 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
Image Datasets For Practicing Machine Learning in OpenCV
No ratings yet
Image Datasets For Practicing Machine Learning in OpenCV
9 pages
Content Based Image Retrieval: Unlocking Visual Databases
From Everand
Content Based Image Retrieval: Unlocking Visual Databases
Fouad Sabry
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Detection of Aedes Aegypti Mosquito by Digital Image Processing Techniques and Support Vector Machine
No ratings yet
Detection of Aedes Aegypti Mosquito by Digital Image Processing Techniques and Support Vector Machine
4 pages
Comparative Analysis Diff Malaria Parasite
No ratings yet
Comparative Analysis Diff Malaria Parasite
7 pages
DR Basit Assignments
No ratings yet
DR Basit Assignments
13 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
Learn OpenCV with Python by Examples
From Everand
Learn OpenCV with Python by Examples
James Chen
No ratings yet
Plant Diseases Identification Using Image Processing (MATLAB)
No ratings yet
Plant Diseases Identification Using Image Processing (MATLAB)
34 pages
Working With Computer Vision Using Python-1
No ratings yet
Working With Computer Vision Using Python-1
3 pages
CV Assignment 2 Group02
No ratings yet
CV Assignment 2 Group02
12 pages
Ijiset V7 I5 05
No ratings yet
Ijiset V7 I5 05
7 pages
Intro To Scikit Learning
No ratings yet
Intro To Scikit Learning
18 pages
Student - Supervisor Expectations (Masters)
No ratings yet
Student - Supervisor Expectations (Masters)
14 pages
Otoritas Ijtihad......
No ratings yet
Otoritas Ijtihad......
30 pages
Fact Sheet Scietech - English and Filipino Article 1
90% (10)
Fact Sheet Scietech - English and Filipino Article 1
2 pages
Changes in Direct Patient Care From Physiotherapy
No ratings yet
Changes in Direct Patient Care From Physiotherapy
9 pages
9-2 LSMW Iklc23
No ratings yet
9-2 LSMW Iklc23
31 pages
A3 V5 Tesol PDF
No ratings yet
A3 V5 Tesol PDF
17 pages
Summer Training Report For BBA IV
No ratings yet
Summer Training Report For BBA IV
14 pages
B 3
No ratings yet
B 3
6 pages
2nd Provisional Merit List of LP Hailakandi
No ratings yet
2nd Provisional Merit List of LP Hailakandi
26 pages
1A.P5.S1,2 Describing A Picture
No ratings yet
1A.P5.S1,2 Describing A Picture
8 pages
EFA and CFA
No ratings yet
EFA and CFA
36 pages
MADE EASY GATE 2019 Rank Predictor - Rank Calculator and Estimator PDF
No ratings yet
MADE EASY GATE 2019 Rank Predictor - Rank Calculator and Estimator PDF
30 pages
First Aid at Work339678559
No ratings yet
First Aid at Work339678559
69 pages
EW3, Scenario, Act
No ratings yet
EW3, Scenario, Act
2 pages
AQA GCSE Specimen Paper Business Studies Exam
No ratings yet
AQA GCSE Specimen Paper Business Studies Exam
16 pages
An Exploration of A Theraplay Informed Group As An Intervention For Adoptive Families
No ratings yet
An Exploration of A Theraplay Informed Group As An Intervention For Adoptive Families
20 pages
Sathishkumar Thirunavukkarasu - Resume - QA & RA Profile PDF
No ratings yet
Sathishkumar Thirunavukkarasu - Resume - QA & RA Profile PDF
1 page
2 Chapter Lesson 4 Estimating Products
No ratings yet
2 Chapter Lesson 4 Estimating Products
23 pages
De Thi Thu Mon Anh Tot Nghiep THPT 2023 Lan 1 THPT Ham Long
No ratings yet
De Thi Thu Mon Anh Tot Nghiep THPT 2023 Lan 1 THPT Ham Long
4 pages
13467
No ratings yet
13467
2 pages
All About Me Progress Booklet 1
100% (1)
All About Me Progress Booklet 1
19 pages
Resource 20250119085957 Worksheet 1 Grammar Practice
No ratings yet
Resource 20250119085957 Worksheet 1 Grammar Practice
3 pages
Tle10 Cookery10 Q4 M3
No ratings yet
Tle10 Cookery10 Q4 M3
15 pages
DLP in EDUC 105 - GROUP2
No ratings yet
DLP in EDUC 105 - GROUP2
5 pages
Adjustment Disorders Edited
No ratings yet
Adjustment Disorders Edited
3 pages
2.0 LP - Cheer Dance Hand Movements and Feet Positions
No ratings yet
2.0 LP - Cheer Dance Hand Movements and Feet Positions
3 pages
Annexure 1. A. 5 Board Results Achievement Circular CISCE - 24-25 - Grade 10 - 12
No ratings yet
Annexure 1. A. 5 Board Results Achievement Circular CISCE - 24-25 - Grade 10 - 12
5 pages
Essential Communication Skills For Conflict Resolution
No ratings yet
Essential Communication Skills For Conflict Resolution
15 pages
Ashby RN Resume
No ratings yet
Ashby RN Resume
1 page

Image Classification

Uploaded by

Image Classification

Uploaded by

Image classification

SIFT (Scale Invariant Feature Transform)

The three global feature descriptors are

You might also like