AI_Based_Image_Processing
AI_Based_Image_Processing
Abstract: As an average one engineer put 2-3 weeks effort to develop and validate one chart,
moreover UX team reviews chart, QA team validate chart and data to ensure developed charts are
aligned with the given wireframe and it is mee-ng business requirement. It requires equal effort when
there is change request/ enhancement suggested by product manager and en-re development life
cycle should be followed. We are proposing a mechanism to -
Problems Solved:
End to end data flow diagram:
A9ributes list 1-Input image details:
Ini-al
Data Source Field
weight
Image Name 100%
Image Path 100%
Image Dataset
Image Resolu-on 100%
Image Format 50%
Data CreaCon:
The first and the foremost part is the data set to be considered for analysis and predic-ons. The system
uses dataset which is formed by gathering the images from all the exis-ng dashboard graphs on the
Tech Pulse Portal from Veneer. The images are saved in a specific naming system. The name consists of
the chart name followed by the chart type further and saved as a valid image format like jpg, png or
jpeg image. The dataset consists of all the images of various graphs collec-vely stored in one single
folder which is used as the dataset for our predic-on model.
Data Cleaning and Processing:
The main challenge lies in handling a highly imbalanced dataset with a low rate of failed data points.
Raw dataset has various issues which need to be pre-processed before training. Hence images need to
be resized using open CV python packages to address these challenges and improve the model
performance. Further different data processing techniques are used to make the predic-on model
efficient for training.
Image DetecCon:
Python split() method is used to split the image name string by specifying the separator in order to
fetch the chart type and store it in the form of labels. A one hot encoding is used which allows the
representa-on of the fetched labels to be easily processed by the predic-on algorithm. This helps maps
the categorical values to integer values and represent as binary vectors.
Colour DetecCon:
By default, open cv converts the image into BGR (Blue Green Red). In order to get the original image,
we convert the image again into RGB format using cvtColor() python method under CV2 package.
Model SelecCon:
There are different data mining and deep learning algorithms useful for predic-ons such as Random
Forest, Neural Network, CNN. The type of dataset used to build the model mainly decides the selec-on
of an algorithm. Here three algorithms CNN, OCR, East text detec-on Model are used for modelling
because of their capability to predict and classify the image data set for mul--class classifiers according
to the requirement.
Random Forest:
The random forests algorithm is a machine learning technique that is increasingly being used for image
classifica-on and crea-on of con-nuous variables such as percent tree cover and forest biomass.
Random forests is an ensemble model which means that it uses the results from many different models
to calculate a response. In most cases the result from an ensemble model will be beder than the result
from any one of the individual models. In the case of random forests, several decision trees are created
(grown) and the response is calculated based on the outcome of all of the decision trees.
However, the dataset we have consists of the large number of images and neural network is the next
best op-on for beder accuracy in classifying the images appropriately as it works best for the large
amount of dataset.
The number of parameters in a neural network grows rapidly with the increase in the number of layers.
This can make training for a model computa-onally heavy (and some-mes not feasible). Tuning so
many of parameters can be a very huge task. The -me taken for tuning these parameters is diminished
by CNNs.
CNN:
Convolu-onal Neural Networks (CNNs) is the most popular neural network model being used for image
classifica-on problems. The prac-cal benefit is that having fewer parameters greatly improves the -me
it takes to learn as well as reduces the amount of data required to train the model. Instead of a fully
connected network of weights from each pixel, CNN has just enough weights to look at a small patch
of the image. It’s like reading a book by using a magnifying glass; eventually, you read the whole page,
but you look at only a small patch of the page at any given -me.
The beauty of CNN is that the number of parameters is independent of the size of the original image.
You can run the same CNN on a 300 × 300 image, and the number of parameters won’t change in the
convolu-on layer.
All the layers of a CNN have mul-ple convolu-onal filters working and scanning the complete feature
matrix and carry out the dimensionality reduc-on. This enables CNN to be a very apt and fit network
for image classifica-ons and processing. CNN learns the filters automa-cally without men-oning it
explicitly. These filters help in extrac-ng the right and relevant features from the input data.
OCR
Op9cal character recogni9on or OCR refers to a set of computer vision problems that is
required to convert images of digital or hand-wriNen text images to machine readable text in
a form your computer can process, store and edit as a text file or as a part of a data entry and
manipula9on soHware.
Model Accuracy:
AUC-ROC curve is the model selec-on metric for mul- class classifica-on problem to depict how good
the model is for differen-a-ng the given classes as in terms of the predicted probability, ROC is used
which is nothing but a probability curve for various classes calculated aler training the model and
finding the predic-ons. The x-axis displays the False Posi-ve Rate (FPR) and the y-axis displays the True
Posi-ve Rate (TPR) values of probability. Here the threshold value is nothing but the op-mal cut-off
point in the curve. This point is present where the TPR value is high and the FPR value is low.
Sample Input image and outcome of model:
Image
Advantages:
• With the help of AI, detec-ng and recognizing objects and paderns in images can be used to
fast-track overall UI development.
• The biggest advantage is automa-on in overall life cycle for end-to-end development of user
interface (UI)
• Any change request in exis-ng code can be completed with this solu-on in few minutes which
can save months of effort for development and valida-on.
• Less chances of human errors for repe--ve work.
• Reduc-on in manual effort for understanding images and code development using React JS
• The proposed solu-on will be next genera-on AI-based image processing solu-on, and can be
used in many applica-ons within HP.
Author:
Ravikumar Patel