0% found this document useful (0 votes)
2 views

AI_Based_Image_Processing

This document proposes an AI-based application that automates the development cycle for generating React JS scripts from images of charts, significantly reducing development and testing efforts. The system predicts chart attributes, generates validation scores, and facilitates easy implementation of changes, thereby streamlining the UI development process. The approach utilizes various machine learning algorithms for image classification and text detection, ultimately enhancing efficiency and accuracy in UI development tasks.

Uploaded by

sadwumble
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

AI_Based_Image_Processing

This document proposes an AI-based application that automates the development cycle for generating React JS scripts from images of charts, significantly reducing development and testing efforts. The system predicts chart attributes, generates validation scores, and facilitates easy implementation of changes, thereby streamlining the UI development process. The approach utilizes various machine learning algorithms for image classification and text detection, ultimately enhancing efficiency and accuracy in UI development tasks.

Uploaded by

sadwumble
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Title: AI based applica-on to process image and auto generate React JS scripts to minimise

development and tes-ng efforts.

Abstract: As an average one engineer put 2-3 weeks effort to develop and validate one chart,
moreover UX team reviews chart, QA team validate chart and data to ensure developed charts are
aligned with the given wireframe and it is mee-ng business requirement. It requires equal effort when
there is change request/ enhancement suggested by product manager and en-re development life
cycle should be followed. We are proposing a mechanism to -

• Automate overall development cycle using the AI model.


• Predict chart type, chart -tle, sub-tles, legends, legend colour, X-Axis, Y-Axis.
• Generate React JS code
• Generate valida-on score to compare UX with actual result, this will save UX review effort.
• Easy implementa-on of any rework/ enhancement

Problems Solved:
End to end data flow diagram:
A9ributes list 1-Input image details:
Ini-al
Data Source Field
weight
Image Name 100%
Image Path 100%
Image Dataset
Image Resolu-on 100%
Image Format 50%

Derived A9ributes list 2:


Ini-al l
Data Source Field
weight
Image Chart Type 100%
Image Dataset Image Colour 100%
Image Text 100%

Progress Flow diagram:

Data CreaCon:
The first and the foremost part is the data set to be considered for analysis and predic-ons. The system
uses dataset which is formed by gathering the images from all the exis-ng dashboard graphs on the
Tech Pulse Portal from Veneer. The images are saved in a specific naming system. The name consists of
the chart name followed by the chart type further and saved as a valid image format like jpg, png or
jpeg image. The dataset consists of all the images of various graphs collec-vely stored in one single
folder which is used as the dataset for our predic-on model.
Data Cleaning and Processing:
The main challenge lies in handling a highly imbalanced dataset with a low rate of failed data points.
Raw dataset has various issues which need to be pre-processed before training. Hence images need to
be resized using open CV python packages to address these challenges and improve the model
performance. Further different data processing techniques are used to make the predic-on model
efficient for training.

Image DetecCon:
Python split() method is used to split the image name string by specifying the separator in order to
fetch the chart type and store it in the form of labels. A one hot encoding is used which allows the
representa-on of the fetched labels to be easily processed by the predic-on algorithm. This helps maps
the categorical values to integer values and represent as binary vectors.

Colour DetecCon:
By default, open cv converts the image into BGR (Blue Green Red). In order to get the original image,
we convert the image again into RGB format using cvtColor() python method under CV2 package.

Text DetecCon and extracCon:


The predic-on model used for text detec-on works best and gives best accuracy under the fixed image
resolu-on at 720 pixels and 13 FPS. Hence, the resized image is then converted to the required
resolu-on.

Model SelecCon:
There are different data mining and deep learning algorithms useful for predic-ons such as Random
Forest, Neural Network, CNN. The type of dataset used to build the model mainly decides the selec-on
of an algorithm. Here three algorithms CNN, OCR, East text detec-on Model are used for modelling
because of their capability to predict and classify the image data set for mul--class classifiers according
to the requirement.

Image ClassificaCon and DetecCon:


Our input is a training dataset that consists of N images, each labelled with one of K different classes.
Then, we use this training set to train a classifier to learn what every one of the classes looks like. In
the end, we evaluate the quality of the classifier by asking it to predict labels for a new set of images
that it has never seen before. We will then compare the true labels of these images to the ones
predicted by the classifier. We have different algorithms used for classifica-on of images.

Random Forest:
The random forests algorithm is a machine learning technique that is increasingly being used for image
classifica-on and crea-on of con-nuous variables such as percent tree cover and forest biomass.
Random forests is an ensemble model which means that it uses the results from many different models
to calculate a response. In most cases the result from an ensemble model will be beder than the result
from any one of the individual models. In the case of random forests, several decision trees are created
(grown) and the response is calculated based on the outcome of all of the decision trees.
However, the dataset we have consists of the large number of images and neural network is the next
best op-on for beder accuracy in classifying the images appropriately as it works best for the large
amount of dataset.

ArCficial Neural Network:


Ar-ficial Neural Network is capable of learning any nonlinear func-on. A single perceptron (or neuron)
can be imagined as a Logis-c Regression. Ar-ficial Neural Network, or ANN, is a group of mul-ple
perceptron/ neurons at each layer. ANN is also known as a Feed-Forward Neural network because
inputs are processed only in the forward direc-on. ANNs have the capacity to learn weights that map
any input to the output. ANN consists of 3 layers – Input, Hidden and Output. The input layer accepts
the inputs, the hidden layer processes the inputs, and the output layer produces the result. Essen-ally,
each layer tries to learn certain weights.

The number of parameters in a neural network grows rapidly with the increase in the number of layers.
This can make training for a model computa-onally heavy (and some-mes not feasible). Tuning so
many of parameters can be a very huge task. The -me taken for tuning these parameters is diminished
by CNNs.

CNN:
Convolu-onal Neural Networks (CNNs) is the most popular neural network model being used for image
classifica-on problems. The prac-cal benefit is that having fewer parameters greatly improves the -me
it takes to learn as well as reduces the amount of data required to train the model. Instead of a fully
connected network of weights from each pixel, CNN has just enough weights to look at a small patch
of the image. It’s like reading a book by using a magnifying glass; eventually, you read the whole page,
but you look at only a small patch of the page at any given -me.

The beauty of CNN is that the number of parameters is independent of the size of the original image.
You can run the same CNN on a 300 × 300 image, and the number of parameters won’t change in the
convolu-on layer.

All the layers of a CNN have mul-ple convolu-onal filters working and scanning the complete feature
matrix and carry out the dimensionality reduc-on. This enables CNN to be a very apt and fit network
for image classifica-ons and processing. CNN learns the filters automa-cally without men-oning it
explicitly. These filters help in extrac-ng the right and relevant features from the input data.

Colour Code DetecCon:


K-Means:
Clustering is one of the most common exploratory data analysis techniques used to get an
intui9on about the structure of the data. It can be defined as the task of iden9fying subgroups
in the data such that data points in the same subgroup (cluster) are very similar while data
points in different clusters are very different. In other words, we try to find homogeneous
subgroups within the data such that data points in each cluster are as similar as possible
according to a similarity measure such as Euclidean-based distance or correla9on-based
distance.
K-means algorithm is an itera9ve algorithm that tries to par99on the dataset into K pre-
defined dis9nct non-overlapping subgroups (clusters) where each data point belongs to only
one group. It tries to make the intra-cluster data points as similar as possible while also keeping
the clusters as different (far) as possible.
AHer training the model using K-mean algorithm by specifying the number of clusters, we fetch
cluster centres for each cluster and convert the extracted RGB values to Hex colour code.

Text DetecCon and extracCon:


Text detec-on techniques required to detect the text in the image and create and bounding box around
the por-on of the image having text. OpenCV package uses the EAST model for text detec-on. The
tesseract package is for recognizing text in the bounding box detected for the text.

OCR
Op9cal character recogni9on or OCR refers to a set of computer vision problems that is
required to convert images of digital or hand-wriNen text images to machine readable text in
a form your computer can process, store and edit as a text file or as a part of a data entry and
manipula9on soHware.

EAST (Efficient accurate scene text detector)


OpenCV’s text detector implementa-on of EAST is quite robust, capable of localizing text even when
it’s blurred, reflec-ve, or par-ally obscured. This is a very robust deep learning method for text
detec-on and only a text detec-on method. It can be used in combina-on with any text recogni-on
method. EAST can detect text both in images and in the video. It runs near real--me at 13FPS on 720p
images with high text detec-on accuracy. We use the Pre-trained EAST model and define the output
layers. Further PyTessaract python package is used for text recogni-on for each block created aler the
text detec-on process. It detects various fields of chart image like Chart name, the legends inside the
chart and X-axis and Y-axis for par-cular chart type.

Training and TesCng:


We split the original training data into 80% training and 20% valida-on to op-mize the classifier. The
test data is separated to finally evaluate the accuracy of the model on the data it has never seen. This
helps to see whether we are over-finng on the training data and whether we should lower the learning
rate and train for more epochs if valida-on accuracy is higher than training accuracy or stop
overtraining if training accuracy shil higher than the valida-on. We train the model for 100 epochs
with a batch size of 50 for CNN predic-on model.

Model Accuracy:
AUC-ROC curve is the model selec-on metric for mul- class classifica-on problem to depict how good
the model is for differen-a-ng the given classes as in terms of the predicted probability, ROC is used
which is nothing but a probability curve for various classes calculated aler training the model and
finding the predic-ons. The x-axis displays the False Posi-ve Rate (FPR) and the y-axis displays the True
Posi-ve Rate (TPR) values of probability. Here the threshold value is nothing but the op-mal cut-off
point in the curve. This point is present where the TPR value is high and the FPR value is low.
Sample Input image and outcome of model:
Image

Colour and Text Detec-on Outputs


React Code generaCon:
This module takes output of the above models like predicted chart type, colour codes, recognized chart
fields like the name, legends, X-axis and Y-axis as an input. The automated code is generated as per the
chart type. This u-lity generates a react code template and populates the predicted input values for
each chart type. This react code is generated using Python script which is redirected to .js file and
which can be downloaded by UI/UX Developers.

Advantages:
• With the help of AI, detec-ng and recognizing objects and paderns in images can be used to
fast-track overall UI development.
• The biggest advantage is automa-on in overall life cycle for end-to-end development of user
interface (UI)
• Any change request in exis-ng code can be completed with this solu-on in few minutes which
can save months of effort for development and valida-on.
• Less chances of human errors for repe--ve work.
• Reduc-on in manual effort for understanding images and code development using React JS
• The proposed solu-on will be next genera-on AI-based image processing solu-on, and can be
used in many applica-ons within HP.

Author:
Ravikumar Patel

Senior SoHware Engineer


IRIS SoHware Ltd

You might also like