0% found this document useful (0 votes)

9 views

Automated Image Data Preprocessing With Deep Reinforcement Learning

This paper presents a deep reinforcement learning framework for automating image data preprocessing, addressing the inefficiencies of manual methods. The framework learns optimal transformation chains for individual images, improving classifier accuracy and robustness against noise. Experimental results demonstrate its effectiveness in enhancing image classification performance compared to traditional preprocessing techniques.

Uploaded by

SYETC156HARSHAL RAHENWAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Automated Image Data Preprocessing With Deep Reinforcement Learning

Uploaded by

SYETC156HARSHAL RAHENWAL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Automated Image Data Preprocessing with Deep

Reinforcement Learning

Tran Ngoc Minh1 , Mathieu Sinn2 , Hoang Thanh Lam3 , Martin Wistuba4
IBM Research Dublin, Ireland
1,4
{m.n.tran, martin.wistuba}@ibm.com, 2,3 {mathsinn, t.l.hoang}@ie.ibm.com
arXiv:1806.05886v2 [cs.CV] 29 Apr 2021

Abstract

Data preparation, i.e. the process of transforming raw data into a format that can
be used for training effective machine learning models, is a tedious and time-
consuming task. For image data, preprocessing typically involves a sequence
of basic transformations such as cropping, filtering, rotating or flipping images.
Currently, data scientists decide manually based on their experience which trans-
formations to apply in which particular order to a given image data set. Besides
constituting a bottleneck in real-world data science projects, manual image data
preprocessing may yield suboptimal results as data scientists need to rely on in-
tuition or trial-and-error approaches when exploring the space of possible image
transformations and thus might not be able to discover the most effective ones. To
mitigate the inefficiency and potential ineffectiveness of manual data preprocessing,
this paper proposes a deep reinforcement learning framework to automatically
discover the optimal data preprocessing steps for training an image classifier. The
framework takes as input sets of labeled images and predefined preprocessing
transformations. It jointly learns the classifier and the optimal preprocessing trans-
formations for individual images. Experimental results show that the proposed
approach not only improves the accuracy of image classifiers, but also makes them
substantially more robust to noisy inputs at test time.

1 Introduction
Data preprocessing, i.e. the process of transforming raw data into a format that can be used for training
effective machine learning models, accounts for 50-80% of the time spent on typical data science
projects [3, 11]. Besides constituting a bottleneck, manual data preprocessing is also ineffective as it
only explores a small part of the space of possible transformations and thus might not discover the
most effective ones for removing noise and/or extracting meaningful features from a given set of
raw data. Unstructured data1 are particularly challenging in this regard as their preparation requires
deep expertise in fields such as Computer Vision or Natural Language Preprocessing; moreover,
because of the high complexity of machine learning models dealing with such data, the effect of data
preprocessing is particularly difficult to understand. Hence, automating data preprocessing is highly
desirable as it increases the productivity of data scientists and may lead to better performance of the
resulting machine learning models.
Despite of its high potential value, the automation of data preprocessing has been mostly overlooked
by the machine learning community, with only few prior works on this subject [3, 13]. Recently, Bilalli
et al. [3] suggested a method for automating data preprocessing via meta-learning. However, their
approach only focuses on structured data with a limited number of relatively simple preprocessing
1
By unstructured data we mean images, text and time series, while we use structured data to refer to data in
tabular format, e.g. as in relational databases.

Preprint. Work in progress.

techniques such as normalization, standardization and discretization. Furthermore, preprocessing in
their case does not address individual data instances, but instead applies to the whole data set. The
study in [13] develops a transformation pursuit for image classification, which is also applied to the
whole data set. While these studies provide principled methods for augmenting training data, they
still suffer from a number of intrinsic disadvantages. Firstly, given a large set of transformations2 ,
applying them to all data instances results in an excessively large training data set, increasing
prohibitively the amount of training time. Secondly, note that the order of applying transformations
is important; for example, rotating and then flipping an image creates a different result than first
flipping and then rotating the image. With a large transformation set, finding effective chains of
transformations3 requires an exhaustive search over an exponential search space. Thirdly, applying
the same transformation chain to all data instances is often ineffective as each data instance has its
own feature and hence should be preprocessed in a different way. For example, different images might
require rotations with different angles because they were taken from different views; clearly, applying
the same rotation to them in the augmented training set is inefficient. Lastly, an augmentation with
irrelevant transformations may even produce wrongly labeled data instances, e.g. a 180-degree
rotation of an image of the digit “6” produces an image of the digit “9”.
In this work, we propose an automated method for data preprocessing that transforms each data
instance individually4 . Although we present our approach for image inputs, it easily generalizes
to other types of data, such as time series or text. The proposed approach is based on a deep
reinforcement learning framework to address the limitations of state-of-the-art approaches discussed
above: Firstly, our preprocessing approach results in one variant per image, so the augmented training
data set has the same size as the original one. Secondly and thirdly, we preprocess each image
individually with its own transformation chain which is discovered on-the-fly, using reinforcement
learning to explore the space of potential transformation chains. Lastly, our approach discovers
optimal transformation chains, again by using reinforcement learning to exploit the most effective
ones. Experimental results on real-world data sets show that the more noise a data set contains,
the more effective the framework is to improve the accuracy of a classifier trained on that data set.
Furthermore, the classifier also becomes substantially more robust against noisy inputs at test time
when being trained using our proposed framework.
The remainder of this paper is organized as follows. We discuss related work in Section 2 and then
present the addressed challenge as well as our proposed solution in Section 3. In order to evaluate the
solution, we describe an evaluation methodology in Section 4 and experiments in Section 5. Finally,
we discuss and conclude our study in Sections 6 and 7.

2 Related Work
Generalization is the main challenge of image classifiers, particularly when trained on small and/or
noisy training data sets. Therefore, numerous approaches have been proposed to improve the gener-
alization, such as adding a regularization term on the norm of weights [4], using dropout [18, 19]
or batch normalization [20]. Data augmentation is another effective approach that helps increase
the generalization of image classifiers through applying simple transformations such as rotating and
flipping input images, and adding the transformed images to the training set [10]. The full set of
transformations used in [10] includes shifting, zooming in/out, rotating, flipping, distorting, shading
and styling. Data augmentation with more complicated transformations is investigated in [9], which
evaluates three concrete preprocessing techniques, namely Zero Component Analysis, Mean Normal-
ization and Standardization, on the performance of different convolutional neural networks. While the
approaches in [9, 10] preprocess images according to a preselected chain of transformations, Paulin et
al. [13] suggest that the transformation set should be chosen in principled way instead of resorting to
(manual) trial-and-error, which is feasible only when the number of possible transformations is small.
Their proposed approach selects a set of transformations, possibly ordered, through a greedy search
strategy. Although this approach offers a more competitive set of transformations, it still has several
limitations: Firstly, the search process is inefficient because it involves retraining the classifier on
2
Throughout this paper, we use the words preprocessing and transformation interchangeably to indicate an
operation applied to data instances such as flipping an image.
3
The term chain of transformations is used to indicate an ordered set of transformations.
4
An implementation of the method can be found at https://github.com/IBM/automation-of-image-data-
preprocessing.

2
the whole augmented data set every time a candidate transformation in the search space is evaluated.
Secondly, the same preprocessing transformations are applied to all images, which has a number of
disadvantages as discussed in Section 1. Our approach uses a reinforcement learning framework to
address precisely those shortcomings.
Reinforcement learning [14] and specifically deep reinforcement learning [7] have recently drawn
substantial attention from the machine learning research community. However, there are only few
studies [1, 2, 8] applying deep reinforcement learning to visual recognition tasks, such as edge
detection, segmentation and object detection, or active object localization. However, none of these
works considers automating the preprocessing of images or learning transformation sets. To the best
of our knowledge, our work is the first study utilizing deep reinforcement learning to search for
effective chains of preprocessing transformations for individual images.

3 Reinforcement Learning Framework

We present in this section our detailed solution for image preprocessing using a deep reinforcement
learning framework. The framework includes two basic components, namely, an agent and an
environment. Given an input image, the general workflow of the framework is shown in Figure 1.
The agent has its policy represented by a deep neural network whose outputs are action values used
by a decision maker to decide whether an image is sufficiently preprocessed or whether additional
transformations need to be applied. The environment is responsible for transforming the image
upon request by the agent, where a concrete transformation operation is given together with the
request. After doing so, the environment continues to feed the transformed image into the deep neural
network for a new evaluation. This process is iterated until the agent makes a final decision to stop
preprocessing the image. In addition to performing the actual image transformations, the environment
also evaluates the impact of each transformation and produces rewards based on that. Technical
details of each component of the framework are given in the following subsections.

Reinforcement Learning Framework

Agent
finish preprocessing Fully Transformed
Decision Maker
Data Source, Images
e.g. Images
apply next
action
values
action

Deep Neural
image Networks Environment

partially preprocessed
image and returned reward

Figure 1: The workflow of preprocessing images.

3.1 State and Action Spaces

In this section we define states and actions used in our reinforcement learning framework and
introduce an important property of preprocessing techniques that can be used.

3.1.1 State Space

We define a state as an image, thus an original state corresponds to an original image, and a
transformed state corresponds to a preprocessed image. As such, the state space consists of all
original images as well as all images transformed by an arbitrary chain of transformations. It is easy
to see that the size of the state space grows exponentially in the length of the transformation chains.

3.1.2 Action Space

In Figure 2(a), we show the typical architecture of a convolutional neural network (CNN) whose
output is a logit vector with k values representing the corresponding unnormalized likelihood of the k

3
output classes. In our study, we use a variant of Deep Q-Network (DQN) [16, 17] to model a network
policy. The network policy implemented in a DQN as shown in Figure 2(b) resembles the CNN in
Figure 2(a), except that the output layer is extended to form an action space. The DQN output layer
containing Q-values consists of two parts that are corresponding to two groups of actions. The first
part is a vector playing the same role as the logit vector in the CNN, i.e. it represents the unnormalized
likelihood of the k classes. We denote each slot in this part as a stop action SActioni . If the decision
maker decides on the next action as one of the stop actions SActioni , the preprocessing of an input
image will stop with a prediction of class i for the input image. The second part of the DQN output
layer is a vector representing a set of n transformation actions. If a decision is made for a next
action with one of the transformation actions T Actionj , the current image will be continued to
be preprocessed with the transformation j. The two sets of stop and transformation actions form
an action space which has totally k + n actions in case of discrete transformation. Note that it is
straightforward to also support continuous actions. For example, we can model a continuous rotation
by defining two slots in the second part of the DQN output: one for the Q-value of the rotation action
and one for the value of the rotation angle. Likewise, we can also adapt the first part of the DQN
output in order to apply the framework to a regression problem, e.g. when the inputs are time series
and the task is to forecast future values.

Last fully connected layer Last fully connected layer Last fully con nected layer
Class 1 |...| Class k SAction 1 |...| SAction k TAction 1 |...| TAction n
+
SAction 1 |...| SAction k TAction 1 |...| TAction n
(a) (b) (c)

Figure 2: Illustration of deep neural network policies

3.1.3 Symmetry of a Transformation Action

There is an important property that any considered transformation operation should possess, the
so-called symmetry property: Due to the exploratory nature of the reinforcement learning approach,
it is essential to allow an image to be recovered after a trial transformation that resulted in a poor
performance. For example, if there is a rotation transformation with an angle α, there should
be another rotation transformation with the angle −α in the set of possible transformations. For
transformations such as image cropping, inverting transformations is not as straight-forward; it
requires implementing a memory mechanism to remember states before any transformation. As this
may lead to large memory usage for long transformations, a mechanism to control the maximum
length of a transformation chain should be used (see Section 3.4).

3.2 Decision Maker

The decision maker is where a reinforcement learning policy is deployed. It is responsible for
selecting the next action to be applied to the current state. The action and the state are then passed to
the environment component for further processing. In our study, we use the max policy to select an
appropriate action, given the DQN output layer. Furthermore, in order to enable the exploration of
reinforcement learning, we allow the decision maker to select alternative next actions randomly with
some probability , which is known as -greedy exploration strategy [14]. The probability starts at a
maximum of 1.0 and is annealed down to a minimum of 0.1 during training.

3.3 Deep Neural Network

Using a DQN as in Figure 2(b) is a simple starting point; performance gains can be achieved using
other variants of DQN. In our work, we implemented a variant of DQN, namely Dueling DQN

4
(DDQN) [21] as shown in Figure 2(c). The idea behind DDQNs is that the Q-values will be a
combination of a value function and an advantage function. The value function specifies how good it
is to be in a given state while the advantage function indicates how much better selecting an action is
compared to the others. The benefit of separating the two functions is that the reinforcement learning
framework does not need to learn both value and advantage at the same time, and therefore a DDQN
is able to learn the state-value function efficiently. In order to update the deep neural network, we use
the Bellman equation Q(s, a) = r + γ × maxa0 (Q(s0 , a0 )), where Q(s, a) is the DQN output value
of action a given input state s, r and s0 are the reward and the next state returned by the environment
when the action a is applied to the state s, and γ is the discounted parameter. We refer the reader to
[21] for more details on DDQNs.

3.4 Environment

The environment is where actual transformations on images are performed. In addition, it is also
responsible for calculating return rewards during training. Upon receiving an image and an action
from the reinforcement learning agent, the environment behaves differently depending on the type
of the action. If it is a transformation action, the environment will apply that transformation to the
image only if the length of the chain of transformations applied particularly to that image is smaller
than a configurable parameter max_len. If the chain is longer than max_len, the image is recovered
to its original state and the reinforcement learning framework must seek another transformation chain
for the image. Note, this recovery mechanism is only used for training. At test time, we simply pick
the stop action with the largest Q-value for the prediction of the image. The recovery mechanism
also solves the memory problem described in Section 3.1.3. Regardless of the length of the current
transformation chain, the environment will return a zero reward to the reinforcement learning agent
in this case.
If the environment receives a stop action SActioni , it does not return a new image but a reward and
classifies the original image as class i. The strategy to compute return rewards during training plays
an important role for the convergence of the training. The environment uses the ground true label of
the original image to determine the reward. A simple strategy is to assign a reward of +1 if the label
is equal to i and −1 if otherwise. However, this simple strategy does not work well when the number
of classes k is larger than 2 since it causes unbalancing rewards. Hence, we suggest a more robust
scheme to compute return rewards, which is to assign a reward of k − 1 if the label is equal to i and
−1 if otherwise.
Policy Network

Trained
CNN CNN CNN from
RL
RL Framework

Environment
Agent

Dog=.7, Cat=.3 Dog=.4, Cat=.3, Rotate=.2, Flip=.1 Dog=.7, Cat=.3

get max
Stop if
dog/cat No, rotate
image
Yes

Dog Dog Dog

(a) (b) (c)

Figure 3: The methodology of our experiments.

4 Methodology
Our methodology in setting up experiments is illustrated in Figure 3. In order to evaluate our auto-
preprocessing framework, we train three different models, namely NN, RL and CL, shown in Figures
3(a), 3(b) and 3(c), respectively. For all of them, we use the same neural network architecture. Figure
3(a) represents a CNN model with an arbitrary architecture. This same architecture will also be used
as the policy network in the reinforcement learning framework as shown in Figure 3(b). Since both
models use the same network architecture, any performance difference between the two models in

5
our experiments is caused by the reinforcement learning solution. In Figure 3(c), we also have a CNN
model with the same network architecture, but we do not train the network from scratch. Rather,
we continue to fine-tune the network obtained from the reinforcement learning framework. The N
original training images are preprocessed by the framework to produce N new training images which
are used as inputs of the fine-tuning process.
In our experiments, we implement three different CNN architectures, namely Arch1, Arch2 and
Arch3 as shown in Figures 4(a), 4(b) and 4(c), respectively. Hence, we have totally nine models for
comparison. The architectures are selected according to their complexity ranging from simple in
Figure 4(a) to complex in Figure 4(c). Note that the hyperparameters and architectures of the models
in Figure 4 are not designed “optimally” (e.g. using (hyper-)parameter tuning or auto-architecture
search), but chosen such that there is some level of complexity difference between them, the effect of
which is discussed in our evaluation below.
5x5(64) SAME

MP 2x2 SAME

3x3(96) SAME

MP 2x2 SAME

Dropout (0.3)

Dropout (0.3)
FC (512)

FC (256)
Input

Relu

Relu
LRN

LRN

BN
(a)
3x3(128) SAME

3x3(256) SAME

3x3(512) SAME
3x3(64) SAME

MP 2x2 VALID

MP 2x2 VALID
Dropout (0.3)

Dropout (0.3)

Dropout (0.5)
FC (1024)
Input

Relu

Relu
BN

BN
(b)

3x3(192) SAME

3x3(192) VALID

MP 2x2 VALID
3x3(48) SAME

3x3(48) VALID

MP 2x2 VALID

3x3(96) SAME

3x3(96) VALID

MP 2x2 VALID

Dropout (0.4)

Dropout (0.4)
Dropout (0.4)

Dropout (0.4)

Dropout (0.4)
FC (512)

FC (256)
Input

Relu

Relu
Relu

Relu

Relu
(c)

Figure 4: CNN architectures used in our experiments. LRN, BN, MP and FC stand for local response
normalization, batch normalization, max pooling and fully connected, respectively. All convolutional
layers use a stride of 1x1 and all max pooling layers use a stride of 2x2.

5 Experimental Results

In this section we will present our experiments to validate our solution to the problem of image
preprocessing automation. We start our experiments by comparing the accuracy performance of
image classifiers with and without preprocessing. Then, we evaluate the robustness of the classifiers
with respect to distorted images at test time. In addition, we also provide some insights on the
behaviour of the reinforcement learning framework.

5.1 Experimental Setups

We select for our study four data sets with different levels of complexity and noise. MNIST [12]
is a very clean 10-class data set with 70K 28x28x1 images divided into 55K/5K/10K for training,
validation and testing, respectively. SVHN [5] is a 10-class data set that is noisier than MNIST with
∼864K 32x32x3 images divided into ∼598K/6K/26K. CIFAR [15] is a 10-class data set that is yet
noisier than SVHN with 60K 32x32x3 images divided into 45K/5K/10K. Finally, DOGCAT [6] is the
noisiest of all four data sets; it has 2 classes with 25K 100x100x3 images divided into 20K/1K/4K.
With respect to the transformation set, we implemented two operations, namely image rotation
and flipping, for simplicity because they trivially satisfy the symmetry property requirement
without the necessity to implement a memory mechanism. Concretely, there are 11 transforma-
tions consisting of 3 flippings (horizontally, vertically, and both) and 8 rotations (with angles
−1, −2, −4, −8, +8, +4, +2, +1 degrees). The parameter max_len specifying the maximum length
of a transformation chain is set to 10 in our experiments. Other general parameters include
optimizer = Adam, learning_rate = 0.0001 and regularization_coef f icient = 0.001. For
each experiment, we perform 5 runs with 5 different initializations and report results as mean ± std.

6
5.2 Performance of Image Classifiers

Performance results in terms of accuracy are shown in Table 1. It can be seen that in most cases,
the bare convolutional neural network classifier (NN) produces the worst performance while the
reinforcement learning classifier (RL) yields higher accuracy. The accuracy performance is improved
further by the CNN classifier that is continued to learn (CL) from the trained RL classifier. We note
that the accuracy reported in Table 1 does not achieve state-of-the-art performance as the networks
that we used in our experiments were relatively simple and not adapted for the data sets; nevertheless
we believe it is worth mentioning that the RL framework results in improving the accuracy of the
baseline methods, nb: without increasing the size of the training set. Moreover, it is interesting
to observe that the accuracy difference between the NN classifier and the RL classifier increases
for noisier and more complex data sets. On the one hand, while for MNIST simple preprocessing
techniques such as rotation and flipping do not help improving the accuracy, they even decrease
accuracy as some digits change their meaning when being rotated or flipped. On the other hand,
on the much noisier DOGCAT data set, the RL classifier is much more successful in increasing the
accuracy of the baseline CNN.

Table 1: Performance of image classifiers in term of accuracy.

MNIST SVHN CIFAR DOGCAT
NN 0.9926 ± 0.0004 0.9429 ± 0.0015 0.7018 ± 0.0028 0.7461 ± 0.0137
Arch1 RL 0.9915 ± 0.0010 0.9507 ± 0.0031 0.7442 ± 0.0046 0.8035 ± 0.0062
CL 0.9935 ± 0.0006 0.9509 ± 0.0026 0.7410 ± 0.0026 0.8222 ± 0.0102
NN 0.9941 ± 0.0002 0.9636 ± 0.0008 0.7910 ± 0.0048 0.8669 ± 0.0261
Arch2 RL 0.9916 ± 0.0004 0.9715 ± 0.0016 0.8193 ± 0.0015 0.9209 ± 0.0124
CL 0.9936 ± 0.0004 0.9716 ± 0.0011 0.8175 ± 0.0022 0.9044 ± 0.0090
NN 0.9954 ± 0.0004 0.9734 ± 0.0016 0.8391 ± 0.0028 0.8647 ± 0.0065
Arch3 RL 0.9932 ± 0.0007 0.9726 ± 0.0029 0.8687 ± 0.0080 0.9231 ± 0.0081
CL 0.9955 ± 0.0005 0.9760 ± 0.0021 0.8700 ± 0.0057 0.9243 ± 0.0101

5.3 Robustness of Image Classifiers

In order to evaluate the robustness of image classifiers, we distort each test image with 50% probability
by applying a random chain of transformations. Robustness results in term of accuracy are shown
in Table 2. As we can see, the results are consistent in all cases in the sense that the NN classifier
is less robust (its accuracy decreases significantly on the test set with distortions), compared to the
performance on clean test data reported in Table 1. On the other hand, the RL classifier is much more
robust as its performance only slightly degrades on the distorted test data. Note that, only 50% images
of the test set were distorted, hence the robustness difference between the two classifiers would be
even larger if all test images had been distorted. The robustness of the CL classifier is not as high as
for the RL classifier, but still substantially higher than that of the NN classifier. This is a trade-off
between the accuracy and the robustness when choosing between the RL and the CL classifiers.

Table 2: Robustness of image classifiers in term of accuracy.

MNIST SVHN CIFAR DOGCAT
NN 0.7442 ± 0.0037 0.6991 ± 0.0020 0.5099 ± 0.0078 0.6687 ± 0.0068
Arch1 RL 0.9772 ± 0.0024 0.8699 ± 0.0033 0.7006 ± 0.0085 0.7574 ± 0.0203
CL 0.8400 ± 0.0048 0.7945 ± 0.0100 0.6645 ± 0.0032 0.7723 ± 0.0228
NN 0.7516 ± 0.0021 0.7242 ± 0.0011 0.5872 ± 0.0057 0.7936 ± 0.0302
Arch2 RL 0.9797 ± 0.0016 0.8876 ± 0.0020 0.7909 ± 0.0027 0.8987 ± 0.0151
CL 0.8301 ± 0.0114 0.8207 ± 0.0024 0.7709 ± 0.0035 0.8524 ± 0.0076
NN 0.7563 ± 0.0024 0.7400 ± 0.0019 0.6402 ± 0.0093 0.7791 ± 0.0062
Arch3 RL 0.9826 ± 0.0014 0.9062 ± 0.0015 0.8160 ± 0.0046 0.8872 ± 0.0074
CL 0.9549 ± 0.0046 0.8462 ± 0.0085 0.7199 ± 0.0037 0.8845 ± 0.0068

7
5.4 A Deeper Look into the Operation of the Framework

In order to visualize how the reinforcement learning framework preprocesses distorted images, we
run another experiment on MNIST with coarser distortions, in particular rotations are performed
with large angles ±90 degrees and flipping operations as in the previous experiment. For each
distorted image, we trace the operation of the framework and obtain the transformation chain that
the framework automatically generates for the image. An illustration for a few images is shown
in Figure 5. It is interesting that most images are either classified directly or transformed to their
original version before being classified. The exact recovery is possible thanks to the symmetry
property of transformation actions. Although the framework is able to recover distorted images, it is
not guaranteed to find the optimal chain of transformations in term of the shortest recovery path. In
addition, there is a small number of images which are confused by the framework as shown in the
bottom row of Figure 5. These are the main source of misclassification errors of the reinforcement
learning classifier.

Figure 5: Illustration of how the reinforcement learning framework preprocesses distorted images.

6 Discussion
The key contributions of this paper are three-fold. Firstly, we developed the idea of automated data
preprocessing using a reinforcement learning framework. While we demonstrated and evaluated it
for image data, it is applicable to other types of structured and unstructured data as well. Secondly,
the proposed system is iterative and therefore it provides explainable data preprocessing, i.e. one
can inspect which transformations were applied to each data instance during the preprocessing.
Thirdly, compared with traditional data augmentation approaches, our system follows a more efficient
approach to produce a clean training data set that can be used effectively for training highly accurate
and robust machine learning models.
Despite being of high practical value, the automation of data preprocessing has only drawn little
interest by the machine learning research community so far. Although we suggest in this paper a
novel approach for this problem, there is still a lot of room to extend this work. Firstly, the set of
transformations may contain more advanced preprocessing techniques such as rotations with learnable
angles, cropping/scaling with learnable ratios, image segmentation, object detection, etc. While it is
easy to integrate continuous actions with learnable parameters into the framework as described in
Section 3.1.2, complicated actions like image segmentation and object detection may require more
efforts. For example, one could select only a small number of segments or objects as the simplified
representation of an image for the next iteration after applying those actions. Secondly, one could
boost the performance of the reinforcement learning framework by replacing the current simple DQN
policy network. In addition, CNNs derived from the policy network (as described in Figure 3(c)) may
be a way to obtain better performance in terms of accuracy.

7 Conclusions
We have presented in this paper a novel approach to the problem of automating data preprocessing,
which is of high potential value for real-world data science and machine learning projects. The
approach is based on a reinforcement learning framework to find sequences of preprocessing transfor-
mations for each data instance individually. We showed in our experiments that even with simple
preprocessing actions such as rotation and flipping, image classifiers can benefit significantly with
respect to their accuracy and particularly their robustness. Thanks to the iterative nature of the
framework, our solution also provides a certain level of explanability, i.e. we can trace exactly how an
image is preprocessed via a chain of transformations. In summary, we believe that this is a promising

8
research approach to address the problem of automating data preprocessing. Future work should aim
at addressing continuous actions, transformations that require memorization, and demonstrating the
framework on other types of data such as text or time series.

References
[1] A. Gherega, M. Radulescu, M. Udrea, “A Q-Learning Approach to Decision Problems in Image
Processing”, International Conferences on Advances in Multimedia, Pages 60-66, 2012.
[2] B. Bhanu, J. Peng, “Adaptive Integrated Image Segmentation and Object Recognition”, IEEE
Transactions on Systems, Man and Cybernetics, Pages 427-441, 2000.
[3] B. Bilalli, A. Abello, T. Aluja-Banet, R. Wrembel, “Automated Data Pre-processing via Meta-
learning”, International Conference on Model and Data Engineering, Pages 194-208, 2016.
[4] B. Wang, D. Klabjan, “Regularization for Unsupervised Deep Neural Nets”, AAAI Conference
on Artificial Intelligence, 2017.
[5] CIFAR Data Set, https://www.cs.toronto.edu/ kriz/cifar.html, 2018.
[6] DOGCAT Data Set, https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data, 2018.
[7] I. Goodfellow, Y. Bengio, A. Courville, “Deep Learning”, MIT Press, 2017.
[8] J.C. Caicedo, S. Lazebnik, “Active Object Localization with Deep Reinforcement Learning”,
IEEE International Conference on Computer Vision, Pages 2488-2496, 2015.
[9] K.K. Pal, K.S. Sudeep, “Preprocessing for Image Classification by Convolutional Neural
Networks”, IEEE International Conference on Recent Trends in Electronics, Information &
Communication Technology, 2016.
[10] L. Perez, J. Wang, “The Effectiveness of Data Augmentation in Image Classification using
Deep Learning”, arXiv:1712.04621, 2017.
[11] M.A. Munson, “A Study on the Importance of and Time Spent on Different Modeling Steps”,
ACM SIGKDD Explorations, Pages 65-71, 2011.
[12] MNIST Data Set, http://yann.lecun.com/exdb/mnist/, 2018.
[13] M. Paulin, J. Revaud, Z. Harchaoui, F. Perronnin, C. Schmid, “Transformation Pursuit for
Image Classification”, IEEE Conference on Computer Vision & Pattern Recognition, Pages
3646-3653, 2014.
[14] R.S. Sutton, A.G. Barto, “Reinforcement Learning: An Introduction”, MIT Press, 2017.
[15] SVHN Data Set, http://ufldl.stanford.edu/housenumbers/, 2018.
[16] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, M. Riedmiller,
“Playing Atari with Deep Reinforcement Learning”, NIPS Deep Learning Workshop, 2013.
[17] V. Mnih et al., “Human-level Control through Deep Reinforcement Learning”, Nature Journal,
Pages 529-533, 2015.
[18] Y. Gal, Z. Ghahramani, “A Theoretically Grounded Application of Dropout in Recurrent Neural
Networks”, Advances in Neural Information Processing Systems, Pages 1019-1027, 2016.
[19] Y. Kubo, G. Tucker, S. Wiesler, “Compacting Neural Network Classifiers via Dropout Training”,
arXiv:1611.06148, 2017.
[20] Y. Ma, D. Klabjan, “Convergence Analysis of Batch Normalization for Deep Neural Nets”,
arXiv:1705.08011, 2017.
[21] Z. Wang, T. Schaul, M. Hessel, H.V. Hasselt, M. Lanctot, N.D. Freitas, “Dueling Network Ar-
chitectures for Deep Reinforcement Learning”, International Conference on Machine Learning,
Pages 1995-2003, 2016.

Step by Step Procedure - Yuktdhara
No ratings yet
Step by Step Procedure - Yuktdhara
51 pages
Generative Design of Landforms With Dynamo in Civil 3D: Andreas Luka
No ratings yet
Generative Design of Landforms With Dynamo in Civil 3D: Andreas Luka
20 pages
UNCUT Magazine - Issue 88
100% (1)
UNCUT Magazine - Issue 88
49 pages
300 PDF
No ratings yet
300 PDF
8 pages
Application of Data Augmentation On Deep Learning
No ratings yet
Application of Data Augmentation On Deep Learning
13 pages
1 s2.0 S2666285X22000565 Main
No ratings yet
1 s2.0 S2666285X22000565 Main
9 pages
2022_A review_ Data pre-processing and data augmentation techniques - ScienceDirect
No ratings yet
2022_A review_ Data pre-processing and data augmentation techniques - ScienceDirect
20 pages
1 s2.0 S0925231221009486 Main
No ratings yet
1 s2.0 S0925231221009486 Main
7 pages
Exer8 TresMarias
No ratings yet
Exer8 TresMarias
3 pages
Building Powerful Image Classification Models Using Very Little Data
No ratings yet
Building Powerful Image Classification Models Using Very Little Data
20 pages
Jimaging 09 00046 v2
No ratings yet
Jimaging 09 00046 v2
26 pages
Comparing Data Augmentation Strategies For Deep Image Classificat
No ratings yet
Comparing Data Augmentation Strategies For Deep Image Classificat
9 pages
Image Classification
No ratings yet
Image Classification
18 pages
CV Pipeline Preprocessing Stage: Dr. Hussien Karam
No ratings yet
CV Pipeline Preprocessing Stage: Dr. Hussien Karam
10 pages
A comprehensive survey of recent trends in deep learning for digital images augmentation
No ratings yet
A comprehensive survey of recent trends in deep learning for digital images augmentation
27 pages
Albumentation
No ratings yet
Albumentation
20 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
30 pages
A Study On Effects of Data Augmentation in Detection
No ratings yet
A Study On Effects of Data Augmentation in Detection
13 pages
Shijie 2017
No ratings yet
Shijie 2017
6 pages
Li Et Al. - 2023 - Building Manufacturing Deep Learning Models With M
No ratings yet
Li Et Al. - 2023 - Building Manufacturing Deep Learning Models With M
8 pages
DL7 1
No ratings yet
DL7 1
19 pages
CSY3025 Artificial Intelligence Techniques: Deep Learning
No ratings yet
CSY3025 Artificial Intelligence Techniques: Deep Learning
42 pages
2.3 Image Data Preprocessing For Deep Learning
No ratings yet
2.3 Image Data Preprocessing For Deep Learning
3 pages
Augmentation and Segmentation
No ratings yet
Augmentation and Segmentation
32 pages
3.2 Preprocessing
No ratings yet
3.2 Preprocessing
10 pages
MICCAI21_fewshot
No ratings yet
MICCAI21_fewshot
12 pages
Learning Feature Engineering For Classification
No ratings yet
Learning Feature Engineering For Classification
7 pages
19BCE089_FIVP_INNOVATIVE_ASSIGNMENT_REPORT
No ratings yet
19BCE089_FIVP_INNOVATIVE_ASSIGNMENT_REPORT
18 pages
Scaling Robot Learning with Semantically Imagined Experience
No ratings yet
Scaling Robot Learning with Semantically Imagined Experience
21 pages
Generative Pretraining From Pixels V2
No ratings yet
Generative Pretraining From Pixels V2
12 pages
Module 5
No ratings yet
Module 5
72 pages
RanPAC
No ratings yet
RanPAC
32 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Data Augmentation
No ratings yet
Data Augmentation
2 pages
Generative Pretraining From Pixels
No ratings yet
Generative Pretraining From Pixels
13 pages
EBSD Calssification
No ratings yet
EBSD Calssification
12 pages
Ali-Aug
No ratings yet
Ali-Aug
29 pages
The Research of Intelligent Image Recognition Technology Based On Neural Network
No ratings yet
The Research of Intelligent Image Recognition Technology Based On Neural Network
4 pages
Make 04 00002 v2
No ratings yet
Make 04 00002 v2
20 pages
1809.06839v1
No ratings yet
1809.06839v1
4 pages
Data Augmentation Techniques I
No ratings yet
Data Augmentation Techniques I
23 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
Unsupervised Pre-Training of Image Features On Non-Curated Data
No ratings yet
Unsupervised Pre-Training of Image Features On Non-Curated Data
10 pages
A Survey On Image Data Augmentation For Deep Learn
No ratings yet
A Survey On Image Data Augmentation For Deep Learn
49 pages
EN3150 Pattern Recognition - L02
No ratings yet
EN3150 Pattern Recognition - L02
51 pages
Exercise #1 7_4_2025
No ratings yet
Exercise #1 7_4_2025
3 pages
2031 Learning To Grow Pretrained Mo
No ratings yet
2031 Learning To Grow Pretrained Mo
18 pages
Transferability in Deep Learning: A Survey: Junguang Jiang
No ratings yet
Transferability in Deep Learning: A Survey: Junguang Jiang
64 pages
Khosla 2020
No ratings yet
Khosla 2020
7 pages
Week-2 - ML Slides
No ratings yet
Week-2 - ML Slides
26 pages
An Introduction To Image Classification (Klaus D Toennis) (Z-Library)
No ratings yet
An Introduction To Image Classification (Klaus D Toennis) (Z-Library)
297 pages
Image Processing Through Machine Learning: By:-Akansh Kumar (En-1)
No ratings yet
Image Processing Through Machine Learning: By:-Akansh Kumar (En-1)
22 pages
A Complete Guide To Data Augmentation - DataCamp
No ratings yet
A Complete Guide To Data Augmentation - DataCamp
18 pages
Unsupervised Embedding Learning Via Invariant and Spreading Instance Feature
No ratings yet
Unsupervised Embedding Learning Via Invariant and Spreading Instance Feature
10 pages
Self-Supervised_Contrastive_Representation_Learning_for_Semi-Supervised_Time-Series_Classification
No ratings yet
Self-Supervised_Contrastive_Representation_Learning_for_Semi-Supervised_Time-Series_Classification
15 pages
Data Preprocessing
No ratings yet
Data Preprocessing
2 pages
BigData Assessment2 26230605
No ratings yet
BigData Assessment2 26230605
14 pages
Image Recognition Technology Based On Machine Lear
No ratings yet
Image Recognition Technology Based On Machine Lear
9 pages
Bad Students Make Great Teachers
No ratings yet
Bad Students Make Great Teachers
16 pages
jimaging-09-00207-v2 (1)
No ratings yet
jimaging-09-00207-v2 (1)
22 pages
ML image processing
No ratings yet
ML image processing
22 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
PBL ppt (1)
No ratings yet
PBL ppt (1)
6 pages
Automated Seed Plantation Mechanism a Solution to Labor Shortages in Agriculture
No ratings yet
Automated Seed Plantation Mechanism a Solution to Labor Shortages in Agriculture
8 pages
Aic Unit 1
No ratings yet
Aic Unit 1
74 pages
Unit 3
No ratings yet
Unit 3
87 pages
Aic Unit 3
No ratings yet
Aic Unit 3
96 pages
21 Gallon Air Compressor
No ratings yet
21 Gallon Air Compressor
16 pages
VOL 3 Listening Test 3 NEW
No ratings yet
VOL 3 Listening Test 3 NEW
6 pages
Resume Surya 2
No ratings yet
Resume Surya 2
2 pages
Aws Multi Account Best Practice
100% (1)
Aws Multi Account Best Practice
122 pages
Technical English 2e Level 4 Contents
No ratings yet
Technical English 2e Level 4 Contents
2 pages
menu flutuante att_02-06-2025_1142 2
No ratings yet
menu flutuante att_02-06-2025_1142 2
5 pages
Fuji Diagram Mcb-V5 Ver.14.03
100% (1)
Fuji Diagram Mcb-V5 Ver.14.03
32 pages
Hardware of MRI Gradients
No ratings yet
Hardware of MRI Gradients
47 pages
Anomaly Detection in Self-Organizing Networks - Conventional Versus Contemporary Machine Learning
No ratings yet
Anomaly Detection in Self-Organizing Networks - Conventional Versus Contemporary Machine Learning
9 pages
Fractions Worksheet: Name: - Date
No ratings yet
Fractions Worksheet: Name: - Date
2 pages
List of Thesis Titles For Computer Science
100% (3)
List of Thesis Titles For Computer Science
8 pages
Lab Assignment 2
No ratings yet
Lab Assignment 2
3 pages
Solution Manual for Fundamentals of Database Systems, 6/E 6th Edition : 0136086209pdf download
100% (5)
Solution Manual for Fundamentals of Database Systems, 6/E 6th Edition : 0136086209pdf download
36 pages
LCR Practical Questions
No ratings yet
LCR Practical Questions
29 pages
19 Usage of ZigBee and LoRa Wireless Technologies in IoT Systems
No ratings yet
19 Usage of ZigBee and LoRa Wireless Technologies in IoT Systems
4 pages
Knowledge Management Facilitators Guide 2020 PDF
100% (1)
Knowledge Management Facilitators Guide 2020 PDF
301 pages
5 Lean Voice of Customer
No ratings yet
5 Lean Voice of Customer
1 page
MCQ Se&pm 2019-2020 Q & A
No ratings yet
MCQ Se&pm 2019-2020 Q & A
78 pages
TC 9 J Yl 43 LX HEEz JUg
No ratings yet
TC 9 J Yl 43 LX HEEz JUg
2 pages
PSA UNIT 1 QUESTION BANK.docx (2)
No ratings yet
PSA UNIT 1 QUESTION BANK.docx (2)
12 pages
đề cương gk2 anh 8 (t)
No ratings yet
đề cương gk2 anh 8 (t)
6 pages
A2-DCN-Fall24-US-11102024-124436pm
No ratings yet
A2-DCN-Fall24-US-11102024-124436pm
13 pages
Orientation To Data Analytics I (Informatics) IK1023.2
No ratings yet
Orientation To Data Analytics I (Informatics) IK1023.2
14 pages
Computer Science 2 2021 2022
100% (1)
Computer Science 2 2021 2022
77 pages
Unit - 5 - Structures and Pointers-1
No ratings yet
Unit - 5 - Structures and Pointers-1
43 pages
g7q4dll lc56
No ratings yet
g7q4dll lc56
8 pages