Amazon Inventory Reconciliation Using AI: ST ND RD
Amazon Inventory Reconciliation Using AI: ST ND RD
1st Pablo Rodriguez Bertorello 2nd Sravan Sripada 3rd Nutchapol Dendumrongsup
Computer Science Computer Science Computational & Mathematical Engineering
Stanford University Stanford University Stanford University
Palo Alto, California Palo Alto, California Palo Alto, California
[email protected] [email protected] [email protected]
Abstract—Inventory management is critical to Amazon’s suc- Support Vector Machines. For the latter, we attained accuracy
cess. Thus, the need arises to apply artificial intelligence to assure of over 30%. We operated on a reduced dataset, for bins that
the correctness of deliveries. The paper evaluates Convolutional contained up to 5 products, for a random baseline probability
Neural Networks for inventory reconciliation. The network de-
tects the the number of items being carried, relying only on of 16.6%.
a photo of the bin. Convolutional performance is evaluated We are among the first to publish results on Amazon’s Bin
against Support Vector Machines, both non-linear and linear. Our Image Dataset. Prior art by Eunbyung Park of the University of
Convolutional method performed over 3x better than random, North Carolina at Chapel Hill [2] applied Deep Convolutional
and over 75% better than our best Support Vector classifier. Classification (ResNet 34). It achieved 55.67% accuracy.
Index Terms—Convolutional Neural Network, Deep Learning,
Support Vector Machine, Radial Basis Function, Amazon Bin II. DATASET AND F EATURES
Image Dataset
A. Input Data Set
I. I NTRODUCTION The data set contains 535,234 images, which contain
Amazon Fulfillment Centers are bustling hubs of innovation 459,476 different product skews, of different shapes and sizes.
that allow Amazon to deliver millions of products to over 100
countries worldwide. These products are randomly placed in
bins, which are carried by robots.
Occasionally, items are misplaced while being handled,
resulting in a mismatch: the recorded bin inventory, versus
its actual content. The paper describes methods to predict
the number of items in a bin, thus detecting any inventory
variance. By correcting variance upon detection, Amazon will
better serve its customers.
Specifically, the input to our model is a raw color photo, of
the products in a bin. To find the best solution to the inventory Fig. 1: Sample Image
mismatch problem, Amazon published the Bin Image Dataset,
which is detailed in Section II. Each image/metadata tuple corresponds to a bin with prod-
The output of a model is the bin’s predicted quantity, the ucts. The metadata includes the actual count of objects in the
number of products in the image. bin, which is used as the label to train our model.
While we started with linear methods, the quest for model We worked with the subset of 150K images, each containing
performance lead us to non-linear algorithms, and ultimately with up to five products. We split this as follows: 70% training,
to convolutional deep learning. Section III summarizes each 20% validation, and 10% test.
algorithms applied. They include:
• Logistic Regression and Classification Trees, summarized B. Data Engineering
in Section III-A and Section III-B Before model training, images were normalized:
• Support Vector Machines: linear kernel, polynomial ker- 1) Re-sized to 224x224 pixels
nel, radial kernel. The algorithms are summarized in 2) Tansformed for zero mean, and unit variance
Section III-C 3) For convolutional models, the dataset was augmented
• Convolutional Neural Network: ResNet, cross-entropy with horizontal flips of every image
loss function, with learning rate optimizations. The al-
gorithm summary in Section III-D C. Feature Engineering: Blobs
Section IV summarizes performance resuts. With Convolu- We explored the Blob features extraction. Blobs are bright
tional Neural Networks we were able to achieve an overall on dark or dark on bright regions in an image. All the
accuracy exceeding 56%. This is over 60% better than with bins in which items are placed are similar and if items are
Fig. 4: Learning Rate Finder - Loss Given the large number of columns in a raw RGB image
(total of 150,528), compressing work-load it to as few as 1000
3) Optimization Algorithms: columns, it is important to identify most effective solver. The
a) Stochastic Gradient Descent (SGD) and Stochastic following Principal Component Analysis solvers were tried:
Gradient Descent with Restarts (SGDR): Training CNNs ’auto’, ’full’, ’arpack’, ’randomized’. The PCA solver resulting
involve optimizing multimodal functions. SGDR helps reset in the highest accuracy: ’randomized’.
learning rate every few iterations so that gradient descent can With respect to Histogram of Oriented Gradients (HOG),
pop out of a local optimum and find its way to global minima, evaluation suggests that the images in the data set do not have
an idea shown to work effectively in the paper Loshchilov, et enough dominant gradients. Thus, identifying products in a
al. [5]. Fig 5 shows SGDR with cosine annealing. bin is difficult. In part, this may be due to Amazon’s usage
of tape to cover products in a bin. For many images, the tape
occludes the products, causing a significant information loss.