Atlas: A Dataset and Benchmark For E-Commerce Clothing Product Categorization

Atlas: A Dataset and Benchmark for
E-commerce Clothing Product Categorization
Venkatesh Umaashankar1[0000−0001−5230−1209] , Girish Shanmugam

S2[0000−0003−2805−7503] , and Aditi Prakash3[0000−0002−4839−9132]
1
Ericsson Research, Chennai, India
arXiv:1908.08984v1 [cs.CV] 12 Aug 2019
[email protected]
2
Ericsson Research, Chennai, India
[email protected]
3
University of Colorado, Boulder, Colorado, USA
[email protected]
Abstract. In E-commerce, it is a common practice to organize the prod-

uct catalog using product taxonomy. This enables the buyer to easily
locate the item they are looking for and also to explore various items
available under a category. Product taxonomy is a tree structure with 3
or more levels of depth and several leaf nodes. Product categorization is a
large scale classification task that assigns a category path to a particular
product. Research in this area is restricted by the unavailability of good
real-world datasets and the variations in taxonomy due to the absence
of a standard across the different e-commerce stores. In this paper, we
introduce a high-quality product taxonomy dataset focusing on clothing
products which contain 186,150 images under clothing category with 3
levels and 52 leaf nodes in the taxonomy. We explain the methodology
used to collect and label this dataset. Further, we establish the bench-
mark by comparing image classification and Attention based Sequence
models for predicting the category path. Our benchmark model reaches
a micro f-score of 0.92 on the test set. The dataset, code and pre-trained
models are publicly available at https://github.com/vumaasha/atlas.
We invite the community to improve upon these baselines.
Keywords: Product Categorization · Attention · Seq to Seq models ·

Image Classification · Computer Vision
1 Introduction
With the Internet revolution, E-commerce has become a major platform for
selling products to customers. E-commerce stores host a collection of prod-
ucts ranging from electronics to fashion apparel to grocery. A well-organized
E-commerce store lets customers navigate through the website with ease and
locate the product they are looking for. Unlike a traditional retail store where
you can walk in and seek assistance, online retailers rely on their product cat-
alog or categorization to assist shoppers to find their desired product. Product
taxonomy is a tree structure with multiple top and intermediate levels, ending
2 V. Umaashankar et al.
in leaf nodes. Taxonomy classification is the process of assigning a category path

to a particular product in the taxonomy tree. E-commerce sites use hierarchi-
cal taxonomies to organize products from generic to specific classes where each
level provides more specific details about the product than the previous level.
For example, Clothing & Accessories >Men >Winterwear >Sweatshirts
& Hoodies. These classification levels are important for an E-commerce store
to perform operations such as search, catalog building, recommendation, which
thereby hugely influence customer satisfaction and revenue of e-commerce sites.
Currently, most of these product classification mechanisms rely on sellers to
provide correct details. Each E-commerce store has its own product taxonomy
and a seller typically sells in multiple stores. This implies that the seller has to
perform such categorizations manually multiple times. Automating this has po-
tential benefits of reduced costs and better catalog quality. Our key contributions
in this paper are: (1) Developed a clean and rich clothing product taxonomy
dataset containing 186,150 images and their corresponding product titles which
maps to 52 category paths. (2) Proposed a methodology to collect large scale
product taxonomy dataset which can be easily extended to categories other than
Clothing. (3) Trained and compared two benchmark models (Image classifica-
tion and Attention based Seq to Seq model) that predicts the category path
from the product image. Our best model reached an f-score of 0.92 on the test
set. (4) The dataset, source code and pre-trained models are made publicly
available4 to encourage future research in this area.
The rest of the paper is organized as follows. Related literature is reviewed
in Section 2. In Section 3 we explain the methodology that we used to develop
the Atlas dataset. In Section 4 we build our benchmark models and explain
our model architecture. In Section 5 we provide our training setup and meta
parameters to facilitate reproducible research. In Section 6 we summarize our
results. Finally, in Section 7 we conclude and provide details about possible
directions for future work.
2 Related Work
A clean and detailed product taxonomy offers several benefits to both the
E-commerce store and its customers. However, creating, maintaining or adapting
an existing categorization standard is not an easy task. Still, most of E-commerce
stores want to have flexibility in the way they organize their catalog and create
their product taxonomy.
Initially, techniques from information retrieval and machine learning were
applied to solve the problem of product categorization. GoldenBullet [4] is a
software environment targeted to automatically classify the products, based on
their original descriptions and existent classification standards (such as UN-
SPSC). It integrates different classification algorithms like Vector space model
(VSM), K-nearest neighbor and Naive-Bayes classifier algorithms and some nat-
ural language processing techniques to pre-process data. [5] approached product
categorization as a hierarchical text classification task. They proposed two differ-
4
https://github.com/vumaasha/Atlas
Atlas: A Dataset for E-commerce Clothing Product Categorization 3
ent approaches of building separate classifiers for each level in the hierarchy and
a flat classifier that directly predicts the leaf level assignment of a document.
They used Support Vector Machine (SVM) classifiers for evaluating both the
approaches. [11] presented a simple linear classifier based approach for product
categorization using mutual information and LDA based features. In general, the
computational complexity involved in some of these traditional machine learning
techniques is well beyond linear with respect to the number of training examples,
features, or classes. The scale of the E-commerce product categorization requires
algorithms capable of processing a huge volume of training data in a reasonable
time, capable of handling a large number of classes and also capable of making
fast real-time predictions.[18].
The remarkable progress made in the field of deep learning in recent years has
provided a better way to approach this problem. [3] has done a detailed study
of using Convolutional Neural Networks (CNN) for the product categorization
task. They used the Amazon product dataset provided by [17] and text features
such as product titles, navigational breadcrumbs, and list price. [6] used mul-
tiple Deep Recurrent Neural Network (RNNs) and generated features from the
text metadata. In recent times, Sequential model-based approaches have been
widely used for product categorization. [8] modeled product categorization as a
Sequence to Sequence learning, they used product titles which are a sequence of
words as input and predicted the category path as a sequence of category levels
in the product taxonomy.
Due to the availability of large high-quality image datasets, the field of Image
classification [12] has matured a lot in recent times. Noise and ambiguity is a
common problem in textual product titles and description. However, most of
the E-commerce products tend to have decent product images, this leads to
a natural choice of using images for product categorization. [1], [2], [10] and
[16] applied computer vision techniques for fashion apparel categorization based
on the product images. The closest to our work is by [13] where they use Seq
to Seq model with product titles as input to predict category paths using an
LSTM Decoder and beam search for inference. We extend their work in this
paper by using product images instead of product titles as input for product
categorization. Similar to [13], we learn an Attention based Seq to Seq model.
3 Atlas Dataset
Rakuten made a product classification dataset publicly available in Rakuten
Data Challenge [15], However, this dataset contains only the product titles and
the levels in the taxonomy are represented using numerical IDs instead of plain
text. Real world product taxonomy datasets are not publicly available. Also,
there is no widely adopted industry standard for defining product taxonomies.
In addition to these, factors like data size, category skewness, and noisy metadata
are limiting further research and practical implementation of large scale product
categorization. This motivated us to develop a real-world dataset for product
categorization.
We developed a new product categorization dataset called Atlas. An E-

commerce store typically sells products under several top-level categories such
as Electronics, Home & Kitchen, Clothing, etc. In this paper, we focused only
on clothing products, . Our dataset contains data corresponding to 52 products
and their title, price, image and category path.
3.1 Taxonomy Generation
In the E-commerce world, each store has its own taxonomy. For example
the category path for ’Jackets’ in Flipkart5 is Clothing >Men’s Clothing
>Winter & SeasonalWear >Jackets and in Amazon6 it is Clothing &
Accessories >Men >Jackets. Treating them as different category paths will
lead to noisy taxonomy and training data. We designed our taxonomy based on
the similarities in the taxonomy structures across the different e-commerce re-
tailer websites. Our taxonomy is organized to a maximum depth of 3 levels which
can assist the consumer to reach their product in not more than 3 clicks. The
process of building our taxonomy involved three steps. First, we analyzed and
listed the taxonomy structures of popular products, niche, and premium cloth-
ing products across different e-commerce stores. Next, we identified the common
category paths up to the third level across these websites. Finally, clothing that
had the same category paths until the third level were clubbed together irrespec-
tive of the dissimilarity in the deeper levels. For example, Women >Ethnic
Wear >Salwar Kameez >Bollywood and Women >Ethnic Wear >Sal-
war Kameez >Anarkali are grouped together under the category Women
>Ethnic Wear >Salwar Kameez as they have similar category paths until
the third level beyond which it branches into different nodes. Our final taxonomy
tree is not an exhaustive list of all clothing categories but cover popular West-
ern Wear and niche Ethnic Wear especially from the Indian Clothing collection.
Each of the categories in the taxonomy tree have a maximum depth of 3 levels
and totals to 52 category paths. Our taxonomy tree and a few sample products
from some of the categories in our dataset can be found here 7 .
3.2 Data Collection
We crawled the product listings from popular Indian E-commerce stores. We
manually created a mapping of the store’s category path that maps to a category
path in our taxonomy. We used web scraping tools Scrapy and Selenium. The
crawlers extract the information from the HTML content of the product page
using CSS selectors. We extracted the product title, breadcrumb, image and price
corresponding to each product in the product listings. The attributes extracted
are stored in JSON format.
3.3 Data Cleaning
It is typical for an E-commerce store to show several images for a single
product. Out of these images, not all the images are necessarily a good represen-
5
https://www.flipkart.com/
6
https://www.amazon.in/
7
https://github.com/vumaasha/Atlas/tree/master/dataset#11-taxonomy-
generation
Fig. 1. Examples of (a) Zoomed(dirty) and (b) Normal(clean) images from our Atlas
dataset. The Zoomed images show close-ups of the apparel or cropped versions of the
image that make it difficult to recognize the product, whereas the Normal images show
figures with the entire product visible.
Fig. 2. Architecture of Zoomed Vs Normal Model
tative image of the product. Some images might display packaging, installation
instructions, etc. In the case of clothing, we found that many product listings
also included zoomed in images that display intrinsic details such as the texture
of the fabric, brand labels, button, and pocket styles. Without the context of the
product listing, it would be even hard for a human to identify the corresponding
product. Including these zoomed in images would drastically affect the qual-
ity of the dataset. To find and remove these noisy images manually would take
considerable time and effort. We modeled this as a binary classification task
(Zoomed Vs Normal Images) and compared Linear SVM with simple 3 layer
CNN (Figure. 2) based classification models. We prepared the training data by
visual inspection. We segregated noisy and high-quality images into two different
folders by looking at the thumbnails of hundreds of product images in a go. Our
models were trained on 6005 normal images and 1054 zoomed images and the
performance metrics on the test are shown in the Table. 1. We used computer
vision based features such as contors and histogram of gradients as input for our
LinearSVM. We automated the process of filtering out the noisy images using
the CNN model due to its superior performance compared to that of LinearSVM.
4 Benchmark models for Product Categorization

4.1 Resnet34 based Image Classification
We use the cnn learner available in fast.ai implementation to train our Image
classification model. This uses Resnet34 architecture as the backbone of the
Table 1. Metrics for the models used to predict Zoomed Vs Normal images
CNN SVM
precision recall f-score precision recall f-score
Normal 0.99 0.99 0.99 0.91 0.99 0.95
Zoomed 0.95 0.95 0.95 0.86 0.48 0.62
Average 0.98 0.98 0.98 0.91 0.91 0.90
model, which is followed by AdaptiveConcatPool2d, Flatten and 2 blocks of

[nn.BatchNorm1d, nn.Dropout, nn.Linear, nn.ReLU] layers. The first block will
have a number of inputs inferred from the backbone arch Resnet34 (512) and
the second one will have a number of outputs equal to the number of classes (52)
without nn.ReLU activation.
4.2 Attention based Seq to Seq Model
We approach the product categorization problem as a sequence prediction
problem by leveraging the dependency between each level in the category path.
We use Attention based Encoder-Decoder Neural Network architecture to gener-
ate sequences. The Encoder is a 101 layered Residual Network(ResNet) trained
on the ImageNet classification task which converts the input image to a fixed
size vector. The Decoder is a combination of Long Short-Term Memory(LSTM)
along with Attention Network which combines the Encoder output and Atten-
tion weights to predict category paths as sequences. Our architecture is similar
to works by [21] and [22] used for Neural Machine Translation and image cap-
tioning respectively. We extended the source code from pytorch tutorial to image
captioning repository by Sagar Vinodababu8 . Figure 3 shows some of the cat-
egory paths generated by our model on test images which are not seen during
training or validation. From this figure, it can be clearly seen that to generate
each category level our model focuses on different parts of the image. To predict
the first category level, which is the gender, our model has focused on the face
and it has focused on the actual region of the clothing products to predict the
next category levels, ’SareeBlouse’ and ’Kurta’.
Encoder and Decoder In Encoder, we use Convolutional Neural Network
(CNN) to produce fixed size vectors. The images in Atlas dataset have different
dimensions as they were collected from different sources. All these images are
resized to have a uniform dimension of 150*150 pixels before being fed as input
to the Encoder. The input images are then represented by the 3 color channels
of RGB values. The Encoder uses a 101 layered Residual Network pre-trained
on the ImageNet classification task which is shown in Figure 4. As we use the
Encoder only to encode images and not for classifying them, we remove the last
two layers (linear and pooling layers) from the ResNet-101 model proposed by [7].
The images are resized to a fixed size by adding a 2D adaptive average pooling
layer which enables the Encoder to accept images of variable sizes. The final
8
https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning
Fig. 3. A sample of category paths predicted on test dataset by our model. We can
observe how the Attention focuses on different sections of the image while generating
each category level. For example, the face is being focused to predict the first category
level - gender.
encoding produced by the Encoder will have the dimensions: batch size,14,14,
2048.
Recurrent Neural Networks(RNN) are popular for sequential classification
task as it considers both the current input and the learnings from the previously
received inputs for prediction. Usually, RNN’s have short term memory but when
combined with Long Short-Term Memory (LSTM) Network they have long term
memory as LSTMs contain their information in a memory. We have a stacked
LSTM Network along with Attention in our Decoder which is shown in Figure
4.
The Attention Network shown in Figure 4, learns which part of the image has
to be focused to predict the next level in the category path while performing the
sequence classification task. The Attention Network generates weights by con-
sidering the relevance between the encoded image and the previous hidden state
or previous output of the Decoder. It consists of linear layers which transform
the encoded image and the previous Decoder’s output to the same size. These
vectors are summed together and passed to another linear layer. This layer cal-
culates the values to be Softmaxed and then passes the values to a ReLU layer.
A final softmax layer calculates the weights alphas of the pixels which add up
to 1. If there are P pixels in our encoded image, then at each time step t,
P
X
αp,t = 1 (1)
p
We use a weighted average across all the pixels instead of a simple average so
that the important pixels are assigned greater weights.
The Decoder receives the encoded image from the Encoder using which it ini-
tializes the hidden and cell state of the LSTM model through two linear layers.
Two virtual category levels <start> and <end> which denote the beginning
and end of the sequence are added to the category path. The Decoder LSTM
uses teacher forcing proposed by [20] for training. The Decoder uses a <start>
marker which is considered to be the zeroth category level. The <start> marker
along with the encoded image is used to generate the first-top level of the cat-
egory path. Subsequently, all other levels are predicted using the sequence gen-
erated so far along with the Attention weights. An <end> marker is used to
mark the end of a category path. The Decoder stops decoding the sequence fur-
ther as soon it generates the <end> marker. At each time step, the Decoder
computes the weights and Attention weighted encoding from the Attention Net-
work using its previous hidden state. Another linear layer is added to create a
sigmoid-activated gate and the Attention weighted encodings are passed through
it and concatenated with the embedding of the previously generated category
path and fed into the LSTM Decoder to generate the new hidden state which is
also the next predicted level. The next level is predicted using a final softmax
layer from the hidden state of the Decoder. The softmax layer transforms the
hidden state into scores which are stored for further utilization in beam search
for selecting ’k’ best levels.
Fig. 4. Encoder - Decoder with Attention Network
5 Training
5.1 Model hyperparameters
Zoomed Vs Normal LinearSVM Model We trained LinearSVM available
in Scikit-learn with C set to 0.0001, class weight set to ’balanced’ using hinge
loss. The optimal C value was identified using a grid search.
Zoomed Vs Normal CNN Model We trained for 10 epochs using Binary
CrossEntropy as loss function, RMSProp Optimizer with a learning rate set to
0.001, rho set to 0.9 and decay set to 0.0.
Resnet34 based Image Classification We trained for 17 epochs using Cat-
egorical CrossEntropy as loss function and Leslie Smith’s one cycle policy [19]
for choosing the learning rate. We used early stopping to terminate the training
process when the decrease in validation loss is less than 0.001 for 3 consecutive
epochs.
Attention based Seq to Seq Model We trained our model in GPU for 3
epochs with a batch size of 128 and dropout rate as 0.5 after which the validation
accuracy stop improving. We used Adam optimizers with a learning rate of 1e-
4 and 4e-4 for Encoder and Decoder respectively. We picked the beam width
as 5 based on our experiments. Regularization parameter for doubly stochastic
Attention was set to 1 and gradient clipping was set to an absolute value of 5.
The pre-trained model can be downloaded from here9 .
5.2 Hardware
(1) Nvidia GPU GEFORCE GTX 1080 Ti 11GB RAM (2) Intel R Xeon R
Processor E5-2650 v4 30M Cache, 2.20 GHz, 12 Cores, 24 Threads (3) 250 GB
RAM (4) CentOS 7
6 Results
We evaluated the proposed model on our dataset having 186,150 clothing
images and their category paths. We split our dataset into train, validation and
test sets similar to the splits used in the work by [9]. Stratified random sampling
was carried out on our dataset with training set having 65% of data(119,155
images), 5% in the validation set(11,147 images) and 30% in the test set(55,848
images). The Resnet34 classification model and the Seq to Seq model trained on
our Atlas dataset achieved an overall micro f-score of 92% and 90% respectively.
A comparison of the f-scores of both the benchmark models over support size of
leaf categories is shown in Figure 5. Though we observe that the classification
model’s performance is better than Seq to Seq model, we believe the reason is
that we have only 52 categories at the moment. As the number of categories
increases, the structure in the taxonomy can be leveraged better using Seq to
Seq model. In addition to Seq to Seq models predicting the category paths, it
also explains the reason behind the predictions which is shown in Figure 3.
[14] claim that using Seq to Seq model for product categorization helps to
identify new category paths in the taxonomy. However, in our experiments, we
have observed that all the new category paths that are generated by the Seq to
Seq model are not always valid. In our case our model generated 5 new category
paths which are shown in Table 2 out of which we found only 2 to be valid.
Therefore, a manual inspection of newly created category paths is needed to
filter out the category paths which could be used to enrich the taxonomy.
Table 2. Valid and invalid category paths created by Seq to Seq model
Valid Category paths Invalid Category paths

Women>Western Wear>Blazers&Suits Men>Western Wear>Dresses
Women>Western Wear>Jackets Men>Western Wear>Tanktops & Camisoles
Women>Inner Wear>Shorts
9
https://goo.gl/forms/C1824kjmbuVo7H6H3
Fig. 5. F-scores of our benchmark models over leaf level categories ordered by their
sample size. Note that the sample size in x axis is in log scale
7 Conclusion and Future Works

This paper introduces Atlas, a fashion apparel dataset with 186,150 apparel
images along with their corresponding product titles. We have open-sourced the
code base and the procedure to build the dataset of images and their taxonomy.
We have proposed two benchmark models using classification and Attention
based Sequence approaches to predict product taxonomy.
In the future, we plan to extend our Atlas dataset by adding more cate-
gories and products thereby increasing the total number of category paths. We
would avoid the generation of invalid category paths in our Seq to Seq model
by considering the taxonomy structure while decoding and explore Transformer
Networks instead of Recurrent Neural Networks (RNN).
8 Acknowledgements
The First Author Venkatesh Umaashankar worked extensively in the problem
of Product Categorization using text attributes during his tenure at Indix. He
thanks Krishna Sangeeth, Sriram Ramachandrasekaran, Anirudh Venkataraman,
Manoj Mahalingam, Rajesh Muppalla and Sridhar Venkatesh for their help and
support.
Bibliography
[1] Bossard, L., Dantone, M., Leistner, C., Wengert, C., Quack, T., Van Gool,
L.: Apparel classification with style. In: Asian conference on computer vi-
sion. pp. 321–335. Springer (2012)
[2] Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic at-
tributes. In: European conference on computer vision. pp. 609–623. Springer
(2012)
[3] Das, P., Xia, Y., Levine, A., Di Fabbrizio, G., Datta, A.: Web-scale
language-independent cataloging of noisy product listings for e-commerce.
In: Proceedings of the 15th Conference of the European Chapter of the As-
sociation for Computational Linguistics: Volume 1, Long Papers. vol. 1, pp.
969–979 (2017)
[4] Ding, Y., Korotkiy, M., Omelayenko, B., Kartseva, V., Zykov, V., Klein, M.,
Schulten, E., Fensel, D.: Goldenbullet: Automated classification of product
data in e-commerce. In: Proceedings of the 5th international conference on
business information systems (2002)
[5] Dumais, S., Chen, H.: Hierarchical classification of web content. In: Proceed-
ings of the 23rd annual international ACM SIGIR conference on Research
and development in information retrieval. pp. 256–263. ACM (2000)
[6] Ha, J.W., Pyo, H., Kim, J.: Large-scale item categorization in e-commerce
using multiple recurrent neural networks. In: Proceedings of the 22nd ACM
SIGKDD International Conference on Knowledge Discovery and Data Min-
ing. pp. 107–115. ACM (2016)
[7] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recog-
nition. In: Proceedings of the IEEE conference on computer vision and
pattern recognition. pp. 770–778 (2016)
[8] Hiramatsu, M., Wakabayashi, K.: Encoder-decoder neural networks for tax-
onomy classification. In: eCOM@SIGIR. CEUR Workshop Proceedings,
vol. 2319. CEUR-WS.org (2018)
[9] Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating
image descriptions. In: Proceedings of the IEEE conference on computer
vision and pattern recognition. pp. 3128–3137 (2015)
[10] Kiapour, M.H., Yamaguchi, K., Berg, A.C., Berg, T.L.: Hipster wars: Dis-
covering elements of fashion styles. In: European conference on computer
vision. pp. 472–488. Springer (2014)
[11] Kozareva, Z.: Everyone likes shopping! multi-class product categorization
for e-commerce. In: Proceedings of the 2015 Conference of the North Ameri-
can Chapter of the Association for Computational Linguistics: Human Lan-
guage Technologies. pp. 1329–1333 (2015)
[12] Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with
deep convolutional neural networks. In: Advances in neural information pro-
cessing systems. pp. 1097–1105 (2012)
[13] Li, M.Y., Kok, S., Kok, S.: Unconstrained product categorization with
sequence-to-sequence models. In: eCOM@SIGIR. CEUR Workshop Pro-
ceedings, vol. 2319. CEUR-WS.org (2018)
[14] Li, M.Y., Kok, S., Tan, L.: Don’t classify, translate: Multi-level e-
commerce product categorization via machine translation. arXiv preprint
arXiv:1812.05774 (2018)
[15] Lin, Y., Das, P., Datta, A.: Overview of the SIGIR 2018 ecom rakuten
data challenge. In: eCOM@SIGIR. CEUR Workshop Proceedings, vol. 2319.
CEUR-WS.org (2018)
[16] Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: Powering robust
clothes recognition and retrieval with rich annotations. In: Proceedings of
the IEEE conference on computer vision and pattern recognition. pp. 1096–
1104 (2016)
[17] McAuley, J., Targett, C., Shi, Q., Van Den Hengel, A.: Image-based recom-
mendations on styles and substitutes. In: Proceedings of the 38th Interna-
tional ACM SIGIR Conference on Research and Development in Informa-
tion Retrieval. pp. 43–52. ACM (2015)
[18] Shen, D., Ruvini, J.D., Sarwar, B.: Large-scale item categorization for e-
commerce. In: Proceedings of the 21st ACM international conference on
Information and knowledge management. pp. 595–604. ACM (2012)
[19] Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017
IEEE Winter Conference on Applications of Computer Vision (WACV). pp.
464–472. IEEE (2017)
[20] Williams, R.J., Zipser, D.: A learning algorithm for continually running
fully recurrent neural networks. Neural computation 1(2), 270–280 (1989)
[21] Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W.,
Krikun, M., Cao, Y., Gao, Q., Macherey, K., et al.: Google’s neural ma-
chine translation system: Bridging the gap between human and machine
translation. arXiv preprint arXiv:1609.08144 (2016)
[22] Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel,
R., Bengio, Y.: Show, attend and tell: Neural image caption generation
with visual attention. In: International conference on machine learning. pp.
2048–2057 (2015)

Atlas: A Dataset and Benchmark For E-Commerce Clothing Product Categorization

Uploaded by

Copyright:

Available Formats

Atlas: A Dataset and Benchmark For E-Commerce Clothing Product Categorization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Atlas: A Dataset and Benchmark For E-Commerce Clothing Product Categorization

Uploaded by

Copyright:

Available Formats

Atlas: A Dataset and Benchmark for

E-commerce Clothing Product Categorization

Venkatesh Umaashankar1[0000−0001−5230−1209] , Girish Shanmugam

Abstract. In E-commerce, it is a common practice to organize the prod-

Keywords: Product Categorization · Attention · Seq to Seq models ·

in leaf nodes. Taxonomy classification is the process of assigning a category path

We developed a new product categorization dataset called Atlas. An E-

Fig. 2. Architecture of Zoomed Vs Normal Model

4 Benchmark models for Product Categorization

model, which is followed by AdaptiveConcatPool2d, Flatten and 2 blocks of

Fig. 4. Encoder - Decoder with Attention Network

Valid Category paths Invalid Category paths

7 Conclusion and Future Works

You might also like