0% found this document useful (0 votes)

16 views10 pages

Visual Taxonomy Report

Uploaded by

cit.dms1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views10 pages

Visual Taxonomy Report

Uploaded by

cit.dms1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Visual Taxonomy: Attribute Prediction for E-

commerce

Samarth Shinde

Academic Report 1

1. Abstract

The Visual Taxonomy project focuses on automating the process of product

attribute prediction for e-commerce platforms using deep learning techniques.
By leveraging E icientNet, an advanced computer vision model, the project
predicts product categories and attributes solely based on product images. The
project, developed during the Meesho Hackathon, demonstrates robust
preprocessing pipelines, high-performance modeling, and e icient large-scale
dataset handling. This solution aims to reduce manual errors, streamline product
cataloging, and enhance user experience for e-commerce platforms.

Academic Report 2

ff
ff
2. Introduction

E-commerce platforms face challenges in maintaining accurate product catalogs

due to discrepancies between product images and descriptions. These
inconsistencies can lead to user dissatisfaction and operational ine iciencies.
The goal of this project is to develop a machine learning model capable of
accurately predicting product attributes (e.g., category, color, pattern) directly
from product images.

The project integrates:

• E icientNet-B7 for image-based classi ication.
• Preprocessing techniques for data preparation and label encoding.
• Automated training pipelines for reproducibility.

The inal model achieved 92.78% accuracy with an F1 score of 0.35138, trained
over 100 epochs using high-performance GPUs (Tesla V100 32GB).

Academic Report 3

f
ff
f
ff
3. Methodology

3.1 Data Preprocessing

The dataset, hosted on Kaggle (Meesho Datasets), comprises raw product

images and metadata. Key preprocessing steps include:
1. Data Cleaning: Removal of incomplete or inconsistent records.
2. Train-Validation Split: Division of the dataset into training and validation
sets using prepare_data.py.
3. Label Encoding: Conversion of categorical labels into numeric format
using encode_labels.py.
4. Data Augmentation: Applied transformations to enhance dataset
diversity.

3.2 Model Architecture

Academic Report 4

E icientNet-B7, pretrained on ImageNet, was ine-tuned for attribute prediction.
The model includes:
• Convolutional layers for feature extraction.
• Fully connected layers for attribute prediction.
• Regularization techniques to prevent over itting.

Training involved:
• Optimizer: Adam optimizer.
• Loss Function: Categorical cross-entropy.
• Hyperparameters: Batch size of 32, learning rate of 0.001.

3.3 Training and Inference

1. Training:
• Conducted using the train.py script.

Academic Report 5

ff
f
f
• Automated using run_training.sh on a Linux environment.
• Completed in 3 days on Tesla V100 GPUs.
2. Inference:
• Predictions were made using the inference.py script.
• Results stored in the submission/ folder.

Academic Report 6

4. Results

4.1 Model Performance

• Accuracy: 92.78%
• F1 Score: 0.35138
• Epochs: 100

4.2 Training Setup

• Hardware: Tesla V100 32GB (4 GPUs)
• Training Time: 3 days

4.3 Observations
• E icientNet performed exceptionally well on diverse product images.

Academic Report 7

ff
• The model’s performance was consistent across categories, but future
improvements could focus on enhancing the F1 score.
5. Project Structure
Folder Organization

visual-taxonomy/
data/ # Dataset iles

encoders/ # Encoded labels

models/ # Trained models and logs

scripts/ # Python scripts for preprocessing and training

submission/ # Submission CSV iles

README.md # Project documentation

requirements.txt # Dependencies

run_training.sh # Shell script for training automation

Academic Report 8

f
f
Key Scripts
1. prepare_data.py: Prepares and splits datasets.
2. encode_labels.py: Encodes categorical labels.
3. category_attributes_mapping.py: Maps attributes to categories.
4. train.py: Trains the E icientNet model.
5. inference.py: Generates predictions for the test dataset.

Academic Report 9

ff
6. Conclusion

The Visual Taxonomy project successfully addresses the challenges of attribute

prediction in e-commerce platforms. The use of E icientNet-B7 demonstrates
the potential of deep learning in automating manual tasks. The model’s high
accuracy and scalable pipeline make it a viable solution for real-world
deployment. Future work will focus on:
• Improving the F1 score through advanced data augmentation
techniques.
• Exploring ensemble models for better generalization.

7. References
1. Meesho Hackathon: Dataset and competition guidelines.
2. TensorFlow Documentation: E icientNet Implementation.
3. Kaggle Dataset: Meesho Datasets.

Academic Report 10

ff
ff