Visual Taxonomy Report
Visual Taxonomy Report
commerce
Samarth Shinde
Academic Report 1

1. Abstract
Academic Report 2

ff
ff
2. Introduction
The inal model achieved 92.78% accuracy with an F1 score of 0.35138, trained
over 100 epochs using high-performance GPUs (Tesla V100 32GB).
Academic Report 3

f
ff
f
ff
3. Methodology
Academic Report 4

E icientNet-B7, pretrained on ImageNet, was ine-tuned for attribute prediction.
The model includes:
• Convolutional layers for feature extraction.
• Fully connected layers for attribute prediction.
• Regularization techniques to prevent over itting.
Training involved:
• Optimizer: Adam optimizer.
• Loss Function: Categorical cross-entropy.
• Hyperparameters: Batch size of 32, learning rate of 0.001.
Academic Report 5

ff
f
f
• Automated using run_training.sh on a Linux environment.
• Completed in 3 days on Tesla V100 GPUs.
2. Inference:
• Predictions were made using the inference.py script.
• Results stored in the submission/ folder.
Academic Report 6

4. Results
4.3 Observations
• E icientNet performed exceptionally well on diverse product images.
Academic Report 7

ff
• The model’s performance was consistent across categories, but future
improvements could focus on enhancing the F1 score.
5. Project Structure
Folder Organization
visual-taxonomy/
data/ # Dataset iles
requirements.txt # Dependencies
Academic Report 8

f
f
Key Scripts
1. prepare_data.py: Prepares and splits datasets.
2. encode_labels.py: Encodes categorical labels.
3. category_attributes_mapping.py: Maps attributes to categories.
4. train.py: Trains the E icientNet model.
5. inference.py: Generates predictions for the test dataset.
Academic Report 9

ff
6. Conclusion
7. References
1. Meesho Hackathon: Dataset and competition guidelines.
2. TensorFlow Documentation: E icientNet Implementation.
3. Kaggle Dataset: Meesho Datasets.
Academic Report 10

ff
ff