Update 2021-5-26-torchvision-mobilenet-v3-implementation.md

jss367 · web-flow · commit ad36b46eaea4 · 2023-07-03T10:23:08.000-07:00
diff --git a/_posts/2021-5-26-torchvision-mobilenet-v3-implementation.md b/_posts/2021-5-26-torchvision-mobilenet-v3-implementation.md
@@ -81,7 +81,7 @@ Another important detail is that though PyTorch’s and TensorFlow’s RMSProp i
 
 **Increasing our accuracy by tuning hyperparameters & improving our training recipe**
 
-After configuring the optimizer to achieve fast and stable training, we turned into optimizing the accuracy of the model. There are a few techniques that helped us achieve this. First of all, to avoid overfitting we augmented out data using the AutoAugment algorithm, followed by RandomErasing. Additionally we tuned parameters such as the weight decay using cross validation. We also found beneficial to perform [weight averaging](https://github.com/pytorch/vision/blob/674e8140042c2a3cbb1eb9ebad1fa49501599130/references/classification/utils.py#L259) across different epoch checkpoints after the end of the training. Finally, though not used in our published training recipe, we found that using Label Smoothing, Stochastic Depth and LR noise injection improve the overall accuracy by over [1.5 points](https://rwightman.github.io/pytorch-image-models/training_hparam_examples/#mobilenetv3-large-100-75766-top-1-92542-top-5).
+After configuring the optimizer to achieve fast and stable training, we turned into optimizing the accuracy of the model. There are a few techniques that helped us achieve this. First of all, to avoid overfitting we augmented our data using the AutoAugment algorithm, followed by RandomErasing. Additionally we tuned parameters such as the weight decay using cross validation. We also found beneficial to perform [weight averaging](https://github.com/pytorch/vision/blob/674e8140042c2a3cbb1eb9ebad1fa49501599130/references/classification/utils.py#L259) across different epoch checkpoints after the end of the training. Finally, though not used in our published training recipe, we found that using Label Smoothing, Stochastic Depth and LR noise injection improve the overall accuracy by over [1.5 points](https://rwightman.github.io/pytorch-image-models/training_hparam_examples/#mobilenetv3-large-100-75766-top-1-92542-top-5).
 
 The graph and table depict a simplified summary of the most important iterations for improving the accuracy of the MobileNetV3 Large variant. Note that the actual number of iterations done while training the model was significantly larger and that the progress in accuracy was not always monotonically increasing. Also note that the Y-axis of the graph starts from 70% instead from 0% to make the difference between iterations more visible: