YOLO Versions
YOLO Versions
- Anchor boxes
- Batch Normalization
* the anchor boxes were all the same size
YOLOv5(EfficientDet)
- YOLO trained on the PASCAL VOC dataset(20 object categories). YOLOv5 trained on
a larger and more diverse dataset called D5(600 object categories)
- dynamic anchor boxes using a clustering algorithm to group the ground truth
bounding boxes into clusters
- then using the centroids of the clusters as the anchor boxes. This allows the
anchor boxes to be more closely aligned with the detected objects' size and shape
- spatial pyramid pooling(SPP) used to reduce the spatial resolution of the feature
maps
- YOLO v5 includes several improvements to the SPP than YOLOv4
- CIoU loss(IOU variant) to improve the model's performance on imbalanced datasets
YOLOv7
- uses nine anchor boxes, which allows it to detect a wider range of object
shapes&sizes compared to previous versions, helping to reduce the number of false
positives.
- new loss function called “focal loss.”
- processes images at a resolution of 608 by 608 pixels(high resolution) to detect
small objects
- high speed - process images at a rate of 155 frames per second
*less accurate than Faster R-CNN and Mask R-CNN
*struggles in crowded scenes and small objects far from camera/different scales
*inconvenient to use in real-world applications
YOLOv8
- YOLO v8 boasts of a new API that will make training and inference much easier on
both CPU and GPU devices and the framework will support previous YOLO versions
- The developers are still working on releasing a scientific paper that will
include a detailed description of the model architecture and performance.