0% found this document useful (0 votes)
2 views1 page

YOLO Versions

Uploaded by

Rithvik Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views1 page

YOLO Versions

Uploaded by

Rithvik Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 1

YOLOv2 (Darknet-19 variant of VGGNet)

- Anchor boxes
- Batch Normalization
* the anchor boxes were all the same size

YOLOv3(Darknet-53 variant of ResNet)


- 53 convolutional layers
- the anchor boxes are scaled, and aspect ratios are varied to better match the
size and shape of the objects being detected.
- "feature pyramid networks" (FPN)
- improve the detection performance on small objects, as the model is able to see
the objects at multiple scales.

YOLOv4(CSPNet variant of ResNet)


- Cross Stage Partial Network
- 54 convolutional layers
- uses k-means clustering to group the ground truth bounding boxes into clusters
and then uses the centroids of the clusters as the anchor boxes
- GHM loss(focal loss function) designed to improve the model’s performance on
imbalanced datasets
- improves FPN than in YOLOv3

YOLOv5(EfficientDet)
- YOLO trained on the PASCAL VOC dataset(20 object categories). YOLOv5 trained on
a larger and more diverse dataset called D5(600 object categories)
- dynamic anchor boxes using a clustering algorithm to group the ground truth
bounding boxes into clusters
- then using the centroids of the clusters as the anchor boxes. This allows the
anchor boxes to be more closely aligned with the detected objects' size and shape
- spatial pyramid pooling(SPP) used to reduce the spatial resolution of the feature
maps
- YOLO v5 includes several improvements to the SPP than YOLOv4
- CIoU loss(IOU variant) to improve the model's performance on imbalanced datasets

YOLOv6(EfficientNet-L2) >> EfficientDet


- dense anchor boxes
- it can achieve with fewer parameters and a higher computational efficiency

YOLOv7
- uses nine anchor boxes, which allows it to detect a wider range of object
shapes&sizes compared to previous versions, helping to reduce the number of false
positives.
- new loss function called “focal loss.”
- processes images at a resolution of 608 by 608 pixels(high resolution) to detect
small objects
- high speed - process images at a rate of 155 frames per second
*less accurate than Faster R-CNN and Mask R-CNN
*struggles in crowded scenes and small objects far from camera/different scales
*inconvenient to use in real-world applications

YOLOv8
- YOLO v8 boasts of a new API that will make training and inference much easier on
both CPU and GPU devices and the framework will support previous YOLO versions
- The developers are still working on releasing a scientific paper that will
include a detailed description of the model architecture and performance.

You might also like