DINOv2 Presentation

DINOv2 is a self-supervised learning model by Meta AI that excels in extracting robust visual features from unlabeled images using Vision Transformer architecture, achieving 86.5% Top-1 accuracy in image classification. It addresses challenges in self-supervised learning by improving generalization and efficiency while also ensuring demographic fairness in its datasets. The model has diverse applications in fields such as autonomous systems, medical imaging, and AR/VR, paving the way for future advancements in AI.

Uploaded by

drash078692

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views13 pages

DINOv2 Presentation

Uploaded by

drash078692

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Presentation on DINOv2

Learning Robust Visual Features

without Supervision
Presented by: Drashti Bhavsar
Company: LOTI AI, Inc.
Abstract
• DINOv2 is a state-of-the-art self-supervised
learning model developed by Meta AI for
computer vision tasks. It utilizes Vision
Transformer (ViT) architecture to extract
robust visual features from unlabeled images,
enabling tasks like image classification, depth
estimation, and semantic segmentation
without specific fine-tuning.
Literature Review
• DINOv2 achieves exceptional performance
compared to other models. Below is a
comparison:
• • OpenCLIP: ViT-G/14 architecture, 86.2% Top-
1 accuracy with weak supervision.
• • iBOT: ViT-L/16 architecture, 82.3% Top-1
accuracy with self-supervision.
• • DINOv2: ViT-g/14 architecture, 86.5% Top-1
accuracy with self-supervision.
Methodology
• DINOv2 employs the following key techniques:
• • Vision Transformers (ViTs) for patch-based
feature extraction.
• • Self-supervised learning with curated
datasets and clustering.
• • Knowledge distillation for transferring
features from large to smaller models.
Data Processing Pipeline
• The pipeline combines curated and uncurated
datasets to create a diverse training set:
• • Curated images are mapped to embeddings.
• • Uncurated images are deduplicated and
matched with curated images.
• • Clustering ensures balanced representation
across groups.
Key Results
• DINOv2 demonstrates significant
improvements across tasks:
• • Image Classification: 86.5% Top-1 accuracy
on ImageNet-1k.
• • Depth Estimation: RMSE improved to 0.279
from 0.358.
• • Semantic Segmentation: 53.1 mIoU on
ADE20k.
Visualizations
• Qualitative examples highlight DINOv2's
capabilities:
• • Semantic segmentation results with frozen
features.
• • Depth estimation with smoother
predictions.
• • Principal Component Analysis (PCA) of patch
features.
Applications
• DINOv2 finds applications in various domains:
• • Autonomous Systems: Navigation and object
detection.
• • Medical Imaging: Accurate semantic
segmentation.
• • AR/VR: Enhanced 3D reconstruction and
depth estimation.
Fairness and Bias Analysis
• DINOv2 addresses demographic fairness:
• • Geographical Fairness: Dataset includes
images from 54 countries.
• • Bias Mitigation: Analysis of skintones,
gender, and age groups.
Challenges Addressed
• DINOv2 overcomes key limitations in self-
supervised learning:
• • Improved generalization across tasks.
• • Efficient training at scale with optimized
pipelines.
• • High-quality feature learning from diverse
datasets.
Conclusion
• DINOv2 represents a milestone in self-
supervised learning for computer vision:
• • State-of-the-art results across benchmarks.
• • Scalable architecture for diverse
applications.
• • Paves the way for future advancements in
AI.
Future Directions
• Potential areas for enhancing DINOv2:
• • Extending to multimodal domains like video,
text, and audio.
• • Optimizing for mobile and embedded
systems.
• • Hybrid models combining self-supervision
with domain-specific fine-tuning.
Thank You
• Contact: [email protected]
• LinkedIn: linkedin.com/in/drashti-bhavsar

MNG3701 Assignment 1
No ratings yet
MNG3701 Assignment 1
5 pages
Distillation Column Lab Report
67% (3)
Distillation Column Lab Report
14 pages
Oki MB472w Maintenance Manual
100% (2)
Oki MB472w Maintenance Manual
224 pages
DINOv2 Updated Presentation
No ratings yet
DINOv2 Updated Presentation
17 pages
Improved DINOv2 Presentation
No ratings yet
Improved DINOv2 Presentation
8 pages
Final Improved DINOv2 Presentation
No ratings yet
Final Improved DINOv2 Presentation
12 pages
Caron Emerging Properties in Self-Supervised Vision Transformers ICCV 2021 Paper
No ratings yet
Caron Emerging Properties in Self-Supervised Vision Transformers ICCV 2021 Paper
11 pages
V T N R: Ision Ransformers EED Egisters
No ratings yet
V T N R: Ision Ransformers EED Egisters
21 pages
Darcet 2023 Vision Transformers Need Registers
No ratings yet
Darcet 2023 Vision Transformers Need Registers
16 pages
Dinov 2
No ratings yet
Dinov 2
31 pages
Dinov2 Meets Text: A Unified Framework For Image-And Pixel-Level Vision-Language Alignment
No ratings yet
Dinov2 Meets Text: A Unified Framework For Image-And Pixel-Level Vision-Language Alignment
16 pages
Liveness Detection in Computer Vision Transformer-Based Self-Supervised Learning For Face Anti-Spoofing
No ratings yet
Liveness Detection in Computer Vision Transformer-Based Self-Supervised Learning For Face Anti-Spoofing
13 pages
Inicai 2V1
No ratings yet
Inicai 2V1
7 pages
General Purpose Image Encoder Dinov2 For Medical Image Registration
No ratings yet
General Purpose Image Encoder Dinov2 For Medical Image Registration
11 pages
DINO-X: A Unified Vision Model For Open-World Object Detection and Understanding
No ratings yet
DINO-X: A Unified Vision Model For Open-World Object Detection and Understanding
22 pages
DINO: Self-Supervised Vision Transformers Explained
From Everand
DINO: Self-Supervised Vision Transformers Explained
William Smith
No ratings yet
Monovit: Self-Supervised Monocular Depth Estimation With A Vision Transformer
No ratings yet
Monovit: Self-Supervised Monocular Depth Estimation With A Vision Transformer
11 pages
Understanding The Robustness in Vision Transformers
No ratings yet
Understanding The Robustness in Vision Transformers
17 pages
Dino
No ratings yet
Dino
2 pages
Detectron2 in Practice: Definitive Reference for Developers and Engineers
From Everand
Detectron2 in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Crane: Deep Dive into Container Image Manipulation
From Everand
Crane: Deep Dive into Container Image Manipulation
William Smith
No ratings yet
Efficient Container Image Building with BuildKit: Definitive Reference for Developers and Engineers
From Everand
Efficient Container Image Building with BuildKit: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Code::Blocks Essentials: Definitive Reference for Developers and Engineers
From Everand
Code::Blocks Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Fundamentals of Digital Image Processing
From Everand
Fundamentals of Digital Image Processing
Dandak Kaniyar
No ratings yet
Efficient Development with JetBrains Tools: Definitive Reference for Developers and Engineers
From Everand
Efficient Development with JetBrains Tools: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
GDB Fundamentals and Techniques: Definitive Reference for Developers and Engineers
From Everand
GDB Fundamentals and Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Expert Guide to Eclipse CDT: Definitive Reference for Developers and Engineers
From Everand
Expert Guide to Eclipse CDT: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
MPLAB Techniques and Workflows: Definitive Reference for Developers and Engineers
From Everand
MPLAB Techniques and Workflows: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Efficient Development with CLion: Definitive Reference for Developers and Engineers
From Everand
Efficient Development with CLion: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Guide to Behave for Python Testing: Definitive Reference for Developers and Engineers
From Everand
Practical Guide to Behave for Python Testing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
YOLO Object Detection Explained: Definitive Reference for Developers and Engineers
From Everand
YOLO Object Detection Explained: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Valgrind Essentials: Definitive Reference for Developers and Engineers
From Everand
Valgrind Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Modin for Scalable Data Science: The Complete Guide for Developers and Engineers
From Everand
Modin for Scalable Data Science: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Benchmarking Detection Transfer Learning With Vision Transformers
No ratings yet
Benchmarking Detection Transfer Learning With Vision Transformers
9 pages
KubeVirt CDI in Practice: The Complete Guide for Developers and Engineers
From Everand
KubeVirt CDI in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Comprehensive Guide to Mbed Development: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Mbed Development: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
? DINOv2 Visualization. A Guide To Visualize DINOv2 Embeddings - by Javier - Medium
No ratings yet
? DINOv2 Visualization. A Guide To Visualize DINOv2 Embeddings - by Javier - Medium
14 pages
Crocov 2
No ratings yet
Crocov 2
19 pages
Learn OpenCV with Python by Examples
From Everand
Learn OpenCV with Python by Examples
James Chen
No ratings yet
IntelliJ IDEA Workflow and Productivity Guide: Definitive Reference for Developers and Engineers
From Everand
IntelliJ IDEA Workflow and Productivity Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Chen An Empirical Study of Training Self-Supervised Vision Transformers ICCV 2021 Paper
No ratings yet
Chen An Empirical Study of Training Self-Supervised Vision Transformers ICCV 2021 Paper
10 pages
Joint Photographic Experts Group: Unlocking the Power of Visual Data with the JPEG Standard
From Everand
Joint Photographic Experts Group: Unlocking the Power of Visual Data with the JPEG Standard
Fouad Sabry
No ratings yet
Wikitude Development Essentials: Definitive Reference for Developers and Engineers
From Everand
Wikitude Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Good Note - ViT
No ratings yet
Good Note - ViT
13 pages
DeepSource Automation and Code Quality Excellence: The Complete Guide for Developers and Engineers
From Everand
DeepSource Automation and Code Quality Excellence: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
AI and Deep Learning for Networks
From Everand
AI and Deep Learning for Networks
Gopee Mukhopadhyay
No ratings yet
IGNOU BCA Operating System Concepts and Networking Management Previous Year Solved Papers MCS 022
From Everand
IGNOU BCA Operating System Concepts and Networking Management Previous Year Solved Papers MCS 022
Manish Soni
No ratings yet
Image Compression: Efficient Techniques for Visual Data Optimization
From Everand
Image Compression: Efficient Techniques for Visual Data Optimization
Fouad Sabry
No ratings yet
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
From Everand
PIC Microcontroller Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Podman Essentials: Definitive Reference for Developers and Engineers
From Everand
Podman Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Inicai 1
No ratings yet
Inicai 1
8 pages
CLIP Systems and Applications: The Complete Guide for Developers and Engineers
From Everand
CLIP Systems and Applications: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
ViT Transformers SEMINAR
No ratings yet
ViT Transformers SEMINAR
16 pages
Effective Workflow in PyCharm: Definitive Reference for Developers and Engineers
From Everand
Effective Workflow in PyCharm: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Vision Transformer (ViT)
No ratings yet
Vision Transformer (ViT)
26 pages
Micropython Essentials: Definitive Reference for Developers and Engineers
From Everand
Micropython Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NetBeans Development Guide: Definitive Reference for Developers and Engineers
From Everand
NetBeans Development Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Revealing The Dark Secrets of Extremely Large Kernel Convnets On Robustness
No ratings yet
Revealing The Dark Secrets of Extremely Large Kernel Convnets On Robustness
13 pages
CircuitPython in Practice: Definitive Reference for Developers and Engineers
From Everand
CircuitPython in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Lead Generation with DeepSeek AI/ A Comprehensive Guide to Transforming Your Sales Strategy
From Everand
Mastering Lead Generation with DeepSeek AI/ A Comprehensive Guide to Transforming Your Sales Strategy
Robert Cullen
No ratings yet
Relax. We've Got You Covered.: Warm Regards
No ratings yet
Relax. We've Got You Covered.: Warm Regards
8 pages
Bioenergy Links
No ratings yet
Bioenergy Links
594 pages
Pall Polypropylene Filters: Selection Guide
No ratings yet
Pall Polypropylene Filters: Selection Guide
8 pages
Nutrigenomics and The Food Industry
No ratings yet
Nutrigenomics and The Food Industry
5 pages
Acct Statement XX1338 15052025
No ratings yet
Acct Statement XX1338 15052025
85 pages
Daraksha Bano
No ratings yet
Daraksha Bano
3 pages
Shenali CV
No ratings yet
Shenali CV
3 pages
Librenms
No ratings yet
Librenms
18 pages
Dissertation Chapter Titles
100% (2)
Dissertation Chapter Titles
6 pages
CE 405 Experiment 2
No ratings yet
CE 405 Experiment 2
6 pages
CFA1-Mock Exam A - Session 1
No ratings yet
CFA1-Mock Exam A - Session 1
23 pages
The Dasavatara Hotel
No ratings yet
The Dasavatara Hotel
10 pages
Template Manual Aeg Sd36 Enu Tu2.11 v1.002
No ratings yet
Template Manual Aeg Sd36 Enu Tu2.11 v1.002
15 pages
Jnby Headquarters - Hangzhou, China: 3 July 2013 Renzo Piano Building Workshop
No ratings yet
Jnby Headquarters - Hangzhou, China: 3 July 2013 Renzo Piano Building Workshop
54 pages
Final Report COMM3
No ratings yet
Final Report COMM3
16 pages
Case Cadbury
No ratings yet
Case Cadbury
6 pages
Sylveo LED - Extra Large
No ratings yet
Sylveo LED - Extra Large
5 pages
Problems Chapter 5 1
No ratings yet
Problems Chapter 5 1
7 pages
GST - Lecture 5 - Time of Supply
No ratings yet
GST - Lecture 5 - Time of Supply
6 pages
Speaking in Styles - Fundamentals of CSS For Web Designers
No ratings yet
Speaking in Styles - Fundamentals of CSS For Web Designers
361 pages
Since - The - News - of - Thread - by - Beatthestreet10 - Mar 11, 24 - From - Rattibha
No ratings yet
Since - The - News - of - Thread - by - Beatthestreet10 - Mar 11, 24 - From - Rattibha
14 pages
Letter Reg. PLT
100% (1)
Letter Reg. PLT
1 page
MATH Q3 Week 8
No ratings yet
MATH Q3 Week 8
34 pages
Freedoms of AIr
No ratings yet
Freedoms of AIr
4 pages
Putra Indah Guard House BQ
No ratings yet
Putra Indah Guard House BQ
2 pages
Micromax Competitive Strategy Roll No.55 PDF
No ratings yet
Micromax Competitive Strategy Roll No.55 PDF
24 pages
Properties Analysis of Self-Luminous Cement-Based Materials With Different
No ratings yet
Properties Analysis of Self-Luminous Cement-Based Materials With Different
14 pages

DINOv2 Presentation

Uploaded by

DINOv2 Presentation

Uploaded by

Presentation on DINOv2

Learning Robust Visual Features

You might also like