0% found this document useful (0 votes)
9 views

01 Introduction

Uploaded by

gzhth776vf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

01 Introduction

Uploaded by

gzhth776vf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Note) Slides and example codes are available:

https://github.com/mint-lab/3dv_tutorial

Introduction to 3D Vision

1312 pages

Sunglok Choi, Assistant Professor, Ph.D.


Computer Science and Engineering Department, SEOULTECH
[email protected] | https://mint-lab.github.io/
Note) Slides and example codes are available:
https://github.com/mint-lab/3dv_tutorial

An Invitation
Introduction to 3D Vision
: A Tutorial for Everyone

Sunglok Choi, Assistant Professor, Ph.D.


Computer Science and Engineering Department, SEOULTECH
[email protected] | https://mint-lab.github.io/
What is Computer Vision?

▪ Computer vision is an interdisciplinary field that deals with how computers can be made to gain
high-level understanding from digital images or videos.
▪ From the perspective of engineering, it seeks to automate tasks that the human visual system can
do.[1][2][3]
▪ "Computer vision is concerned with the automatic extraction, analysis and understanding of useful
information from a single image or a sequence of images.
▪ It involves the development of a theoretical and algorithmic basis to achieve automatic visual
understanding."[9]

Reference: Wikipedia 3
What is Computer Vision?

▪ Computer vision is an interdisciplinary field that deals with how computers can be made to gain
high-level understanding from digital images or videos.
▪ From the perspective of engineering, it seeks to automate tasks that the human visual system can
do.[1][2][3]
▪ "Computer vision is concerned with the automatic extraction, analysis and understanding of useful
information from a single image or a sequence of images.
▪ It involves the development of a theoretical and algorithmic basis to achieve automatic visual
understanding."[9]

Reference: Wikipedia 4
What is Computer Vision?
Image Understanding

Computer
Graphics

(Synthesized) Image inverse relationship


Tower of Pisa

Shape
Computer Face
Location
Vision

Human

Real World (Projected) Image Information

Image
Processing

(transformed) Image/Signal
5
What is Computer Vision? Computer Vision
Image Understanding

Computer
Graphics

(Synthesized) Image inverse relationship


Tower of Pisa

Shape
Computer Face
Location
Vision

Human

Real World (Projected) Image Information

Image
Processing

(transformed) Image/Signal
6
What is Computer Vision?

Computer Vision

What is it? Where am I?


▪ Label (e.g. Tower of Pisa) ▪ Place (e.g. Piazza del Duomo, Pisa, Italy)
▪ Shape (e.g. ) ▪ Location (e.g. )

(84, 10, 18) [m]

7
What is 3D Vision?
Visual Geometry
Multiple View Geometry
Geometric Vision
Computer Vision

What is it? Where am I?


▪ Label (e.g. Tower of Pisa) ▪ Place (e.g. Piazza del Duomo, Pisa, Italy)
▪ Shape (e.g. ) ▪ Location (e.g. )

(84, 10, 18) [m]

Recognition Problems v.s. Reconstruction Problems


Stanford CS231n: Stanford CS231A:
CNN for Visual Recognition Computer Vision,
From 3D Reconstruction to Recognition

YOLO v2 (2016) ORB-SLAM2 (2016) 8


What is 3D Vision?

image depth image, range data, point cloud, polygon mesh, …

3D Vision v.s. 3D Data Processing


Perspective Camera RGB-D Camera
(Stereo, Structured Light, ToF, Light Field)

Omni-directional Camera Range Sensor


(LiDAR, RADAR)

9
What is 3D Vision?

▪ Reference books

10
What is 3D Vision?

▪ All example codes are available at https://github.com/mint-lab/3dv_tutorial.


– All example codes are mostly less than 100 lines and based on recent OpenCV (> 3.0.0).
– Note) OpenCV (Open Source Computer Vision)

OpenCV v4.8.0 main modules: OpenCV v5.0.0-pre main modules:


• core. Core functionality • core. Core functionality
• imgproc. Image Processing • imgproc. Image Processing
• imgcodecs. Image file reading and writing • imgcodecs. Image file reading and writing
• videoio. Video I/O • videoio. Video I/O
• highgui. High-level GUI • highgui. High-level GUI
• video. Video Analysis • video. Video Analysis
• calib3d. Camera Calibration and 3D Reconstruction • 3d. 3d
• features2d. 2D Features Framework • stereo. Stereo Correspondence
• objdetect. Object Detection • features2d. 2D Features Framework
• dnn. Deep Neural Network module • calib. Camera Calibration
• ml. Machine Learning • objdetect. Object Detection
• flann. Clustering and Search in Multi-Dimensional Spaces • dnn. Deep Neural Network module
• photo. Computational Photography • ml. Machine Learning
• stitching. Images stitching • flann. Clustering and Search in Multi-Dimensional Spaces
• gapi. Graph API • photo. Computational Photography
• stitching. Images stitching
• gapi. Graph API 11
Applications) Photo Browsing

▪ Photo Tourism (2006)

Reference: Snavely et al., Photo Tourism: Exploring Photo Collections in 3D, SIGGRAPH, 2006 12
Applications) 3D Reconstruction

▪ Building Rome in a Day (2009)

Reference: Agarwal et al., Building Rome in a Day, ICCV, 2009 13


Applications) Depth Estimation from Cellular Phones

▪ Structure from Small Motion (SfSM; 2015)

▪ Casual 3D Photography (2017)

Reference: Im et al., High Quality Structure from Small Motion for Rolling Shutter Cameras, ICCV, 2015
Reference: Hedman et al., Casual 3D Photography, SIGGRAPH Asia, 2017 14
Applications) Real-time Visual SLAM

▪ ORB-SLAM (2014)

Reference: Mur-Artal et al., ORB-SLAM: A Versatile and Accurate Monocular SLAM System, T-RO, 2015 15
Applications) Augmented Reality

▪ PTAM: Parallel Tracking and Mapping (2007)

Reference: Klein and Murray, Parallel Tracking and Mapping for Small AR Workspaces, ISMAR, 2007 16
Applications) Virtual Reality

▪ Oculus Quest (2019)

Image: TechSpot 17
Applications) Mixed Reality

▪ Microsoft Hololens 2 (2019)


– Head tracking: 4 x visible light cameras

Image: SlashGear 18
Note) Slides and example codes are available:
Summary https://github.com/mint-lab/3dv_tutorial

▪ What is Computer Vision?


▪ What is 3D Vision?
– What? Recognition problem vs. Reconstruction problem
• Note) Generation problem vs. Reconstruction problem
– Why? Applications

Next Topics
▪ Single-view Geometry
▪ Two-view Geometry
▪ Solving Equations
▪ Finding Correspondence
▪ Multiple-view Geometry
▪ Bayesian Filtering
▪ Visual SLAM and Odometry
19

You might also like