0% found this document useful (0 votes)
34 views

Lec01 - Intro To Computer Vision

This document provides an overview of the Spring 2016 CS543/ECE549 Computer Vision course at the University of Illinois. The course will cover early computer vision topics like image formation and processing, mid-level vision including grouping and fitting, multi-view geometry, recognition, and additional topics if time allows. The goal of computer vision is to extract meaning from pixels by interpreting various depth, shape, grouping, and other cues that reveal the structure of the visual world. While computer vision has achieved successes in areas like faces, age progression, digital puppetry, and reconstruction, it remains a challenging field due to factors such as viewpoint and illumination variation, scale changes, object deformation, occlusion, motion, and ambiguity.

Uploaded by

ikhsan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Lec01 - Intro To Computer Vision

This document provides an overview of the Spring 2016 CS543/ECE549 Computer Vision course at the University of Illinois. The course will cover early computer vision topics like image formation and processing, mid-level vision including grouping and fitting, multi-view geometry, recognition, and additional topics if time allows. The goal of computer vision is to extract meaning from pixels by interpreting various depth, shape, grouping, and other cues that reveal the structure of the visual world. While computer vision has achieved successes in areas like faces, age progression, digital puppetry, and reconstruction, it remains a challenging field due to factors such as viewpoint and illumination variation, scale changes, object deformation, occlusion, motion, and ambiguity.

Uploaded by

ikhsan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Spring 2016 CS543 / ECE549

Computer Vision

Course webpage URL: http://slazebni.cs.illinois.edu/spring16/


The goal of computer vision
• To extract “meaning” from pixels

What we see What a computer sees


Source: S. Narasimhan
The goal of computer vision
• To extract “meaning” from pixels

Humans are remarkably good at this…

Source: “80 million tiny images” by Torralba et al.


What kind of information can be
extracted from an image?
tree
roof tree

sky chimney

building
building
window
door

trashcan car car


person
Outdoor scene
ground City European

Semantic information Geometric information


Why study computer vision?
• Vision is useful
• Vision is interesting
• Vision is difficult
• Half of primate cerebral cortex is devoted to visual
processing
• Achieving human-level image understanding is probably
“AI-complete”
Successes of computer vision to date
“Simple” patterns
Faces
Face movies

I. Kemelmacher-Shlizerman, E. Shechtman, R. Garg and S. Seitz,


Exploring Photobios, SIGGRAPH 2011

YouTube Video
Automatic age progression

I. Kemelmacher-Shlizerman, S. Suwajanakorn, and S. Seitz, Illumination-


Aware Age Progression, CVPR 2014

YouTube Video
Digital puppetry

S. Suwajanakorn, S. Seitz, and I. Kemelmacher-Shlizerman, What Makes


Tom Hanks Look Like Tom Hanks, ICCV 2015

YouTube Video
Reconstruction: 3D from photo collections

Q. Shan, R. Adams, B. Curless, Y. Furukawa, and S. Seitz, The Visual


Turing Test for Scene Reconstruction, 3DV 2013

YouTube Video
Reconstruction: 4D from photo collections

R. Martin-Brualla, D. Gallup, and S. Seitz, Time-Lapse Mining from Internet


Photos, SIGGRAPH 2015

YouTube Video
Reconstruction: 4D from depth cameras

R. Newcombe, D. Fox, and S. Seitz, DynamicFusion:


Reconstruction and Tracking of Non-rigid Scenes in Real-Time,
CVPR 2015

YouTube Video
Recognition

• Computer Eyesight Gets a Lot More Accurate,


NY Times Bits blog, August 18, 2014
• Building A Deeper Understanding of Images,
Google Research Blog, September 5, 2014
• Baidu caught gaming recent supercomputer
performance test, Engadget, June 3, 2015
Self-driving cars

http://www.nytimes.com/2016/01/18/technology/driverless-
cars-limits-include-human-nature.html
Why is computer vision difficult?
Challenges: viewpoint variation
Challenges: illumination

image credit: J. Koenderink


Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba


Challenges: deformation

Xu, Beihong 1943

slide credit: Fei-Fei, Fergus & Torralba


Challenges: object intra-class
variation

slide credit: Fei-Fei, Fergus & Torralba


Challenges: occlusion, clutter

Image source: National Geographic


Challenges: Motion
Challenges: ambiguity

slide credit: Fei-Fei, Fergus & Torralba


Challenges: ambiguity
• Many different 3D scenes could have given rise to a
particular 2D picture
Challenges or opportunities?
• Images are confusing, but they also reveal the structure of
the world through numerous cues
• Our job is to interpret the cues!
Depth cues: Linear perspective
Depth cues: Parallax
Shape cues: Texture gradient
Shape and lighting cues: Shading

Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba


Grouping cues: Similarity (color, texture,
proximity)
Grouping cues: “Common fate”

Image credit: Arthus-Bertrand (via F. Durand)


Origins of computer vision

L. G. Roberts, Machine Perception


of Three Dimensional Solids,
Ph.D. thesis, MIT Department of
Electrical Engineering, 1963.
Origins of computer vision

Source: Fei-Fei Li
Connections to other disciplines

Artificial Intelligence

Robotics Machine Learning

Computer Vision

Computer Graphics Cognitive science


Neuroscience

Image Processing
The computer vision industry
• Corporate sponsors of CVPR 2015:
Course overview
I. Early vision: Image formation and processing
II. Mid-level vision: Grouping and fitting
III. Multi-view geometry
IV. Recognition
V. Additional topics
I. Early vision
• Basic image formation and processing

* =
Linear filtering
Edge detection
Cameras and sensors
Light and color

Feature extraction, feature tracking


II. “Mid-level vision”
• Fitting and grouping

Fitting: Least squares Alignment


Hough transform
RANSAC
III. Multi-view geometry

Epipolar geometry Stereo

Structure from motion 3D Photography


IV. Recognition

Instance recognition, large-scale alignment Image classification

Object detection
Deep learning
V. Additional Topics (time permitting)

Segmentation Video

RGBD images Images and text

You might also like