0% found this document useful (0 votes)
128 views30 pages

Fundamentals of Computer Vision

lecture 1 computer vision
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views30 pages

Fundamentals of Computer Vision

lecture 1 computer vision
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

9/5/2018

CS 495: Fundamentals of Computer Vision

S. Asif Mahmood Gilani


([email protected])
• Ph. D., Digital Image Processing, 2002

• Faculty of CS in NU FAST, Lahore

• Faculty of CS & Engg. GIKI

• Govt. College Lahore

• Research interests:
– Computer/machine vision
– Digital Image Processing
– computer graphics

1
9/5/2018

• Office hours:
Mon & Wed at 10:00 am to 11:00 noon
or by appointment, M-014 Ground floor

• Textbook:
Computer Imaging Digital Image Processing and Analysis
Nov. 2010

• Class material:
SLATE

This week

• Introduction to computer vision


• Course requirements
• Course overview (Broad)

2
9/5/2018

Why study computer vision?

• Vision is useful

• Vision is interesting

• Vision is difficult
– Half of primate cerebral cortex is devoted to visual
processing
– Achieving human-level visual perception is
probably “AI-complete”

Why study computer vision…


• Images and videos are everywhere!

Personal photo albums Movies, news, sports

Surveillance and security Medical and scientific images

3
9/5/2018

Origins of computer vision

L. G. Roberts, Machine Perception


of Three Dimensional Solids,
Ph.D. thesis, MIT Department of
Electrical Engineering, 1963.

Connections to other disciplines

Artificial Intelligence

Robotics Machine Learning

Computer Vision

Computer Graphics Psychology


Neuroscience

Image Processing

4
9/5/2018

Applications of computer vision

Factory inspection Reading license plates, Monitoring for safety


checks, ZIP codes (Poseidon)

Surveillance Autonomous driving, Driver assistance


robot navigation (collision warning, lane departure
Sources: K. Grauman, L. Fei-Fei
warning, rear object detection)

Applications of computer vision

Assistive technologies Entertainment Movie special effects


(Sony EyeToy)

Digital cameras (face detection for setting focus, Visual search


exposure) (MSR Lincoln)
Sources: K. Grauman, L. Fei-Fei

5
9/5/2018

Commercial products integrates tidbits of


computer vision

• Most digital cameras now detect faces


– Canon, Sony, Fuji, …

Smile detection?

Sony Cyber-shot® T70 Digital Still Camera

6
9/5/2018

Object recognition (in supermarkets)

LaneHawk by Evoluti-onRobotics
“A smart camera is flush-mounted in the checkout lane, continuously
watching for items. When an item is detected and recognized, the cashier
verifies the quantity of items that were found under the basket, and
continues to close the transaction. The item can remain under the basket,
and with LaneHawk,you are assured to get paid for it… “

Login without a password…

Face recognition systems now


Fingerprint scanners on beginning to appear more widely
many new laptops,
other devices http://www.sensiblevision.com/en-
us/products/forhome/overview.aspx/

7
9/5/2018

Object recognition (in mobile phones)

• This is becoming real:


– Microsoft Research
– Point & Find, Nokia
– SnapTell.com (now amazon)

Snaptell
http://download.cnet.com/ios/snaptell/3260-20_4-6312649-1.html

8
9/5/2018

Special effects: shape capture

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Special effects: motion capture

Pirates of the Carribean, Industrial Light and Magic


Click here for interactive demo

9
9/5/2018

Sports

Sportvision first down line


Nice explanation on www.howstuffworks.com

Smart cars

• Mobileye
– Vision systems currently in high-end BMW, GM, Volvo models
– By 2010: 70% of car manufacturers.
– Video demo

Slide content courtesy of Amnon Shashua

10
9/5/2018

Smart cars

• Mobileye
– Vision systems currently in high-end BMW, GM, Volvo models
– By 2010: 70% of car manufacturers.
– Video demo

Slide content courtesy of Amnon Shashua

Vision-based interaction (and games)

Digimask: put your face on a 3D avatar.

Nintendo Wii has camera-based IR


tracking built in. See Lee’s work at
CMU on clever tricks on using it to
create a multi-touch display!

“Game turns moviegoers into Human Joysticks”, CNET


Camera tracking a crowd, based on this work.

11
9/5/2018

Vision in space

NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.

Vision systems (JPL) used for several tasks


• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.

Medical imaging

Image guided surgery


3D imaging
Grimson et al., MIT
MRI, CT

12
9/5/2018

Applications of computer vision


• For more information on the computer
vision industry:

http://www.cs.ubc.ca/spider/lowe/vision.h
tml

The goal of computer vision


• To perceive the “world behind the picture”
153 156 148 152 149 147 139 146 142 150 146 144 137 125 120 119 136 146 151 164 172 175 183 188 196 200 205 208 214 214 219 217
159 151 150 148 140 138 139 129 119 104 86 82 89 97 107 115 118 130 128 132 128 144 160 168 179 188 200 208 213 220 212 214
149 146 153 147 147 146 132 99 73 78 87 96 105 120 138 151 145 157 163 171 165 161 146 126 157 184 190 201 215 212 214 214
145 150 154 148 148 126 93 67 72 78 96 107 117 127 131 134 127 154 166 167 183 194 200 195 143 140 175 190 197 203 206 207
151 153 151 147 120 85 67 75 84 83 94 92 81 78 78 91 83 117 126 144 178 200 201 203 208 175 127 159 185 196 195 206
146 144 139 123 79 66 74 83 79 69 64 62 58 50 46 54 54 66 60 80 86 108 141 191 184 200 187 123 144 175 198 199
135 130 115 87 64 77 90 79 78 85 81 63 55 57 56 53 70 62 61 68 59 58 84 105 168 194 196 183 131 151 185 197
128 116 92 71 82 94 103 101 83 101 88 66 70 90 80 42 39 53 88 73 76 82 116 87 97 144 188 195 190 166 171 203
135 120 84 83 108 127 135 115 100 92 79 49 85 74 59 0 0 0 50 69 52 79 157 141 100 84 136 187 206 204 189 200
144 103 91 115 139 147 127 91 87 80 72 44 61 84 25 0 0 0 50 181 45 69 142 164 167 113 93 130 193 199 208 203
139 102 123 143 137 131 109 85 93 84 68 47 77 86 31 0 3 0 51 156 53 75 141 169 199 151 171 108 143 181 199 208
141 135 153 142 114 104 97 97 83 98 77 42 77 96 79 21 0 23 58 46 56 77 155 199 212 161 194 193 164 187 202 205
160 172 164 141 128 112 98 95 100 96 91 73 68 86 75 73 64 65 54 69 77 115 190 212 193 181 174 188 210 194 202 207
179 189 160 140 139 116 97 97 108 103 110 99 75 80 72 83 50 55 54 95 98 174 205 185 179 188 185 190 193 217 217 224
189 183 152 130 121 105 105 117 114 108 107 115 110 81 85 85 87 81 81 124 183 202 175 180 178 171 173 204 225 215 219 225
178 161 149 135 120 115 122 129 137 145 131 121 125 115 109 91 92 111 132 159 173 170 184 176 184 190 191 217 210 226 228 223
187 159 139 127 125 115 118 121 121 131 133 134 140 137 134 139 140 152 141 154 170 163 195 194 176 198 216 209 219 224 223 226
185 164 140 122 116 110 109 108 113 118 115 116 123 127 135 148 154 162 165 170 171 160 183 198 201 210 223 216 221 222 221 226
188 175 150 130 118 117 113 110 108 115 117 123 130 132 138 150 157 158 174 182 189 186 198 221 224 221 227 221 223 218 218 222
187 179 158 141 124 127 125 127 126 129 130 135 139 141 150 165 175 172 185 195 207 210 212 226 229 222 224 224 223 218 219 221
188 184 172 159 138 135 135 143 143 143 144 146 145 147 160 174 184 191 199 207 211 213 217 224 227 223 223 221 221 218 224 223
192 191 187 174 153 139 140 147 146 149 157 162 160 159 165 174 181 198 201 210 212 216 223 224 225 225 220 215 217 215 224 224

13
9/5/2018

The goal of computer vision


• To perceive the “world behind the picture”

The goal of computer vision


• To perceive the “world behind the picture”

• What exactly does this mean?


– Vision as a source of metric 3D information
– Vision as a source of semantic information

14
9/5/2018

Vision as Structure
Real-time stereo
measurement
from motion
device
Multi-view stereo for
community photo collections

NASA Mars Rover

Pollefeys et al. Goesele et al.

• Amazing success story!

…but why do Learning for Vision?

• “What if I don’t care about this wishy-washy recognition stuff?


I just want to make my robot go!”

• Small Reason:
– For measurement, other sensors are often better (in
DARPA Grand Challenge, vision was barely used!)
– For navigation, you still need to learn!

• Big Reason:
– The goals of computer vision (what + where) are
in terms of what humans care about.

Slide credit: A. Efros

15
9/5/2018

Vision as a source of semantic information

slide credit: Fei-Fei, Fergus & Torralba

Object categorization

sky
building

flag

face
banner
wall
street lamp
bus bus

cars slide credit: Fei-Fei, Fergus & Torralba

16
9/5/2018

Scene and context categorization


• outdoor
• city
• traffic
•…

slide credit: Fei-Fei, Fergus & Torralba

Qualitative spatial information

slanted

non-rigid moving
object

vertical

rigid moving rigid moving


object object
horizontal slide credit: Fei-Fei, Fergus & Torralba

17
9/5/2018

Challenges: viewpoint variation

Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba

Challenges: illumination

image credit: J. Koenderink

18
9/5/2018

Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba

Challenges: deformation

Xu, Beihong 1943

slide credit: Fei-Fei, Fergus & Torralba

19
9/5/2018

Challenges: occlusion

Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba

Challenges: background clutter

20
9/5/2018

Challenges: object intra-class variation

slide credit: Fei-Fei, Fergus & Torralba

Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba

21
9/5/2018

Challenges or opportunities?
• Images are confusing, but they also reveal the structure of
the world through numerous cues
• Our job is to interpret the cues! (e.g. Texture for ICR query)

Image source: J. Koenderink

Depth cues: Linear perspective

22
9/5/2018

Depth cues: Aerial perspective

Shape cues: Texture gradient

23
9/5/2018

Shape and lighting cues: Shading

Position and lighting cues: Cast


shadows

Source: J. Koenderink

24
9/5/2018

Grouping cues: Similarity (color, texture,


proximity)

Grouping cues: “Common fate”

Image credit: Arthus-Bertrand (via F. Durand)

25
9/5/2018

Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a
particular 2D picture

Image source: F. Durand

Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a particular 2D
picture

• Possible solutions
– Bring in more constraints (more images)
– Use prior knowledge about the structure of the world
• Need both exact measurements and statistical inference!
Image source: F. Durand

26
9/5/2018

I. Early vision
• Basic image formation and processing

* =
Linear filtering
Edge detection
Cameras and sensors
Light and color

Feature extraction: corner and blob detection

II. “Mid-level vision”


• Fitting and grouping

Alignment

Fitting: Least squares


Hough transform
RANSAC

27
9/5/2018

III. Multi-view geometry

Stereo Epipolar geometry

Tomasi & Kanade (1993)

Affine structure from motion Projective structure from motion:


Here be dragons!

IV. Recognition

Patch description and matching Clustering and visual vocabularies

Bag-of-features models Classification

Sources: D. Lowe, L. Fei-Fei

28
9/5/2018

V. Advanced Topics
• Some in class other in term projects…

Segmentation Face detection

Articulated models Motion and tracking

Course requirements
• Philosophy: computer vision is best experienced if hands-on

• Quiz & Programming assignments: 20%


– Three or four Quizzes and assignments (Surprise Quiz)
– Expect the first one in a couple of weeks
– Brush up on your MATLAB skills (see web/slate page for tutorial)

• Final project:
– Putting several pieces together
– List of options will be posted in the next few weeks (some great
ideas can be find on web)
– Expect to commit to a project idea by the end of Aug/September

• Participation: ?%
– Ask questions •Mid & Final Terms: 80%
– Answer questions
– Give me feedback: I’m learning too!

29
9/5/2018

Collaboration policy

• Feel free to discuss assignments with each


other, but coding must be done individually

• Feel free to incorporate code or tips you find on


the Web, provided this doesn’t make the
assignment trivial and you explicitly
acknowledge your sources

• Remember: I can Google too!

• Homework: MATLAB tutorial (self-study,


not collected)
• Reading: cameras and image formation
(WEB)

30

You might also like