0% found this document useful (0 votes)

20 views

Lecture 01 Introduction

This document provides an overview of the field of computer vision including its goals, challenges, applications, and history. Computer vision aims to make computers understand images and videos by analyzing visual content to classify and detect objects, scenes, and activities. While significant progress has been made, fully achieving human-level perception remains an ongoing challenge.

Uploaded by

Ankush Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Lecture 01 Introduction

Uploaded by

Ankush Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 62

Computer Vision

Introduction
What is Computer Vision?
• Make computers understand images and videos.
• What kind of scene?

• Where are the cars?

• How far is the building?

What is Computer Vision?
• Make computers understand images and videos.
• What are they doing?

• Why is this happening?

• What is important?

• What will I see?

Computer Vision and Nearby Fields
Digital Image Processing
Computational Photography Geometry (3D)
Computer Vision
Shape

Images (2D)
Photometry
Computer Graphics
Appearance
Machine learning:
Vision = Machine learning applied to visual data
Image Processing vs Computer Vision
• Image Processing
• Mostly concerned with image-to-image transformations
• Filtering
• Enhancement
• Compression
• Computer Vision
• Concerned with how images reflect the 3D world
• Filtering for feature extraction
• Enhancement for recognition/detection
• Compression that preserves geometric information in images
Visual data on the Internet
• Flickr 90% of net traffic
• 10+ billion photographs
• 60 million images uploaded a month
will be visual!
• Facebook Mostly about cats
• 250 billion+
• 300 million a day
• Instagram
• 55 million a day
• YouTube
• 100 hours uploaded every minute
Too big for humans

http://www.petittube.com/

• Need automatic tools to access and analyze visual data!

Vision is Really Hard
• Vision is an amazing feature of natural intelligence
• Visual cortex occupies about 50% of Macaque brain
• More human brain devoted to vision than anything else

Is that a
queen or a
bishop?
Why is Computer Vision Hard?
Why is Computer Vision Hard?
What did you see?
• Where this picture was taken?

• How many people are there?

• What are they doing?

• What object the person on the left standing on?

• Why this is a funny picture?

Why is Computer Vision Hard?
Why is Computer Vision Hard?
Why is Computer Vision Hard?
Why is Computer Vision Hard?
Why is Computer Vision Hard?
Why is Computer Vision Hard?
Challenges: Many nuisance parameters

Illumination Object pose Clutter

Occlusions Intra-class Viewpoint

appearance
Challenges: Intra-class variation
Handling challenges?
We are really, really far from the human like perception of computer
vision. But we have:
• Lots and lots and lots of data.
• We can learn from humans.
• We have Prior knowledge, which can provide constrains of the
problem.
Computer Vision
Safety Health Security
Technology
Can Better Our Lives

Comfort Fun Access

History of Computer Vision
“In 1966, Minsky hired a first-year
undergraduate student and
assigned him a problem to solve
over the summer:

connect a camera to a computer

and get the machine to describe
Marvin Minsky, MIT what it sees.”
Turing award, 1969 Crevier 1993, pg. 88
Half a century later,
we're still working on it.
1960’s: interpretation of synthetic worlds

Larry Roberts Input image 2x2 gradient operator computed 3D model

“Father of Computer Vision” rendered from new viewpoint

Larry Roberts PhD Thesis, MIT, 1963,

Machine Perception of Three-Dimensional Solids Slide credit: Steve Seitz
1970’s: some progress on interpreting selected images

The representation and matching of pictorial structures

Fischler and Elschlager, 1973
1970’s: some progress on interpreting selected images

The representation and matching of pictorial structures

Fischler and Elschlager, 1973
1980’s: ANNs come and go; shift toward
geometry and increased mathematical rigor

Image credit: Rick Szeliski

1990’s: face recognition; statistical analysis in vogue
2000’s: broader recognition; large annotated
datasets available; video processing starts
2010’s: resurgence of deep learning

[AlexNet NIPS 2012] [DeepFace CVPR 2014]

[DeepPose CVPR 2014] [Show, Attend and Tell ICML 2015]

2020’s: autonomous vehicles
2030’s: robot uprising?
Examples of Computer Vision Applications
• How is computer vision used today?
Face detection

• Most digital cameras and smart phones detect faces (and more)
• Canon, Sony, Fuji, …
• For smart focus, exposure compensation, and cropping
Slide credit: Steve Seitz
Face recognition

Facebook face auto-tagging

Face Landmark Alignment – 3D Persona

What Makes Tom Hanks Look Like Tom Hanks ICCV 2015
Smile Detection

Sony Cyber-shot® T70 Digital Still Camera Slide credit: Steve Seitz
Vision-based Biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story wikipedia

Slide credit: Steve Seitz

Vision-based Biometrics
Optical Character Recognition (OCR)
• Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readers

http://www.research.att.com/~yann/ http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Slide credit: Steve Seitz
Computer vision in sports

Hawk-Eye: helping/improving referee decisions

Computer vision in sports

SportVision: improving viewer experiences

Computer vision in sports

Replay Technologies: improving viewer experiences

Computer vision in sports

Play tracking
Visual recognition for photo organization

Google photo
Earth viewers (3D modeling)

Image from Microsoft’s Virtual Earth

(see also: Google Earth)
Slide credit: Steve Seitz
3D from thousands of images

[Furukawa et al. CVPR 2010]

3D Time-lapse from Internet Photos

3D Time-lapse from Internet Photos, ICCV 2015

Style transfer

Source image (Style) Target image (Content) Output (deepart)

A Neural Algorithm of Artistic Style [Gatys et al. 2015]

Special effects: Matting and composition

Kylie Minogue - Come Into My World

Special effects: Shape capture

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Slide credit: Steve Seitz

Special effects: Motion capture

Pirates of the Carribean, Industrial Light and Magic Slide credit: Steve Seitz
Google cars

Google in talks with Ford, Toyota and Volkswagen to realise driverless cars

http://www.theatlantic.com/technology/archive/2014/05/all-the-world-a-track-th
e-trick-that-makes-googles-self-driving-cars-work/370871/
Interactive Games: Kinect
• Object Recognition: http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o
• Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
• 3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
• Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY
Vision in space

NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.

Vision systems (JPL) used for several tasks

• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
Industrial robots

Vision-guided robots position nut runners on wheels

http://www.automationworld.com/computer-vision-opportunity-or-threat
Mobile robots

NASA’s Mars Spirit Rover http://www.robocup.org/

Saxena et al. 2008 http://www.youtube.com/w

STAIR at Stanford atch?v=DF39Ygp53mQ
Medical imaging

Image guided surgery

3D imaging Grimson et al., MIT
MRI, CT
Computer vision for the mass

Counting cells Predicting poverty

Current state of the art
• Many of these are less than 5 years old
• Very active and exciting research area!
• To learn more about vision applications and companies
– David Lowe maintains an excellent overview of vision companies
• http://www.cs.ubc.ca/spider/lowe/vision.html
Topics of Studies in Computer Vision
• Interpreting Intensities
– What determines the brightness and color of a pixel?
– How can we use image filters to extract meaningful information from the image?
• Correspondence and Alignment
– How can we find corresponding points in objects or scenes?
– How can we estimate the transformation between them?
• Perspective and 3D Geometry
– How can we map between the 3D world and the 2D image?
– How can we recover 3D coordinates from images or video?
• Grouping and Segmentation
– How can we group pixels into meaningful regions?
• Categorization and Object Recognition
– How can we represent images and categorize them?
– How can we recognize categories of objects?
• Advanced Topics
– Action recognition, 3D scenes and context, CNNs, …
Resources
Books
• “Computer Vision: A Modern Approach”, by D. A. Forsyth, J. Ponce.
• “Digital Image Processing: An Algorithmic Approach” by Madhuri A. Joshi
• Videos
• https://www.youtube.com/playlist?list=PLyqSpQzTE6M_PI-rIz4O1jEgffhJU9GgG

Discovery Service Manual v2.4 Low Res 191212
100% (1)
Discovery Service Manual v2.4 Low Res 191212
151 pages
Computer Vision Presentation AI
83% (6)
Computer Vision Presentation AI
16 pages
CUSP Call Processing
No ratings yet
CUSP Call Processing
9 pages
00CV Intro Full
No ratings yet
00CV Intro Full
58 pages
Computer Vision: Linda Shapiro
No ratings yet
Computer Vision: Linda Shapiro
73 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Computer Vision Intorduction
No ratings yet
Computer Vision Intorduction
57 pages
CS 143: Introduction To Computer Vision
No ratings yet
CS 143: Introduction To Computer Vision
38 pages
Computer Vision
100% (1)
Computer Vision
48 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
1a. Introduction
No ratings yet
1a. Introduction
32 pages
CV Module 1
No ratings yet
CV Module 1
166 pages
Computer Vision: Cse 576 Ali Farhadi
No ratings yet
Computer Vision: Cse 576 Ali Farhadi
90 pages
Lect1 PDF
100% (1)
Lect1 PDF
45 pages
Lecture 01 Introduction To Computer Vision PDF
No ratings yet
Lecture 01 Introduction To Computer Vision PDF
118 pages
Lec01 - Intro To Computer Vision
No ratings yet
Lec01 - Intro To Computer Vision
43 pages
CS 474 Lec 01 Introduction
No ratings yet
CS 474 Lec 01 Introduction
69 pages
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
No ratings yet
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
44 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
02 Feature Extraction & DLCV
No ratings yet
02 Feature Extraction & DLCV
165 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
Introduction To Computer Vision
No ratings yet
Introduction To Computer Vision
34 pages
1 Vision Lec 1
No ratings yet
1 Vision Lec 1
49 pages
Lecture 1
No ratings yet
Lecture 1
21 pages
Computer_vision_part1
No ratings yet
Computer_vision_part1
96 pages
Introduction To Computer Vision: by James Hays
No ratings yet
Introduction To Computer Vision: by James Hays
32 pages
lec2
No ratings yet
lec2
52 pages
1_Intro24
No ratings yet
1_Intro24
79 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
CS312 Module 4
No ratings yet
CS312 Module 4
21 pages
Lec01 Intro
No ratings yet
Lec01 Intro
61 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
CV 01 Introduction
No ratings yet
CV 01 Introduction
14 pages
T2310 TDS3651 L01 Introduction
No ratings yet
T2310 TDS3651 L01 Introduction
73 pages
CSE480: Machine Vision
No ratings yet
CSE480: Machine Vision
51 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
Lec 01 CompVision N DIP Intro
No ratings yet
Lec 01 CompVision N DIP Intro
91 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
Text For Presentation
No ratings yet
Text For Presentation
5 pages
Lecture1 - Introduction
No ratings yet
Lecture1 - Introduction
35 pages
Lec 00
No ratings yet
Lec 00
76 pages
LectureNotes PDF
No ratings yet
LectureNotes PDF
212 pages
Computer Vision: Evolution and Promise
No ratings yet
Computer Vision: Evolution and Promise
5 pages
Computer Vision ET
No ratings yet
Computer Vision ET
12 pages
502355296-Computer-Vision-Presentation-AI
No ratings yet
502355296-Computer-Vision-Presentation-AI
16 pages
Computer Vision PDF
No ratings yet
Computer Vision PDF
6 pages
Week5_Computer_Vision
No ratings yet
Week5_Computer_Vision
58 pages
CV_SVD_L01_P1_Intro
No ratings yet
CV_SVD_L01_P1_Intro
35 pages
Chapter 1(Introduction to CV and Image Processing)
No ratings yet
Chapter 1(Introduction to CV and Image Processing)
28 pages
What Is Computer Vision
No ratings yet
What Is Computer Vision
9 pages
01 - Introduction
No ratings yet
01 - Introduction
37 pages
Computer Vision (1) (2)
No ratings yet
Computer Vision (1) (2)
14 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
1_Intro
No ratings yet
1_Intro
103 pages
Department of Computer Science and Engineering - University of Bologna
No ratings yet
Department of Computer Science and Engineering - University of Bologna
23 pages
Computer Vision SM-1
No ratings yet
Computer Vision SM-1
26 pages
What Is Computer Vision?: (Slides From James Hays, Brown University)
No ratings yet
What Is Computer Vision?: (Slides From James Hays, Brown University)
25 pages
Lecture1-1
No ratings yet
Lecture1-1
30 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
Image Recognition
No ratings yet
Image Recognition
18 pages
Computer Vision Applications
No ratings yet
Computer Vision Applications
35 pages
Computer Animation: Exploring the Intersection of Computer Animation and Computer Vision
From Everand
Computer Animation: Exploring the Intersection of Computer Animation and Computer Vision
Fouad Sabry
No ratings yet
Advance Usage of AmiBroker
No ratings yet
Advance Usage of AmiBroker
2 pages
Q.1) Choose The Correct Alternative:: Grade:4 Maths Descriptive Practice Ws
No ratings yet
Q.1) Choose The Correct Alternative:: Grade:4 Maths Descriptive Practice Ws
5 pages
Electro Poles Products Pvt. LTD.: Energizing Nation With Power
100% (1)
Electro Poles Products Pvt. LTD.: Energizing Nation With Power
3 pages
NAME: Mikaela Rae D. Cruz Section: 11 Stem B
No ratings yet
NAME: Mikaela Rae D. Cruz Section: 11 Stem B
6 pages
duex-producao-e-desenvolvimento-ltda-duex-dx-500fse-486
No ratings yet
duex-producao-e-desenvolvimento-ltda-duex-dx-500fse-486
1 page
W10 Vibrating Wire V-Notch Weir Monitor System
No ratings yet
W10 Vibrating Wire V-Notch Weir Monitor System
4 pages
Forces in Truss Member
No ratings yet
Forces in Truss Member
9 pages
Information Sheet 1.5-1 Training Resources
No ratings yet
Information Sheet 1.5-1 Training Resources
4 pages
Software Engineer MM
No ratings yet
Software Engineer MM
4 pages
Systematic Literature Review Template
100% (1)
Systematic Literature Review Template
8 pages
Personal Leadership Development
No ratings yet
Personal Leadership Development
6 pages
Computer POST and Beep Codes PDF
100% (1)
Computer POST and Beep Codes PDF
18 pages
Chapter 8
No ratings yet
Chapter 8
48 pages
Https WWW - Dcrustedp.in Reportcard Newdcrust - PHP Id 06
No ratings yet
Https WWW - Dcrustedp.in Reportcard Newdcrust - PHP Id 06
2 pages
Training Report PDF
100% (1)
Training Report PDF
23 pages
The Advanced Gloster Brochure
No ratings yet
The Advanced Gloster Brochure
39 pages
Undelivered Message Headers
No ratings yet
Undelivered Message Headers
2 pages
Jaya Agung
No ratings yet
Jaya Agung
19 pages
Component System Reliability
100% (1)
Component System Reliability
34 pages
Data Mining New Notes Unit 2 PDF
No ratings yet
Data Mining New Notes Unit 2 PDF
15 pages
Dokumen Pameran JieXpo 2023
No ratings yet
Dokumen Pameran JieXpo 2023
2 pages
Dx t99uw Sch
No ratings yet
Dx t99uw Sch
32 pages
Chapter One 1.1 Background of Study
No ratings yet
Chapter One 1.1 Background of Study
6 pages
01 Data Communication Network Basic 2
No ratings yet
01 Data Communication Network Basic 2
28 pages
Clinton Sure 21 Utility Bill
No ratings yet
Clinton Sure 21 Utility Bill
4 pages
CSA Lecture 6 - Computer Instructions
No ratings yet
CSA Lecture 6 - Computer Instructions
46 pages
Fuzzy-Sets Tutorial
No ratings yet
Fuzzy-Sets Tutorial
81 pages
Ultrasonic Testing of Welds
No ratings yet
Ultrasonic Testing of Welds
10 pages

Lecture 01 Introduction

Uploaded by

Lecture 01 Introduction

Uploaded by

Computer Vision

• Where are the cars?

• How far is the building?

• Why is this happening?

• What will I see?

• Need automatic tools to access and analyze visual data!

• How many people are there?

• What are they doing?

• What object the person on the left standing on?

• Why this is a funny picture?

Illumination Object pose Clutter

Occlusions Intra-class Viewpoint

Comfort Fun Access

connect a camera to a computer

Larry Roberts Input image 2x2 gradient operator computed 3D model

Larry Roberts PhD Thesis, MIT, 1963,

The representation and matching of pictorial structures

The representation and matching of pictorial structures

Image credit: Rick Szeliski

[AlexNet NIPS 2012] [DeepFace CVPR 2014]

[DeepPose CVPR 2014] [Show, Attend and Tell ICML 2015]

Facebook face auto-tagging

Slide credit: Steve Seitz

Digit recognition, AT&T labs License plate readers

Hawk-Eye: helping/improving referee decisions

SportVision: improving viewer experiences

Replay Technologies: improving viewer experiences

Image from Microsoft’s Virtual Earth

[Furukawa et al. CVPR 2010]

3D Time-lapse from Internet Photos, ICCV 2015

Source image (Style) Target image (Content) Output (deepart)

A Neural Algorithm of Artistic Style [Gatys et al. 2015]

Kylie Minogue - Come Into My World

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Slide credit: Steve Seitz

Vision systems (JPL) used for several tasks

Vision-guided robots position nut runners on wheels

NASA’s Mars Spirit Rover http://www.robocup.org/

Saxena et al. 2008 http://www.youtube.com/w

Image guided surgery

Counting cells Predicting poverty

You might also like