0% found this document useful (0 votes)

26 views

1 Introduction

Uploaded by

jamil.arbas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

1 Introduction

Uploaded by

jamil.arbas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 67

CPS834/CPS8307

Introduction to Computer Vision

Dr. Omar Falou

Toronto Metropolitan University

Fall 2024
My Contact Info
Dr. Omar Falou
[email protected] - please put CPS834/CPS8307 in subject line
Office – VIC 717

E-mail expectations:
⚫ I will check my e-mail at minimum once a day
⚫ I answer e-mails in the order they are received and aim to
respond to e-mails within 48 hours (counting “business day”
hours).
⚫ Asking questions in class, after class, or during office hours
is the fastest way to get them answered!
D2L and e-mail
⚫ You are required to maintain a TMU e-mail account. E-mails
from other non-TMU accounts may not be answered!
⚫ I will send e-mails to your TMU account with important class
info.
⚫ Check the D2L Course Shell regularly (before each class) to
check for important announcements.
⚫ Access from my.torontomu.ca
⚫ Everything course-related will be posted on D2L!
Course Description
This course describes foundational concepts of
computer vision. In particular, the course covers
the image formation process, image
representation, feature extraction, model fitting,
motion analysis, 3D parameter estimation and
applications.

Lecture: 3 hrs. 3:10 pm – 6:00 pm (LIB 72)

Labs: 0 hrs.
Office Hours
⚫ Tuesdays 11:00 am – 1:00 pm (VIC 717)
⚫ Office Hours are the best time to ask
questions about the course.
Textbooks
⚫ Multiple View Geometry in Computer Vision, by
Richard Hartley and Andrew Zisserman. ISBN:
9780511811685
⚫ Digital Image Processing, by Rafael C. Gonzalez,
Richard E. Woods. ISBN: 9780133356724
⚫ Computer Vision: Algorithms and Applications, by
Richard Szeliski, an electronic version of the book
can be found on the author’s website.
Topics
Topics
Evaluation
Assignments
⚫ Research Topic (15%)
⚫ Topics will be assigned weekly. Two weeks to submit/present.
⚫ Groups of two.
⚫ Requirements: 3-4 pages report (all students) + in-class
presentation (graduate students)
⚫ Paper Critique (15%)
⚫ Given after study week
⚫ Groups of two.
⚫ 2-3 pages.
Projects
⚫ Group of 4-5 students
⚫ Due at the end of the semester
⚫ Source code (Python) + report + video demo + contributions
⚫ Examples:
⚫ Medical Image Segmentation and Classification
⚫ Medical Image Synthesis
⚫ Object Detection and Tracking
⚫ Automatic Image Captioning
⚫ Emotion Detection from Facial Expressions
⚫ Autonomous Vehicle Lane Detection
⚫ 3D Object Reconstruction from 2D Images
⚫ Defect Detection in Manufacturing
⚫ AI-Powered Visual Search Engine
⚫ Plant Disease Detection
⚫ Sports Action Recognition in Video Clips
Projects
⚫ Group of 4-5 students
⚫ Due at the end of the semester
⚫ Source code (Python) + report
⚫ Examples:
⚫ Medical Image Segmentation and Classification
⚫ Medical Image Synthesis
⚫ Object Detection and Tracking
⚫ Automatic Image Captioning
⚫ Emotion Detection from Facial Expressions
⚫ Autonomous Vehicle Lane Detection
⚫ 3D Object Reconstruction from 2D Images
⚫ Defect Detection in Manufacturing
⚫ AI-Powered Visual Search Engine
⚫ Plant Disease Detection
⚫ Sports Action Recognition in Video Clips
Final Exam
⚫ During the examination period.
⚫ Format: Multiple choice questions.
⚫ Duration: 2 hours.
Religious Exemption

⚫ If you have a religious observance at this

time, you must notify me within the first
two weeks of class. (TMU Policy 150)
⚫ For already-scheduled course components (most of them),
must submit a Request for Accommodation of Student
Religious, Aboriginal and Spiritual Observance AND an
Academic Consideration Request form within the first 2 weeks
of the class
⚫ For a final examination, do these within 2 weeks of the posting
of the examination schedule.
Remarking
⚫ Request for remarking of any work must
be made within 10 working days of the
assessment being released to students
⚫ A request should be made in writing (e-

mail)
⚫ Students should be aware that as a result

of remarking, their mark may go up, down,

or stay the same.
My Work
⚫ Medical Imaging Applications
⚫ Transfer Learning for the Prediction of Breast

Cancer Treatment using Quantitative Ultrasound

Imaging
⚫ Machine Learning for the Prediction of Kidney

Fibrosis using Ultrasound and Photoacoustic

Imaging
⚫ Deep Learning for the Synthesis of Cancer Images
Falou et al. “Transfer Learning of Pre-treatment Quantitative Ultrasound Multi-Parametric
Images for the Prediction of Breast Cancer Response to Neoadjuvant Chemotherapy,” Scientific
Reports, 14(1):2340, 2024
Acknowledgments
Many of the following slides are adapted from the
outstanding lecture materials of similar courses taught by
Professors Yung-Yu Chuang, Fredo Durand, Alyosha Efros,
Bill Freeman, James Hays, Svetlana Lazebnik, Andrej
Karpathy, Fei-Fei Li, Srinivasa Narasimhan, Silvio Savarese,
Steve Seitz, Richard Szeliski, Li Zhang, Kosta Derpanis, Yan
Tong, and Noah Snavely. The instructor deeply appreciates
these researchers for sharing their notes publicly, which
have been invaluable in developing this course.
Some Context
Every image tells a story
• Goal of computer vision:
perceive the “story” behind
the picture
• Compute properties of the
world
– 3D shape
– Names of people or objects
– What happened?
The goal of computer vision
Can computers match human perception?
• Yes and no (mainly no)
– computers can be better at
“easy” things
– humans are better at “hard”
things

• But huge progress

– Accelerating in the last five
years due to deep learning
– What is considered “hard”
keeps changing
Human perception has its shortcomings

https://twitter.com/pickover/status/1460275132958662657/
But humans can tell a lot about a scene from a
little information…

Source: “80 million tiny images” by Torralba, et al.

The goal of computer vision
The goal of computer vision
• Compute the 3D shape of the world

ZED 2i Camera
The goal of computer vision
• Recognize objects and people

Terminator 2, 1991
031111_034757+8_beijing_e031124

slide credit: Fei-Fei, Fergus & Torralba

031111_034757+8_beijing_e031124

sky
building

flag

face
banner
wall
street lamp
bus bus

cars slide credit: Fei-Fei, Fergus & Torralba

The goal of computer vision
• “Enhance” images
The goal of computer vision
• Forensics

Source: Nayar and Nishino, “Eyes for Relighting”

Source: Nayar and Nishino, “Eyes for Relighting”
Source: Nayar and Nishino, “Eyes for Relighting”
The goal of computer vision
• Improve photos (“Computational Photography”)

Super-resolution (source: 2d3)

Depth of field on cell phone camera

(source: Google Research Blog)
Removing objects
(Google Magic Eraser)
Low-light photography
(credit: Hasinoff et al., SIGGRAPH ASIA 2016)
April 10, 2019
Why study computer vision?
• Billions of images/videos captured per day

• Huge number of potential applications

• The next slides show the current state of the art
Optical character recognition (OCR)
• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs (1990’s) License plate readers

http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
http://yann.lecun.com/exdb/lenet/

Sudoku grabber
http://sudokugrab.blogspot.com/

Automatic check processing

Face detection

• Nearly all cameras detect faces in real time

– (Why?)
Face analysis and recognition
Vision-based biometrics

Who is she? Source: S. Seitz

Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story

Source: S. Seitz
Login without a password

Fingerprint scanners on Face unlock on Apple iPhone X

many new smartphones See also http://www.sensiblevision.com/
and other devices
New York Times, Jan. 18, 2020
by Kashmir Hill
Bird identification

Merlin Bird ID (based on Cornell Tech technology!)

Special effects: shape capture

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Source: S. Seitz
Special effects: motion capture

Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz

3D face tracking w/ consumer cameras

Snapchat Lenses

Face2Face system (Thies et al.)

Image synthesis

https://this-person-does-not-exist.com/
Which face is real?

https://www.whichfaceisreal.com/
Image synthesis

“An astronaut riding a horse in a “A photo of a Corgi dog riding a bike in Times
photorealistic style” – DALL-E 2 Square. It is wearing sunglasses and a beach hat” –
Imagen
Sports

Sportvision first down line

Explanation on www.howstuffworks.com

Source: S. Seitz
Smart cars

• Mobileye
• Tesla Autopilot
• Safety features in many cars
Self-driving cars

Waymo
Robotics

NASA’s Mars Curiosity Rover Amazon Picking Challenge

https://en.wikipedia.org/wiki/Curiosity_(rover) http://www.robocup2016.org/en/events/amazon-picking-challenge/

Amazon Prime Air Amazon Scout

Medical imaging

3D imaging
(MRI, CT) Skin cancer classification with deep learning
https://cs.stanford.edu/people/esteva/nature/
Virtual & Augmented Reality

6DoF head tracking Hand & body tracking

3D scene understanding 3D-360 video capture

Current state of the art
• You just saw many examples of current systems.
– Many of these are less than 5 years old

• Computer vision is an active research area, and rapidly changing

– Many new apps in the next 5 years
– Deep learning and generative methods powering many modern
applications

• Many startups across a dizzying array of areas

– Generative AI, robotics, autonomous vehicles, medical imaging,
construction, inspection, VR/AR, …
Why is computer vision difficult?

Viewpoint variation

Credit: Flickr user michaelpaul

Scale
Illumination
Why is computer vision difficult?

Motion (Source: S. Lazebnik)

Intra-class variation

Background clutter Occlusion

Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba

But there are lots of visual cues we can use…

Source: S. Lazebnik
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a given 2D
image

Artist Julian Beever with his anamorphic Coke bottle

– We often must use prior knowledge about the world’s structure

Image source: F. Durand
Good luck!
I am here to help you succeed.

EBI R610.1 ID - Guide
100% (1)
EBI R610.1 ID - Guide
136 pages
Flex Fields, Value Sets and Lookups
No ratings yet
Flex Fields, Value Sets and Lookups
76 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
T2310 TDS3651 L01 Introduction
No ratings yet
T2310 TDS3651 L01 Introduction
73 pages
Lec 00
No ratings yet
Lec 00
76 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
DL4CV_Week01_Part01
No ratings yet
DL4CV_Week01_Part01
35 pages
LectureNotes PDF
No ratings yet
LectureNotes PDF
212 pages
Computer Vision and Image Computer Vision and Image Processing (CSEL Processing (CSEL - 393) 393) Lecture 1: Introduction Lecture 1: Introduction
No ratings yet
Computer Vision and Image Computer Vision and Image Processing (CSEL Processing (CSEL - 393) 393) Lecture 1: Introduction Lecture 1: Introduction
14 pages
CompVisNotes PDF
No ratings yet
CompVisNotes PDF
115 pages
CV Module 1
No ratings yet
CV Module 1
166 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
Lecture 1
No ratings yet
Lecture 1
21 pages
intro
No ratings yet
intro
66 pages
Computer Vision (1) (2)
No ratings yet
Computer Vision (1) (2)
14 pages
CV_Lecture_1-DD-Don
No ratings yet
CV_Lecture_1-DD-Don
38 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
1) Lecture1v11 Part1
No ratings yet
1) Lecture1v11 Part1
9 pages
ECE885 Computer Vision: Prof. Bhupinder Verma
No ratings yet
ECE885 Computer Vision: Prof. Bhupinder Verma
59 pages
Cv Digital Notes
No ratings yet
Cv Digital Notes
77 pages
UNIT-I_Introduction to Computer Vision
No ratings yet
UNIT-I_Introduction to Computer Vision
45 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
CV Lecture 1
No ratings yet
CV Lecture 1
65 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
Chapter+1+Introduction+Part+1
No ratings yet
Chapter+1+Introduction+Part+1
72 pages
Introduction To CVIP
No ratings yet
Introduction To CVIP
33 pages
Computer Vision: From Recognition To Geometry
No ratings yet
Computer Vision: From Recognition To Geometry
26 pages
Computer Vision
No ratings yet
Computer Vision
14 pages
Lect1 PDF
100% (1)
Lect1 PDF
45 pages
INT345 Computer Vision
No ratings yet
INT345 Computer Vision
31 pages
CV_UNIT_1
No ratings yet
CV_UNIT_1
17 pages
Topic 5 Computer Vision
No ratings yet
Topic 5 Computer Vision
65 pages
Group 17 Computer Vision @Lcd-1
No ratings yet
Group 17 Computer Vision @Lcd-1
25 pages
MODULE-1
No ratings yet
MODULE-1
18 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
Lec 01 CompVision N DIP Intro
No ratings yet
Lec 01 CompVision N DIP Intro
91 pages
Computer Vision
No ratings yet
Computer Vision
13 pages
Cv Unit 1 Overview of Computer Vison and Application
No ratings yet
Cv Unit 1 Overview of Computer Vison and Application
51 pages
RMK Group 21cs905 CV Unit 5
No ratings yet
RMK Group 21cs905 CV Unit 5
101 pages
00CV Intro Full
No ratings yet
00CV Intro Full
58 pages
Computer Vision I: Raul Queiroz Feitosa
No ratings yet
Computer Vision I: Raul Queiroz Feitosa
26 pages
Computer Vision SM-1
No ratings yet
Computer Vision SM-1
26 pages
Image Recognition in Artificial Intelligence
100% (2)
Image Recognition in Artificial Intelligence
11 pages
Seminar
No ratings yet
Seminar
23 pages
01 Introduction 2023
No ratings yet
01 Introduction 2023
83 pages
lecture 1 AI Summary
No ratings yet
lecture 1 AI Summary
31 pages
Lecture 01
No ratings yet
Lecture 01
79 pages
Lec 1
No ratings yet
Lec 1
51 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
01_Introduction_To_MachineVision
No ratings yet
01_Introduction_To_MachineVision
53 pages
CS 804 Image Processing notes
No ratings yet
CS 804 Image Processing notes
57 pages
CPCS335 - Chapter 9-Final
No ratings yet
CPCS335 - Chapter 9-Final
24 pages
"Introduction To Computer Vision": Submitted by
No ratings yet
"Introduction To Computer Vision": Submitted by
45 pages
computer-vision-al-701
No ratings yet
computer-vision-al-701
50 pages
Computer Vision: Linda Shapiro
No ratings yet
Computer Vision: Linda Shapiro
73 pages
Lecture Notes
No ratings yet
Lecture Notes
144 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
OpenCV Essentials
From Everand
OpenCV Essentials
Mª del Milagro Fernández Carrobles
No ratings yet
Learn OpenCV with Python by Examples
From Everand
Learn OpenCV with Python by Examples
James Chen
No ratings yet
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
From Everand
Applied Deep Learning: Design and implement your own Neural Networks to solve real-world problems (English Edition)
Dr. Rajkumar Tekchandani
No ratings yet
2a. Java Input-Output
No ratings yet
2a. Java Input-Output
18 pages
Lab Activity 5 - Reactor
No ratings yet
Lab Activity 5 - Reactor
2 pages
Planmed Flyer ENG
No ratings yet
Planmed Flyer ENG
4 pages
Full HD Digital T2 Terrestrial / Cable Receiver With Conax Embedded Card Reader
No ratings yet
Full HD Digital T2 Terrestrial / Cable Receiver With Conax Embedded Card Reader
2 pages
CamRanger2_UserGuide_iOS
No ratings yet
CamRanger2_UserGuide_iOS
41 pages
DLL - Mapeh 6 - Q4 - W3
No ratings yet
DLL - Mapeh 6 - Q4 - W3
5 pages
MRA840 Data-Sheet en
No ratings yet
MRA840 Data-Sheet en
1 page
Aot 2021 0013
No ratings yet
Aot 2021 0013
20 pages
Gauss Grade 7 Contests
No ratings yet
Gauss Grade 7 Contests
56 pages
English Grade 1 Part 1 (Pupil's Book)
No ratings yet
English Grade 1 Part 1 (Pupil's Book)
152 pages
Jay's Money Method 2
No ratings yet
Jay's Money Method 2
7 pages
AIS Chapter Seven
No ratings yet
AIS Chapter Seven
11 pages
Data Structures Full Notes
100% (3)
Data Structures Full Notes
90 pages
Message
No ratings yet
Message
6 pages
Ms Project Web
No ratings yet
Ms Project Web
18 pages
Professional Data Engineer Demo
No ratings yet
Professional Data Engineer Demo
6 pages
DSP Lab 2 Student SU20
No ratings yet
DSP Lab 2 Student SU20
7 pages
HCIA-Intelligent Computing V1.0 Mock Exam
No ratings yet
HCIA-Intelligent Computing V1.0 Mock Exam
4 pages
Automation PLC Scada
No ratings yet
Automation PLC Scada
14 pages
Authentication and Hash Function
No ratings yet
Authentication and Hash Function
21 pages
Migrating From OBIEE To Oracle Analytics Cloud - PPT
No ratings yet
Migrating From OBIEE To Oracle Analytics Cloud - PPT
25 pages
Module 7 Policing The Internet
No ratings yet
Module 7 Policing The Internet
13 pages
BBPS Presentation
No ratings yet
BBPS Presentation
11 pages
Lu Et Al - 2019 - PID Control Considerations For Analog-Digital Hybrid Low-Dropout Regulators
No ratings yet
Lu Et Al - 2019 - PID Control Considerations For Analog-Digital Hybrid Low-Dropout Regulators
3 pages
CSVCassignment
No ratings yet
CSVCassignment
49 pages
Aniket
No ratings yet
Aniket
23 pages
Implementing Smart Factory
No ratings yet
Implementing Smart Factory
10 pages
Safety_and_Security_Exam_
No ratings yet
Safety_and_Security_Exam_
5 pages

1 Introduction

Uploaded by

1 Introduction

Uploaded by

CPS834/CPS8307

Introduction to Computer Vision

Dr. Omar Falou

Lecture: 3 hrs. 3:10 pm – 6:00 pm (LIB 72)

⚫ If you have a religious observance at this

of remarking, their mark may go up, down,

Cancer Treatment using Quantitative Ultrasound

Fibrosis using Ultrasound and Photoacoustic

• But huge progress

Source: “80 million tiny images” by Torralba, et al.

slide credit: Fei-Fei, Fergus & Torralba

cars slide credit: Fei-Fei, Fergus & Torralba

Source: Nayar and Nishino, “Eyes for Relighting”

Super-resolution (source: 2d3)

Depth of field on cell phone camera

• Huge number of potential applications

Digit recognition, AT&T labs (1990’s) License plate readers

Automatic check processing

• Nearly all cameras detect faces in real time

Who is she? Source: S. Seitz

Fingerprint scanners on Face unlock on Apple iPhone X

Merlin Bird ID (based on Cornell Tech technology!)

The Matrix movies, ESC Entertainment, XYZRGB, NRC

Pirates of the Carribean, Industrial Light and Magic Source: S. Seitz

Face2Face system (Thies et al.)

Sportvision first down line

NASA’s Mars Curiosity Rover Amazon Picking Challenge

Amazon Prime Air Amazon Scout

6DoF head tracking Hand & body tracking

3D scene understanding 3D-360 video capture

• Computer vision is an active research area, and rapidly changing

• Many startups across a dizzying array of areas

Credit: Flickr user michaelpaul

Credit: Flickr user michaelpaul

Motion (Source: S. Lazebnik)

Background clutter Occlusion

slide credit: Fei-Fei, Fergus & Torralba

Artist Julian Beever with his anamorphic Coke bottle

– We often must use prior knowledge about the world’s structure

You might also like