1 Introduction
1 Introduction
Fall 2024
My Contact Info
Dr. Omar Falou
[email protected] - please put CPS834/CPS8307 in subject line
Office – VIC 717
E-mail expectations:
⚫ I will check my e-mail at minimum once a day
⚫ I answer e-mails in the order they are received and aim to
respond to e-mails within 48 hours (counting “business day”
hours).
⚫ Asking questions in class, after class, or during office hours
is the fastest way to get them answered!
D2L and e-mail
⚫ You are required to maintain a TMU e-mail account. E-mails
from other non-TMU accounts may not be answered!
⚫ I will send e-mails to your TMU account with important class
info.
⚫ Check the D2L Course Shell regularly (before each class) to
check for important announcements.
⚫ Access from my.torontomu.ca
⚫ Everything course-related will be posted on D2L!
Course Description
This course describes foundational concepts of
computer vision. In particular, the course covers
the image formation process, image
representation, feature extraction, model fitting,
motion analysis, 3D parameter estimation and
applications.
Labs: 0 hrs.
Office Hours
⚫ Tuesdays 11:00 am – 1:00 pm (VIC 717)
⚫ Office Hours are the best time to ask
questions about the course.
Textbooks
⚫ Multiple View Geometry in Computer Vision, by
Richard Hartley and Andrew Zisserman. ISBN:
9780511811685
⚫ Digital Image Processing, by Rafael C. Gonzalez,
Richard E. Woods. ISBN: 9780133356724
⚫ Computer Vision: Algorithms and Applications, by
Richard Szeliski, an electronic version of the book
can be found on the author’s website.
Topics
Topics
Evaluation
Assignments
⚫ Research Topic (15%)
⚫ Topics will be assigned weekly. Two weeks to submit/present.
⚫ Groups of two.
⚫ Requirements: 3-4 pages report (all students) + in-class
presentation (graduate students)
⚫ Paper Critique (15%)
⚫ Given after study week
⚫ Groups of two.
⚫ 2-3 pages.
Projects
⚫ Group of 4-5 students
⚫ Due at the end of the semester
⚫ Source code (Python) + report + video demo + contributions
⚫ Examples:
⚫ Medical Image Segmentation and Classification
⚫ Medical Image Synthesis
⚫ Object Detection and Tracking
⚫ Automatic Image Captioning
⚫ Emotion Detection from Facial Expressions
⚫ Autonomous Vehicle Lane Detection
⚫ 3D Object Reconstruction from 2D Images
⚫ Defect Detection in Manufacturing
⚫ AI-Powered Visual Search Engine
⚫ Plant Disease Detection
⚫ Sports Action Recognition in Video Clips
Projects
⚫ Group of 4-5 students
⚫ Due at the end of the semester
⚫ Source code (Python) + report
⚫ Examples:
⚫ Medical Image Segmentation and Classification
⚫ Medical Image Synthesis
⚫ Object Detection and Tracking
⚫ Automatic Image Captioning
⚫ Emotion Detection from Facial Expressions
⚫ Autonomous Vehicle Lane Detection
⚫ 3D Object Reconstruction from 2D Images
⚫ Defect Detection in Manufacturing
⚫ AI-Powered Visual Search Engine
⚫ Plant Disease Detection
⚫ Sports Action Recognition in Video Clips
Final Exam
⚫ During the examination period.
⚫ Format: Multiple choice questions.
⚫ Duration: 2 hours.
Religious Exemption
mail)
⚫ Students should be aware that as a result
https://twitter.com/pickover/status/1460275132958662657/
But humans can tell a lot about a scene from a
little information…
ZED 2i Camera
The goal of computer vision
• Recognize objects and people
Terminator 2, 1991
031111_034757+8_beijing_e031124
sky
building
flag
face
banner
wall
street lamp
bus bus
Sudoku grabber
http://sudokugrab.blogspot.com/
“How the Afghan Girl was Identified by Her Iris Patterns” Read the story
Source: S. Seitz
Login without a password
Snapchat Lenses
https://this-person-does-not-exist.com/
Which face is real?
https://www.whichfaceisreal.com/
Image synthesis
“An astronaut riding a horse in a “A photo of a Corgi dog riding a bike in Times
photorealistic style” – DALL-E 2 Square. It is wearing sunglasses and a beach hat” –
Imagen
Sports
Source: S. Seitz
Smart cars
• Mobileye
• Tesla Autopilot
• Safety features in many cars
Self-driving cars
Waymo
Robotics
3D imaging
(MRI, CT) Skin cancer classification with deep learning
https://cs.stanford.edu/people/esteva/nature/
Virtual & Augmented Reality
Viewpoint variation
Scale
Illumination
Why is computer vision difficult?
Source: S. Lazebnik
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a given 2D
image