0% found this document useful (0 votes)
2 views38 pages

DIP L01 Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views38 pages

DIP L01 Introduction

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

CSE-408 ( Diigital Image Processing)

Instructor: Dr. Muhammad Abeer Irfan


DCSE
Have you ever used Image processing and
computer vision?
Where?
How?
Have you ever used computer vision?
How? Where?
Reconstruction? Recognition? (Re)organization?

Think-Pair-Share
Laptop: Biometrics auto-login (face recognition, 3D), OCR
Smartphones: QR codes, computational photography (Android Lens Blur, iPhone
Portrait Mode), panorama construction (Google Photo Spheres), face detection,
expression detection (smile), Snapchat filters (face tracking), FaceID (iPhone), Night
Sight (Pixel), iPhone 12 Pro (LiDAR)
Web: Image search, Google photos (face recognition, object recognition, scene
recognition, geolocalization from vision), Facebook (image captioning), Google maps
aerial imaging (image stitching), YouTube (content categorization)
VR/AR: Outside-in tracking (HTC VIVE), inside out tracking (simultaneous localization
and mapping, HoloLens), object occlusion (dense depth estimation)
Motion: Kinect, full body tracking of skeleton, gesture recognition, virtual try-on
Medical imaging: CAT / MRI reconstruction, assisted diagnosis, automatic pathology,
connectomics, endoscopic surgery
Industry: Vision-based robotics (marker-based), machine-assisted router (jig),
automated post, ANPR (number plates), surveillance, drones, shopping
Transportation: Assisted driving (everything), face tracking/iris dilation for
drunkeness, drowsiness, automated distribution (all modes)
Media: Visual effects for film, TV (reconstruction), virtual sports replay
(reconstruction), semantics-based auto edits (reconstruction, recognition)
Optical character recognition (OCR)
Technology to convert images of text into text
If you have a scanner, it probably came with OCR software

Live
Camera
Translation

Mail digit recognition, AT&T labs


http://www.research.att.com/~yann/

License plate readers


http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
JH
Face detection

• Almost all digital cameras detect faces


• Snapchat face filters
Facial login without a password…
Facial login without a password…
Facial login without a password…

Liang et
al. 2014
Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns”


Read the story (Wikipedia)

JH
Smile detection

Sony Cyber-shot® T70 Digital Still Camera


JH
Video call eye gaze correction
Kuster et al., SIGGRAPH Asia 2012
– https://cgl.ethz.ch/publications/papers/paperKus12.php

Apple FaceTime
Attention Correction
Object recognition (in mobile phones)
e.g., Google Lens
Object recognition (in supermarkets)
How does it work? Think-Pair-Share
How does it work?
Source: Vivek Ramanujan
3D from images

Building Rome in a Day: Agarwal et al. 2009


Human shape capture
Human shape capture
Human shape capture
Human shape capture
Special effects: motion capture
Interactive Games
Object Recognition:
http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o
Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY

JH
Sports

Virtual pitch markings Free viewpoint video

Sportvision first down line [Canon 2017]


Nice explanation on www.howstuffworks.com

JH
Medical imaging

Image guided surgery


3D imaging
Grimson et al., MIT
MRI, CT

JH
AutoCars - Uber bought CMU’s lab (2015)
Then sold it (2020)
http://www.robocup.org/
Mobile robots
Saxena et al. 2008
STAIR at Stanford

Skydio 2 drone
6x fisheye cameras for
obstacle avoidance
Onboard NVIDIA GPU
Vision in space

NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.

Vision systems (JPL) used for several tasks


• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
JH
NASA Perseverance lander and rover
Landed 18th February 2021
Humanoid Robots

https://blog.bostondynamics.com/flipping-the-script-with-atlas, Boston Dynamics (2021)


Augmented Reality and Virtual Reality

MS HoloLens, Oculus, Magic Leap,


ARCore / ARKit
Augmented Reality and Virtual Reality
Real-time monocular depth Real-time 3D hand
estimation and camera tracking pose estimation

Oculus (Quest)
Niantic
AI for Physical Interaction

Boston Dynamics (2017)

You might also like