0% found this document useful (0 votes)

91 views80 pages

Computer Vision ch1

This document provides information about the EE5811 Topics in Computer Vision course. It introduces the instructor, Dr. LI Haoliang, and the three teaching assistants. It lists three recommended textbooks and discusses technical paper reading. It outlines the course goals of understanding the physical world from images and reconstructing 3D models from 2D images. Finally, it provides a tentative schedule covering topics like image filtering, edge detection, object detection, 3D reconstruction and motion analysis.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views80 pages

Computer Vision ch1

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 80

EE5811

Topics in Computer Vision

Dr. LI Haoliang
Department of Electrical Engineering
EE5811 Background Info
• Instructor: LI, Haoliang,
– Assistant professor @ EE department
• Office: Yeung-G6526
• Email: [email protected]
– The best way to reach me.
• Tel: 3442-6087
Teaching Assistants
• Mr. QIN Tiexin
– [email protected]
• Mr. LIU Jie
– [email protected]
• Mr. LIU Hui
– [email protected]
• Textbook
• Milan Sonka, Vaclav Hlavac, Roger Boyle, Image
Processing, Analysis, and Machine Vision (CENGAGE
Learning, 4th edition, 2015)
• D. Forsyth and J. Ponce, Computer Vision: A Modern
Approach, 2nd edition, Prentice Hall (2011)
• R. Gonzalez and R. Woods, Digital Image Processing, 3rd
edition, Prentice Hall (2007)
• Computer Vision: Algorithms and Applications, 2nd
edition, 2021 http://szeliski.org/Book/
Technical paper reading
• Computer Vision Venue
– CVPR, ICCV, ECCV, ACCV,BMVC, etc.
• Machine Learning Venue
– ICML, NeurIPS, ICLR, UAI, AISTAT, etc.
• Other Venues
– ACL, KDD, MICCAI, etc.
7
Every image tells a story
• Goal of computer vision:
perceive the “story”
behind the picture
• Compute properties of
the world
– 3D shape
– Names of people or
objects
– What happened?
Can the computer match human
perception?
• Yes and no (mainly no)
– computers can be better at
“easy” things
– humans are much better at
“hard” things

• But huge progress has

been made
– Accelerating in the last few
years due to deep learning
– What is considered “hard”
keeps changing
Humans can tell a lot about a scene
from a little information…

Source: “80 million tiny images” by Torralba, et al.

The goal of computer vision
The goal of computer vision
• Compute and understand the physical world
The goal of computer vision
• Reconstruct 3D model from crowdsourcing

Internet Photos (“Colosseum”) Reconstructed 3D cameras Dense 3D model

and points
The goal of computer vision
• Recognize objects and people

Terminator 2, 1991
sky
building

flag

face
banner
wall
street lamp
bus bus

cars slide credit: Fei-Fei, Fergus & Torralba

Why study computer vision?
• Billions of images/videos captured per day

• Huge number of useful applications

• The next slides show the current state of the art
Computer Vision

• Low Level Vision

– Measurements
– Enhancements
– Region segmentation
– Features White
• Mid Level Vision
– Reconstruction
– Depth Shadow

– Motion Estimation 1m

• High Level Vision

– Category detection
– Activity recognition
– Deep understandings
17
Computer Vision

• Low Level Vision

– Measurements
– Enhancements
– Region segmentation
– Features White
• Mid Level Vision
– Reconstruction
– Depth Shadow

– Motion Estimation 1m

• High Level Vision

– Category detection
– Activity recognition
– Deep understandings
18
Computer Vision

• Low Level Vision

– Measurements
– Enhancements
– Region segmentation
– Features White
• Mid Level Vision
– Reconstruction
– Depth Shadow

– Motion Estimation 1m

• High Level Vision

– Category detection
– Activity recognition
– Deep understandings
19
Image enhancement
• Improve photos (“Computational Photography”)

Haze removal

Super-resolution (source: 2d3)

Inpainting / image completion (image credit: Hays and Efros)
Applications: 3D Scanning

Scanning Michelangelo’s “The David”

• The Digital Michelangelo Project

- http://graphics.stanford.edu/projects/mich/
• UW Prof. Brian Curless, collaborator
• 2 BILLION polygons, accuracy to .29mm 21
Google’s 3D Maps
Structure estimation from tourist photos

22
Apple’s 3D maps

https://www.youtube.com/watch?v=InIVv-LsgZE

23
Optical character recognition (OCR)
• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readers

http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
http://www.research.att.com/~yann/

Equation Recognition Source: S. Seitz

Automatic check processing
Face detection

• Nearly all cameras detect faces in real time

Face Recognition
Face recognition/Micro expression

Who is she? Source: S. Seitz

Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story

Source: S. Seitz
Object recognition (in mobile phones)

Source: S. Seitz
Google Goggles
Bird Identification

Merlin Bird ID (based on Cornell Tech technology!)

Plant Identification

Pl@ntNet is a research and educational initiative on plant

biodiversity supported by Agropolis Foundation since 2009.
Marine Mammal Recognition
Vessel based CWD survey

Under-water fish counting

Amazon Picking Challenge
http://www.robocup2016.org/en/events
/amazon-picking-challenge/

Robomart
Medical imaging
Healthcare Gist – Chili fish head
Color moment – Braised pork
FC7 – Steamed chicken feet

AlexNet – Kung Pao Chicken

VGG – Kung Pao Chicken

Multi-task VGG – Kung Pao Chicken

[chicken, chili, peanut]

Region-based Multi-task VGG

chicken: dice, stir-fry
chili: dry
peanut: roasted
Virtual & Augmented Reality

6DoF head tracking Hand & body tracking

3D scene understanding 3D-360 video capture

Why is computer vision difficult?

Viewpoint variation

Scale
Illumination
Why is computer vision difficult?

Motion (Source: S. Lazebnik)

Intra-class variation

Background clutter Occlusion

Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a
particular 2D picture

– We often need to use prior knowledge about the

structure of the world
Image source: F. Durand
“In 1966, Minsky hired a first-year
undergraduate student and assigned
him a problem to solve over the
summer: connect a television camera
to a computer and get the machine to
describe what it sees.”
Crevier 1993, pg. 88

Marvin Minsky,
MIT
Turing
award,1969
42
Marvin Minsky, Gerald Sussman, MIT
MIT (the undergraduate)

Turing “You’ll notice that Sussman never

award,1969 worked in vision again!” – Berthold Horn
A brief History of Computer Vision
1970s
1980s
David Marr (1982)
1980s
1990s
2000s
2010s
Current state of the art
• You just saw examples of current systems.
– Most of these are less than 5 years old

• This is a very active research area, and rapidly

changing
– More algorithms and apps in the next 5 years??

• To learn more about vision applications and

companies
– David Lowe maintains an excellent overview of vision
companies
• http://www.cs.ubc.ca/spider/lowe/vision.html
Robotics

Machine
Learning
Scope of Our Course Human Computer
Interaction

Image processing
Scene understanding
Graphics
Object recognition Medical Imaging
Motion analysis
Computational
Photography
Neuroscience

Multimedia
Tentative Schedule
• Week 1 (1/Sep): Introduction
• Weed 2-3 (8/Sep - 15/Sep): Image filtering, color, texture, and
resampling.
• Week 4 (22/Sep): Edge detection (Quiz 1)
• Week 5-6 (29/Sep - 6/Oct): Image keypoint, descriptor,
transformation and alignment.
• Week 7 (13/Oct): Deep Neural Networks
• Week 8 (20/Oct): Object detection and Image segmentation
• Week 9 (27/Oct): Generative Model
• Week 10 (3/Nov): Camera (Quiz 2)
• Week 11 (10/Nov): 3D and Stereo
• Week 12 (17/Nov): Depth and Structure
• Week 13 (24/Nov): Motion and Wrap Up
Course information
• Prerequisites
– A good working knowledge of programming
• We will briefly go through Pillow and OpenCV later.
– Data structure and algorithm
– Some math: linear algebra, vector calculus
• We will revisit some basic math in the lecture.

• Grading
– Project assignment (30%)
– Quiz (20%): paper exam (1 – 2 hours)
– Final exam (50%)
Project Assignment
• Literature survey (10%)
– Survey of at least 10 research papers (published in recent 4 years) for a
specific problem/topic of computer vision and image processing.
– Slide presentation through video recording.
• Course project: computer vision applications (20%)
– Source Code (colab or Jupyter Notebook (ipynb))
– Project report
– Reports are limited to 4 pages, including figures and tables, in the CVPR style
(with name and student ID). Additional pages containing cited references.

• Grouping
– Each group can have 1~3 members. For non-single group, the contribution of
each member should be listed at the end of the report (can be contained in
the additional pages) for reference.
– Find your groupmates NOW!!

https://cvpr2022.thecvf.com/author-guidelines
Suggested Topics

Detection/Tracking
Object/face
recognition

Segmentation

(Medical) Image Image registration

Processing/Enhancement/…

You may not follow these suggested topics, and you could
choose your own preference topic. ☺
Let’s have some fun!
Play with Colab
• Colab, or "Colaboratory", allows you to write
and execute Python in your browser, with
– Zero configuration required
– Access to GPUs free of charge
– Easy sharing
Python Programming
for Image Processing
• Pillow
– Pillow is a fork of PIL, the Python Imaging Library
• Installation:
https://pillow.readthedocs.io/en/latest/installation.html
pip install Pillow

Need to include an exclamation mark when using colab. (!pip install Pillow)

– Image Basics
• http://pillow.readthedocs.org/en/latest/handbook/concepts.html

– List of Modules
• http://pillow.readthedocs.org/en/latest/reference/index.html
Image Reading / Writing

from PIL import Image

# read image
im = Image.open("cat.jpg")
print im.format, im.size, im.mode # JPEG (512, 512) RGB
im.show() # use display(im) for colab

# create thumbnails
newsize = (128,128) Input
im.thumbnail(newsize)

# write image
outfile = "cat_ thumbnail.jpg"
im.save(outfile)
Output

There can be many parameters in the function, GOOGLE it if

you want to explore more!
Image Cutting / Pasting / Merging

# Copying a subrectangle from an image

box = (100, 100, 400, 400)
region = im.crop(box)

# Processing a subrectangle, and pasting it back

region = region.transpose(Image.ROTATE_180)
im.paste(region, box)
Output 1

# Splitting and merging bands

r, g, b = im.split()
im = Image.merge("RGB", (b, g, r))

Output 2
Geometric / Color Transformation
# geometric transforms:
out1 = im.resize((128, 128))
out2 = im.rotate(45) # degrees counter-clockwise
out3 = im.transpose(Image.FLIP_LEFT_RIGHT)
out4 = im.transpose(Image.FLIP_TOP_BOTTOM)

# color transforms:
out5 = im.convert("L")

Input

Output 1 Output 2 Output 3 Output 4 Output 5

Image Filter

from PIL import ImageFilter

# image smoothing / edge

out1 = im.filter(ImageFilter.BLUR)
out2 = im.filter(ImageFilter.GaussianBlur(radius=20))
out3 = im.filter(ImageFilter.CONTOUR)
Input
out4 = im.filter(ImageFilter.FIND_EDGES)

Output 1 Output 2 Output 3 Output 4

OpenCV
• OpenCV (Open Source Computer Vision Library) is an open source
computer vision and machine learning software library
– (Latest version: OpenCV 4.6 )

• OpenCV-Python is the Python API of OpenCV

• Cross-platform (Windows, Mac, Linux, Android, iOS, etc )

• Open Source and free (May have some commercial packages)

• Documentation:
http://docs.opencv.org/modules/core/doc/intro.html
Installation

• Install OpenCV-Python in Windows (site)

– Install Anaconda
• Anaconda is essentially a nicely packaged Python IDE that is shipped with
tons of useful packages, such as NumPy, Pandas, IPython Notebook, etc.
• Installation: https://docs.anaconda.com/anaconda/install/
– Install python virtual environment on Anaconda
• https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-
environments.html
– Install opencv-python:
pip install opencv-python # exclamation mark for colab

– Test:
• Open Python IDLE and type following codes in Python terminal
>>> import cv2
>>> print cv2.__version__
Reading, Colorspace Changing:
import cv2

# reading image:
image = cv2.imread(‘cat.jpg’) # default
image = cv2.imread(‘cat.jpg’, 1) # colorful, BGR
image = cv2.imread(‘cat.jpg’, 0) # gray-scale
image = cv2.imread(‘cat.jpg’, -1) # unchanged

66
Reading, Colorspace Changing:
import cv2

# change image colorspace:

image = cv2.imread(‘cat.jpg’) # default, BGR
image2 = cv2.cvtColor(image, flag)

flag:
• BGR->Gray: cv2.COLOR_BGR2GRAY
• BGR->RGB: cv2.COLOR_BGR2RGB
• BGR->HSV: cv2.COLOR_BGR2HSV

67
Image Segmentation
• K-means Clustering
– It clusters the given data into k-clusters or parts based on the k-centroids
– The motivation behind image segmentation using k-means is that we try to assign labels to each pixel based on
the RGB (or HSV) values
• OpenCV
– cv2.kmeans(data, K, criteria, attempts, flags[, bestLabels[, centers]]) →
retval, bestLabels, centers
– Input parameters
• data: np.float32 data type, and each feature should be put in a single column
• K : Number of clusters required at end
• criteria : It is the iteration termination criteria
• attempts: Flag to specify the number of times the algorithm is executed
• flags: how initial centers are taken
– Output parameters
• retval: the sum of squared distance from each point to their corresponding centers
• bestLabels: the label array where each element marked ‘0’, ‘1’.....
• centers: array of centers of clusters
Image Segmentation
• Preprocessing
• Convert the MxNx3 image into a Kx3 matrix (K=MxN)

import numpy as np
import cv2
from matplotlib import pyplot as plt

Input
image = cv2.imread('coins.jpg')
# reduce noise and make the image smoother
image = cv2.GaussianBlur(image, (7, 7), 0)

# each row is now a vector in the 3-D space of RGB

vectorized = image.reshape(-1, 3)
# convert the unit8 values to float (opencv requirement)
vectorized = np.float32(vectorized)

Noise Removal
Image Segmentation

• k-means algorithm

# define number of segments, with default 5

segments = 2

# OpenCV k-means function

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0) Label map

ret, label, center = cv2.kmeans(vectorized, segments, None, criteria, 10,

cv2.KMEANS_RANDOM_CENTERS)

# assign every pixel with a color based on the label map

res = center[label.flatten()]
# reshape to image size
segmented_map = res.reshape((image.shape))
result = segmented_map.astype(np.uint8)
cv2.imwrite("segmented.jpg", result) Segmentation map
Feature Matching
• SIFT
– OpenCV 3 came a big push to move many of these “non-
free” modules out of the default OpenCV install and into
the opencv_contrib package
– To get access to the original SIFT and SURF
implementations found in OpenCV 2.4.X, you need to pull
down both the opencv and opencv_contrib repositories
from GitHub and then compile and install OpenCV 3 from
source

• ORB -Oriented FAST and rotated BRIEF

– An efficient alternative to SIFT or SURF
– A fusion of FAST keypoint detector and BRIEF descriptor
Feature Matching
• Example

import numpy as np
import cv2

img1 = cv2.imread('church.jpg',0) # queryImage

img2 = cv2.imread('church_part.jpg',0) # trainImage

# Initiate detector
orb = cv2.ORB_create()

# find the keypoints and descriptors

kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
Feature Matching
• Brute-Force matcher
– It takes the descriptor of one
feature in first set and is
matched with all other
features in second set using
some distance calculation
– The closest one is returned
Feature Matching
Feature Matching

Input Output
Face Tracking
• Object Detection using Haar feature-based cascade
classifiers is an effective object detection method
– Group the features into different stages of classifiers and apply
one-by-one
– If a window fails the first stage, discard it.
– If it passes, apply the second stage of features and continue the
process
– The window which passes all stages is a face region

• OpenCV contains many pre-trained classifiers for face, eyes,

smile etc.
• Those XML files are stored in opencv/data/haarcascades/
folder
Face Tracking
• Example
Face Tracking
Face Tracking
• Result

Input Output
Deep Learning Library

Computer Vision 2011
100% (1)
Computer Vision 2011
103 pages
Low Vision Weekly Exam Question Set-1
75% (4)
Low Vision Weekly Exam Question Set-1
9 pages
COLOR THEORY Jose M Parramon PDF
100% (3)
COLOR THEORY Jose M Parramon PDF
51 pages
GIS and Remote Sensing Notes-Class 2021
No ratings yet
GIS and Remote Sensing Notes-Class 2021
83 pages
Field Guide To Visual and Ophthalmic Optics PDF
No ratings yet
Field Guide To Visual and Ophthalmic Optics PDF
120 pages
Computer Vision
100% (1)
Computer Vision
48 pages
Unit 1
No ratings yet
Unit 1
200 pages
Computer Vision
No ratings yet
Computer Vision
30 pages
Unit 1
No ratings yet
Unit 1
186 pages
Lecture 1
100% (1)
Lecture 1
21 pages
Lecture1 1
No ratings yet
Lecture1 1
30 pages
Formatting On The Homebrewery
No ratings yet
Formatting On The Homebrewery
7 pages
Human Eye and Colourful World Notes
No ratings yet
Human Eye and Colourful World Notes
13 pages
1 Intro
No ratings yet
1 Intro
103 pages
Ch-3 Image AnalysisComputer Vision
No ratings yet
Ch-3 Image AnalysisComputer Vision
88 pages
Examination of The Eyes and Vision - OSCE Guide
No ratings yet
Examination of The Eyes and Vision - OSCE Guide
12 pages
Lect1 PDF
100% (1)
Lect1 PDF
45 pages
Computer Vision Report
No ratings yet
Computer Vision Report
31 pages
Liquitex Acrylic Gouache Booklet
No ratings yet
Liquitex Acrylic Gouache Booklet
24 pages
Art Activity
No ratings yet
Art Activity
3 pages
What Computer Vision With The OpenCV
100% (5)
What Computer Vision With The OpenCV
137 pages
CV Module 1
No ratings yet
CV Module 1
166 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
CV Digital Notes
No ratings yet
CV Digital Notes
77 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
No ratings yet
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
44 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
T2310 TDS3651 L01 Introduction
No ratings yet
T2310 TDS3651 L01 Introduction
73 pages
Computer Vision Part1
No ratings yet
Computer Vision Part1
96 pages
Colour Wheel
100% (1)
Colour Wheel
21 pages
Introduction to Data Science: (Khoa học dữ liệu)
No ratings yet
Introduction to Data Science: (Khoa học dữ liệu)
91 pages
CV SVD L01 P1 Intro
No ratings yet
CV SVD L01 P1 Intro
35 pages
Introduction To CVIP
No ratings yet
Introduction To CVIP
33 pages
Lec 00
No ratings yet
Lec 00
76 pages
18cse390t U1 s1 Slo1 Content
No ratings yet
18cse390t U1 s1 Slo1 Content
15 pages
UNIT-I - Introduction To Computer Vision
No ratings yet
UNIT-I - Introduction To Computer Vision
45 pages
grp3 Computervision
No ratings yet
grp3 Computervision
28 pages
Intro
No ratings yet
Intro
66 pages
Computer Vision SM-1
No ratings yet
Computer Vision SM-1
26 pages
Computer Vision Intorduction
No ratings yet
Computer Vision Intorduction
57 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
Colors
No ratings yet
Colors
4 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
Lec 01 CompVision N DIP Intro
No ratings yet
Lec 01 CompVision N DIP Intro
91 pages
00CV Intro Full
No ratings yet
00CV Intro Full
58 pages
Lec01 Intro
No ratings yet
Lec01 Intro
61 pages
LectureNotes PDF
No ratings yet
LectureNotes PDF
212 pages
1a. Introduction
No ratings yet
1a. Introduction
32 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Edit Photos On Computer With The Easy To Use Photo Editing Software
No ratings yet
Edit Photos On Computer With The Easy To Use Photo Editing Software
2 pages
Chapter 1 (Introduction To CV and Image Processing)
No ratings yet
Chapter 1 (Introduction To CV and Image Processing)
28 pages
1 Introduction
No ratings yet
1 Introduction
67 pages
DL4CV Week01 Part01
No ratings yet
DL4CV Week01 Part01
35 pages
1 Vision Lec 1
No ratings yet
1 Vision Lec 1
49 pages
Daniele Genadry - Slow Light PDF
No ratings yet
Daniele Genadry - Slow Light PDF
25 pages
CV Lecture 1
No ratings yet
CV Lecture 1
65 pages
Lecture 01 Introduction To Computer Vision PDF
No ratings yet
Lecture 01 Introduction To Computer Vision PDF
118 pages
Raz Report Final
No ratings yet
Raz Report Final
37 pages
CVIP Module 01 Reviewer
No ratings yet
CVIP Module 01 Reviewer
20 pages
CXVXFV
No ratings yet
CXVXFV
12 pages
Strabismus
No ratings yet
Strabismus
9 pages
Computer Vision
No ratings yet
Computer Vision
14 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
Graphic Design
No ratings yet
Graphic Design
17 pages
Gmetrix Reviewer
No ratings yet
Gmetrix Reviewer
22 pages
Computer Vision: From Recognition To Geometry
No ratings yet
Computer Vision: From Recognition To Geometry
26 pages
Prerequisites: What Is Computer Vision? Vision For Measurement
No ratings yet
Prerequisites: What Is Computer Vision? Vision For Measurement
8 pages
2D Lumitrax 3D Line Scan: Innovative Machine Vision Technology
No ratings yet
2D Lumitrax 3D Line Scan: Innovative Machine Vision Technology
20 pages
CV - Lecture 1 - Iintroduction
No ratings yet
CV - Lecture 1 - Iintroduction
24 pages
Kriya Sharir Department Hod-Dr - Girgune Sir Lec - Pooja Deshpande Ma'Am
No ratings yet
Kriya Sharir Department Hod-Dr - Girgune Sir Lec - Pooja Deshpande Ma'Am
18 pages
Ophthalmic Lenses: Single Vision Lenses Bi-Focal Lenses
No ratings yet
Ophthalmic Lenses: Single Vision Lenses Bi-Focal Lenses
26 pages
Mapeh
No ratings yet
Mapeh
14 pages
The Animation of The Photographic: Stereoscopy and Cinema. The Experiments of Aurélio Da Paz Dos Reis
No ratings yet
The Animation of The Photographic: Stereoscopy and Cinema. The Experiments of Aurélio Da Paz Dos Reis
21 pages
Class 6 ColorModel 02
No ratings yet
Class 6 ColorModel 02
14 pages
Computer Vision and Image Computer Vision and Image Processing (CSEL Processing (CSEL - 393) 393) Lecture 1: Introduction Lecture 1: Introduction
No ratings yet
Computer Vision and Image Computer Vision and Image Processing (CSEL Processing (CSEL - 393) 393) Lecture 1: Introduction Lecture 1: Introduction
14 pages
Human Sensing 03
No ratings yet
Human Sensing 03
9 pages
Soflex OverNight English Booklet
No ratings yet
Soflex OverNight English Booklet
17 pages
Seni Lukis Dan Catan I 22/23
No ratings yet
Seni Lukis Dan Catan I 22/23
8 pages
Format of 1st Page - Seminar
No ratings yet
Format of 1st Page - Seminar
3 pages
Computer Vision: Evolution and Promise
No ratings yet
Computer Vision: Evolution and Promise
5 pages
Nature Photography With Magnification Using Hugin For Focus Stacking
No ratings yet
Nature Photography With Magnification Using Hugin For Focus Stacking
4 pages
Changes in Contrast Sensitivity in Young Adults With Diabetes
No ratings yet
Changes in Contrast Sensitivity in Young Adults With Diabetes
5 pages
Moving Object Tracking in Video Using MATLAB
No ratings yet
Moving Object Tracking in Video Using MATLAB
5 pages
Digital Microscope PDF
No ratings yet
Digital Microscope PDF
3 pages
2.4 Create A Painting or Drawing, Using Warm or Cool Colors Expressively
No ratings yet
2.4 Create A Painting or Drawing, Using Warm or Cool Colors Expressively
2 pages
5CS4-04 Computer Graphics & Multimedia Set 2
No ratings yet
5CS4-04 Computer Graphics & Multimedia Set 2
1 page

Computer Vision ch1

Uploaded by

Computer Vision ch1

Uploaded by

EE5811

Topics in Computer Vision

• But huge progress has

Source: “80 million tiny images” by Torralba, et al.

Internet Photos (“Colosseum”) Reconstructed 3D cameras Dense 3D model

cars slide credit: Fei-Fei, Fergus & Torralba

• Huge number of useful applications

• Low Level Vision

• High Level Vision

• Low Level Vision

• High Level Vision

• Low Level Vision

• High Level Vision

Super-resolution (source: 2d3)

Scanning Michelangelo’s “The David”

• The Digital Michelangelo Project

Digit recognition, AT&T labs License plate readers

Equation Recognition Source: S. Seitz

• Nearly all cameras detect faces in real time

Who is she? Source: S. Seitz

Merlin Bird ID (based on Cornell Tech technology!)

Pl@ntNet is a research and educational initiative on plant

Under-water fish counting

AlexNet – Kung Pao Chicken

Multi-task VGG – Kung Pao Chicken

Region-based Multi-task VGG

6DoF head tracking Hand & body tracking

3D scene understanding 3D-360 video capture

Motion (Source: S. Lazebnik)

Background clutter Occlusion

– We often need to use prior knowledge about the

Turing “You’ll notice that Sussman never

• This is a very active research area, and rapidly

• To learn more about vision applications and

(Medical) Image Image registration

from PIL import Image

There can be many parameters in the function, GOOGLE it if

# Copying a subrectangle from an image

# Processing a subrectangle, and pasting it back

# Splitting and merging bands

Output 1 Output 2 Output 3 Output 4 Output 5

from PIL import ImageFilter

# image smoothing / edge

Output 1 Output 2 Output 3 Output 4

• OpenCV-Python is the Python API of OpenCV

• Cross-platform (Windows, Mac, Linux, Android, iOS, etc )

• Open Source and free (May have some commercial packages)

• Install OpenCV-Python in Windows (site)

# change image colorspace:

# each row is now a vector in the 3-D space of RGB

# define number of segments, with default 5

# OpenCV k-means function

ret, label, center = cv2.kmeans(vectorized, segments, None, criteria, 10,

# assign every pixel with a color based on the label map

• ORB -Oriented FAST and rotated BRIEF

img1 = cv2.imread('church.jpg',0) # queryImage

# find the keypoints and descriptors

• OpenCV contains many pre-trained classifiers for face, eyes,

You might also like