0% found this document useful (0 votes)
6 views

CS4442_CS9542_Part 2_Lecture 1_Intro_Filtering

The document outlines the syllabus for Part II of the Artificial Intelligence course, taught by Dr. Yalda Mohsenzadeh, focusing on computer vision and deep learning. Key topics include image processing techniques, neural networks, and various methods for motion estimation and image segmentation. The course aims to provide students with a comprehensive understanding of computer vision and its applications in AI.

Uploaded by

aeryaery0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

CS4442_CS9542_Part 2_Lecture 1_Intro_Filtering

The document outlines the syllabus for Part II of the Artificial Intelligence course, taught by Dr. Yalda Mohsenzadeh, focusing on computer vision and deep learning. Key topics include image processing techniques, neural networks, and various methods for motion estimation and image segmentation. The course aims to provide students with a comprehensive understanding of computer vision and its applications in AI.

Uploaded by

aeryaery0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Artificial Intelligence II

Part 2: Lecture 1
Yalda Mohsenzadeh
Outline of CS4442/9542 (Part II)
• Instructor of Part II: Dr. Yalda Mohsenzadeh, will take over the
class on Feb 26th

• A faculty member in the Department of Computer Science, Brain


and Mind Institute, and Vector Institute for AI

• Email: [email protected]

• Office hour: Wednesdays at 10:30 AM to 11:30 AM

• The part II of class will cover a brief introduction to very active


and exciting areas of AI:
• computer vision & deep learning
Part 2 Syllabus
• Introduction to Computer Vision:
• Filtering, Edge detection, Edge types, Image gradients, Canny Edge detector
• Image segmentation (perceptual grouping, pixel clustering, histogram based
methods)
• Motion: Motion estimation, motion field, optical flow, Methods for optical flow
estimation and motion tracking
• Neural Networks
• Brief history, basic formulation, optimization with gradient descent, layer types
(linear, point-wise, nonlinearity), linear classification with perceptron,
Tensorflow, Regularizers, Normalization
• Deep learning
• Batch processing, stochastic gradient descent, backpropagation
• Neural networks for images,
• convolutional neural networks (multiple channels, pooling, strides), receptive
fields, unit visualization, important network architectures (AlexNet, VggNet,
ResNet, DenseNet, ...) and their tricks
• Representational learning, unsupervised/self-supervised learning with
neural network
Computer Vision
Introduction
Filtering
A simple Visual World

To discover from images what is present in the


world, where things are, what actions are taking
place, to predict and anticipate events in the world
What is Computer Vision?
• The ability of computers to see
• Image Understanding
• Machine Vision
• Robot Vision
• Image Analysis
• Video Understanding
Exciting Time for Computer Vision
Not long ago
Mask RCNN, He et al. 2017
Human Brain: A View of Visual System
• Vision starts with the eyes, but truly takes place in the brain

10
Goldstein and Brockmole, 2016
Low level processing
• Low level operations
• Filtering, edge detection
Mid level processing
• Mid level operations
• Shape formation, 3D shape reconstruction, …
High level processing
• High level operations
• Recognition of objects, people, places, events
Perception versus measurement

Edward Adelson
Perception versus measurement
Image
• 2-D array of numbers (intensity values, gray levels)
• gray level 0 (black) to 255 (white)
• Color images are 3 2-D arrays of numbers
• Red
• Green
• Blue
• Resolution (number of rows and columns)
• 128 x 128
• 256 x 256
• 640 x 480
Images as functions
• We can think of an image as a
function, f, from R2 to R:
• f(x, y) gives the intensity at position
(x,y)
• f(x,y) is proportional to the brightness
at (x,y)
• Realistically, we expect the image only
to be defined over a rectangle, with a
finite range:
• f: [a,b] x[c,d] -> [0, 255]
• Standard range for gray scale images is
(0, 1, 2, …, 255)
• A color image is just three functions
pasted together. We can write this as
vector-valued function
𝑟(𝑥, 𝑦)
𝑓 𝑥, 𝑦 = 𝑔(𝑥, 𝑦)
𝑏(𝑥, 𝑦)
A digital image
• In computer vision we usually operate on digital
(discrete) images:
• Sample the 2D space on a regular grid
• Quantize each sample (round to nearest integer)
• If our samples are Δ apart, we can write this as:
𝑓 𝑖, 𝑗 = 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑒{𝑓 𝑖Δ, 𝑗Δ }
• The image can now be represented as a matrix of
integer values
12 15 120 128 128 128 130
240 120 18 120 121 128 128
252 248 22 13 112 133 133
255 243 230 11 20 128 125
24 32 251 255 26 127 123
10 15 252 253 18 120 128
8 14 18 176 154 128 127
129 110 120 127 128 128 130
Image processing
• Image processing operation: defining a new image
g in terms of an existing image f
• We can transform either the domain or the range
of f.
• Range transformation:
• 𝑔 𝑥, 𝑦 = 𝑡 (𝑓 𝑥, 𝑦 )
Image processing
• Digital negative
• 𝑔 𝑥, 𝑦 = 255 − 𝑓(𝑥, 𝑦)
Image processing
• Improving the contrast in the picture
Image processing
• Some operations preserve the range but change
the domain of f:
• 𝑔 𝑥, 𝑦 = 𝑓(𝑡𝑥 𝑥, 𝑦 , 𝑡𝑦 𝑥, 𝑦 )
Common Geometric Transformation
Image Processing
• Still other operations operate on both domain and
the range of f
Image Processing: Filtering
• Modifies pixels based on neighborhood
𝑓 𝑔

r r
z p q z p q
v v

𝑔 𝑝 = 2𝑓 𝑝 + 0.5𝑓 𝑟 + 0.5𝑓 𝑞 + 0.3𝑓 𝑣 + 0.3𝑓(𝑧)

• Useful to:
• Noise reduction, integrate information over constant
regions, scale change, detect changes
Filtering Application: Noise Reduction
• Common types of noise:
• Salt and pepper noise: contains
random occurrences of black
and white pixels
• Impulse noise: contains random
occurrences of white pixels
• Gaussian noise: variations in
intensity drawn from a Gaussian
normal distribution

• Image processing is useful for


noise reduction
Noise Reduction by Mean Filtering
• How can we smooth away noise in a single image?
𝑓(𝑥, 𝑦) g(𝑥, 𝑦)
Effect of mean filters
Convolution
• Assume the averaging window as (2k+1) x (2k+1):
1 𝑘 𝑘
𝑔 𝑖, 𝑗 = 2
𝑓[𝑖 − 𝑢, 𝑗 − 𝑣]
2𝑘 + 1 𝑢=−𝑘 𝑣=−𝑘
• Let’s generalize the idea by allowing different weights for
different neighboring pixels:
1 𝑘 𝑘
𝑔 𝑖, 𝑗 = 2
ℎ[𝑢, 𝑣]𝑓[𝑖 − 𝑢, 𝑗 − 𝑣]
2𝑘 + 1 𝑢=−𝑘 𝑣=−𝑘
• This is called a convolution:
𝑔 =ℎ ∗𝑓
• h is called the “filter”, “kernel”, or “mask”.
Convolution

1 3 2 1
2 9 1 1 1 0 -1 -1 10
* 1 0 -1 =
1 3 2 3 4 12
1 0 -1
5 6 1 2 ℎ 𝑓

g 𝑖, 𝑗 = ℎ ∗ 𝑓 = ∑𝑢,𝑣 ℎ 𝑢, 𝑣 𝑓(𝑖 − 𝑢, 𝑗 − 𝑣)
31
Mean Kernel (also called box filter)
• Kernel for a 3x3 mean filter:
f[𝑥, 𝑦]

ℎ[𝑢, 𝑣]

1
9
Gaussian Filtering
• A Gaussian kernel gives less weights to pixels
further from the center of the window

ℎ[𝑢, 𝑣]

1
16

Discrete Gaussian kernel


Gaussian Kernel
• Weight contributions of neighboring pixels by nearness

1 (𝑥 2 + 𝑦 2 )
𝐺𝜎 𝑥,𝑦 = exp −
𝜎 2𝜋 2𝜎 2
• Constant factor at front makes volume sum to 1 (we
should normalize weights to sum to 1 in any case)
• What happens if you increase 𝜎?
Gaussian Filtering

• Each row shows


smoothing with
Gaussians of
different width
• Each column
shows different
realizations of
an image of
Gaussian noise.
Filtering an impulse

?
Median Filter
• Median of {1, 2, 25, 3, 24, 22, 20, 21, 23}
={1, 2, 3, 20, 21, 22, 23, 24, 25} =21
1 2 25 x x x
3 24 22 x 21 x
20 21 23 x x x

• Median filter selects the Median intensity over a


window.
• Median filter preserves sharp detail better than
mean filter, it is not so prone to over-smoothing.
• Is a median filter a kind of convolution?
Salt and Pepper Noise
• Comparison
Gaussian Noise
Face of faces

You might also like