0% found this document useful (0 votes)

12 views3 pages

Intro To Linear Programming

The document discusses convolution in the context of Convolutional Neural Networks (CNNs), defining how a filter operates on image data represented as tensors. It explains the process of applying multiple kernels to extract features from images and introduces the concept of robustness in classifiers, detailing how to determine the smallest perturbation that changes a data point's classification. Additionally, it covers the optimization problem for linear classifiers and the discriminator's objective in maximizing certain values related to data distributions.

Uploaded by

spammysharky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views3 pages

Intro To Linear Programming

Uploaded by

spammysharky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

MA3K1 Mathematics of Machine Learning April 10, 2021

Solution (23) Given a vector x 2 Rd and a sequence g = (g d+1 , . . . , gd 1 )

T
, the
convolution is defined as the vector y = (y1 , . . . , yd )T , where
d
X
yk = xi · g k i , k 2 {1, . . . , d}
i=1

In terms of matrices,
0 1
0 1 g0 g 1 ··· g d+1 0 1
y1 B g1 C x1
B ..C B g0 ··· g d+2 C B .C
@y2 .A = B .. .. .. .. C @x2 ..A
@ . . . . A
yd xd
gd 1 gd 2 ··· g0

The sequence g is called a filter. Convolutional Neural Networks (CNNs) operate on

image data, and in the context of colour image data, a vector x that enters a layer of
the CNN will not be seen as a single vector in some Rd , but as a tensor with format
N ⇥ M ⇥ 3. Here, the image is considered to be a N ⇥ M matrix, with each entry
corresponding to a pixel, and each pixel is represented by a 3-tuple consisting of the red,
blue and green components. A convolution filter will therefore operate on a subset of
this tensor, typically a n ⇥ m ⇥ 3 window. Typical parameters are N = M = 32 and
n = m = 5.

2 3
⇤ ⇤ ⇤ ⇤ ⇤
2 3
⇤ ⇤66⇤⇤ 3⇤⇤ ⇤⇤ ⇤ ⇤7
7
2 6 7 ···
⇤ ⇤66⇤⇤
⇤ ⇤
⇤⇤6 ⇤⇤ ⇤ ⇤7 ⇤ 7 ⇤ ⇤ 7
6⇤ ⇤66⇤⇤ ⇤⇤ ⇤⇤7 ⇤ ⇤7 · · · 5
4 ⇤ ⇤ ⇤ 7 ⇤ ⇤ [⇤] · · ·
6 7
6⇤ ⇤ 4⇤⇤ ⇤⇤ ⇤⇤⇤7⇤⇤· · ⇤·⇤5 ⇤ ⇤ ..
6 7 ..
4⇤ ⇤ ⇤⇤ ⇤⇤ ⇤⇤5 ⇤ .⇤ .. . .
.. .
⇤ ⇤ ⇤ ⇤ .⇤ .
.. ..
.. ..
. .

Each 5 ⇥ 5 ⇥ 3 block is mapped linearly to an entry of a 28 ⇥ 28 matrix in the

same way (that is, applying the same filter). The filter applied is called a kernel and the
resulting map is interpreted as representing a feature. Specifically, instead of writing
it as a vector g, we can write it as a matrix K, and then have, if we denote the input
image by X, X
Y = X ⇤ K, Yij = Xk` · Ki k,j ` .
k,`

In a convolutional layer, various different kernels or convolutions are applied to the

image. This can be seen as extracting several features from an image. CNNs have the
advantage of being scalable: they work on images of arbitrary size.

19
MA3K1 Mathematics of Machine Learning April 10, 2021

Solution (24) Let h : Rd ! Y be any classifier. Define the smallest perturbation that
moves a data point into a different class as

(x) = inf {krk : h(x + r) 6= h(x)}. (3.5)

Given a probability distribution on the input spaces, define the robustness as


h (X)
⇢(h) = E .
kXk

For a linear classifier with only two classes, where h(x) = wT x + b, we get the
vector that moves point x to the boundary by solving the optimization problem (as for
SVMs)
1
r ⇤ = argmin krk2 subject to wT (x + r) + b = 0.
2
The solution is
|wT x + b|
r⇤ = .
kwk
Assume that we now have k linear functions f1 , . . . , fk , with fi (x) = wiT x + bi , and
a classifier h : Rd ! {1, . . . , k} that assigns to each x the index j of the largest value
fj (x) (this corresponds to the one-to-many setting for multiple classification). Let x be
such that maxj fj (x) = fk (x) and define the linear functions

gi (x) := fi (x) fk (x) = (wi wk )T x + (bi bk ), i 2 {1, . . . , k 1}.

Then \
x2 {y : gi (y) < 0}.
1ik 1

The intersection of half-spaces is a polyhedron, and x is in the interior of the polyhedron

P delineated by the hyperplanes Hi = {y : gi (x) = 0} (see Figure 6).

H4
H3

H2
H5
x

Figure 6: The distance to misclassification is the radius of the largest enclosed ball in
a polyhedron P .

20
MA3K1 Mathematics of Machine Learning April 10, 2021

A perturbation x + r ceases to be in class k as soon as gj (x + r) > 0 for some j,

and the smallest length of such an r equals the radius of the largest enclosed ball in the
polyhedron. Formally, noting that wj wk = rgj (x), we get

|gj (x)| gĵ (x)

ĵ := arg min , r⇤ = · rgĵ (x). (3.6)
j krgj (w)k krgĵ (x)k2

Solution (25) The discriminator D would like to achieve, on average, a large value
on G0 (Z0 ), and a small value on G1 (Z1 ). Using the logarithm, this can be expressed as
the problem of maximizing

EZ0 [log(D(G0 (Z0 )))] + EZ1 [log(1 D(G1 (Z1 )))].

As an integral, using the push-forward measures, we get the expression

Z
[log(D(x))⇢X0 (x) + log(1 D(x))⇢X1 (x)] dx.
X

We choose D(x) so that the integrand becomes maximal. So considering the function

f (y) = log(y)⇢X0 (x) + log(1 y)⇢X1 (x)

and computing the first and second derivatives,

⇢X0 (x) ⇢X1 (x) ⇢X0 (x) ⇢X1 (x)

f 0 (y) = , f 00 (y) < 0,
y 1 y y2 (1 y)2

we see that we get a maximum at

⇢X0 (x)
y= ,
⇢X0 (x) + ⇢X1 (x)

which is the value of D⇤ (x) that we want.

Transport Phenomena Notes
50% (2)
Transport Phenomena Notes
168 pages
Support Vector Machines
No ratings yet
Support Vector Machines
57 pages
(Graduate Studies in Mathematics) Jennifer Schultens - Introduction To 3-Manifolds. 151-American Mathematical Society (2014)
100% (2)
(Graduate Studies in Mathematics) Jennifer Schultens - Introduction To 3-Manifolds. 151-American Mathematical Society (2014)
300 pages
CNN PPT Unit Iv
No ratings yet
CNN PPT Unit Iv
134 pages
Calculus Paper-1-2: (336 Marks)
No ratings yet
Calculus Paper-1-2: (336 Marks)
55 pages
CH 9
No ratings yet
CH 9
41 pages
5-Thermodynamics For Cryogenics
No ratings yet
5-Thermodynamics For Cryogenics
52 pages
Entropy Change in Reversible and Irreversible Processes
100% (1)
Entropy Change in Reversible and Irreversible Processes
8 pages
Spacial Relativity Mcqs
100% (1)
Spacial Relativity Mcqs
14 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Lecture05 VDL
No ratings yet
Lecture05 VDL
145 pages
MATRIX Theory and Questions
No ratings yet
MATRIX Theory and Questions
66 pages
6 CNN
No ratings yet
6 CNN
92 pages
Lecture 04
No ratings yet
Lecture 04
28 pages
Ds 2
No ratings yet
Ds 2
27 pages
Ch10 Deep Learning
No ratings yet
Ch10 Deep Learning
104 pages
MV cs4243 2024 Amir 2
No ratings yet
MV cs4243 2024 Amir 2
70 pages
תרגול - SVM 1
No ratings yet
תרגול - SVM 1
32 pages
Kakade S. Tewari A. - Topics in Artificial Intelligence (Learning Theory)
No ratings yet
Kakade S. Tewari A. - Topics in Artificial Intelligence (Learning Theory)
68 pages
Fundations Data Science
No ratings yet
Fundations Data Science
16 pages
CNNs
No ratings yet
CNNs
22 pages
03 PL, Activation, BackProp, CNN
No ratings yet
03 PL, Activation, BackProp, CNN
95 pages
Unit 2
No ratings yet
Unit 2
45 pages
ML Section15 Neural Networks
No ratings yet
ML Section15 Neural Networks
133 pages
Lec 3
No ratings yet
Lec 3
131 pages
CNN Concept
No ratings yet
CNN Concept
57 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Deep Learning: Convolutional Neural Network & Its Applications
No ratings yet
Deep Learning: Convolutional Neural Network & Its Applications
53 pages
TFM Lichtner Bajjaoui Aisha
No ratings yet
TFM Lichtner Bajjaoui Aisha
51 pages
Lecture CNN
No ratings yet
Lecture CNN
68 pages
Dis3 Sol
No ratings yet
Dis3 Sol
7 pages
SML Lecture5
No ratings yet
SML Lecture5
45 pages
Montanari
No ratings yet
Montanari
10 pages
CV Assignment2 SC22B109
No ratings yet
CV Assignment2 SC22B109
19 pages
Understanding Deep Convolutional Networks
No ratings yet
Understanding Deep Convolutional Networks
17 pages
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
CNN Slides PDF
No ratings yet
CNN Slides PDF
81 pages
Perceptron
No ratings yet
Perceptron
23 pages
Sol3 2020
No ratings yet
Sol3 2020
5 pages
DL Assignment Solutions
No ratings yet
DL Assignment Solutions
64 pages
Lec 05
No ratings yet
Lec 05
46 pages
Solved Problems in Quantum Mechanics
No ratings yet
Solved Problems in Quantum Mechanics
3 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
Sarma CNN Vce Oct 2022
No ratings yet
Sarma CNN Vce Oct 2022
63 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Lecture 4
No ratings yet
Lecture 4
49 pages
Perceptrons
No ratings yet
Perceptrons
12 pages
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
No ratings yet
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
40 pages
INPROCESS BOOSTER - Aspen HYSYS - Module 1
100% (1)
INPROCESS BOOSTER - Aspen HYSYS - Module 1
13 pages
Homework 3: SVM and Sentiment Analysis: Minted Listings
No ratings yet
Homework 3: SVM and Sentiment Analysis: Minted Listings
7 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 3
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 3
8 pages
CH4
No ratings yet
CH4
30 pages
Razumikhin and Krasovskii Stability Theorems For Time-Varying Time-Delay Systems
No ratings yet
Razumikhin and Krasovskii Stability Theorems For Time-Varying Time-Delay Systems
11 pages
Instructor's Solution Manual For Neural Networks
No ratings yet
Instructor's Solution Manual For Neural Networks
40 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
7 pages
Lecture 13 - Kernels
No ratings yet
Lecture 13 - Kernels
5 pages
CS 229, Autumn 2017 Problem Set #4: EM, DL & RL
No ratings yet
CS 229, Autumn 2017 Problem Set #4: EM, DL & RL
10 pages
Project Report
No ratings yet
Project Report
20 pages
Linear Separability
No ratings yet
Linear Separability
4 pages
Homework For The Course "Advanced Learninig Models": 1 Neural Networks
No ratings yet
Homework For The Course "Advanced Learninig Models": 1 Neural Networks
10 pages
Vahid
No ratings yet
Vahid
18 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Convolutional Neural Networks Introduction To Convolution Neural Networks
No ratings yet
Convolutional Neural Networks Introduction To Convolution Neural Networks
8 pages
Convection Intro
No ratings yet
Convection Intro
47 pages
CS 182 Berkeley 2021 Discussion 3
No ratings yet
CS 182 Berkeley 2021 Discussion 3
5 pages
NN Theory
No ratings yet
NN Theory
138 pages
Lecture 2 Math
No ratings yet
Lecture 2 Math
34 pages
Midterm Review Spring18 Sols
No ratings yet
Midterm Review Spring18 Sols
22 pages
6.86x Machine Learning With Python: Linear Classifiers
No ratings yet
6.86x Machine Learning With Python: Linear Classifiers
7 pages
3 Linear
No ratings yet
3 Linear
5 pages
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
No ratings yet
Linear Classifiers and The Perceptron Algorithm: 36-350, Data Mining, Fall 2009 16 November 2009
5 pages
Solution ASSIGNMENT-1 (Revised)
No ratings yet
Solution ASSIGNMENT-1 (Revised)
6 pages
S&H 7
No ratings yet
S&H 7
44 pages
Fast Terminal Sliding Mode Control For Twin Rotor Multi-Input Multi-Output System
No ratings yet
Fast Terminal Sliding Mode Control For Twin Rotor Multi-Input Multi-Output System
5 pages
Class Xii QP (15-01-2025)
No ratings yet
Class Xii QP (15-01-2025)
5 pages
2maths7 6 Transformationgeometry
No ratings yet
2maths7 6 Transformationgeometry
22 pages
Btech 1 Sem Mathematics 1 Nbc101 2020
No ratings yet
Btech 1 Sem Mathematics 1 Nbc101 2020
2 pages
L-7 - Matrices and Determinants
No ratings yet
L-7 - Matrices and Determinants
31 pages
Acceptance Rate
No ratings yet
Acceptance Rate
232 pages
(Maa 5.14) Implicit Differentiation - More Kinematics
No ratings yet
(Maa 5.14) Implicit Differentiation - More Kinematics
20 pages
SC Academy JEE Main Grade 12 FT-03 Code A QP
No ratings yet
SC Academy JEE Main Grade 12 FT-03 Code A QP
17 pages
AL Maths Pure Unit 7 Test
No ratings yet
AL Maths Pure Unit 7 Test
2 pages
July 8 2020 Final Exam
No ratings yet
July 8 2020 Final Exam
2 pages
Get Solution Manual For Applied Numerical Methods With MatLab For Engineers and Science Chapra 3rd Edition Free All Chapters
No ratings yet
Get Solution Manual For Applied Numerical Methods With MatLab For Engineers and Science Chapra 3rd Edition Free All Chapters
51 pages
Assignment 1 Spring 2023 PHY111
No ratings yet
Assignment 1 Spring 2023 PHY111
3 pages
Tangent and Normal
No ratings yet
Tangent and Normal
79 pages
Micas922 L3
No ratings yet
Micas922 L3
41 pages
Emt 2
No ratings yet
Emt 2
3 pages
Topic 3.0 - Tutorial (3.1 3.3)
No ratings yet
Topic 3.0 - Tutorial (3.1 3.3)
4 pages
Activity - A Journey Thru Calc From A To Z
No ratings yet
Activity - A Journey Thru Calc From A To Z
7 pages
Notes For Stats
No ratings yet
Notes For Stats
8 pages
Introduction To SVM
No ratings yet
Introduction To SVM
3 pages
Deep Learning
No ratings yet
Deep Learning
3 pages
Spotify Logo
No ratings yet
Spotify Logo
1 page

Intro To Linear Programming

Uploaded by

Intro To Linear Programming

Uploaded by

MA3K1 Mathematics of Machine Learning April 10, 2021

Solution (23) Given a vector x 2 Rd and a sequence g = (g d+1 , . . . , gd 1 )

The sequence g is called a filter. Convolutional Neural Networks (CNNs) operate on

Each 5 ⇥ 5 ⇥ 3 block is mapped linearly to an entry of a 28 ⇥ 28 matrix in the

In a convolutional layer, various different kernels or convolutions are applied to the

(x) = inf {krk : h(x + r) 6= h(x)}. (3.5)

Given a probability distribution on the input spaces, define the robustness as

gi (x) := fi (x) fk (x) = (wi wk )T x + (bi bk ), i 2 {1, . . . , k 1}.

The intersection of half-spaces is a polyhedron, and x is in the interior of the polyhedron

A perturbation x + r ceases to be in class k as soon as gj (x + r) > 0 for some j,

|gj (x)| gĵ (x)

EZ0 [log(D(G0 (Z0 )))] + EZ1 [log(1 D(G1 (Z1 )))].

As an integral, using the push-forward measures, we get the expression

f (y) = log(y)⇢X0 (x) + log(1 y)⇢X1 (x)

and computing the first and second derivatives,

⇢X0 (x) ⇢X1 (x) ⇢X0 (x) ⇢X1 (x)

we see that we get a maximum at

which is the value of D⇤ (x) that we want.

You might also like