0% found this document useful (0 votes)

52 views15 pages

Deep Learning Assignment0

text

Uploaded by

vidhya ds

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views15 pages

Deep Learning Assignment0

text

Uploaded by

vidhya ds

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Deep Learning - Assignment 0 Your Name, Roll Number

Instructions:

• This assignment is meant to help you understand certain concepts we will use in the
course.

1. Simple Derivatives

(a) Find the derivative of the sigmoid function with respect to x where the sigmoid
function σ(x) is given by,
1
σ(x) =
1 + e−x

Solution: The derivative of the sigmoid function is as follows:

dσ(x)
σ 0 (x) =
dx
d 1
= ( )
dx 1 + e−x
d
= (1 + e−x )−1
dx
d
= −(1 + e−x )−2 (1 + e−x )
dx
= −(1 + e−x )−2 (−e−x )

We can simplify the above answer as follows :

e−x
−(1 + e−x )−2 (−e−x ) =
(1 + e−x )2
1 e−x
=
1 + e−x 1 + e−x
1 1 − 1 + e−x
=
1 + e−x 1 + e−x
1 1
= 1−
1 + e−x 1 + e−x
= σ(x)(1 − σ(x))

Therefore, the derivative of the sigmoid function is :

σ 0 (x) = σ(x)(1 − σ(x))

(b) Given two gaussian functions
1 x2
y = N (0, 1) = √ e− 2
2π
and
1 (x−1)2
ŷ = N (1, 1) = √ e− 2
2π
we define,
L = (y − ŷ)2
dL
Find dx
at x = 1.

Solution: Given,

L = (y − ŷ)2
1 − x2 (x−1)2 2
= e 2 − e− 2
2π
The derivative of L w.r.t x is given by dL
dx
= L0 , which can be found as follows:

1 d − x2 (x−1)2 2
L0 = e 2 − e− 2
2π dx
2 − x2 (x−1)2 d x2 (x−1)2
= e 2 − e− 2 e− 2 − e− 2
2π dx
1 − x2 −
(x−1)2
d
− x2
2 d − (x−1)2
= e 2 −e 2 e − e 2
π dx dx
1 − x2 (x−1)2
x2 d x2 (x−1)2 d (x − 1)2
= e 2 − e− 2 e− 2 (− ) − e− 2 −
π dx 2 dx 2
1 − x2 (x−1)2 x2 (x−1)2
e 2 − e− 2 e− 2 (−x) − e− 2 (−(x − 1))

=
π
−1 − x2 (x−1)2 x2 (x−1)2
= e 2 − e− 2 xe− 2 − (x − 1)e− 2
π
By substituting x = 1, we get :

dL −1 − 1 − 2
(1−1)2
− 12
(1−1)2
− 2
= e 2 − e e − (1 − 1)e
dx x=1 π
−1 − 1 1
e 2 − 1 e− 2

=
π

(c) Find the derivative of f (ρ) with respect to ρ where f (ρ) is given by,
ρ 1−ρ
f (ρ) = ρ log + (1 − ρ) log
ρ̂ 1 − ρ̂
(Hint : You can treat ρ̂ as a constant.)

Page 2
Solution: The derivative of f (ρ) with respect to ρ can be found as follows:

d
f 0 (ρ) = (f (ρ))
dρ
d ρ 1 − ρ
= ρlog( ) + (1 − ρ)log
dρ ρ̂ 1 − ρ̂
d
= ρlog(ρ) − ρlog(ρ̂) + (1 − ρ)log(1 − ρ) − (1 − ρ)log(1 − ρ̂)
dρ
d d d d
= (ρlog(ρ)) − (ρlog(ρ̂)) + ((1 − ρ)log(1 − ρ)) − ((1 − ρ)log(1 − ρ̂))
dρ dρ dρ dρ
Treating ρ̂ as a constant and using product rule of derivatives, we get,
1 −1
f 0 (ρ) = (ρ. + log(ρ)(1)) − log(ρ̂)(1) + ((1 − ρ). + log(1 − ρ)(−1)) − log(1 − ρ̂)(−1)
ρ (1 − ρ)
= 1 + log(ρ) − log(ρ̂) − 1 − log(1 − ρ) + log(1 − ρ̂)
ρ 1−ρ
= log( ) − log( )
ρ̂ 1 − ρ̂
ρ(1 − ρ̂)
= log( )
ρ̂(1 − ρ)

2. Chain Rule
Using the chain rule of derivatives, find the derivative of f (x) with respect to x where
(a) f (x) = xlog(3x )

Solution: Let,

z = 3x
dz d x
∴ = 3 = 3x log3
dx dx
Also let,

y = log(z)
dy d 1 1
∴ = logz = = x
dz dz z 3

Therefore, we can write f (x) in terms of y which itself can be written in terms
of z, i.e. ,
f (x) = xy

Page 3
The derivative of f (x) can be found as follows:

d
f 0 (x) = (f (x))
dx
d
= (xy)
dx
dy d
=x +y x (By Product Rule)
dx dx
dy dz
=x +y (By Chain Rule)
dz dx
1
= x x 3x log3 + log3x
3
= xlog3 + log3x
= log3x + log3x
= 2log3x

(b) f (x) = σ(w1 (σ(w0 x + b0 )) + b1 ),

where w1 , w0 , b0 , b1 are constants and σ(x) is the sigmoid function defined in Q1(a).

Solution: Using change of variables we can write f (x) as:

f (x) = σ(w1 (σ(w0 x + b0 )) + b1 )

| {z }
=z
| {z }
=y

where,

z = w 0 x + b0
dz d
∴ = (w0 x + b0 ) = w0
dx dx
and

y = w1 (σ(z)) + b1
dy dσ(z)
∴ = w1 = w1 σ(z)(1 − σ(z))
dz dz
Therefore, we can write f (x) in terms of y which itself can be written in terms
of z, i.e. ,
f (x) = σ(y)

Page 4
The derivative of f (x) can be found as given below. Also, recall from Q1(a),
the derivative of σ(x) w.r.t x is given by σ 0 (x) = σ(x)(1 − σ(x)).

f (x) = σ(y)
d
f 0 (x) = σ(y)
dx
d dy
= σ(y) (By Chain rule)
dy dx
dy dz
= σ(y)(1 − σ(y)) (By Chain rule)
dz dx
= σ(y)(1 − σ(y))w1 σ(z)(1 − σ(z))w0

3. Taylor Series
(a) Consider x ∈ R and f (x) ∈ R. Write down the Taylor series expansion of f (x).

Solution: A function f (x) can be expanded around a given point x by the

Taylor Series :

f 00 (x) f (n)
f (x + δx) = f (x) + f 0 (x)(δx) + (δx)2 + . . . + (δx)n + . . .
2! n!
where δx is very small, f 0 (x) is the first derivative of f (x) with respect to x and
f (n) (x) is the nth derivative of f (x) with respect to x.

(b) Consider x ∈ Rn and f (x) ∈ R. Write down the Taylor series expansion of f (x).

Solution: A function f (x) where x is a vector in Rn , can be expanded by the

Taylor series as follows:
1
f (x + δx) = f (x) + ∇x f (x)δx + δxT ∇2x f (x)δx + . . .
2!
where,
δx = [δx1 , . . . , δxn ]T
 ∂x 
x1
∇x f (x) =  ... 
 
∂x
 xn2
∂ 2 f (x) ∂ 2 f (x)

∂ f (x)
∂x21 ∂x1 x2
... ∂x1 xn

∇2x f (x) =  .. .. .. .. 
 . . . . 

∂ 2 f (x) ∂ 2 f (x) ∂ 2 f (x)
∂xn x1 ∂xn x2
... ∂x2n

Page 5
4. Softmax Function
(a) How is the softmax function defined ?

Solution: Softmax function squashes a K-dimensional vector v of arbitrary

real values to a K-dimensional vector softmax(v) of real values, where each
entry is in the range (0, 1), and all the entries add up to 1.
The softmax function is defined as:
evj
sof tmax(v)j = PK j = 1, 2, . . . , K
k=1 evk
For example :
Let v = [2.1 4.8 3.5], then the softmax of it will be:

ev1
sof tmax(v)1 = P3 , note that here K = 3
e vk
k=1
e2.1
= = 0.0502
e2.1 + e4.8 + e3.5
ev2
sof tmax(v)2 = P3 vk
k=1 e
e4.8
= = 0.7464
e2.1 + e4.8 + e3.5
ev3
sof tmax(v)3 = P3 vk
k=1 e
e3.5
= = 0.2034
e2.1 + e4.8 + e3.5
Therefore, sof tmax(v) = [0.0502 0.7464 0.2034]

(b) Can you think of any concept which is similar to what the softmax function com-
putes? (Hint : You probably learnt it in high school)

Solution: The output of the softmax function can be used to represent the
probability distribution over K components of the input vector.

5. Matrix Multiplication
(a) What are the four ways of multiplying two matrices ?

Solution:

1. The most common way of finding the product of two matrices A and B is
to compute the ij-th element of the resultant product matrix C using the

Page 6
ith row of A and j th column of B. For example, suppose matrix A is of
size m×n with elements aij and a matrix B of size n×p with elements bjk ,
then multiplying matrices A and B will produce matrix C of size m × p.
The ij-th element of this matrix will be computed as,

n
X
cij = aik bkj
k=1

2. The second way is to realise that the columns of C are the linear combi-
nations of columns of A. To get the ith column of C, multiply the whole
matrix A with the ith column of B. (Remember that a matrix times col-
umn is a column.)
Example: Let A be a 3 × 2 matrix and B be a 2 × 3 matrix. Then,

C = AB
 
a11 a12 b11 b12 b13
= a21 a22  b21 b22 b23
a31 a32

      
a11 a12 b11 a11 a12 b12 a11 a12 b13
a21 a22  b21 a21 a22  b22 a21 a22  b23 
=
 a31 a32 a31 a32 a31 a32


| {z } | {z } | {z }
1st column of C 2nd column of C 3rd column of C

 
a11 b11 + a12 b21 a11 b12 + a12 b22 a11 b13 + a12 b23
= a21 b11 + a22 b21 a21 b12 + a22 b22 a21 b13 + a22 b23 
a31 b11 + a32 b21 a31 b12 + a32 b22 a31 b13 + a32 b23

3. The third way is to realise that the rows of C are the linear combinations
of rows of B. To get the ith row of C, multiply the ith row of A with the
whole matrix B. (Remember that a row times matrix is a row.)

Page 7
Example: Let A be a 3 × 2 matrix and B be a 2 × 3 matrix.

C = AB
 
a11 a12 b11 b12 b13
= a21 a22  b21 b22 b23
a31 a32

 
a11 a12 b11 b12 b13 st
1 row of C 

 b21 b22 b23 
 
 
 a21 a22 b11 b12 b13 
=
 2nd row of C
 b21 b22 b23 

 
 
 a31 a32 b11 b12 b13 
3rd rowof C
b21 b22 b23

 
a11 b11 + a12 b21 a11 b12 + a12 b22 a11 b13 + a12 b23
= a21 b11 + a22 b21 a21 b12 + a22 b22 a21 b13 + a22 b23 
a31 b11 + a32 b21 a31 b12 + a32 b22 a31 b13 + a32 b23

4. The fourth way is to look at the product of AB as a sum of (columns of

A) times (rows of B).
Example: Let A be a 3 × 2 matrix and B be a 2 × 3 matrix. Then,

C = AB
 
a11 a12 b11 b12 b13
= a21 a22  b21 b22 b23
a31 a32
   
a11 b11 b12 b13 a12 b21 b22 b23
= a21  + a22 
a31 a32
| {z } | {z } | {z } | {z }
1st column of A 1st row of B 2nd column of A 2nd row of B
   
a11 b11 a11 b12 a11 b13 a12 b21 a12 b22 a12 b23
= a21 b11 a21 b12 a21 b13  + a22 b21 a22 b22 a22 b23 
a31 b11 a31 b12 a31 b13 a32 b21 a32 b22 a32 b23
 
a11 b11 + a12 b21 a11 b12 + a12 b22 a11 b13 + a12 b23
= a21 b11 + a22 b21 a21 b12 + a22 b22 a21 b13 + a22 b23 
a31 b11 + a32 b21 a31 b12 + a32 b22 a31 b13 + a32 b23

(b) Consider a matrix A of size m × n and a vector x of size n. What is the result of

Page 8
the matrix-vector multiplication Ax. Is it a vector or a matrix? What are the the
dimensions of the product.

Solution: It will be a vector of size m.

(c) Consider two vectors x and y ∈ Rn . What is xyT ? Is it a matrix of size n × n, a

vector of size n or a scalar?

Solution: It will be a matrix of size n × n.

6. L2-norm
(a) What is meant by L2-norm of a vector?

Solution: L2 norm of a vector v = [v1 , v2 , . . . , vn ] is defined as the square root

of the sum of squares of the absolute values of the vector components and is
written as, v
u n
uX 2
||v||2 = t |vi |
i=1

 
v1
(b) Given a vector v = v2  ∈ R3 , find it’s L2-norm, i.e. ||v||2 .

v3
√
Solution: ||v||2 = v1 2 + v2 2 + v3 2
 
v1
 v2 
(c) Given a vector v =  ..  ∈ Rn , find it’s L2-norm, i.e ||v||2 .
 
.
vn
pPn
Solution: ||v||2 = i=1 vi 2

7. Euclidean Distance
Consider two vectors x and y ∈ Rn . How would you compute the Euclidean distance
between the two vectors ?

   
x1 y1
 x2   y2 
Solution: Let, x =  ..  and y = ..  be the two vectors.The Euclidean distance,
   
. .
xn yn

Page 9
d, between the two vectors can then be calculated as:
p
d = (x1 − y1 )2 + (x2 − y2 )2 + . . . + (xn − yn )2

8. Consider two vectors x and y ∈ Rn . How do you compute the dot product between the
two vectors ? Is it a matrix of size n × n, a vector of size n or a scalar ?

   
x1 y1
 x2   y2 
Solution: Let, x =  ..  and y = ..  be the two vectors. Then, the dot product
   
. .
xn yn
between them is defined as follows:

x · y = xT y
= x1 y1 + x2 y2 + . . . + xn yn
Xn
= xi y i
i=1

9. Consider two vectors x and y ∈ Rn . How do you compute the cosine of the angle between
the two vectors ?

   
x1 y1
 x2   y2 
Solution: Let, x =  ..  and y = ..  be the two vectors and θ be the angle
   
. .
xn yn
between them. Then, the cosine of the angle between the two vectors is given by:
x·y
cos θ =
|x||y|

10. Basic Geometry

(a) What is the equation of a line ?

Page 10
Solution: The equation of line can be written as:

y = mx + b

Note that it also can be re-written as:

a1 x 1 + a2 x 2 = b

where, x1 = x, x2 = y, a1 = −m, a2 = 1

(b) What is the equation of a plane in 3 dimensions (assume the axes are x1 , x2 , x3 )?

Solution: The equation of a plane in 3 dimensions is:

a1 x 1 + a2 x 2 + a3 x 3 = b

where, x1 , x2 , x3 are the axes and a1 , a2 , a3 , b are the coefficients.

Solution: The equation of a plane in n dimensions is :

n
X
ai x i = b
i=1

where, xi are the axes and ai , b are the coefficients.

11. Basis Consider a set of vectors S = {v1 , v2 , . . . , vn } ∈ Rn . When do you say that these
vectors form a basis in Rn ?

Solution: A set of vectors S = {v1 , v2 , . . . , vn } ∈ Rn forms a basis in Rn if and only

if following conditions are satisfied:

1. v1 , v2 , . . . , vn are linearly independent vectors

2. S spans Rn i.e. every vector in Rn can be represented as a linear combination

of vectors in S.
For example, if x ∈ Rn then we can write,

x = c1 v1 + c2 v2 + . . . + cn vn

where vi ∈ S form the basis of Rn and ci are co-efficients, ∀i ∈ {1, 2, . . . , n}.

Page 11
For example :      
0 0 1
The unit basis vectors for R3 are 0, 1 and 0. Note that you can represent
1 0 0
3
any vector v ∈ R as the linear combination of these three basis vectors.

12. Orthogonal Vectors

(a) When are two vectors u and v ∈ Rn said to be orthogonal ?

Solution: Two vectors u and v are said to be orthogonal vectors when their
dot-product is zero i.e. u · v = uT v = 0.

(b) Are the

  followingvectors
 orthogonal
  to each other?
1 0 0
v1 = 0, v2 = 1, v3 = 0
0 0 1

Solution: From part (a) of this question, we know that two vectors u and v are
said to be orthogonal if their dot product is zero. Therefore, to check whether
v1 , v2 and v3 are orthogonal, we have to find the dot product between them.
We do this by taking two vectors at a time.

v1 · v2 = v1T v2
 
0
= 1 0 0 1
0
=0

v2 · v3 = v2T v3
 
0
= 0 1 0 0
1
=0

v1 · v3 = v1T v3
 
0
= 1 0 0 0
1
=0

Page 12
As we can see, we can take any subset of the above 3 vectors and compute the
dot product and the result will be zero. Therefore, v1 , v2 and v3 are orthogonal
to each other.

13. Consider two vectors a and b ∈ Rn . What is the vector projection of b onto a ?

Solution: The vector projection of b onto a will have the same direction as vector
a but it will be either a scaled up or down version of a depending on the vector b.
The vector projection of b onto a is given by,

a·b aT b
· a = ·a
||a||2 ||a||2

14. Consider a matrix A and a vector x. We say that x is an eigen vector of A if ?

Solution: x is an eigenvector of A if Ax = λx where λ is a scalar and is called the

corresponding eigenvalue.

15. Consider a set of vectors x1 , x2 , . . . , xn ∈ Rn ? We say that x1 , x2 , . . . , xn form an or-

thonormal basis in Rn if ?

Solution: {x1 , x2 , . . . , xn } form an orthonormal basis in Rn if {x1 , x2 , . . . , xn } are

orthogonal to each other and have unit length.

16. Consider a set of vectors x1 , x2 , . . . , xn ∈ Rn . We say that x1 , x2 , . . . , xn are linearly

independent if ?

Solution: We say that x1 , x2 , . . . , xn are linearly independent if any vector in the

set cannot be written as a linear combination of the remaining vectors in the set.
On the other hand, a vector xi is said to be linearly dependent on vectors x1 to xn
if it can be written as a linear combination of these vectors as :

c1 x1 + . . . + ci−1 xi−1 + ci+1 xi+1 + . . . cn xn = xi

=⇒ c1 x1 + . . . + ci−1 xi−1 + ci+1 xi+1 + . . . cn xn + (−1)xi = 0
Xn
=⇒ ck xk = 0, where ci = −1
k=1

Page 13
But for a set of linearly independent vectors no vector in the set can be written as
a linear combination of the remaining vectors in the set. An alternate way of saying
this is that, a set of vectors is linearly independent if the only solution to the equation
n
X
ck xk = 0, is, ck = 0 ∀k = {1, 2, . . . , n}
k=1

17. Consider a vector x ∈ Rn and a matrix A ∈ Rn×n . The product xT Ax can be written
as i=1 nj=1
Pn P
?

Pn Pn
Solution: i=1 j=1 xi Aji xj

18. KL Divergence
(a) Consider a discrete random variable X which can take one of k values from the
set {x1 , . . . , xk }. A distribution over X defines the value of P r(X = x) ∀x ∈
{x1 , . . . , xn }. Consider two such distributions P and Q. How do you compute the
KL divergence between P and Q.

Solution: The KL Divergence between two distributions P and Q can be cal-

culated as :
X Q(x)
DKL (P ||Q) = − P (x) log
x
P (x)
X P (x)
= P (x) log
x
Q(x)
h P (x) i
= EX∼P log
Q(x)
For example,
Consider a discrete random variable X which can take one of 3 values from the
set {x1 , x2 , x3 }. A distribution over X defines the value of P r(X = x) ∀x ∈
{x1 , x2 , x3 }. Consider two such distributions P and Q which are defined as
follows:

0 1 0
P = |{z} |{z} |{z}
Pr(X = x1 ) Pr(X = x2 ) Pr(X = x3 )

0.228 0.619 0.153
Q= | {z } | {z } | {z }
Pr(X = x1 ) Pr(X = x2 ) Pr(X = x1 )

Page 14
Then, the KL divergence between P and Q can be calculated as:
0 1 0
DKL (P ||Q) = (0.0 ∗ log + 1.0 ∗ log + 0.0 ∗ log )
0.228 0.619 0.153
= 0.691

(b) Is KL Divergence symmetric?

Solution: KL divergence is not symmetric as DKL (P ||Q) 6= DKL (Q||P ), which

can be shown as follows:
X P (x)
DKL (Q||P ) = − Q(x) log
x
Q(x)
X Q(x)
= Q(x) log
x
P (x)
h Q(x) i
= EX∼Q log
P (x)
6 DKL (P ||Q)
=

19. Cross Entropy

Given two distributions P and Q defined over a discrete random variable X, how do you
compute the cross entropy between the two distributions?

Solution: The cross entropy between two distributions P and Q is given by,
X
H(P, Q) = − P (x) log Q(x)
x
For example,
Consider a discrete random variable X which can take one of 3 values from the set
{x1 , x2 , x3 }. A distribution over X defines the value of P r(X = x) ∀x ∈ {x1 , x2 , x3 }.
Consider two such distributions P and Q which are defined as follows:

0 1 0
P = |{z} |{z} |{z}
Pr(X = x1 ) Pr(X = x2 ) Pr(X = x3 )

0.228 0.619 0.153
Q= | {z } | {z } | {z }
Pr(X = x1 ) Pr(X = x2 ) Pr(X = x1 )

Then, the cross-entropy between P and Q can be calculated as:

H(P, Q) = −(0.0 ∗ log(0.228) + 1.0 ∗ log(0.619) + 0.0 ∗ log(0.153))
= 0.691

Page 15

A History of Mechanical Engineering - Ce Zhang PDF
100% (1)
A History of Mechanical Engineering - Ce Zhang PDF
742 pages
1.sequential Circuits
No ratings yet
1.sequential Circuits
119 pages
7.example Problems
100% (1)
7.example Problems
396 pages
Assignments Week02
No ratings yet
Assignments Week02
4 pages
Linearity in Regression, Domodar N Gujrati - Basic Econometrics
No ratings yet
Linearity in Regression, Domodar N Gujrati - Basic Econometrics
2 pages
Slides Week8 PDF
No ratings yet
Slides Week8 PDF
46 pages
1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
Assignment 8 PDF
100% (1)
Assignment 8 PDF
9 pages
P Block Elements
100% (1)
P Block Elements
13 pages
Digital Image Processing - Unit 3 - Week 2 - Aast2
100% (1)
Digital Image Processing - Unit 3 - Week 2 - Aast2
5 pages
Load Tree Calcs
No ratings yet
Load Tree Calcs
3 pages
Rigaku Operating Procedure
No ratings yet
Rigaku Operating Procedure
8 pages
4.deep Learning Assignment4 Solution PDF
100% (1)
4.deep Learning Assignment4 Solution PDF
12 pages
Applied Physics For Computer Science Stream: TIME: 03 Hours Max. Marks: 100
No ratings yet
Applied Physics For Computer Science Stream: TIME: 03 Hours Max. Marks: 100
3 pages
New-DLP Phase2 Assignment-3 Module-B Final-9.8.18
No ratings yet
New-DLP Phase2 Assignment-3 Module-B Final-9.8.18
6 pages
Assignments Week08
No ratings yet
Assignments Week08
4 pages
Ece Vii & Viii
No ratings yet
Ece Vii & Viii
62 pages
SMC Flow Switch
No ratings yet
SMC Flow Switch
63 pages
TRNG Handouts For Project PLNG & MGMT Wkshop 27 Nov 1dec 151 200
No ratings yet
TRNG Handouts For Project PLNG & MGMT Wkshop 27 Nov 1dec 151 200
50 pages
TRNG Handouts For Project PLNG & MGMT Wkshop 27 Nov 1dec 201 248
No ratings yet
TRNG Handouts For Project PLNG & MGMT Wkshop 27 Nov 1dec 201 248
48 pages
Modelling
No ratings yet
Modelling
52 pages
Lec 10
No ratings yet
Lec 10
36 pages
Aberrations of Spectacle Lenses
No ratings yet
Aberrations of Spectacle Lenses
39 pages
Refer Slide Time 00:17
No ratings yet
Refer Slide Time 00:17
32 pages
Week 1: Assignment: Customization Cost That Is To Be Separately Borne by Every Customer."
No ratings yet
Week 1: Assignment: Customization Cost That Is To Be Separately Borne by Every Customer."
3 pages
Active Transducers - Thermocouple
No ratings yet
Active Transducers - Thermocouple
29 pages
Mathematics For Physicists
No ratings yet
Mathematics For Physicists
15 pages
Density Moist: Temperature To
No ratings yet
Density Moist: Temperature To
32 pages
19.05.24 - Osr - Star Co-Sc - Jee-Adv - 2022 - P2 - Gta-15 (P2) - Key & Sol
No ratings yet
19.05.24 - Osr - Star Co-Sc - Jee-Adv - 2022 - P2 - Gta-15 (P2) - Key & Sol
16 pages
Assignment 8
No ratings yet
Assignment 8
9 pages
8Fd - Physical - Trends (AAF - JNS) - 1
No ratings yet
8Fd - Physical - Trends (AAF - JNS) - 1
14 pages
Gaussian Process Regression Based Remaining Fatigue Lif - 2022 - International J
No ratings yet
Gaussian Process Regression Based Remaining Fatigue Lif - 2022 - International J
9 pages
Nexys4™ PDM Filter Project: Revised February 3, 2014 This Manual Applies To The Nexys4 Rev. B
No ratings yet
Nexys4™ PDM Filter Project: Revised February 3, 2014 This Manual Applies To The Nexys4 Rev. B
12 pages
I F P G A (Fpga) : Ntroduction To Ield Rogrammable ATE Rrays S
No ratings yet
I F P G A (Fpga) : Ntroduction To Ield Rogrammable ATE Rrays S
13 pages
The Holography Times
No ratings yet
The Holography Times
16 pages
Skee4613 20162017-2
No ratings yet
Skee4613 20162017-2
7 pages
1.LaTeX On Windows Using TeXworks
No ratings yet
1.LaTeX On Windows Using TeXworks
8 pages
Electron Beam
No ratings yet
Electron Beam
5 pages
Latex/C2/Mathematical-Typesetting/English: Script - Format
No ratings yet
Latex/C2/Mathematical-Typesetting/English: Script - Format
8 pages
Latex/C2/Report-Writing/English: Author: Prof Kannan Moudgalya Date
No ratings yet
Latex/C2/Report-Writing/English: Author: Prof Kannan Moudgalya Date
7 pages
Latex/C2/Equations/English: Visual Cue Narration
No ratings yet
Latex/C2/Equations/English: Visual Cue Narration
6 pages
Little - Modified Unit Conversion Notes HW
No ratings yet
Little - Modified Unit Conversion Notes HW
7 pages
Arrow Pushing Review
No ratings yet
Arrow Pushing Review
5 pages
Arduino Tutorial #1 - Getting Started and Connected
No ratings yet
Arduino Tutorial #1 - Getting Started and Connected
5 pages
2SC5446 PDF
No ratings yet
2SC5446 PDF
6 pages
Classification and Function of Relays
No ratings yet
Classification and Function of Relays
7 pages
Wave Exercise
No ratings yet
Wave Exercise
6 pages
Title Defense Windstrip
No ratings yet
Title Defense Windstrip
5 pages
3.letter Writing
No ratings yet
3.letter Writing
2 pages
Be - Mechanical Engineering - Semester 7 - 2018 - August - Cad Cam Automation Cadcampattern 2015
No ratings yet
Be - Mechanical Engineering - Semester 7 - 2018 - August - Cad Cam Automation Cadcampattern 2015
2 pages
18W White COB
No ratings yet
18W White COB
5 pages
Deep Learning Assignment0
No ratings yet
Deep Learning Assignment0
15 pages
Matlab PDF
No ratings yet
Matlab PDF
1 page
Earth & Life Science Activity No.3: Direction: Answer The Following Questions
No ratings yet
Earth & Life Science Activity No.3: Direction: Answer The Following Questions
1 page
Forging Defects
No ratings yet
Forging Defects
1 page
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)

Deep Learning Assignment0

Uploaded by

Deep Learning Assignment0

Uploaded by

Deep Learning - Assignment 0 Your Name, Roll Number

Solution: The derivative of the sigmoid function is as follows:

We can simplify the above answer as follows :

Therefore, the derivative of the sigmoid function is :

σ 0 (x) = σ(x)(1 − σ(x))

(b) f (x) = σ(w1 (σ(w0 x + b0 )) + b1 ),

Solution: Using change of variables we can write f (x) as:

f (x) = σ(w1 (σ(w0 x + b0 )) + b1 )

Solution: A function f (x) can be expanded around a given point x by the

Solution: A function f (x) where x is a vector in Rn , can be expanded by the

Solution: Softmax function squashes a K-dimensional vector v of arbitrary

4. The fourth way is to look at the product of AB as a sum of (columns of

Solution: It will be a vector of size m.

(c) Consider two vectors x and y ∈ Rn . What is xyT ? Is it a matrix of size n × n, a

Solution: It will be a matrix of size n × n.

Solution: L2 norm of a vector v = [v1 , v2 , . . . , vn ] is defined as the square root

10. Basic Geometry

Note that it also can be re-written as:

Solution: The equation of a plane in 3 dimensions is:

where, x1 , x2 , x3 are the axes and a1 , a2 , a3 , b are the coefficients.

Solution: The equation of a plane in n dimensions is :

where, xi are the axes and ai , b are the coefficients.

Solution: A set of vectors S = {v1 , v2 , . . . , vn } ∈ Rn forms a basis in Rn if and only

1. v1 , v2 , . . . , vn are linearly independent vectors

2. S spans Rn i.e. every vector in Rn can be represented as a linear combination

where vi ∈ S form the basis of Rn and ci are co-efficients, ∀i ∈ {1, 2, . . . , n}.

12. Orthogonal Vectors

(b) Are the

14. Consider a matrix A and a vector x. We say that x is an eigen vector of A if ?

Solution: x is an eigenvector of A if Ax = λx where λ is a scalar and is called the

15. Consider a set of vectors x1 , x2 , . . . , xn ∈ Rn ? We say that x1 , x2 , . . . , xn form an or-

Solution: {x1 , x2 , . . . , xn } form an orthonormal basis in Rn if {x1 , x2 , . . . , xn } are

16. Consider a set of vectors x1 , x2 , . . . , xn ∈ Rn . We say that x1 , x2 , . . . , xn are linearly

Solution: We say that x1 , x2 , . . . , xn are linearly independent if any vector in the

c1 x1 + . . . + ci−1 xi−1 + ci+1 xi+1 + . . . cn xn = xi

Solution: The KL Divergence between two distributions P and Q can be cal-

(b) Is KL Divergence symmetric?

Solution: KL divergence is not symmetric as DKL (P ||Q) 6= DKL (Q||P ), which

19. Cross Entropy

Then, the cross-entropy between P and Q can be calculated as:

You might also like