0% found this document useful (0 votes)
25 views10 pages

MF Compre Regular Sol

The document contains a series of mathematical problems and solutions related to linear algebra, optimization, and principal component analysis. It covers topics such as finding general solutions to linear equations, proving subspaces, and demonstrating properties of kernel functions. Additionally, it includes computations for eigenvalues, eigenvectors, and projections in PCA, along with justifications for each step taken in the solutions.

Uploaded by

RASHMI MARGANI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views10 pages

MF Compre Regular Sol

The document contains a series of mathematical problems and solutions related to linear algebra, optimization, and principal component analysis. It covers topics such as finding general solutions to linear equations, proving subspaces, and demonstrating properties of kernel functions. Additionally, it includes computations for eigenvalues, eigenvectors, and projections in PCA, along with justifications for each step taken in the solutions.

Uploaded by

RASHMI MARGANI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

MFML/MFDS Comprehensive Regular - Solutions

Q Answer the following questions with justifications.


(A) Suppose A = [C1 , C2 , C3 , C4 ] where Cm ∈ R4 , m = 1, 2, 3, 4 are columns
of A. It is known that rank(A) = 2 and C2 = 3C1 and C4 = 2C1 + 3C3 .
If a particular solution of Ax = b is [1, 0, 1, 0]T , then
(i) Find the general solution of Ax = b. [2 Marks]
(ii) Find b if RREF(A) = A. [2 Marks]

(B) Consider M = {A ∈ R2×2 | A = −AT }.


(i) Prove that M is subspace of vector space V , where V is set of all
2 × 2 real matrices.      [1.5 Marks]
1 0 0 −1 0 0
(ii) Prove or disprove , , is a basis for M de-
0 0 1 0 0 1
fined in (i).
[2.5 Marks]
     
1 −1 5 −5 0 −3
(iii) Prove or disprove that , , is a linearly
1 0 5 5 3 1
independent set. [2 Marks]
Solution
(A) Now A = [C1 , C2 , C3 , C4 ] where Cm ∈ R4 , m = 1, 2, 3, 4 are columns of
A. It is known that rank(A) = 2

(i) C2 = 3C1 ⇒ 3C1 −C2  + 0C3 + 0C4 = 0


3
−1
⇒ [C1 , C2 , C3 , C4 ] 
0
 [0.5 Marks]
0
C4 = 2C1 + 3C3 ⇒2C1+ 0C2 + 3C3 − 1C4 = 0
2
0
⇒ [C1 , C2 , C3 , C4 ] 
3
 [0.5 Marks]
−1
Thus,
   the general
  solution  of Ax = b is given
 by

 1 3 2 

0 −1 0
      
  + λ1   + λ2   | ∀λ1 , λ2 ∈ R [0.5 Marks]
 1 0 3 
 
 0 0 −1 

(ii) Now A = [C1 , C2 , C3 , C4 ] = [C1 , 2C1 , C3 , 2C1 + 3C3 ]


and rank(A) = 2.
Therefore C1 , C3 are pivot columns and RREF(A) = A
⇒ C1 = [1, 0, 0, 0]T and C3 = [0, 1, 0, 0]T
⇒ C2 =[3, 0, 0, 0]T and
 C4 = [2, 3, 0, 0]
T

1 3 0 2
0 0 1 3
⇒A= 0 0 0 0
 [1 Mark]
0 0 0 0
Therefore b = A[1, 0, 1, 0]T = [1, 1, 0, 0]T [0.5 Marks]

(B) (i) Clearly the zero matrix of order 2, 0 ∈ M ⇒ M ̸= ϕ. [0.5 Marks]


Let A, B ∈ M, k ∈ R
⇒ A = −AT , B = −B T
⇒ A + B = −AT − B T = −(A + B)T
⇒ A + B ∈ M. [0.5 Marks]
T T T
kA = k(−A ) = (−k)A = −(kA)
So, M is a subspace. [0.5 Marks]
   
1 0 0 0
(ii) Clearly , does not belong to M and hence cannot be
0 0 0 1
a basis of M .
(Kindly award full marks for any valid reason for disproving the
claim.) [2.5 Marks]
       
1 −1 5 −5 0 −3 0 0
(iii) Consider a +b +c = [0.5 Marks]
1 0 5 5 3 1 0 0

⇒ a + 5b = 0
−a − 5b − 3c = 0
a + 5b + 3c = 0
5b + c = 0
⇒ a = b = c = 0. Hence it a linearly independent set. [1.5 Marks]
(Kindly award full marks for any other correct method.)
Q Answer the following questions with justifications.
(A) Consider the following optimization problem (A) on the data
(x1 , y1 ), (x2 , y2 ), . . . (xn , yn ) of the following form:
i=n j=n
i=n X
X X
max αi − αi αj xi .xj
i=1 i=1 j=1
i=n
X
subject to αi yi = 0
i=1
αi ≥ 0, ∀i
Note that here yi is ±1, ∀i and each xi is a n × 1 vector. The variables
in the problem are the αi . We form a new optimization problem (B) as
follows:
i=n+1
X X j=n+1
i=n+1 X
max αi − αi αj xi .xj
i=1 i=1 j=1
i=n+1
X
subject to αi yi = 0
i=1
αi ≥ 0, ∀i, 1 ≤ i ≤ n + 1
where xn+1 = 12 (xi +xj ) for some values of i and j such that yi = yj . We
set yn+1 = yi . Show that the maximum value of the objective function
for the problem (B) is greater than or equal to the maximum value of the
objective function for problem (A). Justify your solution mathematically.
[3 Marks]

(B) Let x and z be two n × 1 vectors, for which a kernel function is defined
as K(x, z) = (xT z)2 + 3(xT z + 2)2 . If possible, find a mapping ϕ from
the space of n × 1 vectors to the space of (n2 + n + 1) × 1 vectors for
which the given Kernel function represents the inner product. Otherwise
explain why such a mapping is not possible for the given kernel function.
[5 Marks]
(C) Consider a gradient update rule given by:
at+1 = γat + (1 − γ)∇w (L)
wt+1 = wt − at+1
What is the contribution of a0 while computing the value of a5
[2 Marks]
Solutions
(A) Let (α1∗ , α2∗ , . . . αn∗ ) be the optimal solution for problem (A), and let the
optimal objective value be OA . For problem (B) let us set (α1 , α2 , . . . αn , αn+1 ) =
(α1∗ , α2∗ , . . . αn∗ , 0). It is easy to see that this tuple is a feasible solution
for problem (B) since both the constraints are satisfied. The value of
the objective function of problem (B) for this assignment to the αi s can
be seen to equal to OA , since the terms in the objective function of (B)
that do not exist in the objective function for (A) contain αn+1 which
is set to zero in the feasible solution for (B). Thus the optimal solution
problem has to greater than or equal to OA , since the optimal solution
is greater than or equal to any feasible solution to the problem.

(B) The given kernel function can be written as (xT z)2 + 3(xT z + 2)2 =
4(xT z)2 + 12xT z + 12. This can be seen as an inner product ϕ of the
following form ϕ = [ϕ1 (x), ϕ2 (x), ϕ3 (x)]T where ϕ1 is a n2 × 1 mapping
representing the term 4(xT z)2 and can be derived as follows: 4(xT z)2 =
Pi=n Pj=n Pi=n Pj=n
i=1 j=1 4xi zi xj zj = i=1 j=1 2xi xj ∗ 2zi zj . This leads us to the
mapping ϕ1 (x) = [2x1 x1 , 2x1 x2 , . . . 2x1 xn , 2x2 x1 , 2x2 x2 , . . . 2x2 xn ,
T
. . . 2xn x1 , 2xn x2 .√
. . 2xn x√
n] is a n2 × 1 mapping. ϕ2 (x) is the
which √
T
n × 1 mapping √ [ 12x1 , 12x2 , . . . 12xn ] representing the term 12x z
and ϕ3 (x) = 12 representing the constant term in the kernel function.
Now ϕT (x)ϕ(x) can be seen to equal the given Kernel function and ϕ is
of dimension n2 + n + 1.
Q Answer the following questions with justifications.
(A) Consider the following design matrix, representing four sample points
Xi ∈ R2 :  
4 1
2 3
X= 5 4

1 0
(i) Compute the unit-length principal component directions of X, and
state which principal component you would choose if you were re-
quested to choose just one. Show your computation.
[4 Marks]
(ii) Draw the principal component direction (as a line) and the projec-
tions of all four sample points onto the principal direction.

[3 Marks]
(iii) Reconstruct the data using rank-1 approximation.
[1 Marks]
(B) Assume that you are given the matrix A as below:
 
2 0 0
A = 0 4 0
0 0 4
You are asked to use an iterative technique (e.g., power method) to
obtain the eigenvector corresponding to the dominant eigenvalue. Your
start with an initial value of x = [0, 0.5, 1] for the iterative technique. To
which eigenvalue and eigenvector will the algorithm likely to converge?
Given reasons for your answer.
[2 Marks]
Solution:
(A) (i)
x y (x - 3) (y - 2) (x − 3)2 (y − 2)2 (x - 3) (y - 2)
4 1 1 -1 1 1 -1
2 3 -1 1 1 1 -1
5 4 2 2 4 4 4
1 0 -2 -2 4 4 4
x̄ = 3 ȳ = 2 - - 10 10 6
(1.5 marks)
Covariance Matrix:
     
1 Cov(x, x) Cov(x, y) 1 10 6 2.5 1.5
Σ= = =
N Cov(y, x) Cov(y, y) 4 6 10 1.5 2.5
(0.5 mark)
Eigenvalues:
det |Σ − λI| = 0 => λ2 − 5λ + 4 = 0 => λ1 = 4, λ2 = 1
(1.0 mark)
Eigenvectors:
   
1 −1
v1 = , v2 =
1 1
Unit Eigenvectors:
   
0.71 −0.71
ê1 = , ê2 =
0.71 0.71
(1.0 mark)
(ii) The one - dimensional subspace we are projecting onto is along
the principal eigenvector:
 
0.71
ê1 =
0.71
which corresponds to the direction of maximum variance in the
data. The coordinates (in principal coordinate space) for each
of the four sample points are projected as;
 T
0.71
Y = êT1 X = X
0.71
and are given by:

(4, 1) → 3.55, (2, 3) → 3.55, (5, 4) → 6.39, (1, 0) → 0.71

(2.0 marks)
Projection:

Projectionpoints.pdf

(1 mark)
(iii) Projection for all data points can be written as follows:
Z = Xcentered .ê1 =
   
1 −1   0
−1 1  0.71  0 
 . = 
2 2  0.71  22 
2 −2 −22
Xreconstructed = Z.êT1 =
 
0 0
0 0
 
2 2
−2 −2

which is the rank - 1 approximation


Note: In order to get the original data, Add mean to X recon-
structed
(1 mark)
(B)

X (1) = AX0 =
       
2 0 0 0 0 0
0 4 0 . 0.5 = 2 = 4 0.5 = λ1 X1
0 0 4 1 4 1
(1 marks)
X (2) = AX1 =
       
2 0 0 0 0 0
0 4 0 . 0.5 = 2 = 4 0.5 = λ2 X2
0 0 4 1 4 1
(1 marks)
Here the method converges in second iteration.

The largest eigenvalue


λ=4
and the corresponding eigenvector X =
 
0
0.5
1
The converged eigenvalue is the same as the given initial vector. The given ma-
trix is symmetric positive definite with the dominant eigenvalue repeated. Thus,
any vector in the eigenspace of the dominant eigenvalue is a valid eigenvector.
Q Answer the following questions with justifications.
The block diagram below shows a system with input x, output L, and the
computations.

Figure 1

Here,
x ∈ R2
f 1 : p = w1T x + b1
1
f2 : q =
1 + e−p
f 3 : r = w2T x + b2
er − e−r
f4 : s = r
e + e−r
m=q+s
1
L = (m − ml )2
2

(A) Draw the computation graph for this system which can be used to com-
pute the gradient of L w.r.t x using chain rule of partial derivatives.
[2 marks]
(B) show that:
∂q
= q(1 − q)
∂p
[2 marks]
(C) show that:
∂s
= (1 − s2 )
∂r
[2 marks]
(D) Write the expression for each of the below partial derivatives:
(i) ∂L
∂r
∂L
(ii) ∂w 1
∂L
(ii) ∂w 2
(ii) ∂L
∂x
[4 marks]
Solution
(A)
Figure 2

(B)
1
q=
1 + e−p
∂q e−p
=
∂p (1 + e−p )2
e−p
1−q =
1 + e−p
∂q
= q(1 − q)
∂p
(C)
er − e−r
s= r
e + e−r
∂s (e + e )(e + e ) − (e − e )(er − e−r )
r −r r −r r −r
=
∂r (er + e−r )2
∂s (er − e−r )2
=1− r
∂r (e + e−r )2
∂s
= 1 − s2
∂r
(D) (i)
∂L ∂L ∂m ∂s
= = (m − ml)(1 − s2 )
∂r ∂m ∂s ∂r
(ii)
∂L ∂L ∂m ∂q ∂p
= = (m − ml)q(1 − q)x
∂w1 ∂m ∂q ∂p ∂w1
(iii)
∂L ∂L ∂m ∂s ∂r
= = (m − ml)(1 − s2 )x
∂w2 ∂m ∂s ∂r ∂w2
(iv)
∂L ∂L ∂p ∂L ∂r
= + =
∂x ∂p ∂x ∂r ∂x
(m − ml)q(1 − q)w1 + (m − ml)(1 − s2 )w2

You might also like