Assignment 1 Solutions
Assignment 1 Solutions
Assignment 1 Solutions
Professor: Norman Schürhoff
Due date: 5 November 2023
f : R → R, f (x) = ex .
4. Use one of the proof methods seen in class to prove the following propositions:
(a) Let A, B ⊆ RN be convex sets. Then A ∩ B is convex. Prove that A ∪ B doesn’t have to be
convex.
(b) Prove the following statement for n ≥ 1 using mathematical induction:
n
X n(2n − 1)(2n + 1)
(2k − 1)2 =
3
k=1
Solution
p q r q∧r p ∨ (q∧r) (p ∨ q) (p ∨ r) (p ∨ q) ∧ (p ∨ r)
F F F F F F F F
F F T F F F T F
F T F F F T F F
F T T T T T T T
T F F T T T T T
T F T T T T T T
T T F T T T T T
T T T T T T T T
1
(d) Yes, the truth table is:
2. Prove by contradiction:
Assume that h is injective, i.e. ∀x1 , x2 s.t. x1 ̸= x2 ⇒ f (x1 ) ̸= f (x2 )
h(x) = 0 ⇒ x3 − x = 0 ⇒ x = 0, x = 1, x = −1
x1 = 0, x2 = 1, x1 ̸= x2 ⇒ f (x1 ) = f (x2 ) = 0 ⇒contradiction⇒ h(x) is non-injective.
3. • Surjectivity: ∀x ∈ R ex > 0, i.e. Rf = R+ \{0}
Rf ̸= Cf = R ⇒ f : R → R, f (x) = ex is not surjective.
• Injectivity: x1 ̸= x2 , let x1 > x2 , then f (x1 ) > f (x2 ), because f (x) = ex is monotonic. Hence
f : R → R, f (x) = ex is injective.
• Bijectivity: f : R → R, f (x) = ex is not surjective and thus is not bijective.
There exist two convex A, B ⊆ RN s.t. A ∪ B is not convex. Thus, the union of two convex sets
is not always convex.
(b) Proof by mathematical induction:
P1
1. n = 1 ⇒ k=1 (2 − 1)2 = 12 = 1∗1∗3
3
Pn n(2n−1)(2n+1)
2. Assume that it is true for k=1 (2k − 1)2 = 3
3. if n = k + 1 :
2
n+1 n
!
X X
2 2
(2k − 1) = (2k − 1) + (2(n + 1) − 1)2
k=1 k=1
n(2n − 1)(2n + 1)
= + (2n + 1)2
3
(2n + 1)(n(2n − 1) + 3(2n + 1))
=
3
(2n + 1)(2n2 − n + 6n + 3)
=
3
(2n + 1)(2n2 + 5n + 3)
=
3
(2n + 1)(2n2 + 3n + 2n + 3)
=
3
(2n + 1)(2n + 3)(n + 1)
=
3
(n + 1)(2(n + 1) − 1)(2(n + 1) + 1)
=
3
Solution
1. (a) Determinant
2 3 4 1 3 4 1 2 4 1 2 3
|A| = 1(−1)1+4 3 1 2 +0(−1)2+4 2 1 2 +(−2)(−1)3+4 2 3 2 +(−6)(−1)4+4 2 3 1
1 1 −1 1 1 −1 1 1 −1 1 1 1
3
(b) Cofactors
3 1 2
c11 = (−1)1+1 1 1 −1
0 −2 −6
= [3 ∗ 1 ∗ (−6) + 1 ∗ (−1) ∗ 0 + 2 ∗ 1 ∗ (−2) − 0 ∗ 1 ∗ 2 − (−2) ∗ (−1) ∗ 3 − (−6) ∗ 1 ∗ 1]
= −22
2 1 2
c12 = (−1)1+2 1 1 −1
1 −2 −6
= (−1) ∗ [2 ∗ 1 ∗ (−6) + 1 ∗ (−1) ∗ 1 + 2 ∗ 1 ∗ (−2) − 1 ∗ 1 ∗ 2 − (−2) ∗ (−1) ∗ 2 − (−6) ∗ 1 ∗ 1]
= 17
2 3 2
c13 = (−1)1+3 1 1 −1
1 0 −6
= [2 ∗ 1 ∗ (−6) + 3 ∗ (−1) ∗ 1 + 2 ∗ 1 ∗ 0 − 1 ∗ 1 ∗ 2 − 0 ∗ (−1) ∗ 2 − (−6) ∗ 1 ∗ 3]
=1
2 3 1
1+4
c14 = (−1) 1 1 1
1 0 −2
= (−1)[2 ∗ 1 ∗ (−2) + 3 ∗ 1 ∗ 1 + 1 ∗ 1 ∗ 0 − 1 ∗ 1 ∗ 1 − 0 ∗ 1 ∗ 2 − (−2) ∗ 1 ∗ 3]
= −4
2 3 4
c21 = (−1)2+1 1 1 −1
0 −2 −6
= (−1)[2 ∗ 1 ∗ (−6) + 3 ∗ (−1) ∗ 0 + 4 ∗ 1 ∗ (−2) − 0 ∗ 1 ∗ 4 − (−2) ∗ (−1) ∗ 2 − (−6) ∗ 1 ∗ 3]
=6
1 3 4
c22 = (−1)2+2 1 1 −1
1 −2 −6
= [1 ∗ 1 ∗ (−6) + 3 ∗ (−1) ∗ 1 + 4 ∗ 1 ∗ (−2) − 1 ∗ 1 ∗ 4 − (−2) ∗ (−1) ∗ 1 − (−6) ∗ 1 ∗ 3]
= −5
1 2 4
c23 = (−1)2+3 1 1 −1
1 0 −6
= (−1)[1 ∗ 1 ∗ (−6) + 2 ∗ (−1) ∗ 1 + 4 ∗ 1 ∗ 0 − 1 ∗ 1 ∗ 4 − 0 ∗ (−1) ∗ 1 − (−6) ∗ 1 ∗ 2]
=0
1 2 3
c24 = (−1)2+4 1 1 1
1 0 −2
= [1 ∗ 1 ∗ (−2) + 2 ∗ 1 ∗ 1 + 3 ∗ 1 ∗ 0 − 1 ∗ 1 ∗ 3 − 0 ∗ 1 ∗ 1 − (−2) ∗ 1 ∗ 2]
=1
4
2 3 4
c31 = (−1)3+1 3 1 2
0 −2 −6
= [2 ∗ 1 ∗ (−6) + 3 ∗ 2 ∗ 0 + 4 ∗ 3 ∗ (−2) − 0 ∗ 1 ∗ 4 − (−2) ∗ 2 ∗ 2 − (−6) ∗ 3 ∗ 3]
= 26
1 3 4
c32 = (−1)3+2 2 1 2
1 −2 −6
= (−1)[1 ∗ 1 ∗ (−6) + 3 ∗ 2 ∗ 1 + 4 ∗ 2 ∗ (−2) − 1 ∗ 1 ∗ 4 − (−2) ∗ 2 ∗ 1 − (−6) ∗ 2 ∗ 3]
= −20
1 2 4
c33 = (−1)3+3 2 3 2
1 0 −6
= [1 ∗ 3 ∗ (−6) + 2 ∗ 2 ∗ 1 + 4 ∗ 2 ∗ 0 − 1 ∗ 3 ∗ 4 − 0 ∗ 2 ∗ 1 − (−6) ∗ 2 ∗ 2]
= −2
1 2 3
c34 = (−1)3+4 2 3 1
1 0 −2
= (−1)[1 ∗ 3 ∗ (−2) + 2 ∗ 1 ∗ 1 + 3 ∗ 2 ∗ 0 − 1 ∗ 3 ∗ 3 − 0 ∗ 1 ∗ 1 − (−2) ∗ 2 ∗ 2]
=5
2 3 4
c41 = (−1)4+1 3 1 2
1 1 −1
= (−1)[2 ∗ 1 ∗ (−1) + 3 ∗ 2 ∗ 1 + 4 ∗ 3 ∗ 1 − 1 ∗ 1 ∗ 4 − 1 ∗ 2 ∗ 2 − (−1) ∗ 3 ∗ 3]
= −17
1 3 4
c42 = (−1)4+2 2 1 2
1 1 −1
= [1 ∗ 1 ∗ (−1) + 3 ∗ 2 ∗ 1 + 4 ∗ 2 ∗ 1 − 1 ∗ 1 ∗ 4 − 1 ∗ 2 ∗ 1 − (−1) ∗ 2 ∗ 3]
= 13
1 2 4
c43 = (−1)4+3 2 3 2
1 1 −1
= (−1)[1 ∗ 3 ∗ (−1) + 2 ∗ 2 ∗ 1 + 4 ∗ 2 ∗ 1 − 1 ∗ 3 ∗ 4 − 1 ∗ 2 ∗ 1 − (−1) ∗ 2 ∗ 2]
=1
1 2 3
c44 = (−1)4+4 2 3 1
1 1 1
= [1 ∗ 3 ∗ 1 + 2 ∗ 1 ∗ 1 + 3 ∗ 2 ∗ 1 − 1 ∗ 3 ∗ 3 − 1 ∗ 1 ∗ 1 − 1 ∗ 2 ∗ 2]
= −3
−22 17 1 −4
6 −5 0 1
Cof (A) =
26 −20 −2
5
−17 13 1 −3
(c) Trace Trace(A) = 1 + 3 + 1 + (−6) = −1
5
(d) Kernel Nullspace (or kernel) : null(A)= {x ∈ Rn |Ax = 0}
Try to perform row operations
to
get the row echelon
form :
1 2 3 4 r3 =r3 −r4 0 2 5 10 0 2 5 10
r2 =r2 −r3
2 3 1 2 1 −r4 1
2 0 3 0 −5 −7
2 −r1 1
A= r1 =r→ r2 =r→
1 1 1 −1 0 1 3 5 0 1 3 5
1 0 −2 −6 1 0 −2 −6 1 0 −2 −6
0 0 −1 0 0 0 −1 0 0 0 −1 0
r2 =r2 −r4
r1 =r1 −2r3 0 0 −3 −1 rr23=r
=r3 +r2
2 −3r 1
0 0 0 −1 r4 =r r3 =r3 +4r2
4 −2r 1 −6r 2
0 0 0 −1
→
0 1 3
→ →
5 0 1 0 4 0 1 0 0
1 0 −2 −6 1 0 −2 −6 1 0 0 0
0 0 −1 0 x1 0
0 0 0 −1 x2 0
Ax = 0 : 0 1 0
= and this leads to :
0 x3 0
1 0 0 0 x4 0
x1 = 0
x2 = 0
−x3 = 0
−x4 = 0
Implies :
x1 = 0
x2 = 0
x3 = 0
x4 = 0
6
3. Now consider a factor analysis of the returns. Factor analysis allows for dimension reduction by
retaining only the top eigenvalues (factors) that capture the most significant variation in the data.
This reduces the dimensionality of the problem while preserving the essential information.
(a) Decompose the variance-covariance into its eigenvalues and eigenvectors.
(b) Show that each eigenvector represents a linear combination of the original variables, and each
eigenvalue represents the variance explained by its corresponding eigenvector.
(c) Perform factor analysis using 1, 2, 3, 4, 5, and 6 factors without rotation. How many factors
are actually required to perform factor analysis? Give reasons. (Hint: Check eigenvalues. The
number of factors is the number of eigenvalues exceeding 1. This is known as the Kaiser Rule.
The logic is that a factor with eigenvalue greater than 1 explains more variance than a single
observed variable.)
(d) Factor rotation helps to make the factors more interpretable. Varimax rotation is an orthogo-
nal rotation in which assumption is that there is no intercorrelations between components. In
contrast, oblique rotation is more complex, and can provide more superior results. See https://real-
statistics.com/linear-algebra-matrix-topics/varimax/. Perform factor analysis using the required
number of factors with varimax rotation. Report the factor loadings. Interpret the factor loadings.
Solution
1. Refer to the Python code.
2. Now drop the 11th asset and answer the following:
(a) Refer to the Python code
(b) Refer to the Python code
3. Now consider a factor analysis of the returns. Factor analysis allows for dimension reduction by
retaining only the top eigenvalues (factors) that capture the most significant variation in the data.
This reduces the dimensionality of the problem while preserving the essential information.
(a) Refer to the Python code
(b) One of the ways to show this is regress each eigenvector on the original variables.
A mathematical explanation can be given as: We start from variance-covariance matrix, S
S = E XX T − E(X)E(X)T
Since Var µT X is simply a number, so we denote it as λ, so we have
µT Sµ = λ
Since µT µ = 1, so
µµT Sµ = Sµ = λµ
which means the µ that we define in the first place is actually an eigenvector of the data covariance
matrix, the eigenvalue of which is the variance that the data has in that direction.
(c) Refer to the Python code
(d) Refer to the Python code
7
Section IV - (15 points)
1. Is function f (x, y) = ln (ex + ey ) convex? Is it strictly convex?
2. Calculate
π
+ cos π2 ex
sin 2x
lim
cos 3π
2 cosx
x→0
Solution:
1. We will show that f (x, y) is convex, but not strictly convex. Convexity follows from that H(x, y) is
positive semidefinite on x, y ∈ R2 (see p.101, proposition 3.19).
ex+y
∆1 = (ex +ey )2
> 0, ∆2 = 0 ≥ 0.
Thus quadratic form H (x, y) is positive semidefinite and f (x, y) is convex. Note that if H (x, y) is
positive semidefinite, function f (x, y) can still be strictly convex.
However, one can quickly obtain a counter-example in which strict convexity doesn’t hold for f (x, y).
Take (x1 , y1 ) = (0, 0) , (x2 , y2 ) = (2, 2) , λ = 21 . If f (x, y) is strictly convex then we should have
λ ln (ex1 + ey1 ) + (1 − λ) ln (ex2 + ey2 ) > ln eλx1 +(1−λ)x2 + eλy1 +(1−λ)y2 .
1 1 1 1
1 1
= 1 + ln 2 , which is equal to ln e 2 ·0+ 2 ·2 + e 2 ·0+ 2 ·2 =
However, 2 ln e0 + e0 + 2 ln e2 + e2
ln (2e) = 1 + ln 2 . Contradiction. Therefore, f (x, y) is not strictly convex.
2. We are going to approximate the given functions by polynomials around a = 0 applying Taylor’s
theorem. One may thus note that
(a) sin π2 x = π2 x + E3
h 2
i 2
(b) cos π2 ex = cos π2 1 + x + x2! + E3 = cos π2 + π2 x + π2 x2! + E3 = −sin π2 x + π4 x2 + E3 =
− π2 x + π4 x2 +
+ E3
h i
x2 3π x2
(c) cos 3π 3π 3π
= −sin 3π
2
2 cosx = cos 2 1 − 2! + E 4 = cos 2 − 2 2! + E 4 4 x + E4 =
− 3π 2
4 x + E4
8
where En is a remainder in Peano’s form. Thus we have
π
+ cos π2 ex π
+ E3 − π2 x + π4 x2 + E3 − π4 x2 + E3 − π4 + E1
sin 2x 2x
lim = lim = lim = lim =
cos 3π − 3π x→0 − 3π x2 + E4 x→0 − 3π + E2
2
2 cosx 4 x + E4
x→0 x→0
4 4
− π4 + 0 1
= 3π = .
− 4 +0 3
√ 2t
3. Let 1 + ex = t ⇒ x = ln t2 − 1 , dx = t2 −1 dt
v=t
t − ln (t − 1) + C = (t − 1) ln (t − 1) − t + C
u = ln (t = 1)
R du = dt R t−1+1
ln (t + 1) dt = {Integration by parts} = t+1 = t ln (t + 1)−
dw = dt t+1 dt+C = t ln (t + 1)−
v=t
t + ln (t + 1) + C = (t + 1) ln (t + 1) − t + C
Solution: Let’s put exercise in matrix notation , if Bt , Rt , Nt denote probabilities that economy is in
boom, recession or in normal growth today at t .
↗ boom(100 − 20 − 5 = 75%)
If today is boom - tomorrow −→ normal(20%)
↘ recession(5%)
↗ boom(2%)
If today is normal growth - tomorrow −→ normal(100 − 2 − 8 = 90%)
↘ recession(8%)
↗ boom(10%)
If today is recession- tomorrow −→ normal(20%)
9
↘ recession(100 − 10 − 20 = 70%)
So, probabilities Bt+1 , Rt+1 , Nt+1 that economy tomorrow will be in boom, recession or normal growth
can be computed as follow:
Bt+1 0.75 0.02 0.1 Bt Bt
Nt+1 = 0.2 0.9 0.2 Nt = A Nt .
Rt+1 0.05 0.08 0.7 Rt Rt
Bt+10
We are asked to find Nt+10 .
Rt+10
Bt+10 Bt+9 Bt+8 Bt
Nt+10 = A Nt+9 = A · A Nt+8 = A10 Nt .
Rt+10 Rt+9 Rt+8 Rt
A = CDC −1 ,
where C is a matrix of eigenvectors and C −1 is the inverse of C. D is a diagonal matrix with eigenvalues on
principal diagonal.
A2 = CDC −1 CDC −1 = CD2 C −1 =⇒ A10 = CD10 C −1 .
So,
Bt+10 Bt
Nt+10 = CD10 C −1 Nt .
Rt+10 Rt
1
−1000λ3 + 2350λ2 − 1805λ + 455
|A − λI| =
10
1
= · (λ − 1)(−1000λ2 + 1350λ − 455) − 1000λ2 + 1350λ − 455 = 0
10
1
−1350 ± (13502 − 4 · 455 · 1000) 2 −1350 ± 50
λ1,2 = =
−2000 −2000
=⇒ λ1 = 0.7, λ2 = 0.65.
10
So, |A − λI| = (λ − 1) (λ − 0.7) (λ − 0.65) and the eigenvalues are λ ∈ {1, 0.7, 0.65}. The corresponding
eigenvector are given by:
For λ = 1:
−0.25 0.02 0.1 x1 0
(A − λI)x = 0.2 −0.1 0.2 x2 = 0
0.05 0.08 −0.3 x3 0
A few elementary transformations yield
x1 0
− 23
1 0 x2 = 0
0 1 − 10
3 x3 0
2 10
′
Thus, we obtain for λ = 1, the eigenvector X = 3, 3 , 1 .
For λ = 0.65:
0.1 0.02 0.1 x1 0
(A − λI)x = 0.2 0.25 0.2 x2 = 0
0.05 0.08 0.1 x3 0
Again, after a few elementary transformations we get
x 0
1 0 1 1
x2 = 0
0 1 0
x3 0
′
Thus, we obtain for λ = 0.65, the eigenvector X = (−1, 0, 1) .
Note: Eigenvectors are not unique. Here, I set the arbitrary value to 1.
11
and we end up with the yield vector
Bt+10 0.1608 0.1237 0.1473 Bt
Nt+10 = 0.6478 0.6761 0.6478 Nt .
Rt+10 0.1914 0.2003 0.2049 Rt
This shows that the time series is stationary. In the exercise text it is given that we are in a recession, thus
′
the starting vector becomes (0, 0, 1) . Plugging in this vector in the above formulas results in the desired
probabilities.
12