0% found this document useful (0 votes)

44 views7 pages

HW 1

Uploaded by

quanliangliu1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views7 pages

HW 1

Uploaded by

quanliangliu1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Homework 1

>>Quanliang Liu<<
>>9085925288<<

Instructions: This is a background self-test on the type of math we will encounter in class. If you find many
questions intimidating, we suggest you drop 760 and take it again in the future when you are more prepared.
You can use this latex file as a template to develop your homework. Submit your homework on time as a
single pdf file to Gradescope. There is no need to submit the latex source or any code.

1 Vectors and Matrices [6 pts]

Consider the matrix X and the vectors y and z below:

3 2 2 1
X= y= z=
−7 −5 1 −1

1. Compute y T Xz
0
2. Is X invertible? If so, give the inverse, and
if no, explain
why not.
5 2
Yes, X is invertible, and the inverse is:
−7 −3

2 Calculus [3 pts]
x
1. If y = e− x + arctan(z) x6/z − ln , what is the partial derivative of y with respect to x?
x+1
6− z
6x z arctan(z)
∂y
∂x = −e− x + z − 1
x ( x +1)

3 Probability and Statistics [10 pts]

Consider a sequence of data S = (1, 1, 1, 0, 1) created by flipping a coin x five times, where 0 denotes that
the coin turned up heads and 1 denotes that it turned up tails.
1. (2.5 pts) What is the probability of observing this data, assuming it was generated by flipping a biased
coin with p( x = 1) = 0.6?
0.05184
2. (2.5 pts) Note that the probability of this data sample could be greater if the value of p( x = 1) was not
0.6, but instead some other value. What is the value that maximizes the probability of S? Please justify
your answer.
If the coin has this probability p( x = 1) = a, then the probability of S is: p(S) = a4 (1 − a). Now the
goal is to fine the a which makes p(S) the largest. Based on p′ (S) = 4a3 (1 − a) − a4 , we know that
when a = 0.8, S has a maximum probability.
3. (5 pts) Consider the following joint probability table where both A and B are binary random variables:

1
Homework 1 CS 760 Machine Learning

A B P( A, B)
0 0 0.3
0 1 0.1
1 0 0.1
1 1 0.5

(a) What is P( A = 0| B = 1)?

1
6
(b) What is P( A = 1 ∨ B = 1)?
0.7

4 Big-O Notation [6 pts]

For each pair ( f , g) of functions below, list which of the following are true: f (n) = O( g(n)), g(n) = O( f (n)),
both, or neither. Briefly justify your answers.
1. f (n) = ln(n), g(n) = log2 (n).
Since ln(n) = log2 (n) · ln(2), the functions differ by a constant factor. Therefore, both f (n) = O( g(n))
and g(n) = O( f (n)) are true. Conclusion: Both are true.

2. f (n) = log2 log2 (n), g(n) = log2 (n).

Since log2 (log2 (n)) grows much slower than log2 (n), so f (n) = O( g(n)). However, g(n) = O( f (n))
is false because log2 (n) grows faster. Conclusion: Only f (n) = O( g(n)) is true.
3. f (n) = n!, g(n) = 2n .
Since n! grows faster than 2n , so f (n) = O( g(n)) is false. g(n) = O( f (n)) is true because 2n grows
slower than n!. Conclusion: Only g(n) = O( f (n)) is true.

5 Probability and Random Variables

5.1 Probability [12.5 pts]
State true or false. Here Ω denotes the sample space and Ac denotes the complement of the event A.
1. For any A, B ⊆ Ω, P( A| B) P( A) = P( B| A) P( B).
False
2. For any A, B ⊆ Ω, P( A ∪ B) = P( A) + P( B) − P( B ∩ A).
True
P( A∪ B∪C )
3. For any A, B, C ⊆ Ω such that P( B ∪ C ) > 0, P( B∪C )
≥ P ( A | B ∪ C ) P ( B ).
False
4. For any A, B ⊆ Ω such that P( B) > 0, P( Ac ) > 0, P( B| AC ) + P( B| A) = 1.
False
5. If A and B are independent events, then Ac and Bc are independent.
True

5.2 Discrete and Continuous Distributions [12.5 pts]

Match the distribution name to its probability density / mass function.

2
Homework 1 CS 760 Machine Learning

(f) f ( x; Σ, µ) = √ 1
exp − 12 ( x − µ)T Σ−1 ( x − µ)
(2π )k det(Σ)
(g) f ( x; n, α) = (nx)α x (1 − α)n− x for x ∈ {0, . . . , n}; 0
otherwise
1 | x −µ|
(h) f ( x; b, µ) = 2b exp − b
(a) Gamma j
x
(b) Multinomial i (i) f ( x; n, α) = n!
Πk α i for xi ∈ {0, . . . , n} and
Πik=1 xi ! i =1 i
(c) Laplace h ∑ik=1 xi = n; 0 otherwise
(d) Poisson l (j) f ( x; α, β) =
βα α−1 − βx
x e for x ∈ (0, +∞); 0 other-
Γ(α)
(e) Dirichlet k wise
Γ(∑ik=1 αi ) α −1
(k) f ( x; α) = ∏ik=1 xi i for xi ∈ (0, 1) and
∏ik=1 Γ(αi )
∑ik=1 xi = 1; 0 otherwise
−λ
(l) f ( x; λ) = λ x ex! for all x ∈ Z + ; 0 otherwise

5.3 Mean and Variance [10 pts]

1. Consider a random variable which follows a Binomial distribution: X ∼ Binomial(n, p).
(a) What is the mean of the random variable?
µ = np
(b) What is the variance of the random variable?
Var ( X ) = np( p − 1)
2. Let X be a random variable and E[ X ] = 1, Var( X ) = 1. Compute the following values:
(a) E[5X ]
5
(b) Var(5X )
25
(c) Var( X + 5)
1

5.4 Mutual and Conditional Independence [12 pts]

1. (3 pts) If X and Y are independent random variables, show that E[ XY ] = E[ X ]E[Y ].
By definition, if two random variables X and Y are independent, their joint probability distribution
factors as:
P( X = x, Y = y) = P( X = x ) P(Y = y)
Thus, the expected value of the product of X and Y is:

E[ XY ] = ∑ ∑ xyP(X = x, Y = y)
x y

Using the independence property:

E[ XY ] = ∑ ∑ xyP(X = x) P(Y = y)
x y

This can be rewritten as:

! !
E[ XY ] = ∑ xP(X = x) ∑ yP(Y = y)
x y

3
Homework 1 CS 760 Machine Learning

Hence, we have:
E[ XY ] = E[ X ]E[Y ]

2. (3 pts) If X and Y are independent random variables, show that Var( X + Y ) = Var( X ) + Var(Y ).
Hint: Var( X + Y ) = Var( X ) + 2Cov( X, Y ) + Var(Y )
Since the sum of two random variables:

Var( X + Y ) = Var( X ) + Var(Y ) + 2Cov( X, Y )

Since X and Y are independent, their covariance is zero:

Cov( X, Y ) = E[ XY ] − E[ X ]E[Y ] = 0

Thus, the variance simplifies to:

Var( X + Y ) = Var( X ) + Var(Y )

3. (6 pts) If we roll two dice that behave independently of each other, will the result of the first die tell us
something about the result of the second die?
Since the probabilities of the results are independent. Therefore, the result of the first die tells nothing
about the result of the second die.
If, however, the first die’s result is a 1, and someone tells you about a third event — that the sum of
the two results is even — then given this information is the result of the second die independent of the
first die?
Since the first die’s result is 1, the second die must be 1 for the sum to be even. As a result, the result
of the second die is no longer independent of the first.

5.5 Central Limit Theorem [3 pts]

Prove the following result.

1. Let X1 , . . . , Xn are iid, Xi ∼ N (0, 1), and X̄ = 1

n ∑in=1 Xi , then the distribution of X̄ satisfies
√ n→∞
n X̄ −→ N (0, 1)

Given that Xi ∼ N (0, 1), the mean and variance of each Xi are:

E [ Xi ] = 0 and Var( Xi ) = 1

Step 1. The expectation of the sample mean X̄ is:

n
1 1
E[ X̄ ] =
n ∑ E [ Xi ] = n
·0 = 0
i =1

Step 2. Since the Xi ’s are independent and identically distributed (iid), the variance of X̄ is:

n
1 1 1
Var( X̄ ) =
n2 ∑ Var(Xi ) = n2
·n =
n
i =1
√
Step 3. For the scaled sample mean n X̄. The expectation is:

4
Homework 1 CS 760 Machine Learning

√ √ √
E[ n X̄ ] = n · E[ X̄ ] = n · 0 = 0
√
The variance of n X̄ is:

√ 1
Var( n X̄ ) = n · Var( X̄ ) = n · = 1
n
As a result, the distribution of the standardized sample mean approaches the normal distribution as
n → ∞. Specifically, we have:

√ d
n X̄ −
→ N (0, 1)

Thus, we have shown that:

√ n→∞
n X̄ −−−→ N (0, 1)

This completes the proof.

6 Linear algebra
6.1 Norms [5 pts]
Draw the regions corresponding to vectors x ∈ R2 with the following norms:

1. ||x||1 ≤ 1 (Recall that ||x||1 = ∑i | xi |)

q
2. ||x||2 ≤ 1 (Recall that ||x||2 = ∑i xi2 )

5
Homework 1 CS 760 Machine Learning

3. ||x||∞ ≤ 1 (Recall that ||x||∞ = maxi | xi |)

 
5 0 0
For M = 0 7 0, Calculate the following norms.
0 0 3
4. || M ||2 (L2 norm)
7 ( largest singular value of the matrix M)
5. ∥ M ∥ F (Frobenius norm)
p √ √
∥ M∥ F = 52 + 72 + 32 = 25 + 49 + 9 = 83.

6.2 Geometry [10 pts]

Prove the following. Provide all steps.
1. The smallest Euclidean distance from the origin to some point x in the hyperplane w T x + b = 0 is
|b|
||w||
. You may assume w ̸= 0.
2
The distance d from a point x0 to the hyperplane w T x + b = 0 is given by the formula:

|w T x0 + b|
d=
∥ w ∥2
The distance is from the origin, so x0 = 0. Substituting x0 = 0 into the formula:

|wT 0 + b| |b|
d= =
∥ w ∥2 ∥ w ∥2
Thus, the smallest Euclidean distance from the origin to the hyperplane is:
|b|
d=
∥ w ∥2

|b1 −b2 |
2. The Euclidean distance between two parallel hyperplane w T x + b1 = 0 and w T x + b2 = 0 is ||w||2
(Hint: you can use the result from the last question to help you prove this one).
Since the hyperplanes are parallel, they share the same normal vector w. The distance between these
hyperplanes is the perpendicular distance between a point on one hyperplane to the other hyperplane.
Let’s choose a point x1 that lies on the hyperplane w T x + b1 = 0.
A point x1 that satisfies the equation of the hyperplane w T x + b1 = 0 must lie on this hyperplane, so:

w T x1 = −b1

6
Homework 1 CS 760 Machine Learning

The distance from the point x1 (which lies on the first hyperplane) to the second hyperplane w T x +
b2 = 0 is given by the formula:
|w T x1 + b2 |
d=
∥ w ∥2

Substituting w T x1 = −b1 into the formula, we get:

|(−b1 ) + b2 | |b − b1 |
d= = 2
∥ w ∥2 ∥ w ∥2

7 Programming Skills [10 pts]

Sampling from a distribution. For each question, submit a scatter plot (you will have 2 plots in total). Make
sure the axes for all plots have the same ranges.
1. Make a scatter plot by drawing 100 items from a two dimensional Gaussian N ((1, −1) T , 2I ), where I
is an identity matrix in R2×2 .

1 0.25
2. Make a scatter plot by drawing 100 items from a mixture distribution 0.3N (5, 0) T , +
0.25 1
T 1 −0.25
0.7N (−5, 0) , .
−0.25 1

Evans - Architecture and Its Three Geometries
No ratings yet
Evans - Architecture and Its Three Geometries
36 pages
HW7 Solutions
No ratings yet
HW7 Solutions
39 pages
Math 1 Syllabus
No ratings yet
Math 1 Syllabus
3 pages
HW 2
No ratings yet
HW 2
7 pages
hw0 22au
No ratings yet
hw0 22au
5 pages
HW6 Sols
No ratings yet
HW6 Sols
6 pages
E1 Exam
No ratings yet
E1 Exam
3 pages
Exercise Sheet 04
No ratings yet
Exercise Sheet 04
8 pages
Homework #5: MA 402 Mathematics of Scientific Computing Due: Monday, September 27
100% (1)
Homework #5: MA 402 Mathematics of Scientific Computing Due: Monday, September 27
6 pages
Homework 1: Background Test: Due 12 A.M. Tuesday, September 06, 2020
No ratings yet
Homework 1: Background Test: Due 12 A.M. Tuesday, September 06, 2020
4 pages
Mstat PSB 2024
No ratings yet
Mstat PSB 2024
4 pages
HW01 - Math Recap
No ratings yet
HW01 - Math Recap
4 pages
Assignment 0
No ratings yet
Assignment 0
3 pages
Memo Proba
No ratings yet
Memo Proba
2 pages
Allex PDF
No ratings yet
Allex PDF
42 pages
EC404 - Monsoon 2016 Archana Aggarwal Introduction To Statistics and Econometrics 307, SSS-II Problem Set 1
No ratings yet
EC404 - Monsoon 2016 Archana Aggarwal Introduction To Statistics and Econometrics 307, SSS-II Problem Set 1
4 pages
Exercise 01 Solution
No ratings yet
Exercise 01 Solution
8 pages
HW 1
No ratings yet
HW 1
4 pages
ISI MStat PSB Past Year Paper 2014
No ratings yet
ISI MStat PSB Past Year Paper 2014
6 pages
Homework 0: Mathematical Background For Machine Learning
No ratings yet
Homework 0: Mathematical Background For Machine Learning
11 pages
HW 1
No ratings yet
HW 1
4 pages
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
No ratings yet
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
4 pages
Isi JRF Stat 07
No ratings yet
Isi JRF Stat 07
10 pages
Probability Assignment 3
No ratings yet
Probability Assignment 3
2 pages
Placement Exam
100% (2)
Placement Exam
2 pages
Isi Mstat Pyq Book (2024-2016)
No ratings yet
Isi Mstat Pyq Book (2024-2016)
178 pages
Test Code MS (Short Answer Type) 2011 Syllabus For Mathematics
No ratings yet
Test Code MS (Short Answer Type) 2011 Syllabus For Mathematics
5 pages
Spring 2008 Test 2 Solution
No ratings yet
Spring 2008 Test 2 Solution
10 pages
hw6 Sol
No ratings yet
hw6 Sol
12 pages
STA5328 Ramin Shamshiri HW3
No ratings yet
STA5328 Ramin Shamshiri HW3
6 pages
Roch Mmids Intro 5exercises
No ratings yet
Roch Mmids Intro 5exercises
9 pages
Week01 Workshop Soln
No ratings yet
Week01 Workshop Soln
5 pages
Solution RVCE AIML Test 3
No ratings yet
Solution RVCE AIML Test 3
3 pages
E1 Exam Sol
No ratings yet
E1 Exam Sol
6 pages
MAKAUT Question Paper GIVEN BY KKS
No ratings yet
MAKAUT Question Paper GIVEN BY KKS
4 pages
Midterm One 6711 F10 Sols
No ratings yet
Midterm One 6711 F10 Sols
8 pages
Reg HW1 Solution
No ratings yet
Reg HW1 Solution
2 pages
17 Notes MFML Probreview
No ratings yet
17 Notes MFML Probreview
19 pages
PRML Solution Manual-2
No ratings yet
PRML Solution Manual-2
122 pages
Mathematical Statistics Test Paper
No ratings yet
Mathematical Statistics Test Paper
9 pages
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
100% (1)
Lecture Notes 1: Brief Review of Basic Probability (Casella and Berger Chapters 1-4)
14 pages
Intro To Data Science Lecture 2
No ratings yet
Intro To Data Science Lecture 2
12 pages
Is1 Class 1 Group 6 Assignment 3
No ratings yet
Is1 Class 1 Group 6 Assignment 3
12 pages
Osobine Var
No ratings yet
Osobine Var
19 pages
ISI MStat 06
No ratings yet
ISI MStat 06
5 pages
SC633 Lecture Notes
No ratings yet
SC633 Lecture Notes
4 pages
Solutions To Exam 1: 1 2 N N A N
No ratings yet
Solutions To Exam 1: 1 2 N N A N
3 pages
MAI 102 Mathematics II ETE 2023 24
No ratings yet
MAI 102 Mathematics II ETE 2023 24
28 pages
MSexam Stat 2016S Solution
No ratings yet
MSexam Stat 2016S Solution
11 pages
Spring 2008 Test 2
No ratings yet
Spring 2008 Test 2
6 pages
Practice 1
No ratings yet
Practice 1
9 pages
MATH2010 2022 23 AutumnNotes Gappy
No ratings yet
MATH2010 2022 23 AutumnNotes Gappy
92 pages
Basic Probability Exam Packet
No ratings yet
Basic Probability Exam Packet
44 pages
Statistics 100A Homework 7 Solutions: Ryan Rosario
No ratings yet
Statistics 100A Homework 7 Solutions: Ryan Rosario
17 pages
DSE P1 Mathematics 1
No ratings yet
DSE P1 Mathematics 1
8 pages
5/6/2011: Final Exam: Your Name
No ratings yet
5/6/2011: Final Exam: Your Name
4 pages
January 2016B
No ratings yet
January 2016B
7 pages
151 Midterm 2 Sol
No ratings yet
151 Midterm 2 Sol
2 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Topology Essentials
From Everand
Topology Essentials
Emil G. Milewski
5/5 (1)
Assignment 06
No ratings yet
Assignment 06
7 pages
Assignment 02
No ratings yet
Assignment 02
6 pages
Assignment 05
No ratings yet
Assignment 05
7 pages
Assignment 04
No ratings yet
Assignment 04
8 pages
Assignment 03
No ratings yet
Assignment 03
12 pages
Vector Calculus - Corral PDF
No ratings yet
Vector Calculus - Corral PDF
222 pages
4 Ruang Vektor Euclidean 1 PDF
No ratings yet
4 Ruang Vektor Euclidean 1 PDF
40 pages
Note On Geometry (Springer) - Elmer G. Rees
83% (6)
Note On Geometry (Springer) - Elmer G. Rees
124 pages
Linear Algebra I: Course No. 100 221 Fall 2006 Michael Stoll
100% (1)
Linear Algebra I: Course No. 100 221 Fall 2006 Michael Stoll
68 pages
Geometry From Isometries To Special Relativity 1st Edition Nam-Hoon Lee Ebook All Chapters PDF
100% (2)
Geometry From Isometries To Special Relativity 1st Edition Nam-Hoon Lee Ebook All Chapters PDF
65 pages
Tifr 2019
No ratings yet
Tifr 2019
8 pages
The Geometry of Time Dierckekkehard Liebscher PDF Download
No ratings yet
The Geometry of Time Dierckekkehard Liebscher PDF Download
85 pages
Friedman Laws of Nature and Causal Necessity
No ratings yet
Friedman Laws of Nature and Causal Necessity
23 pages
Hilbert'S Fifth Problem: Review: Journal of Mathematical Sciences, Vol. 105, No. 2, 2001
No ratings yet
Hilbert'S Fifth Problem: Review: Journal of Mathematical Sciences, Vol. 105, No. 2, 2001
5 pages
TA Hilbert Spaces in Music
100% (1)
TA Hilbert Spaces in Music
50 pages
Riemannian Diffusion Models
No ratings yet
Riemannian Diffusion Models
34 pages
Applied Linear Algebra Second Edition Peter J. Olver Download
No ratings yet
Applied Linear Algebra Second Edition Peter J. Olver Download
55 pages
CS602 Short Questions Answer WWW Vustudents Net
No ratings yet
CS602 Short Questions Answer WWW Vustudents Net
13 pages
Unravelling Mysteries of Space Teccex
No ratings yet
Unravelling Mysteries of Space Teccex
6 pages
The Mirror of Infinity
No ratings yet
The Mirror of Infinity
50 pages
HypLL: The Hyperbolic Learning Library
No ratings yet
HypLL: The Hyperbolic Learning Library
4 pages
Geometric Transformations in OpenGL
No ratings yet
Geometric Transformations in OpenGL
44 pages
(Lecture Notes in Electrical Engineering 292) Jacob Scharcanski, Hugo ProenÃ A, Eliza Du (Eds.) - Signal and Image Processing For Biometrics
No ratings yet
(Lecture Notes in Electrical Engineering 292) Jacob Scharcanski, Hugo ProenÃ A, Eliza Du (Eds.) - Signal and Image Processing For Biometrics
336 pages
2013 - Fitting Cylinders To Data
No ratings yet
2013 - Fitting Cylinders To Data
20 pages
The Hyperbolic Theory of Special Relativity
No ratings yet
The Hyperbolic Theory of Special Relativity
109 pages
Question Bank Sem-2 (Maths-2)
No ratings yet
Question Bank Sem-2 (Maths-2)
13 pages
Nils Andersson and Gregory L. Comer - Relativistic Fluid Dynamics: Physics For Many Different Scales
No ratings yet
Nils Andersson and Gregory L. Comer - Relativistic Fluid Dynamics: Physics For Many Different Scales
83 pages
Lecture 1 On Solid Mechanics
No ratings yet
Lecture 1 On Solid Mechanics
89 pages
Ch5 Inner Product Spaces
No ratings yet
Ch5 Inner Product Spaces
101 pages
Riemannian Metrics: P P P P
No ratings yet
Riemannian Metrics: P P P P
15 pages
The Theory of Quantum Information 1st Edition John Watrous - Download The Ebook Now For An Unlimited Reading Experience
100% (4)
The Theory of Quantum Information 1st Edition John Watrous - Download The Ebook Now For An Unlimited Reading Experience
57 pages
Smarandache Curves and Applications According To Type-2 Bishop Frame in Euclidean 3-Space
No ratings yet
Smarandache Curves and Applications According To Type-2 Bishop Frame in Euclidean 3-Space
15 pages
Cognitive Psychology 4th Ebook All Chapters PDF
100% (2)
Cognitive Psychology 4th Ebook All Chapters PDF
34 pages

HW 1

Uploaded by

HW 1

Uploaded by

Homework 1

1 Vectors and Matrices [6 pts]

3 Probability and Statistics [10 pts]

(a) What is P( A = 0| B = 1)?

4 Big-O Notation [6 pts]

2. f (n) = log2 log2 (n), g(n) = log2 (n).

5 Probability and Random Variables

5.2 Discrete and Continuous Distributions [12.5 pts]

5.3 Mean and Variance [10 pts]

5.4 Mutual and Conditional Independence [12 pts]

Using the independence property:

This can be rewritten as:

Var( X + Y ) = Var( X ) + Var(Y ) + 2Cov( X, Y )

Since X and Y are independent, their covariance is zero:

Thus, the variance simplifies to:

Var( X + Y ) = Var( X ) + Var(Y )

5.5 Central Limit Theorem [3 pts]

1. Let X1 , . . . , Xn are iid, Xi ∼ N (0, 1), and X̄ = 1

Step 1. The expectation of the sample mean X̄ is:

Thus, we have shown that:

This completes the proof.

1. ||x||1 ≤ 1 (Recall that ||x||1 = ∑i | xi |)

3. ||x||∞ ≤ 1 (Recall that ||x||∞ = maxi | xi |)

6.2 Geometry [10 pts]

Substituting w T x1 = −b1 into the formula, we get:

7 Programming Skills [10 pts]

You might also like