0% found this document useful (0 votes)
47 views54 pages

MIG Magazine 2024 Issue 1 - Functions

Uploaded by

mary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views54 pages

MIG Magazine 2024 Issue 1 - Functions

Uploaded by

mary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Contents

2
Introduction
As Term 1 quickly draws to a close, we are proud to present to you Issue 1 of MIG Magazine 2023.
For this issue, we hope that you’ll dive deep into the world of functions as you check out our 8 articles,
ranging from introducing functions that you’ve definitely never heard of in school to the vast applica-
tions of functions in various areas.

Have you ever wondered why polynomials graphs look a certain way? If so, check out Xavier’s arti-
cle which explains how we can visualise the shape of polynomial graphs from their expression. For
those who wish to explore what else you can do with the basic properties of functions you’ve learnt in
MA3132, do consider reading Ethan and Qi Huan’s article on functional equations.

For those feeling rather adventurous, and want to explore some unorthodox functions to feed your
curious minds, we have three articles just for that. Firstly, if you love solving integrals, you should
take a look at Isaac’s article on Gamma Functions which could help you a bit with that. Next, we
have Naveen and Hao Xuan’s article on the famous and mysterious Riemann Zeta function. Finally,
we have Junsen’s article on the Collatz Conjecture which shows us the wonderful things that just 2
simple arithmetic operations can do.

To the many Math Olympiad enthusiasts out there, we have the perfect article for you. Check out
Tyler and Drew’s article on Inversions in Geometry for some useful problem-solving ideas.

For the many physics lovers out there, do jump straight to Prannaya and Vishal’s article on lie deriva-
tives. As someone once said, ”All good things come in threes”. This article does just that by covering
the application of functions in fluid dynamics, classical mechanics and quantum mechanics!

Don’t worry CS majors, we didn’t leave you out. Do check out Pranjal’s article which covers the
mathematical magic behind hash functions.

The new and improved MIG magazine this time round has some pleasant surprises for you. Do have
a good laugh while checking out our new ”Fun Facts & Memes” section. Don’t miss out on our ex-
citing ”Challenge Problems” section where you can rack your brains trying to solve our well-crafted
questions. It will be worth it as there are prizes to be won!

Last but not least, we would like to express our greatest appreciation to our whole MIG magazine
team for giving your all for this term’s issue. We couldn’t have done it without you!

Hope that you enjoy all the articles here and we hope this issue gets you all excited for the next three
issues to come this year!

Yours Sincerely,
The Mathematics Interest Group (MIG)

3
Graphs of Polynomial Functions
Xavier Ang (M24402)

1 Introduction
Polynomials are one of the many basic types of functions, and all polynomials
in x have graphs. This article investigates the graphs of polynomial functions,
and how they vary with di↵erent coefficients.

2 Linear Function f (x) = mx + c


For a linear function f (x) = mx + c, the term c in the function refers to the
y-intercept of the function, and the coefficient m refers to the gradient of the
graph.
If m < 0, the graph slopes downward. If m > 0, the graph slopes upward. The
larger |m| is, the steeper the linear graph.
c
By letting f (x) = 0, the x-intercept is found to be x = m .

(a) (b)

Figure 1: Graphs of di↵erent linear functions f (x) = mx + c. In figure


(a), m is varied but in figure (b) c is varied.

3 Quadratic Function f (x) = ax2 + bx + c


For a quadratic function f (x) = ax2 + bx + c, where a 6= 0,
We can complete the square, therefore making the quadratic equation take on
this form: f (x) = a(x h)2 + k, where (h, k) represents the turning point of the
graph.

4
The process of completing the square is as such:
✓ ◆
b
ax2 + bx + c = a x2 + x + c
a
"✓ ◆2 #
b b2
=a x+ +c
2a 4a2
✓ ◆2
b 4ac b2
=a x+ +
2a 4a
2
b 4ac b
We can hence conclude that the turning point of the quadratic graph is ( 2a , 4a ),
b b
or simply ( 2a , f ( 2a )).
b
The axis of symmetry is given by the vertical line x = 2a . The quadratic
graph is a parabola, and its shape is determined by the leading coefficient a.
If a > 0, the graph concaves upward. If a < 0, the graph concaves downward.
The constant term c is also the y-intercept of the function.
The number of x-intercepts of the graph is also related to the discriminant of
the function, = b2 4ac.

1. If > 0, there are 2 x-intercepts, which can be found using the quadratic
formula.
p
• Since > 0, > 0. When substituted into the quadratic formula,
the 2 roots are hence real numbers, which can be represented on the
x-axis.
b
2. If = 0, there is 1 x-intercept; which can be found using x = 2a .
p
• Since = 0, = 0. When substituted into the quadratic for-
mula, the formula thus simplifies to b/2a, which indicates only 1
x-intercept.
3. If < 0, there are no x-intercepts.
p
• Since < 0, < 0. When substituted into the quadratic formula,
the formula outputs complex numbers which cannot be represented
on the x-axis.

5
Figure 2: A quadratic graph for the function f (x) = ax2 + bx + c where
> 0. x1 and x2 are the x-intercepts of the graph.

4 Cubic Function f (x) = ax3 + bx2 + cx + d


For a cubic function f (x) = ax3 + bx2 + cx + d, where a 6= 0, some properties
of its graph are as follows:
1. Cubic graphs have at least 1 x-intercept. It is possible for cubic graphs to
have 2 or 3 x-intercepts.
2. Cubic graphs can have 0, 1 or 2 critical points. Cubic graphs have one
inflection point.
• Since the derivative of a cubic function is f 0 (x) = 3ax2 + 2bx + c, the
x-coordinates of the critical points can be foundpby taking f 0 (x) = 0
b2 3ac
which, by the quadratic formula, gives x = b± 3a .
• By taking x = 3a b
, we obtain the x-coordinate of the turning point
of its derivative, which is also the x-coordinate of the inflection point
of the cubic function.
3. If a > 0, the graph tends to infinity as x increases to infinity. If a < 0,
the graph tends to negative infinity as x increases to infinity.
4. The constant term d is the y-intercept of the graph.
The number of critical points and x-intercepts of the cubic graph can be deter-
mined as such:
1. If b2 3ac < 0, then the cubic graph has 0 critical points. Hence, the
graph only has 1 x-intercept.

6
2. If b2 3ac = 0, then the cubic graph has only 1 critical point. The graph
still has only 1 x-intercept.
3. If b2 3ac > 0, then the cubic graph has 2 critical points. The graph can
have 1, 2 or 3 x-intercepts, determined as such: (let yc1 and yc2 be the
local maximum and minimum values of the function respectively)
(a) If both yc1 and yc2 are more than or less than 0, then the cubic graph
has only 1 x-intercept.
(b) If either yc1 = 0 or yc2 = 0, then the cubic graph has 2 x-intercepts.
(c) If either yc1 > 0 and yc2 < 0, or then the cubic graph has 3 x-
intercepts.

Figure 3: A cubic graph (in red) for the function f (x) = ax3 +bx2 +cx+d
where there are 3 x-intercepts. The blue graph represents f 0 (x).

5 Polynomial function for polynomials of degree


2 and above
For a general polynomial function f (x) = an xn +an 1 xn 1 +...+a2 x2 +a1 x+a0 ,
where an 6= 0,
The y-intercept of the graph is always a0 . We can show that this is always true
by substituting x = 0:

f (x) = an xn + an 1x
n 1
+ ... + a2 x2 + a1 x + a0
f (0) = an (0)n + an 1 (0)
n 1
+ ... + a2 (0)2 + a1 (0) + a0
f (0) = a0

7
The degree of the polynomial, n, and the leading coefficient an , determines the
shape of the polynomial graph. For polynomials with degree n 2,
1. If n is even,
(a) If an > 0, as x ! 1, f (x) ! 1. As x ! 1, f (x) ! 1.
(b) If an < 0, as x ! 1, f (x) ! 1. As x ! 1, f (x) ! 1.
2. If n is odd,
(a) If an > 0, as x ! 1, f (x) ! 1. As x ! 1, f (x) ! 1.
(b) If an < 0, as x ! 1, f (x) ! 1. As x ! 1, f (x) ! 1.

(a) (b)

Figure 4: Figure (a) shows an example of the range of polynomials of


degree 2 (an is even), while figure (b) shows an example of the range
of polynomials of degree 3 (an is odd).

A polynomial of degree n can have up to n 1 critical points. At the critical


dy
points, dx = 0.
• The critical points can either be local maximum points, local minimum
points or neither. (The first or second derivative tests can identify if it is
a local maximum or local minimum point)
By the Fundamental Theorem of Algebra, a polynomial equation of degree n
has n roots.
Similarly, a polynomial function of degree n can have up to n real roots. The
number of x-intercepts of the polynomial graph is equal to the number of real
solutions for the polynomial.
• The graph of a polynomial with an odd degree has at least 1 x-intercept,
since polynomials with an odd degree have at least 1 real root.

8
6 Conclusion
The graphs of polynomial graphs relate to its degree and the leading function.
By changing the coefficients of the terms in the polynomial, the shape of the
polynomial graph will also change. There are a few varieties of polynomial
graphs:
1. Linear graph: The graph of a linear polynomial is a straight line, with 1
x-intercept.
2. Graphs for even degree polynomials: The graph is a curve, and has a range
of either ( 1, ↵] when the leading coefficient is negative, or [ , 1) when
the leading coefficient is positive, where ↵ and are real values. They
may not have x-intercepts.

3. Graphs for odd degree polynomials: The graph is a curve, and has a range
of ( 1, 1). They have at least 1 x-intercept.

7 Problems
1. Find the number of maximum points of the polynomial function f (x) =
(x 100)(x 98) . . . (x 2)(x)(x + 2) . . . (x + 100).
2. Determine the range of g(x) = (x 1)2 (x + 1)2 1.

8 Solutions to Problems
1. 50.
The function f (x) has degree 101 and has 101 unique real roots, thus it has
100 critical points. Notice that the critical points alternate between max-
imum and minimum points for this function, so the number of maximum
points is 50.
2. [ 1, 1)
By using the second derivative test on x = 1, x = 0 and x = 1, we find
that (1, 1) and ( 1, 1) are minimum points while (0, 0) is a maximum
point. Since this is a polynomial of degree 4 and the leading coefficient is
positive, the range is hence [ 1, 1).

9 References
1. https://math.libretexts.org/Bookshelves/Algebra/Book%3A_Advanced_
Algebra/06%3A_Solving_Equations_and_Inequalities/604%3A_Quadratic_
Functions_and_Their_Graphs

9
2. https://math.libretexts.org/Courses/Borough_of_Manhattan_Community_
College/MAT_206_Precalculus/3%3A_Polynomial_and_Rational_Functions_
New/3.4%3A_Graphs_of_Polynomial_Functions
3. https://amsi.org.au/ESA_Senior_Years/SeniorTopic2/2e/2e_2content_
2.html#content_3

10
Functional Equations
Bor Qi Huan (M24405)
Goh Wei Cong Ethan (M24404)

1 Introduction
This article recaps the various types of functions (injective, surjective, bijective,
involutory), and will detail certain elementary strategies in tackling functional
equations (FEs) through worked examples. We also introduce Cauchy’s additive
functional equation and analyse how this classic FE can help solve functional
equations.

2 Revisiting Functions
For a function f : A ! B, we define A to be the ‘domain’, and B to be the
‘co-domain’. (i.e. f maps values from the domain A to the co-domain B)

Another useful group of functions is the involutions. A function satisfying


f (f (x)) = x for all x 2 Df is called an involution. Among other properties
they are all bijective.

A function f is called surjective if for each value b 2 B, there exists some value
a 2 A where f (a) = b. Likewise, f is called injective if f (a) = f (b) implies
a = b. If f is both injective and surjective, it is called bijective. These types of
functions are illustrated in Figure 1.

Recall also that a function f (x) is called even when f (x) = f (x) for all
x 2 Df . Similarly, it is called odd when f ( x) = f (x) for all x 2 Df .

Some examples of even functions include x2n where n is a positive integer, the
hyperbolic cosine, cosh x and the cosine function, cos x. Graphically, even func-
tions are symmetrical about the y-axis. The sum of even functions is also an
even function, such as y = 12 x4 3x2 cosh x + cos x.

Some examples of odd functions include x2n+1 where n is a positive integer, the
hyperbolic sine, sinh x and the sine function, sin x. Graphically, odd functions
are symmetrical about the origin. Similarly, the sum of odd functions is also
odd, such as y = 13 x3 10x + 2 sin x 10 1
sinh x.

11
Figure 1: Table of injective, surjective and bijective functions

(a) An even function (b) An odd function

Figure 2: Odd and even functions

3 Strategies to Solve FEs


One common strategy for functional equations is substitution, especially for
R ! R functions where one can substitute in 0. Substitution can yield useful
properties of the function such as evenness/oddness, injectivity/surjectivity etc.

Note: One should always remember to check that their solutions work after
solving the FE.

For example, consider the following problem (Kyrzgyzstan 2012):

Example 3.1. Find all functions f : R ! R such that

f (f (x)2 + f (y)) = xf (x) + y

for all x, y 2 R

12
Solution: Note that f (x) = x clearly works. A little reflection reveals that
f (x) = x also works. Now we want to show that f (x) = ±x are the only
solutions. Let P (x, y) be the assertion that f (f (x)2 + f (y)) = xf (x) + y. Now

P (0, 0) ) f (f (0)2 + f (0)) = 0 (1)

This is not very nice, but if we let m be the inner term f (0)2 + f (0) (note that
f (m) = 0), we get this:

P (m, y) ) f (f (m)2 + f (y)) = mf (m) + y ) f (f (y)) = y (2)

This implies that f is an involution. In particular, it is also a bijection.


We can now let x = f (a), such that f (x) = f (f (a)) = a to remove the f ’s.
Substituting into the original equation gives:

f (f (x)2 + f (y)) = af (a) + y = f (f (a)2 + f (y)) (3)

Since f is injective, we have

a2 + f (y) = f (a)2 + f (y)


f (a)2 = a2
f (a) = ±a.

It looks like we are done, but we have to exclude a strange possibility of f (a)
being a at certain points at a at others. However, this is easy to resolve and
is left as an exercise for the reader. (Hint: Consider the properties of f that we
determined so far.)

4 Cauchy’s functional equation over Q


Cauchy’s functional equation over Q is f (x + y) = f (x) + f (y), for all
x, y 2 Q (the set of rational numbers). Solutions to Cauchy’s functional equa-
tion over R are outside the scope of this article. One sees the trivial solution
f (x) = kx for some rational constant k, and might conjecture there are no
other solutions. It turns out this is indeed the case, and the demonstration of
the nonexistence of other solutions is a good exercise in functional equations.

Sketch of proof: Let P (x, y) be the assertion that f (x + y) = f (x) + f (y), where
x, y are rational.

P (0, 0) ) f (0) + f (0) = f (0) ) f (0) = 0. (4)


Note how strategic substitution yields the useful value of f (0).

P (k, k) ) f (k) + f ( k) = f (0) = 0 ) f (k) = f ( k) (5)

13
i.e. f is an odd function. We claim that f (ax) = af (x) for all integers a(*).
For a 2 Z+ (Z+ = {1, 2, 3, . . .}), the claim is obvious using induction similar to
(3).
Now suppose a < 0. Since a 2 Z+ , we have

f (ax) = f ( a · x) = af ( x) = a · ( f (x)) = af (x) (6)

This proves (*). If we can show now that (*) holds for any rational q = m/n(in
its lowest form), where m, n are integers, the result follows.
⇣m ⌘ ✓ ◆
1 1 1
f (qx) = f ·x =f mx = f (mx) = · mf (x) = qf (x) (7)
n n n n

Cauchy’s equation is very useful in situations such as solving Jensen’s FE (which


we do not have the space to go through) and others. It is important however to
note that the result needs stronger conditions when the domain of the function
is not Q.

The following is an appetiser problem. It is rather delicious though the serving


is quite small.
Example 4.1. Find all functions f : Z+ ! Z+ such that

(f (a) + b) | a2 + f (a) + f (b)

for all a, b 2 Z+ .
This problem is left as an exercise to the reader.

5 Recap
To end o↵ the article, the order of which you go about solving functional equa-
tions (usually) is as follows:
- Guess your solutions. f (x) = ax + b is often the answer, the hard part is just
to prove it
- Substitute the values for 0, ±1 and check if this reveals anything
- Substitute more complex terms which allows for cancellation of terms
- Prove injectivity or surjectivity if the opportunity presents itself
- Look for symmetry, evenness/oddness and involutions

6 Bonus problems
Find all functions f such that:
1. (z + 1)f (x + y) = f (xf (z) + y) + f (yf (z) + x) for all x, y, z 2 R+
2. f (x2 + f (y)) = f (f (x)) + f (y 2 ) + 2f (xy) for all x, y 2 R
3. f (x + y + f (y)) = 4044x f (x) + f (2023y) for all x, y 2 R+

14
Gamma Function
Isaac Tan (M24301)

1 The Factorial
Let’s take a look at a simple question.

A teacher has 7 classes of students. He wants to arrange these 7 classes of


students into 7 rooms. How many ways can he arrange the classes into the
rooms?
Firstly, let’s try to visualise this question. Shown below are the 7 rooms (shown
as boxes):

Room 1 Room 2 Room 3 Room 4 Room 5 Room 6 Room 7

Firstly, let’s look at room 1. In room 1, we can place any of the 7 possible
classes inside. Next, in room 2, since one of the classes was placed in room 1,
we can place 6 possible classes in room 2. Continuing on, we can place 5 possible
classes in room 3, 4 possible classes in room 4 etc, all until we are left with the
last class, which we can place in room 7. Hence the number of ways to arrange
the classes into rooms = 7 ⇥ 6 ⇥ 5 ⇥ 4 ⇥ 3 ⇥ 2 ⇥ 1 = 5040. We have a notation
for this: 7! = 7 ⇥ 6 ⇥ 5 ⇥ 4 ⇥ 3 ⇥ 2 ⇥ 1 = 5040.

Factorial Definition
(a) For a positive integer n, n! = n ⇥ (n 1) ⇥ (n 2) ⇥ (n 3) ⇥ ... ⇥ 2 ⇥ 1
(b) For any real number x such that x is not a negative integer, x! = x⇥(x 1)!.

Exercise 1. Prove (b) for the case where x is a positive integer n.

Now, let’s look at another way to express the factorial.

2 The Gamma Function


Let’s look at the graph of y = x!.

15
As you can see, the graph forms a “tooth-like” shape, with there being vertical
asymptotes at x = n where n is a negative integer.

However, we notice that we can evaluate the factorial function at non-integer


values like e and even some negative values like ⇡e . Why is this so, and how
do we calculate these values?

Let’s look at another definition of the factorial function:

Definition of the Gamma Function


Z 1
x! = (x + 1) = tx e t
dt
0

(x) is read as “Gamma of x”. We can write this as


Z 1
(x 1)! = (x) = tx 1 e t dt
0

For example, (5) = 4! = 24.

16
Proof of Integral Form of Gamma Function
R1
Let A be a positive real number. Consider the integral 0 e Ax dx.
Z 1
1 ⇥ Ax ⇤1 1 1
e Ax dx = e 0
= [0 1] =
0 A A A
Di↵erentiating on both sides with respect to A, we get:
Z 1 ✓ ◆
d d 1
e Ax dx =
dA 0 dA A

By Leibniz’s Integral Rule,


Z 1
@ Ax 1
e dx =
0 @A A2
Z 1
Ax 1
( x)e dx =
0 A2
Di↵erentiating on both sides with respect to A again, we get:
Z 1
2
( x)2 e Ax dx = 3
0 A
Di↵erentiating on both sides with respect to A again, we get:
Z 1
6
( x)3 e Ax dx =
0 A4
n
d
We notice a pattern here: When we take dA n on both sides of the original

equation, we get Z 1
n n!
( 1) xn e Ax dx = ( 1)n n+1
0 A
Z 1
n!
xn e Ax dx = n+1
0 A
R1 n x
Letting A = 1, we get n! = 0 x e dx. (proven)

Hence, we can use this integral form to evaluate the factorial of non-integers.
Exercise 2.
(a) Evaluate (6).
R1
(b) Evaluate 0 x8 e x dx.
R1 x p
(c) Given 0 epx dx = ⇡, evaluate ( 1
2 ).
R1 x2
(d) Using the Gamma Function, evaluate the Gaussian Integral: 1
e dx.
(Challenging)

17
3 Euler’s Reflection Formula
Euler’s Reflection Formula is the following identity:

(x) (1 x) =
sin(⇡x)

If we let x = 12 , then we will get:


✓ ◆ ✓ ◆
1 1 ⇡
1 =
2 2 sin( ⇡2 )
✓ ✓ ◆◆2
1
=⇡
2
✓ ◆
1 p
= ⇡
2
p
The value ( 12 ) = ⇡ is very important. Using it, we can find values like ( 32 )
5
and ( 2 ). We can also use this to evaluate the Gaussian Integral (refer to ex-
ercise 2(d)).

We can also use Euler’s Reflection Formula to find out why the graph of y = x!
has vertical asymptotes at negative integers.

Proof that y = x! has vertical asymptotes at all negative


integers x

Letting x = 0 in (x) (1 x) = sin(⇡x) , we get:


(0) (1 0) =
sin(0⇡)

( 1)!1 =
0

( 1)! =
0

However, 0 is undefined, as we cannot divide by 0. Hence, ( 1)! is undefined,
and the graph of y = x! has a vertical asymptote at x = 1.

Recall that
(x + 1) = x (x).
So, ( 1) = ( 2) ( 2), implying that ( 2) is undefined.
We can repeat this process to prove that ( 3), ( 4) etc. are all undefined.
Hence, the graph of y = x! has vertical asymptotes at all x = n where n is a
negative integer.

18
4 Solutions to Exercise Problems
Exercise 1
n! = n ⇥ (n 1) ⇥ (n 2) ⇥ · · · ⇥ 1
= n ⇥ ((n 1) ⇥ (n 2) ⇥ · · · ⇥ 1)
= n ⇥ (n 1)!
Exercise 2(a)
(6) = 5! = 120
Exercise 2(b) Z 1
x8 e x
dx = 8! = 40320
0
Exercise 2(c) Z ✓ ◆ ✓ ◆
1
e x 1 1 p
p dx = != = ⇡
0 x 2 2
Since (x + 1) = x (x),
✓ ◆ ✓ ◆
1 1 1
=
2 2 2
✓ ◆
1 p
= 2 ⇡
2
Exercise 2(d)
R1 2
x2
Let I = 1 e x dx and f (x) = e
( x)2 x2
Since f (x) = e =e = f (x), f is an even function. Hence,
Z 1
2
I=2 e x dx
0
2 du du
Let u = x , then = 2x ) dx =
dx
p
2 x
When x = 0, u = 0. As x ! 1, u ! 1. So,
Z 1 Z 1 ✓ ◆
u du 1
u 1 p
I=2 e p = u2 e du = = ⇡
0 2 x 0 2

5 References
1. Proof of Integral Form of Gamma Function:
https://www.youtube.com/watch?v=3x6yvmZg91E
2. Proof of Euler’s Reflection Formula:
https://www.youtube.com/watch?v=C1TMEo12DIQ
3. For further reading:
https://madasmaths.com/archive/maths_booklets/advanced_topics/
gamma_and_beta_functions.pdf

19
Riemann’s Zeta Function
V T Naveen Mugundh (M24301)
Ng Hao Xuan (M24303)

Have you heard of the one-million-dollar bounty for the solution of the Riemann
Hypothesis? The likes of this equation have evaded many of history’s greatest
minds. But how did this function even come about?

1 History of the Riemann Hypothesis


In 1796, Carl Friedrich Gauss, a German mathematician, was very interested in
finding a pattern for the occurrence of prime numbers. Gauss was so interested
in them that he manually calculated all the primes till nearly 3 million, in hopes
of finding a pattern. He then translated that data into a graph, which he titled
the Prime Counting Function1 . This function had a graph as such:

Figure 1: The Graph of the Prime Counting Function

Gauss realized, with a start, that the pattern of prime numbers, which was thus
far undiscovered, might be closely related to the logarithmic integral function.
1A function such that its output is the number of primes less than or equal to its input.

20
This was the start of the Zeta Function, and Gauss’s student Bernhard Rie-
mann’s profound hypothesis.

But before the discovery of the Zeta Function, we must introduce another math-
ematician, Leonhard Euler, for discovering and experimenting on the Zeta Func-
tion. In his time (⇠1707), mathematicians were still figuring out the general
structure of an infinite series2 . It can be a bit counter-intuitive to add an infinite
number of terms, but it can be done. Just look at the following series:
X1
1
2n
n=1

In this case, the infinite series approaches a value: 1, and can be visualized with
a simple geometric proof:

Figure 2: The Geometric Representation of an Infinite Series

This is an example of a convergent series3 . On the other hand, we have divergent


series4 . For example, the sum of all positive integers, or 1 + 2 + 3 + 4 + · · ·, does
not approach a finite value, and instead shoots up all the way to infinity.

2 Riemann’s Zeta Function


Riemann’s Zeta Function is defined as:
1 1 1
⇣(s) = s + s + s + · · ·
1 2 3
For example, when s = 2, the Zeta function will have a value of
1 1 1
⇣(2) = 2
+ 2 + 2 + ···
1 2 3
⇡2
which in fact equals 6 . (Note that the series representation of ⇣(s) converges
only when s > 1.)
2Asummation consisting of an infinite number of terms.
3A series that approaches a finite value, defined as the limit of its partial sums as the
number of terms increases to infinity.
4 A series that does not approach a finite value. A series which is not convergent is divergent.

21
Finally, we come to Bernhard Riemann. Before we get into Riemann’s Hy-
pothesis, we will need to talk about complex numbers. Its idea is sparked from
the fact that the domain of the square root function is restricted to positive reals.

When squaring a non-negative real, the result is going to be non-negative as


well, and when squaring a negative real, the result is going to be non-negative
as well, and when squaring a negative real, the result is going to be positive as
well. Hence, it is not possible to take the square root of 1. Mathematicians,
however,
p worked around this problem by inventing a new number: i, where
i= 1. Now, based on the new imaginary value that has just been defined;
we
p can find the values of the square roots of negative values. For example,
16 = 4i.

Now that we have defined complex numbers, we will need to introduce the
complex plane, which is the plane containing real values along its x-axis, and
imaginary values along its y–axis, and looks like this:

Figure 3: The Complex Plane

The Zeta Function takes in complex inputs, which is why we must define the
complex plane. (Note that complex values can be real, imaginary, or a mixture
of both.) It is also noteworthy that the series representation of the Zeta function
has a defined output only when the real part of our input is greater than 1. The
problem arises when the real part of our input is less than 1, as the series
evaluated at such an input doesn’t converge. Hence, mathematicians typically
use what is called an analytic continuation5 of the function to include such
inputs.
5A technique to extend the domain of analytic functions.

22
3 Trivial Zeros and Non-Trivial Zeros
We will now get into the details of the Riemann Hypothesis.

The Riemann Hypothesis states that all non-trivial zeros of the Zeta Function
have real part 12 . After extending our function for almost all possible complex
inputs, we would now like to know when the Zeta Function equals zero. The
values when the Zeta Function equals zero are called the zeroes of the Zeta
Function, and they can be classified into two categories: trivial zeros, and non-
trivial zeros.

Figure 4: Trivial Zeros

All the negative even integers are known to be trivial zeros of the Zeta function,
as there is a clear pattern as to when they will appear. However, non-trivial
zeros do not. Riemann defined the critical strip to be the region in the complex
plane such that all complex numbers in the critical strip have real part between
0 and 1, and theorized that all non-trivial zeros within that strip would lie on
the critical line — when the real part of the input was 12 .

Modern supercomputers have tested billions upon billions of values of non-trivial


Zeta zeros, and thus far, all of them lie on the critical line. It now remains a chal-
lenge - a one-million-dollar challenge - to find out when the Zeta function equals
zero, and hundreds of theorems, and the fates of countless mathematicians are
riding on this single, unproven statement.

References
[1] 3Blue1Brown - Visualizing the Riemann zeta function and analytic contin-
uation
[2] Zeta Function and Prime Numbers: https://www.youtube.com/watch?v=zlm1aajH6gY

23
Collatz Conjecture
Zhao Junsen (M24501)

1 Introduction
Think of any positive integer, n. Is n even? If yes, divide by 2. If no, multiply by
3 then add one. Repeat this cycle on and on, and sooner or later you will enter the
loop: 4, 2, 1, 4, 2, 1... For example, suppose we start with the number 7. Then
the number would generate the following sequence: 7, 22, 11, 34, 17, 52, 26, 13, 40,
20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, ... Feeling intrigued? Well, welcome to the Collatz
Conjecture, named after the German mathematician Luther Collatz, who came up
with it in 1930.
So now you know about the Collatz Conjecture. Well, how do you solve it? Every
number below 268 has been checked to follow the conjecture. And after that? Well,
no one knows. In fact, Paul Erdos, a famous mathematician, once said, “Mathe-
matics is not yet ripe enough for such questions.” This article will be explaining
more in detail about Collatz Conjecture, and various attempts to solve it.

2 Analysis of Collatz Conjecture


2.1 Odd vs Even Numbers
Now at first glance it may seem strange that both odd and even numbers should
eventually end up at one. Even numbers are divided by 2 at the start, while odd
numbers are increased by more than 3 times. Therefore it seems that the sequence
should on average grow by 3/2, not shrink to 1. Here is the reason why. Although
each odd number is increased by more than 3 times in the first step, each number
is then divided by 2 next step, which will be 3/2 of the initial value, and half the
numbers divided by 2 again on step 3, which will be 3/4 of the initial value, 1/4
of the numbers divided by 2 again on step 4, which will be 3/8 of the initial value,
and so on.
And so, if you take the geometric mean, you will find that on average, to get from
one odd number to the next, you multiply by
n ✓
Y ◆ 1
3 2i 3
lim = <1
n!1
i=1
2i 4

Hence statistically speaking, 3x+1 sequences are more likely to shrink than grow.

24
2.2 Visualisation of 3x+1
One way to visualise the paths of numbers in 3x+1 is to simply show how each
number is connected to the next number in the sequence. This is called a directed
graph.

If the conjecture is true, then it means every number is connected to this graph
and is linked to 4, 2, 1.

2.3 2 Possible Ways How the Conjecture may be False


There are two possible ways for the conjecture to be proven false.
The first possibility is that there could be a number that starts a sequence of
numbers that never fall back to 1 and instead stretch on to infinity. Another
possibility is that there are a set of numbers which form a closed loop, i.e. all the
numbers in this loop would not be connected to the main graph and would form a
cycle. But so far, no closed loop or sequence which shoots o↵ to infinity has been
found. In fact, given that the first 268 numbers have been proven to follow the
conjecture, mathematicians have calculated that any possible closed loop must be
at least 186,000,000,000 (186 billion) numbers long.

3 Attempts to Prove Collatz Conjecture


3.1 Scatterplot
One way mathematicians have attempted to prove the conjecture is by making a
scatterplot with the seed numbers (starting numbers) on the x-axis and a number
from each seed’s sequence on the y-axis. If it is shown that at least one number

25
from each seed falls below the line y = x, then the conjecture is proven. This is
because whatever seed you pick, if you know that at some point it will get smaller,
and that number as the seed will also get smaller, then eventually you will reach
1. And so if you can prove that all odd seeds have a number in their sequence that
falls below the y = x line, then you have proven the conjecture as even numbers
will eventually form an odd number after dividing by their highest power of 2.

3.2 Proofs by Mathematicians


In 1976, Riho Terras was able to show that almost all Collatz sequences reach a
point below their initial value. In 1979, the limit was reduced to almost all the
numbers going below x0.869 , and in 1994, the limit was further reduced to x0.7925 .
The mathematical definition of this limit is that as the seed approaches infinity,
the fraction of seeds whose sequences contain a number that is below the line
approaches 1.
Then recently, in 2019, another mathematician, Terrence Tao, imposed an even
stricter criteria. He showed that almost all numbers will end up smaller than any
arbitrary function f (x), as long as lim f (x) = 1, but the function can increase
as slowly as you like. So f (x) can be equal to log(x), or log(log(x)), or even
log(log(log(x))). As Terrence Tao said in a public talk, “This is about as close one
can get to the Collatz conjecture without actually solving it.”

4 Is Collatz Conjecture Really Worth Proving?


For nearly a century mathematicians have racked their brains in thinking of count-
less proofs for the Collatz conjecture. However, none have been able to actually
find a concrete proof for all seeds. Why is it that no one has been able to find a

26
complete proof for the conjecture? Could it be that it does not actually hold? As
Paul Erdos said, ”Mathematics is not yet ripe enough for such questions.” Maybe
the proof by Terry Tao will always be our best answer to this simple yet complex
question.

27
Inversion in Geometry
Tyler Ng Pek Han (M24601)

Drew Michael Terren Ramirez (M24604)

1 Inversion
1.1 What is inversion?
Consider a circle ! with centre O and radius R. Inverting about ! sends any point A to
the unique point A⇤ on ray OA, such that OA · OA⇤ = R2 . In addition, the centre of the
circle O is sent to P1 , the point at infinity (which lies on every line but on no circle), and
also maps P1 to O. This is based on the idea of 10 = 1. If that feels arbitrary, it’s normal.
But let’s explore what it does.

O A A⇤

1.2 Some properties of inversion


These are some useful facts about inversions, to get you started.
1. If an inversion maps A to A⇤ , then it maps A⇤ to A. To prove this, we find the image
of A⇤ under the same inversion. (A⇤ )⇤ is the unique point on ray OA⇤ (which is the
ray OA) such that O(A⇤ )⇤ · OA⇤ = R2 . But the point A satisfies these conditions, so
(A⇤ )⇤ = A.
2. A point A is on the circle ! if and only if A⇤ = A. This is because A lies on the circle
() OA = R () OA⇤ = R.
3. For points A and B such that O, A, B are not collinear, if A is mapped to A⇤ and B
is mapped to B ⇤ , then the quadrilateral AA⇤ BB ⇤ is cyclic. This is because
OA · OA⇤ = OB · OB ⇤ = R2 . This also gives 4OAB ⇠ 4OB ⇤ A⇤ .
4. For a point A outside !, let the tangents from A touch ! at R, S. Then A⇤ is the
midpoint of RS. (Refer to diagram in section 2.1)

28
1.3 What happens to lines?
1.3.1 Lines through O
For every point A on the line ` through O, the image A⇤ still lies on `. Hence, the line `
through O inverts to itself.

1.3.2 Other lines


Other lines invert to circles through O. First, since P1 maps to O, the image of the line `
passes through O. Let B be the point on ` such that OB ? `. It remains to show that for
every point A on `, A⇤ lies on the circle with diameter OB ⇤ .
From the diagram we can see that since AB ⇤ A⇤ B is cyclic,
90 = \OBA = \ABB ⇤ = \AA⇤ B ⇤ = \OA⇤ B ⇤ , as desired.

B⇤
`⇤

`
A B

A⇤
O

We can thus see that lines either map to the same line or another circle, depending on if
the line passes through O.

1.4 What happens to circles?


1.4.1 Circles through O
Since the inversion map is its own inverse, it is not hard to see that a circle through O
maps to a line, not passing through O.

1.4.2 Circles not through O


Try this for yourself! If you want a hint, take a look at the diagram below.

29
B⇤
B

O A C C⇤ A⇤

An angle chase is sufficient (finding out relations between angles). Let AC be the diameter
of the circle such that O, A, C are collinear, with OA < OC, and let B be an arbitrary
point on . Since

\BAA⇤ = 180 \BB ⇤ A⇤ and 180 \BCC ⇤ = \BB ⇤ C ⇤ ,

we get:

\ABC = 90 =) 90 = \BAA⇤ +180 \BCC ⇤ = 180 \BB ⇤ A⇤ +\BB ⇤ C ⇤ = 180 \A⇤ B ⇤ C ⇤

=) \A⇤ B ⇤ C ⇤ = 90 .
Hence B ⇤ lies on the circle with diameter A⇤ C ⇤ .
(By the way, triangles ABC and A⇤ B ⇤ C ⇤ are not similar)

1.5 Other things to note


• Since the inversion maps circles and lines to other circles and lines, a lot of times the
radius of ! does not even matter! Changing the radius will just scale the inverted
diagram, independent of the original diagram. In this case, we can say that we invert
about a point O, instead of inverting about a circle !.

• Inversion preserves intersections. In particular, inversion preserves tangencies. It can


be specifically helpful to use when there are many tangent circles, as we will see in
the example below.
• It is important to note that if a circle 1 maps to a circle 2, the centre of 1 is not
mapped to the centre of 2 .

• When inverting to solve a problem, it is often a good idea to use this inversion to
convert circles to lines, because lines are much easier to handle.
• Some angles are preserved, but in the way that \OAB = \OB ⇤ A⇤ (by the cyclic
quadrilateral). Inversion does not handle angles away from O very well.

30
1.6 Problems
1. Let A1 B1 C1 D1 , A1 A2 B2 B1 , B1 B2 C2 C1 , C1 C2 D2 D1 , D1 D2 A2 A1 be cyclic
quadrilaterals. Prove that A2 B2 C2 D2 is cyclic.
Hint: Invert about a point, not one of the given circles. The result will be a di↵erent
but much simpler problem, with 3 lines (because we inverted about a point which
lines pass through!), 2 circles, and a concyclic condition to prove.
2. Suppose that a circle 1 maps to the circle 2 . Let the centre of 1 be O1 and the
centre of 2 be O2 . Is it ever possible that (O1 )⇤ = O2 ?
R2 · AB
3. Prove the distance formula for inversion: A⇤ B ⇤ = . (Hint: similar triangles)
OA · OB
4. (Much harder than previous problems) Let P be a point inside triangle ABC. Let `A
be the line through P perpendicular to AP , and similarly for `B and `C . Let A1 be
the intersection of `A and line BC. Points B1 and C1 are defined similarly. Prove
that A1 , B1 , C1 are collinear. (This problem has no circles! Inversion can’t change
any circles into lines in this figure, only lines into circles. But the extra circles
actually make the problem easier!)

2 Pole and Polar


2.1 Definition
Remember the tangent construction for the image of inversion earlier? In fact, this line RS
does not only give A⇤ . It actually is more useful than that. The line RS is called the polar
of point A. More generally, for a point P , the line ` that passes through P ⇤ and is
perpendicular to OP is called the polar of P , and P is called the pole of line `.

O A⇤ A

2.2 La Hire’s Theorem


P lies on the polar of Q if and only if Q lies on the polar of P . This is one of the most
important facts about poles and polars, and is what makes it useful. The proof is not
difficult: it is just by angles and one of the properties of inversion.

2.2.1 Outline of Proof


Consider the points P , P ⇤ , Q, Q⇤ . Show that P lies on the polar of Q if and only if
\P Q⇤ Q = 90 . Can you finish from there using one of the inversion properties?1

1 Hint: This property is that P P ⇤ Q⇤ Q is a cyclic quadrilateral.

31
Hash Function - The Math of Encryption
Pranjal Dasghosh (M24402)

1 Introduction
In computer science, a hash function is a function that maps data of arbitrary
size to a fixed-size output. This fixed-size output is often called a hash value
or simply a hash. Hash functions are commonly used in computer programs
for various purposes, such as data indexing, data deduplication, and password
storage.

2 How Hash functions work


A hash function takes an input, which can be a string, number, or any other
data type, and returns a fixed-size output. The output of a hash function is
deterministic, which means that the same input will always produce the same
output. It also has to have collision resistance, which means that two inputs are
unlikely to produce the same output hash value, and additionally one cannot be
able to work backwards to find the input from the hash value. So if, for example,
our hash function was f (x) = x2 , then it wouldn’t be collision resistant as ± any
number would produce the same output, resulting in same hash values being
very common, and additionally, it can easily be reversed by doing square rooting
the hash value.

3 SHA-256
The most commonly used hash function in cryptography is SHA-256, which was
developed by the National Security Agency (NSA) in the United States. It takes
an input of arbitrary length and produces a 256-bit hash value. The SHA-256
hash function works by dividing the input data into 512-bit blocks, padding the
final block, and processing each block through a series of logical operations. The
output of the last block is the final hash value. The processes in the function are
padding (where the message is ”padded” with a series of bits to make to make it
a multiple of 512 - block size used by SHA-256), message scheduling (where the
12 bit blocks are broken down into 64 32-bit words), hash value initialization
and concatenation (where the hash value is initialized by a set of eight 32-bit
constants) and compression function (which produces the hash value output).
We will be focusing on the arithmetic operations in message scheduling.

32
4 Arithmetic functions
The primary arithmetic functions that are used for the hash function are mod-
ular arithmetic functions. Specifically, it uses the following operations:

4.1 Modular operation


Addition modulo 232 : This operation adds two 32-bit integers and takes the
modulo 232 of the result. This is denoted by the symbol “+”, with a subscript
indicating the modulo 232 operation, such as ”a + b mod 232 ”. The reason why
this is used is to prevent memory errors when computing the hash value as
otherwise the numbers will be too large for the program to handle (the max
number that 32-bits can represent is 232 1 and therefore mod 232 is done).

4.2 Right rotate


This operation rotates a 32-bit integer to the right by a specified number of
bits. This is denoted by the symbol “ >> ”, with the number of bits to rotate
indicated by a subscript, such as ”a >> n”. The degree of rotation is determined
by the formula (7 i) mod 32, which will return a value that is between 0 to
31, and that will be the degree of rotation (the ‘i’ in the formula denotes the
position of the 32-bit word in the whole message). After the degree of rotation
is determined, all the values in the bits are shifted to the right by that degree
of rotation, and the values that are at the end wrap around and come to the
front. This step is important as it shu✏es the bits in a quite random manner,
making it more attack-resistant.

5 Conclusion
So in conclusion, the arithmetic functions are used to create a series of “message
schedules” that are combined with the message blocks to produce the final hash
value. Hopefully this has allowed you to gain some insight into the mathematics
of hash values.

6 References
1. Hash Function. (n.d.). Corporate Finance Institute.
https://corporatefinanceinstitute.com/ resources/cryptocurrency/hash-function/4
2. Gautam, S. (n.d.). Hash Functions in Data Structure. Www.enjoyalgorithms.com.
Retrieved February 14, 2023, from https://www.enjoyalgorithms.com/blog/introduction-
to-hash-functions
3. Rotate bits of a number - GeeksforGeeks. (2009, October 31). Geeks-
forGeeks. https://www.geeksforgeeks.org/rotate-bits-of-an-integer/
‌4. 1fabunicorn. (2016, January 12). Beginner question- What is modulo 232

33
and the choosing? (sha-256). https://www.reddit.com/r/crypto/comments/
40mnr6/beginner question what is modulo 232 and the/

34
Lie Derivatives

Kannan Vishal (Alumni)


Prannaya Gupta (Alumni)

In this article, we will tell 4 di↵erent stories in the fields of fluid mechanics, clas-
sical mechanics, and quantum mechanics, and di↵erential geometry, and notice a
common construction.

1 Background
In high school calculus, we learn how to compute the rate of change of a function
with respect to some parameter, which we also know geometrically to be the slope
of the function at that point. When we move on to multivariable calculus, our
functions may however become vector-valued, and they may depend on any number
of variables.
They become functions which ascribe to each point in Rn , a vector in Rm , so we
call them vector fields.1 But to continue doing calculus, we must introduce more
novel forms of derivatives.

Examples.
1. To find derivatives with respect to a particular variable, we settle for a partial
derivative of a particular component of F , @Fxi /@xj . Geometrically, this is
the single-variable derivative taken from a ”slice” of the original function.
2. To collect all the possible partial derivatives in one place, we define the jaco-
bian matrix DF , where the entry in the i-th row and j-th column is @Fxi /@xj .
This allows us to get an analogous limit definition of the derivative to the single
variable case. i.e. DF is the operator such that DF (x)h = F (x+h) F (x)
as ||h|| ! 0.
3. New questions also arise in higher dimensions. How does a scalar function
h(x) vary along a particular direction,
P v̂?. Well, we know it to be the direc-
tional derivative, v̂·rh(x) = i v i @h(x)/@xi . Again, since we are interested
in geometric intuition, we can interpret this as taking rf , which is the vector
in the direction of ”the rate of fastest ascent”, and keeping only its compo-
nent along v̂. This still works if we let h(x) become a vector function, we
dot v̂ with the rows of rh = (Dh)T
1 When they map to R in particular, they are called scalar fields

35
Figure 1: Visualization of directional derivative of scalar function f .

4. But we may not be yet satisfied with the previous notion of the directional
derivative for vector fields. Consider the vector plot of v = y x̂ in figure 4.
When we take the directional derivative of u along ŷ, we get (ŷ · r)y x̂ = x̂
(Verify this yourself). This makes sense, if you took a vertical slice of the
vector plot, and were asked at what rate the vectors were changing as you tilt
your head from bottom to top, you would say they are increasing in magnitude
in the x direction at unit speed.
But what if you take the converse operation? What is the directional derivative
of ŷ along v? Well, the directional derivative gives zero, because the unit
vector is a constant vector. But some of you may not find this satisfactory,
because to you ”directional” derivative might be more evocative of how a
vector changes when is transported by another vector field. If you place a
unit vector at (0,1) for instance, the ”tip of the vector” would intuitively travel
faster to the right than the bottom of the arrow, and this would cause a net
rotation, so the vector is changing. So what is this elusive alternate idea of a
’directional derivative’ ?
Prodding further, we phrase the question brought up in the last example to ask,
”How does a vector field change when it is carried along by another vector
field?”

2 Integral flows and Fluid Mechanics


Vector fields can be thought of as ascribing vectors to each point in space, its
domain. In two dimensions, we can plot it on a graph (Figure 2). If the vector
field is ”smooth”, we might be tempted to grab a marker and naturally draw curves
passing through the field, tangent to the vector fields. Just like that you have

36
Figure 2: Left: A vector field, which determines an ODE. Right: The integral
flows of the vector field, which are exactly the solutions to the ODE.

”solved” an ordinary di↵erential equation, and found the integral flows of the
vector field, reminiscent of water currents on a shallow pool. The vector fields
would then represent the local flow velocity of the currents u(x).
We can push the analogy further, and imagine dropping a pin (footnote: imagine
it to be a magical elastic pin that can be stretched and squished by water currents)
onto the water stream, and rephrase the original question as ”how does the dyed
line element evolve as it is carried by the currents?”

Figure 3: v gets transported by the vector field u.

Referring to figure 3, The line element is the vector v(t) = rB rA , which gets
carried to v(t + dt) = rA + u(x + v) dt rB + u(x) dt in time dt,
so that the overall change is by (u(x + dv) u(x)) dt = v · ru dt. There we
go.2
2 The analogy to fluid flow is not just a pedagogical tool but is also relevant to the study

of fluid mechanics, where we may wish to investigate the rate of deformation of fluid parcels
due to viscous forces.

37
We call these quantities that get carried along by other vectors material quantities.
We obtain from our analysis a characteristic mathematical identity for material
quantities, that is, dv 3
dt ⌘ u · rv = v · ru. , or

u · rv v · ru = 0. (1)

3 Classical mechanics
Let us now shift our gaze to classical mechanics, or specifically analytical mechanics,
where instead of analysing physical systems by balancing forces using newton’s laws,
we opt to analyse the system as a whole, detailing the degrees of freedom, space of
possible configurations and constraints. In mechanics, we want to be able to take
some system and be able to say given a certain set of initial conditions, how predict
how the system will evolve and look like at time t. Of course, sometimes we may
just settle for just knowing know qualitative features
Some of you may be familiar with the Lagrangian, for use in solving common physics
problems. For the uninitiated, here is the process to solving problems using La-
grangian mechanics.
1. Identify the coordinates which parameterise the configuration of the system.
2. Find the sum of the total kinetic energy T and potential energy of the system,
as a function of the configuration coordinates, this is called the Lagrangian
L=T V.
3. Find the constraints f j (q) on the coordinates.
d @L @L
P @f j
4. The equation dt @ q̇i @qi = i j @qi , where the RHS consist of Lagrange
multipliers, are exactly the equations of motion, similar to Newton’s second
law. Interestingly, the Lagrangian, a scalar quantity, contains all the informa-
tion about the system.
But what we are going to look at in this article is the Hamiltonian, a cousin of the
Lagrangian. For conservative systems, the Hamiltonian H is related to the energy,
H = T + V . But unlike the Lagrangian which is a function of the ’positional’
coordinates q and its derivatives, the Hamiltonian, H ! H(q, p) is a function of q
and also p, the momentum. Like the Lagrangian, the equations of motion can be
derived from

@H
= q̇ i , (2)
@pi
@H
= ṗi , (3)
@q i
3 @v/@t = 0 for steady fields

38
known as Hamilton’s equations. This system of equations may seem mysterious
but it will make more sense in a while.

Example.
Consider the problem of the harmonic oscillator. A mass is constrained to move
along a line passing through the origin, subject to a force kq. q will denote the
displacement of the mass, and p will be its momentum, mq̇. The Hamiltonian (and
energy) of the system is,
p2 /2m + kq 2 /2 = H, (4)
and this is a constant of the motion. The Hamiltonian equations then give,

p/m = q̇
kq = ṗ

The first equation is trivial in this case, but the second equation is in fact Newton’s
second law kq = Fnet ⌘ ma = ṗ.

Figure 4: Phase space of Harmonic oscillator. Density plot represents value of


scalar function H. Arrows represent the Hamiltonian vector field XH = JrH.
The orbits or trajectories correspond to ellipses.

But we can go further, and note other interesting features of the problem. Again,
in the pursuit of deriving geometric intuition, we may wish to plot equation 4 onto
a graph with p as the y-axis and q as the x-axis (Figure 4). This is called the phase
space. It is in fact the equation of an ellipse, and the equation describes the range
of possible configurations the system can be in, the orbit, for a given energy. Since
the energy of the system can be as large as what we put into it, the possible orbits

39
consist of concentric ellipses centered at the origin and a point at the origin for
E = 0. These ellipses span the entire phase space in R2 , so the configuration space
is also R2 . ⇤
The earlier example taught us that we can parameterise the state of the system with
p and q, so that each possible state is represented by a point in phase space. So
when we want to learn about the transient behavior of a physical system, that’s the
same as asking, ”how do we drag our finger around the phase space graph starting
from a certain point so it lines up with the evolution of the system”.
That’s what Hamilton’s equations are for. We may rewrite Hamilton’s equations by
defining4 ✓ ◆
0 I
J=
I 0
to give,

✓ ◆

= J dH(q, p) (5)

0 1
@H/@p1
B ... C
B C
B@H/@pn C
⌘JB B C
1C . (6)
B @H/@q C
@ ... A
@H/@q n

that is, J dH defines a vector field over the phase space, and the trajectory of the
system is the trajectory tangent to the vector field at each point. That’s how we
solve systems with Hamiltonian mechanics:
1. Compute the Hamiltonian, which determines a vector field in phase space
2. Use a marker to draw the integral flows in phase space using the vector field
as a guide.

4 Poisson brackets
That is great but we may not be interested on p and q specifically, but rather a
quantity that depends on p and q, f (p, q), which we call an *observable*. To know
how it changes with time, we may take its time derivative, and use the chain rule
to get

df X @f dq i @f dpi
= +
dt i
@pi dt @q i dt
4 Notice that in R2 , J corresponds to 2D rotation by ⇡/2.

40
.
But using the Hamiltonian equations, (2) and (3), we get
!! box this

df X @f @H @f @H
= . (7)
dt i
@qi @p @pi @q

We call the peculiar looking di↵erential operator the Poisson brackets,

X @⇤ @4 @⇤ @4
{⇤, 4} = . (8)
i
@q i @pi @pi @q i

So whenever we want to find the time derivative of an observable, we can just find
df / dt = {f, H}.

Examples.
1. Letting f = p or f = q, we can even rewrite Hamilton’s equations (2,3) using
Poisson brackets.
2. Letting f = H itself will always give 0, as the Hamiltonian is a constant of
motion. In well-behaving conservative systems, this is just conservation of
energy.

4.1 Déjà vu
We may notice some similarities between the time derivatives of observables and
the material quantities we discussed in the earlier chapters. Analagous to the fluid
flow vector field, u(x), we have the Hamiltonian vector field dH(p, q), which we
use to find the integral flows, and then ask how quantities evolve with the base
vector field. In fact, we were answering the same question as in the first section,
how does a quantity f change with time as it is carried along by a vector field H.
After all, once we draw the phase portrait, what’s so di↵erent between the two
scenarios?
Then at the end, we end up getting a mysterious di↵erential operator which may look
di↵erent, but have an underlying ”structural” resemblance, such as their antisym-
metry. This hints that there is some underlying algebraic structure at play.

5 Quantum mechanics
Disclaimer: This chapter assumes understanding of basic quantum mechanics.
(actually its just the first chapter of Shankar[1]), I believe one can still appreciate

41
the results if you choose to take some assertions as a black box, but this chapter
can be skipped without loss of continuity.
Let us extend our analysis to quantum mechanics, which should extend classical
mechanics. Here’s a brief overview of some definitions.
1. In quantum mechanics, we we treat q, p as linear operators Q, P and so
functions of Q and P also become linear operators by direct substitutions.
2. Statevectors | i give the probability distribution of the state of the particle
you are looking at. Suppose I am working in the position basis, so I project
the statevector onto the x-basis, hx| i ! (x). Then, the probability of
finding the particle between |x0 i and |x0 + dxi is |hx0 | i|2 dx.
3. These quantum operators act on statevectors, which represent the probabil-
ity distribution. The eigenvalues of these linear operators correspond to real5
observable values. If we have a probability distribution, we can also find the
expected value of a quantum operator ⌦, h⌦i, by taking the weighted sum
of the observables, i.e. its eigenvalues !i .

X X
h⌦i = P (!i )!i = |h!i | i|2 !i , (9)
i i

which canPbe shown to be equivalent to (with a clever use of the completeness


theorem i |!i i h!i | = I (Exercise 5.1)).

h⌦i = h |⌦| i . (10)


I
P Exercise 5.1 Prove Equation (10). Hint: Use the completeness theorem
i |!i i h!i | = I. What properties of the eigenkets allow the completeness
theorem to hold?

5.1 Ehrenfest’s theorem


Again, our mission is the same. We want to know how observables vary with
time in a quantum system? So let’s di↵erentiate the expected value of a quantum
operator and see what happens.

dh⌦i d
= h |⌦| i
dt dt
= h ˙ |⌦| i + h |⌦|
˙ i + h |⌦| ˙ i .

Assuming steady state, i.e. ⌦ is time independent, we may make use of Schrödinger’s
equation,
5 Because quantum operators are usually hermitian and hermitian operators have real eigen-

values,

42
i
| ˙i = Ĥ | i , (11)
~
to derive

dh⌦i i
= h[⌦, Ĥ]i, (12)
dt ~

where [4, ⇤] ⌘ 4⇤ ⇤4 are called the commutator brackets.


I Exercise 5.2 Fill in the gaps in the proof of equation (12).
This is strikingly similar to equation (7), d!/dt = {!, H}. But perhaps this is the
most perplexing variant, because on the LHS of equation 12, we have an analytic
expression, a time derivative, but on the RHS, we have the commutator brackets,
something more reminiscent of algebra.
But now, we have gathered the clues to understand what’s really going on. The
commutator bracket [4, ⇤] boils down the underlying ”structure” equations (1,7,12)
seem to have. ”Left-Right-minus-Right-Left”. So how is non-commutativity, that
is, how badly a variable fails to commute with XH , related to the time evolution of
flow-dependent quantities?

6 Punchline
What we have been doing along was study how integral flows induced by vector
fields transform tangent planes, one of the many topics of study in di↵erential
geometry. Officially, di↵erential geometry is the study of smooth spaces6 using
techniques of analysis, but a popular joke is that di↵erential geometry is the ”study
of things that are invariant under change of notation.” This is because often it does
not make sense to study quantities and properties that change meaning depending
on the name with which you call it, nor the angle or distance from which you look
at it. If I say there is a stain on your left sleeve, you will turn to your right to
check.
Recall the first section where we considered the drift of a pin along the flow of
water currents. Thinking of the pin as a vector with scale and direction, I might ask
you if the pin is still the same pin as it drifts. You might say, ”Well, the currents
did rotate it a bit to the right and stretched it a little, so now its pointing to the
left,” but that is not what I am really asking. The pin never rotated on its own or
changed its own scale.
This idea of what should be the ”real rate of change” of a vector (field) along
the flow of another vector field is called the Lie Derivative, and it allows us to
6 Smooth meaning continuously di↵erentiable as many times as convenient

43
properly identify tangent vectors from one point with their counterparts at another
point.

Remark. Lie derivatives are commonly encountered in the study of smooth man-
ifolds, the study of non-euclidean spaces that can be locally treated as euclidean
spaces, so that we can do calculus on them, where allow us to identify the tangent
planes at one point of a smooth manifold with another.

7 Commutativity
That’s cool and all but does this have to do with commutativity?
Let’s suppose we have two vector fields, U, V , in the set7 of all vector fields, X(Rn ).
(with components in C 1 (R), the set of continuously di↵erentiable functions that
map to R). Up , Vp 2 Tp Rn are the tangent vector ascribed by the vector fields at
p, and Tp Rn ⇠
= Rn is the tangent plane at p, the set of all tangent vectors at p.
We are usually used to defining tangent vectors using unit axis vectors, e.g. x̂, ŷ.
But in the spirit of directional derivatives, we can define an operation of tangent
vectors on real-valued functions, LW : C 1 (R) ! C 1 (R), defined by,

d X @f
(LW f )(p) ⌘ f (Wt x) = W i (p) (p) (13)
dt t=0 i
@xi
P
So
P instead of writing vector fields as Wp = i W i (p)x̂i , we can write it as Wp =
i @
i W (p) @xi instead. Each vector field, U and V , determines a unique flow (by
solving an ODE), ✓t (p) and 't (p) respectively, such that 0 (p) = p and

d
✓t (p) = U✓t (p)
dt t=t0

for all t 2 ( ✏, ✏), ✏ 2 R sufficiently small. (Picard’s theorem proves existence and
uniqueness of solutions in this case).
If the lie derivative of the vector Vp 2 V were to vanish, that is, there is no intrinsic
change to the vector Vp as it gets carried along to ✓t (p) for t ! 0, then we would
expect that the tip of vector Vp , t0 (✓t (p)) for t0 ! 0, should match up with where
the tip of the original vector Vp , 0t (p) gets transported, ✓t ( t0 (p)). Thus, we require
0
t (✓t0 (p)) = ✓t ( t0 (p)) on a local neighbourhood of t, t = 0, thus

(p) ⌘ ✓ t t0 ✓t t0 (p) = p, (14)

as t, t0 ! 0.
and derivatives must vanish as well,
7A C 1 (R)-module in fact

44
@ 2 't 0 ✓ t @ 2 ✓ t 't 0
=
@t @t0 t=t0 =0 @t @t0 t=t0 =0

Since

@
f( t0 ✓t (p)) = (LV f )(✓t (p)),
@t0 t0 =0

and letting g = LU f 2 C 1 (R),

@
g(✓t (p)) = (LU g)(p) = (LU LV f )(p),
@t t=0

Hence,

@2
f ('t0 ✓t (p)) = (LU LV f )(p).
@t @t0 t=t0 =0

Repeating this for the RHS, and since f is arbitrary, we have the condition that

[U, V ] ⌘ LU LV LU LV = 0,

is identically zero. We call this di↵erential operation [⇤, 4] the commutator brack-
ets.
@
We can get [U, V ] into an even nicer form, by using the form LW = W i @x i , which

you will show (Exercise 7.1) to be


!
X X j j
i @V i @U @
[U, V ] = U V . (15)
j i
@xi @xi @xj

But this is equivalent to

[X, Y ] = X · rY Y · rX.
.
This is exactly the expression we obtained in section 2!
If [X, Y ] 6= 0, the value of [X, Y ] tells us by how much it has failed to commute, and
this occurs when a vectors undergo some intrinsic change as they get transported,
so [X, Y ] is exactly this rate of this change, the Lie derivative.

Remark. The algebraic structure seemingly arose from the algebra of flows, and
indeed, integral flows obey group properties. That is not to say there isn’t more to

45
the algebraic properties of commutator brackets. They form an algebra known as
a Lie algebra, and a related to the beautiful theory of lie groups, which are studied
by geometers.
I Exercise 7.1 Prove equation (15), Hint: Use the equality of mixed partial deriva-
tives (Clairaut’s theorem).

8 Review
8.1 Fluid Mechanics
In section 2, material quantities are defined by a vanishing Lie derivative LU V ,
LU V ⌘ [U, V ] = 0. Objects carried by the flow are material quantities, and so
are more abstract vector fields like vorticity, since they are fully determined by the
underlying fluid flow field.
For an example of a non-material quantity, recall our earlier example of how the
unit vector ŷ is transported by the vector field U = y x̂. We encountered that
the directional derivative vanishes, although the unit vector must undergo some
instantaneous rotation due to the varying flow speed. Using our new tool the lie
derivative, we find that, denoting V = ŷ to be the constant vector field, LU V =
[U, V ] = x̂. This makes sense, as we would expect that the unit y-vector would
gain a horizontal component at unit speed, i.e. x̂t, as it gets carried by U due to
the flow speed di↵erence between its tip and tail. But it doesn’t, its a constant
vector field after all. So there must be some ”invisible hand” enacting an intrinsic
opposing change to the vector at the rate x̂, so that it remains constant as it gets
transported.

8.2 Classical Mechanics


As we move onto the later examples of mechanics however, a qualitative account
quickly become more complex, as we would have to introduce some symplectic
geometry.
In section 3, the Poisson brackets {f, H} is equivalent to the directional derivative
LXH f , as a straightforward calculation would show, but there is also a direct relation
between lie brackets and Poisson brackets. Recall the Hamiltonian equation XH =
JrH. The Hamiltonian vector field, XH is tangent to the orbits of constant H,
which correspond to the trajectory of particles. But we can extend this to other
scalars. An observable, also generates its own Hamiltonian vector field, Xf =
Jrf .
Proposition 8.1 Poisson brackets are equivalent to (the negation of) commutator
brackets under the lift f 7! Xf , mapping C 1 (R) ! X(Rn ),

X{f,H} = [Xf , XH ]. (16)

46
Or in other words, the following diagram commutes.

{ ⇤,4 }
C 1 (R) ⇥ C 1 (R) C 1 (R)

X⇤ ⇥X⇤ X⇤ (17)
[⇤,4]
X(Rn ) ⇥ X(Rn ) X(Rn )

Proof. An elementary computational proof can be found in this StackExchange


response [2]. But we can still come up with a qualitative explanation for this parallel
between Lie brackets and Poisson brackets. The map from a scalar function to its
hamiltonian vector field, f 7! Xf , can be thought of as mapping a scalar function
to the tangent vectors along its contours of constant values, from the geometric
pictures we have drawn ths far. After all, in 2D, the map is exactly f 7! Jrf ,
where J, rotates rf , the normal vector to the contours of f by ⇡/2, making it
tangent to the contours instead.

Figure 5: Visualization of the proposition. Dashed lines represent contour lines


(lines of constant value). Xf is parallel to XH and thus experiences no torque,
thus has a vanishing lie derivative and its contour lines coincide with trajectories.
This doesn’t hold for Xg , which thus has nonzero time derivative.

So to compute df / dt = {f, H}, we might as well project f and the Hamiltonian


H to their tangent vectors, and find how the tangent vectors to the contour plot of
f , gets transported by the Hamiltonian flow XH (Figure 5). Xf is locally parallel
to XH , that is equivalent to saying f is constant on that orbit/trajectory. But
the more Xf turns perpendicular to the Hamiltonian flow, the greater ”torque” it
feels from the Hamiltonian flow, which results in a greater lie derivative, which also

47
means that the Hamiltonian flow is crossing a greater number of contour lines of f
per unit time, which is another way of saying df / dt is getting larger.

8.3 Quantum Mechanics


In the quantum case, the commutator bracket is also intimately related to quantiza-
tion, the problem of trying to translate classical setups into quantum, and figuring
out why we cannot do so in some cases. (e.g. The lack of commutativity between
the X and P operators gives rise to the Heisenberg uncertainty principle Footnote:
This is a result of the spectral theorem for hermitian matrices) I do not think I can
do this topic justice myself in this article,8 but perhaps this might inspire the reader
to learn more about the mathematical physics side of QM and QFT.

9 Conclusion
Although I think the topic of Lie derivatives is fascinating in its own right, I wanted to
write this article to share what I feel to be one of the most rewarding things about
learning mathematics and physics: Finding ’bijections’ between ideas in di↵erent
fields — cross-pollinating perspectives and techniques between domains to get the
big picture. In my case, I had a click moment when I realized there was a full
geometric interpretation of Hamiltonian mechanics and quantum mechanics, that
textbooks only allude to, when learning di↵erential geometry. Maybe that explains
some of the allure of the work being done in mathematics today, like the Langlands’
program, which explores connections between geometry and number theory. But
don’t take my word for it, because Paul Dirac, one of the pioneers of the modern
theory of quantum mechanics, likely empathises with this view too:
“The first main advance I made with Quantum Theory was realizing that there is
an analogy between the commutator of two dynamical variables uv vu, which is
not zero, with Heisenberg’s Matrix Mechanics, and the Poisson bracket of classical
mechanics, I remember this idea occurred to me when I was taking a walk out in
the country, and I was very excited about it. ” - Paul A.M. Dirac.

References
1. Shankar, R. Principles of quantum mechanics 2nd ed. en. isbn: 978-0-306-
44790-7 (Plenum Press, New York, 1994).
2. (https://math.stackexchange.com/users/1031660/sakari-pirnes), S. P. Rela-
tion of the Hamiltonian vector fields and the Lie bracket Mathematics Stack
Exchange. URL:https://math.stackexchange.com/q/4394467 (version: 2022-
03-02). eprint: https://math.stackexchange.com/q/4394467. https:
//math.stackexchange.com/q/4394467.
8 Cause idk.

48
3. Arnold, V. I. Chapter 16 Hamiltonian Mechanics. In Mathematical Meth-
ods of Classical Mechanics en. isbn: 978-1-4419-3087-3 978-1-4757-2063-1.
http : / / link . springer . com / 10 . 1007 / 978 - 1 - 4757 - 2063 - 1 (2022)
(Springer New York, New York, NY, 1989).
4. Morin, D. Introduction to Classical Mechanics: With Problems and Solu-
tions. en. https://scholar.harvard.edu/files/david- morin/files/
cmchap15.pdf.
5. Lee, J. M. Introduction to smooth manifolds Graduate texts in mathemat-
ics 218. isbn: 978-0-387-95495-0 978-0-387-95448-6 (Springer, New York,
2003).
6. Abraham, R. & Marsden, J. E. Foundations of mechanics 2nd ed. First
Indian edition 2011. eng. OCLC: 908404855. isbn: 978-0-8218-6875-1 (AMS
Chelsea Publishing, Providence, R.I., 2011).
7. Childress, S. An Introduction to Theoretical Fluid Dynamics https : / /
math.nyu.edu/~childres/fluidsbook.pdf (Feb. 2008).

49
50
51
52
Challenge Problems
In this section, we present to you 7 fun and interesting questions on functions for you to try out.
Some of these problems can be very challenging, so don’t worry if you can’t solve many of them.
Once you have solved some or all of these questions, do fill in your answers in the Microsoft Forms
link below to stand a chance to win attractive prizes!!!

CHALLENGE SET FORM (CLICK HERE TO SUBMIT YOUR ANSWERS!!).

Link: https://forms.office.com/r/UFCDgkE5B5

Questions 1 to 4 will be multiple choice questions (MCQs), where you just need to choose the
correct final answer in the MS forms. For questions 5 to 7, do upload a photo of your full solution
to get the full points.

Question 1 (2 points)
Let f (x) be a 4th degree polynomial function such that the remainder when f is divided by
(x 2019), (x 2020), (x 2021), (x 2022), (x 2023) are 120, 6, 2, 24 and 120 respectively.
What is the value of f (2024)?

(A) 0
(B) 910
(C) 630
(D) 2024

Question 2 (3 points)
Let f (x) = x3 + 3x + 1, where x is a real number. Given that the inverse function of f (x) exists
and is given by
p ! 13 p ! 13
1 x a+ x2 bx + c x a x2 bx + c
f (x) = +
2 2

where a,b,c are positive constants, what is the value of a + 10b + 100c?

(A) 521
(B) 215
(C) 325
(D) 746

53
Question 3 (5 points)
@
Compute the flow, ✓t (x, y) of the vector field, V = 4y @x @
+ 9x @y on R2 .

(A) (x cosh 6t + 23 y cosh 6t, y sinh 6t + 32 x sinh 6t)


(B) (x cosh 6t, 32 y sinh 6t)
(B) (x cosh 6t + 23 y sinh 6t, y cosh 6t + 32 x sinh 6t)
(D) ( 23 y sinh 6t, 32 x sinh 6t)

Question 4 (5 points)
Find all functions f : R ! R such that

f (y f (x)) = f (x) 2x + f (f (y))

(A) f (x) = x, x 2 [0, 1)


(B) f (x) = x 1
(C) f (x) = x2 and f (x) = x
(D) f (x) = x

Question 5 (6 points)
Find all functions f (x) determined on interval [0, 1] satisfying

f (f (x)) = f (x) and f (x) sin2 (x) + x cos(f (x)) cos(x) = f (x)

Question 6 (7 points)
Let 1 , 2 , 3 , 4 be distinct circles such that 1 , 3 are externally tangent at P , and 2 , 4 are
externally tangent at the same point P . Suppose that 1 and 2 ; 2 and 3 ; 3 and 4 ; 4 and 1
meet at A, B, C, D, respectively, and that all these points are di↵erent from P . Prove that

AB · BC P B2
= .
AD · DC P D2

Question 7 (7 points)
Let n be a positive integer. Find the number of pairs P, Q of polynomials with real coefficients such
that
(P (x))2 + (Q(x))2 = x2n + 1 and degP > degQ

54

You might also like