CMM Power Round Full
CMM Power Round Full
1 Linear Algebra 2
1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Injectivity, Surjectivity, and Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Foundational Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.2 Linear Maps and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.3 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Group Theory 20
2.1 The Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Other Polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Symmetric Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Some Coincidences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Abstract Definitions and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4 Representation Theory 41
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Irreducible Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Abelian Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Character Theory 48
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Class Functions and Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.3 Decomposition of Reducible Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4 Applications and Character Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1
1 Linear Algebra
Over the course of this power round, the main tools we leverage in every theorem and proof are mathematical
objects. One such object you are likely familiar with is a set, a collection of any other things, for instance,
numbers. What other mathematical objects are there? Often, we obtain new objects by bestowing an additional
structure on sets. The set of rational numbers, ℚ, is one example. By giving the rational numbers the operations
of addition (+) and multiplication (·), we can start to ask more specific questions about it, since the elements
are no longer interchangeable.
The study of mathematical objects is very useful, because it helps us generalize results. If we prove some
theorem about ℚ based on the fact that it has addition and multiplication, then any other object with those
same operations would have the same theorem! So if we start with a precise definition for an “object” that has a
minimal set of properties, in one fell swoop we prove the theorem for any possible instance of this object. Then,
the results will have surprising generality and often represent vast common themes in the world of mathematics.
In this section, we introduce vector spaces, which are the currency of linear algebra. There are concrete,
intuitive examples of vector spaces that you are probably already familiar with. While we encourage you to
examine how our results manifest in these examples, try also to understand them abstractly. The abstract results
apply to anything which is a vector space!
1.1.1 Definitions
We begin by defining the notion of a field. The most common examples of fields are ℝ (the real numbers) and
ℚ (the rational numbers) - intuitively, they are the familiar sets where we can add, subtract, multiply and divide.
However, we must axiomatize all of the “familiar properties” that we usually take for granted. It may look
overwhelming, but every property here is one that you use every time you do arithmetic. First, we recall that a
binary operation on a set S is just a map S × S → S.
3. Identities: there exist additive and multiplicative identities in F, referred to as 0, 1 ∈ F, respectively, such
that for all a ∈ F, we have that 0 + a = a and 1 · a = a.
4. Additive inverse: for all a ∈ F, there exists b ∈ F such that a + b = 0. Such b is often denoted −a.
5. Multiplicative inverse: for all a ∈ F such that a ̸= 0, there b ∈ F such that a · b = 1. Such b is often
denoted a−1 .
2
Remark. If you would like to prove that a certain object is a field, also do not forget to check closure under
addition and multiplication! That is, do not forget to check that for all a, b ∈ F, we have a + b ∈ F and
a · b ∈ F. “Closure” is not listed as one of the field axioms, but really what it comes from is from the definition
of addition and multiplication as binary operations.
We have already seen some familiar examples of fields. The rational numbers ℚ, the real numbers ℝ, and
the complex numbers ℂ are fields (with the usual addition and multiplication operations). If you know some
number theory, you might also be able to see that, for any prime number p, the set of integers modulo p (usually
denoted by ℤ/pℤ) equipped with the addition and multiplication operations for modular arithmetic, is a field
(this is called a finite field, since the cardinality of ℤ/pℤ is p). Less obvious is that the following subset of the
real numbers ( )
√ √ 3 √ √ k
ℚ[ 2 + 2023] := ∑ ak ( 2 + 2023) : a0 , a1 , a2 , a3 ∈ ℚ
k=0
is also a field, again with the usual addition and multiplication operations.
Definition 1.2. A vector space V over a field F (we sometimes say: V is an F-vector space) is a set equipped
with two operations:
• Addition: + : V ×V → V, (v, w) 7→ v + w,
• Scalar Multiplication: · : F ×V → V, (a, v) 7→ a · v
addition (+) and scalar multiplication (·). Scalar multiplication takes an element a ∈ F and a vector v ∈ V and
outputs another vector w = a · v.
3. Additive identity: there is a zero vector 0 ∈ V , such that for all v ∈ V , we have 0 + v = v.
4. Additive inverse: for all v ∈ V , there exists w ∈ V such that v + w = 0. Such w is often denoted −v.
6. Scalar multiplication and field multiplication commute: for all a, b ∈ F and v ∈ V , we have that
(a · b) · v = a · (b · v).
3
Note that in the case of both vector spaces and fields, sometimes · is omitted when denoting multiplication.
You should also take care to distinguish 0 ∈ F and 0 ∈ V . Elements of fields are typically called scalars, and
elements of vector spaces are called vectors. The zero vector is different from the scalar zero. Moreover, the
same symbol · is used for field multiplication and scalar multiplication of vectors. The former is an operation
between two scalars, and the latter is an operation between a scalar and a vector. So this abuse of notation is
benign, since you should always be able to figure out which one is being used in context.
Another instance of benign abuse of notation occurs when we write 0 ∈ V , instead of using the bolded form
0. Again, when reading algebraic expressions, it should be clear, by context, whether any 0 refers to that of a
field or that of a vector space.
Example 1.1. Consider the set F n = {(a1 , . . . , an ) | ai ∈ F for i = 1, . . . , n}. We define addition on this set by
(a1 , . . . , an ) + (b1 , . . . bn ) = (a1 + b1 , . . . , an + bn ), and scalar multiplication by c · (a1 , . . . , an ) = (ca1 , . . . , can ).
Then, F n equipped with these operations is a vector space over F: in fact, every vector space axiom follows
from a corresponding axiom for the field F, as we encourage you to verify. For instance, (b + c) · (a1 , . . . , an ) =
((b + c)a1 , . . . , (b + c)an ) = (ba1 + ca1 , . . . , ban + can ) = (ba1 , . . . , ban ) + (ca1 , . . . , can ) = b · (a1 , . . . , an ) + c ·
(a1 , . . . , an ). This proof uses only field distributivity and definitions of operations to prove a distributivity axiom
of vector spaces.
This example of a vector space is likely the most familiar to you. For instance, ℝ2 is the space of all
2-dimensional vectors, with the first component being the x-coordinate and the second component the y-
coordinate. However, there are various other vector spaces, as you will see in 1.3.
Let us briefly mention some easy-to-prove basic “properties” of vector spaces. Of course, given a vector
space V , we have uniqueness of the additive identity 0 ∈ V , and uniqueness of additive inverses in V . The
method to prove these uniqueness statements is the same as in Problem 1.1 (which we urge you to review
and/or ponder about if you are still confused!). These may seem obvious, but really they do require the vector
space (and field) axioms to verify! As usual, F is a field, and V is an F-vector space. For every a ∈ F, v ∈ V :
• 0v = 0, a0 = 0.
• (−1)v = −v.
• av = 0 implies a = 0 or v = 0.
To verify the last one, you would need to use the multiplicative inverse property of a field.
4
The above Problem 1.2 is extremely important!! Pretty much everything here is as important as the
above definitions, and the concepts here will be referenced many times in the future. So, make sure to read
the problem statements before you proceed.
Problem 1.3 (3 Points). Here are some example vector spaces. You should define addition and scalar multipli-
cation appropriately. Let F be a field.
(a) (1 Point) Prove that the set of polynomials with coefficients in F is a vector space (over F). Prove that
the set of polynomials with degree at most d, for any d ∈ ℕ, is also a subspace. (Note: you should not
need to define any multiplication between polynomial.)
(b) (2 Points) Prove that if S is any set, and V is any vector space over F, then the set of all functions from S to
V is a vector space. Hint: for addition and scalar multiplication, define f + g by ( f + g)(s) = f (s) + g(s),
and (c f )(s) = c f (s).
1.1.2 Dimension
From the remainder of Section 1, we will use F to denote a field. Additionally, U, W , and V will be used
to denote any given vector space over F (potentially subspaces).
Many vector spaces are sets with an infinite number of elements. However, some of these vector spaces
seem bigger than others. For instance, ℝ seems smaller than ℝ2 . In terms of cardinality, the sizes of ℝ and
ℝ2 are the same (we will not go into this). But, in terms of vector spaces, ℝ is indeed “smaller” than ℝ2 . We
will understand this rigorously by defining a dimension of a vector space. First, an expression of the form
a1 v1 + · · · + an vn (with ai ∈ F, vi ∈ V ) is called a linear combination of v1 , . . . , vn . Now, we provide the two key
definitions:
Definition 1.3. The span of a finite set of vectors S = {v1 , . . . , vn } in V is defined as follows. If S is non-empty,
then the span of S is the subspace {a1 v1 + · · · + an vn | a1 , . . . , an ∈ F} of V . If S is empty, then the span of S is
{0}. In other words, the span of S is the smallest subspace of V that contains S. We denote the span of S by
span(S).
Example 1.2. Consider ℝ3 , the space of 3-dimensional real vectors. The span of the vector (1, 1, 1), for in-
stance, will be the line where x = y = z. The span of (1, 1, 1), (1, 0, 0) is the plane formed by the x-axis and the
line x = y = z. The span of (1, 1, 1), (1, 0, 0) is the same as the span of (0, 1, 1), (2, 0, 0).
Definition 1.4. A nonempty finite set of vectors S = {v1 , . . . , vn } in V is linearly independent if any vector
v ∈ span(S) can be written uniquely as a linear combination of S. S is called linearly dependent if it is not
linearly independent.
Remark. You may verify that S = {v1 , . . . , vn } is a linearly independent set in V is equivalent to saying that
span(v1 , . . . , vn ) = span(v1 ) ⊕ · · · ⊕ span(vn ). The definition can also be slightly simplified. Say that a vector v
could be written distinctly as a1 v1 + · · · + an vn and b1 v1 + · · · + bn vn . Then, (a1 − b1 )v1 + . . . (an − bn )vn = 0.
So, 0, a vector in span(S), could then be written as this linear combination. However, 0 has an even simpler,
different linear combination: 0v1 + · · · + 0vn = 0. We have just proved that if some vector in span(S) can be
written in two different ways, then 0 can also be written in two different ways. And the converse is certainly
true. This proof demonstrated a key technique in linear algebra: often global statements can be reinterpreted as
statements about 0. Below, we restate this equivalent definition of linear independence.
Definition 1.5. A set of vectors S = {v1 , . . . , vn } is called linearly independent if a1 v1 + · · · + an vn = 0 implies
that a1 , . . . , an = 0.
5
Now, we define “finite-dimensional” vector spaces. Much of elementary linear algebra is the theory of
vector spaces with finite dimension because it is easier to understand their structure.
Definition 1.6. A vector space V is finite-dimensional if there exists a finite set of vectors S = {v1 , . . . , vn } that
spans V . A set S of vectors is called a basis of V if it is linearly independent and spans V .
(The remainder of the Power Round will almost be always concerned with finite-dimensional vector spaces,
unless otherwise indicated.)
At this point, we should mention that the proofs of Lemma 1.7 and Theorems 1.8, 1.9, and 1.10 do rely on
the fact that F be a field (i.e., the field axioms). First, we present the Lemma, which will be crucial in proving
the following Theorems. If you are interested, you may try to prove the Lemma yourself. You are welcome to
cite the Lemma in any Problem on this Power Round that follows thereafter.
Lemma 1.7. If S is a linearly dependent set of vectors {v1 , . . . , vn } that spans V , then there exists some vi such
that vi ∈ span(v1 , . . . , vi−1 ), and S\{vi } spans V .
Theorem 1.10. Let V be a finite-dimensional vector space. Any linearly independent set v1 , . . . , vn can be
extended to a basis.
Proof. Say that S1 and S2 are two bases. Then, S1 is linearly independent and S2 is spanning, which proves that
S2 is at least as big as than S2 . The same argument proves that S1 is at least as big as S2 , proving they have the
same cardinality.
6
Now, in a very precise sense, we can see why ℝ2 is “larger” than ℝ as a vector space over ℝ. Note
that {(1, 0), (0, 1)} is a basis of ℝ2 (verify this if it is not clear!), and {1} is a basis of ℝ. So they have
dimensions 2 and 1 respectively. It is a simple exercise that F n is a vector space with dimension n, and the basis
{(1, . . . , 0), . . . , (0, . . . , 1)} is called the standard basis.
Example 1.3. Consider the fields ℝ, ℂ.√ Observe that ℂ is a 2-dimensional vector space over ℝ in the obvious
way. Indeed, the set {1, i} (where i = −1 as usual) is a basis of ℂ as an ℝ-vector space.
On the other hand, ℂ is certainly a 1-dimensional vector space over ℂ (itself). This indicates that the
concept of dimension also depends on the base field.
Problem 1.6 (4 Points). Prove that ℝ is a vector space over ℚ. Is it infinite-dimensional, or finite-dimensional?
Justify (i.e. prove) your answer. If ℝ is finite-dimensional, find the dimension. You may use the fact that π is
transcendental (there is no polynomial p with rational coefficients such that p(π) = 0).
1.2.1 Definitions
A linear map T : V → W is a function from one vector space to another with additional structure. It is said to
“respect the structure of vector spaces,” which is made precise in the following sense:
Definition 1.11. Let V,W be vector spaces over the same field F. A map T : V → W is called a linear transfor-
mation or a linear map if it obeys the following properties:
When working with linear maps, you should make sure to distinguish elements of V with elements of W ,
and especially the 0 ∈ V with the 0 ∈ W !
Why are linear maps defined in this manner? Well the theory of functions between vector spaces is only
useful if the function connects the vector space properties of one space to another. If the vector spaces have
completely different definitions of addition and multiplication that the function does not respect, then the func-
tion becomes rather weak from the perspective of linear algebra.
Example 1.4. Consider the fields ℝ, ℂ. Recall that ℂ is a 2-dimensional vector space over ℝ, and a 1-
dimensional vector space over ℂ. Then, the map T : ℂ → ℂ, z 7→ iz is a linear map, whether we view ℂ as
an ℝ-vector space or as a ℂ-vector space! Sometimes, we say that this is either an ℝ-linear map or a ℂ-linear
map to be clear on the base field!
7
For instance, to check ℂ-linearity, let z1 , z2 ∈ ℂ (as vectors) and a ∈ ℂ (as a scalar). We have T (z1 + z2 ) =
i(z1 + z2 ) = iz1 + iz2 = T (z1 ) + T (z2 ), and T (az1 ) = iaz1 = a(iz1 ) = aT (z1 ). Thus, T is ℂ-linear. The exact
same proof works for ℝ-linearity, because ℝ is contained in ℂ (ℝ is called a subfield of ℂ).
In the following two definitions and the following theorem, T : V → W is a linear map.
Definition 1.12. The set {v ∈ V | T (v) = 0} is a subset of V , called the kernel of T , or ker(T ).
Definition 1.13. The set T (V ) := {T (v) | v ∈ V } is a subset of W , called the image of T , or im(T ).
Theorem 1.14.
1. The image of T is a subspace of W .
2. The kernel of T is a subspace of V .
Proof. Let v1 , v2 ∈ V, a ∈ F. If w1 := T (v1 ) and w2 := T (v2 ) are two elements in the image of T , then aw1 −w2 =
aT (v1 ) − T (v2 ) = T (av1 − v2 ) is in the image. Moreover, since T (0) = T (0 + 0) = T (0) + T (0), T (0) = 0. So,
0 ∈ im(T ). Now by Problem 1.2, the image is a subspace.
Now, say that T (v1 ) = 0 and T (v2 ) = 0. Then, T (av1 − v2 ) = a · 0 − 0 = 0. Moreover, T (0) = 0, so
0 ∈ ker(T ). So, once again by 1.2, the kernel is a subspace.
The dimension of the image of a linear map T : V → W is called the rank of T , and the dimension of the
kernel of T is called the nullity of T .
Let T : V → W be a linear map. Recall that T is called injective if T does not send any two elements to the
same vector: i.e., if v ̸= w (v, w ∈ V ), then T (v) ̸= T (w). Also recall that T is called surjective if im(T ) = W .
The intuitive significance of injective functions (in general) is that they fit a full copy of the domain into the
codomain, without collapsing any points. For linear maps, however, we can show the following additional
result!
Proof. Note T injective implies that only 0 gets mapped to 0 by T , and therefore ker(T ) = {0} (the nullity of T
is 0). Conversely, if ker(T ) = {0}, then for v, w ∈ V , note T (v) = T (w) =⇒ T (v − w) = 0 =⇒ v − w = 0 =⇒
v = w. This is precisely injectivity.
Definition 1.16. A linear map T : V → W which is both injective and surjective is called a linear isomorphism.
Theorem 1.17. A linear map T : V → W is a linear isomorphism if and only if it has a (two-sided) inverse
linear map T −1 .
Linear isomorphisms are a key concept. If there is an isomorphism between two vector spaces, they are
called isomorphic and you should think of those spaces as being, for all intents and purposes, the same. As
vector spaces, they are effectively identical, with the linear isomorphism T just being a “translation” between
two different representations of the same object.
8
Problem 1.7 (3 Points). In this problem, you may use the fact that the composition of two linear maps is again
linear. This is straightforward to show, and you may assume this fact witihout proof. This is yet another
essential property of linear maps!
(a) (1 Point) Prove that the composition of two injective linear maps is again an injective linear map. That
is, if T1 : U → V, T2 : V → W are injective linear maps, then T2 ◦ T1 : U → W is again an injective linear
map.
(b) (1 Point) Prove that the composition of two surjective linear maps is again a surjective linear map. That
is, if T1 : U → V, T2 : V → W are surjective linear maps, then T2 ◦ T1 : U → W is again a surjective linear
map.
(c) (1 Point) Now, let T1 : U → V, T2 : V → W be linear maps, and suppose T2 ◦ T1 : U → W is a linear
isomorphism. Prove that T1 is injective and T2 is injective.
Problem 1.8 (2 Points). Let T : V → W be a linear isomorphism of finite dimensional vector spaces V,W . Prove
that dim V = dim W . You may not directly cite the Rank-Nullity Theorem.
We now prove a fundamental result about the rank and nullity of a linear map out of a finite-dimensional vector
space.
Theorem 1.18 (Rank-Nullity Theorem). Say T : V → W is a linear map, where V is a finite-dimensional vector
space. Then dim im(T ) + dim ker(T ) = dim V .
Proof. Take some basis v1 , . . . , vm of ker(T ). Since it is linearly independent in V , it can be extended to a basis
v1 , . . . , vn of V . Then, note that T (v1 ), . . . , T (vn ) span im(T ). After, all, any T (v) = T (a1 v1 + · · · + an vn ) =
a1 T (v1 ) + · · · + an T (vn ). But since T (v1 ), . . . , T (vm ) are all zero, this implies that T (vm+1 ), . . . , T (vn ) span
im(T ).
Now, assume by way of contradiction that T (vm+1 ), . . . , T (vn ) are linearly dependent in im(T ) (i.e., not
a basis). Then, we would have that 0 = am+1 T (vm+1 ) + · · · + an T (vn ) = T (am+1 vm+1 + · · · + an vn ), for some
a1 , . . . , an ∈ F with ai ̸= 0 for some i. This would imply that the nonzero vector (since one of the ai is nonzero
and v = vm+1 , . . . , vn are linearly independent) am+1 vm+1 + · · · + an vn is in the kernel of T . However, this implies
that v can be written as a nonzero linear combination of v1 , . . . , vm , and we already know it can be written as
a linear combination of vm+1 , . . . , vn —this contradicts linear independence. So, we have that the dimension of
im(T ) is n − m, completing the proof!
For an intuition for this proof, imagine that the size of V can be deduced from two quantities - how big its
image is (when it is mapped into W by T ), and how redundant the map T is. The latter is made precise by the
number of elements that “hit zero,” i.e. the dimension of the kernel.
Problem 1.9 (3 Points). Consider the function T : ℝ3 → ℝ2 defined by T ((v1 , v2 , v3 )) = (v1 , v2 ). Prove it is
a linear map, describe the kernel and image, and prove that the kernel is dimension 1 and the image dimen-
sion 2 (explicitly verifying that the rank-nullity theorem holds in this example). Describe both vector spaces
geometrically.
Another crucial result is the following: the linear maps from one space to another form a vector space.
9
Definition 1.19. Let V,W be vector spaces over the same field F. Then, the set of all linear maps from V to W
is denoted by Hom(V,W ).
Proof. In Problem 1.3, we prove that the set of functions from S, a set, to W , for any fixed vector space W , is
a vector space itself. Taking S to be V , we have that the set of functions from V to W is a vector space, where
addition of functions is pointwise addition (T1 + T2 : v 7→ T1 (v) + T2 (v)) and scalar multiplication is pointwise
scalar multiplication (aT : v 7→ aT (v)).
To prove this result, therefore, we must show that Hom(V,W ) is a subspace of such functions. Using the
condition in 1.2, say we have two linear maps T and S. Then, aT − S : v 7→ aT (v) − S(v) is a linear map. To see
this, we observe that:
(aT − S)(v + w) = aT (v + w) − S(v + w) = aT (v) − S(v) + aT (w) − S(w) = (aT − S)v + (aT − S)w)
(aT − S)(cv) = aT (cv) − S(cv) = c(aT (v) − S(v)) = c(aT − S)v
Unfortunately, we still don’t know how to understand Hom(V,W ) from a practical standpoint. This is
addressed in the following section.
1.3 Coordinates
1.3.1 Vectors
All of the discussion so far has been very abstract - but when we talk about finite-dimensional vector spaces, a
vector space can be made much more explicit.
Corollary. All vector spaces over F of the same dimension are isomorphic.
Proof. If V and W both have dimension n, There is an isomorphism from V to F n and one from F n to W . Since
the composition of two isomorphism is an isomorphism, we find that V is isomorphic to W .
The theorem allows us to write any vector v ∈ V “with respect to a basis” as a coordinate vector (a1 , . . . , an ).
In a sense, every isomorphism of a vector space to F n represents another choice of basis. The choice allows us
to express a vector in terms of scalars. But it is crucial to remember that a vector exists independent of its basis.
We use a certain basis to manipulate and express vectors in a convenient manner, but no one basis is best. That
is why this section, about coordinates, explores the choice of basis in detail.
10
1.3.2 Linear Maps and Matrices
If we can write vectors in terms of some basis, is it also possible to express linear maps in terms of a basis?
The answer to this question is yes, using the notion of matrices. For positive integers m, n, an m × n matrix with
entries inF is simply
a table with m rows and n columns, where each of the mn entries is an element in F. For
1 2 3
instance, 1 1 1 is a 2 × 3 matrix with entries in ℚ.
2 3
We denote by Mm×n (F) the vector space of m × n matrices with entries in F—as a vector space, it is
isomorphic to F mn , since its dimension is mn.
Now, given matrices A ∈ Ml×m (F), B ∈ Mm×n (F). We may form a matrix AB = C ∈ Ml×n (F) using a
formula called matrix multiplication. If
a11 a12 · · · a1m b11 b12 · · · b1n
a21 a22 · · · a2m b21 b22 · · · b2n
A= . , B = .. ,
.. .. .. .. .. ..
.. . . . . . . .
al1 al2 · · · alm bm1 bm2 · · · bmn
then C is the matrix
c11 c12 · · · c1n
c21 c22 · · · c2n
C= . .. ,
.. . .
.. . . .
cl1 cl2 · · · cln
where for i = 1, 2, . . . , l and j = 1, 2, . . . , n:
ci j = ai1 b1 j + ai2 b2 j + · · · + aim bm j .
3 1 5 6
1 2 3
Example 1.5. Matrices with entries in ℚ: A := 1 1 1 is a 2 × 3 matrix, and B := 4 2 3 1 is a 3 × 4 matrix.
2 3 5319
26 14 14 35
Then, AB is the 2 × 4 matrix 20 3 41 19 .
3 6 2
A final additional remark. In what follows, it helps!to regard vectors in F n as column vectors. That is, given
a1
a2
v = (a1 , a2 , . . . , an ) ∈ F n, it helps to visualize v as .. .
.
an
Theorem 1.22. Take a linear map T : V → W , and assume both V and W are finite-dimensional. Fix bases
v1 , . . . , vn and w1 , . . . , wm of V and W .
Denote by φV : V → F n the linear isomorphism that takes a given v ∈ V to the column vector (a1 , . . . , an ) ∈
F n, where v = a1 v1 + · · · + an vn (cf. Theorem 1.21), and define φW : W → F m the same way with respect
to the basis w1 , . . . , wm . There is a linear isomorphism φ from Hom(V,W ) to Mm×n (F), such that for any
T ∈ Hom(V,W ) and any v ∈ V , φw (T (v)) = φ (T ) · φv (v), where the operation · is matrix multiplication.
Here, we say that a1 , . . . , an are the coefficients of v with respect to the v1 , . . . , vn basis. The analogous
terminology applies for
Before we prove this theorem, what is it telling us? It says that with respect to a basis on both the input and
output spaces, linear maps can be represented as matrices—and scaling and adding the matrices is the same as
scaling and adding the linear maps. And, just as the linear maps act on vectors, matrices act on the coordinate
vectors (by matrix multiplication)! Matrix multiplication is precisely equivalent to the application of a linear
map on a vector, once bases are chosen.
11
Proof. Say we have a linear map T . Then, we know that for any vi in our basis, T (vi ) = a1i w1 + · · · + ami wm ,
for unique coefficients a ji , since w1 , . . . , wm are a basis. We define:
a11 a12 · · · a1n
a21 a22 · · · a2n
φ : Hom(V,W ) → Mm×n (F), φ (T ) = . .. .
.. ..
.. . . .
am1 am2 · · · amn
Here, the ith column are the coefficients of the output T (vi ). The map φ is linear, because if T, S ∈ Hom(V,W )
are linear maps, then the coefficients (with respect to the w1 , . . . , wn basis) of T (vi ) + S(vi ) = (T + S)(vi ) are the
sum of the coefficients of T (vi ) and of S(vi ). Similarly, for a ∈ F, the coefficients of aT (vi ) are the coefficients
of T (vi ) scaled by a.
Now we must prove that φW (T (v)) = φ (T ) · φV (v). In the following, we use the subscript notation ∗i to
mean the ith component of the column vector. Then, the ith component φW (T (v))i may be expressed as:
Remark. The sums and expansions in this proof are bound to get confusing, so it is worth going through the
proof carefully until it starts to make sense. The reason why this proof is a little laborious is the same reason why
matrix multiplication is tedious! It turns out that both linear maps and matrix multiplication are representing
the same process.
Corollary. If V and W are finite-dimensional vector spaces of dimension n and m, respectively, then Hom(V,W )
is a finite-dimensional vector space of dimension mn.
Definition 1.23. In the notation of Theorem 1.22, we say that φ (T ) is the matrix of the linear map T with
respect to the bases v1 , . . . , vn of V and w1 , . . . , wm of W .
Remark. Sometimes we consider linear maps T : V → V from the a vector space to itself. In this case, we
may obtain a matrix of T with respect to a single basis v1 , . . . , vn of V (just that this basis is applied to both the
domain V and the codomain V in the situation of Theorem 1.22).
Proof. Isomorphic vector spaces have the same dimension, and Mm×n (F) has dimension mn.
Here is another beautiful correspondence between linear maps and their matrix representations—simply
multiplying the matrices corresponds to composing the linear maps. For this reason, we often denote composi-
tion of linear maps simply as if it was multiplication.
12
Theorem 1.24. Say that u1 , . . . , un , v1 , . . . , vm , and w1 , . . . , wl are the bases for U,V,W . Let T be a linear map
from U to V and S a linear map from V to W . Let φUV be the isomorphism Hom(U,V ) → Mm×n (F) using
the above bases, and similarly let φVW be the isomorphism Hom(V,W ) → Ml×m (F). Then φVW (S) · φUV (T ) =
φUW (S ◦ T ), where the former · is matrix multiplication and the latter ◦ is the (abstract) composition of linear
maps.
Proof. By Theorem 1.22, φUV (T )φU (u) = φV (T (u)). Similarly, φVW (S)φV (T (u)) = φW ((S ◦ T )(u)) Substi-
tuting the first equation in the second, we get that φVW (S)φUV (T )φU (u) = φW ((S ◦ T )(u)). Remember that
φVW (S)φUV (T ) here is a matrix, and we’ve just shown that it acts just like φUW (S ◦ T ) should. After all, this
latter matrix also satisfies the equation φUW (S ◦ T )φU (u) = φW ((S ◦ T )(u)). Since φU is an isomorphism, every
vector in F n equals φU (u) for some u ∈ U, which means that the two matrices φVW (S)φUV (T ) and φUW (S ◦ T )
act the same on every column vector in F n . Hence, they are equal.
Problem 1.10 (2 Points). Complete the above proof by showing that if A, B ∈ Mm×n (F) and Av = Bv for all
v ∈ F n then A = B.
The above theorem basically says that if matrices act on vectors like linear maps do, then that implies that
matrices multiplying together is like function composition. After all, a matrix multiplying with a matrix is like
the left matrix multiplying n columns vectors on the right, so it makes sense they are roughly equivalent. Note
that it also proves that the inverse of a linear map with respect to a basis is the matrix inverse! Here is a formal
definition of matrix inverse:
Definition 1.25.!Let n ≥ 1 be a positive integer. The n × n identity matrix, (usually) denoted by I, is the n × n
1 ··· 0
matrix .. . . .. . Namely, every entry on the “main” diagonal is a 1, while every off-diagonal entry of I is a 0.
. ..
0 ··· 1
Let A ∈ Mn×n (F). Then, we say that A is invertible if there is some B ∈ Mn×n (F) such that AB = BA = I,
where I is the n × n identity matrix. Such B is usually denoted A−1 .
Sometimes I is used to denote the identity linear map I : V → V . This abuse of notation is generally OK,
because the matrix of the the identity map is exactly the n × n identity matrix regardless of choice of basis of V !
Using Theorems 1.22 and Theorem 1.24, it is not hard to see that A ∈ Mn×n (F) is invertible if and only if
it is the matrix of an invertible linear map T : V → V (where dim V = n) with respect to any basis of V .
Problem 1.11 (3 Points). Here we consider a special class of linear maps from ℝ2 to ℝ2 , rotations! They can
be expressed in the following form:
a cos θ − sin θ a
Rθ =
b sin θ cos θ b
Convince yourself that this is the same as saying that
if we choose the basis of the input and output space to be
cos θ − sin θ
the standard basis, then φ (Rθ ) = .
sin θ cos θ
(a) (2 Points) Prove that Rθ1 ◦ Rθ2 = Rθ1 +θ2 , and R−θ = R−1
θ , where the ◦ is composition of functions.
1
(b) (1 Point) Verify that Rθ acts as a rotation by θ on the vector , for any θ .
0
13
1.3.3 Change of Basis
Say we have some finite-dimensional vector space V and two bases v1 , . . . , vn and w1 , . . . , wn . Say we have a
vector (a1 , . . . , an ) with respect to the first basis. How could we transform it to the second? If we say that
vi = a1i w1 + · · · + ani wn , then the matrix
a11 a12 · · · a1n
a21 a22 · · · a2n
B= .
.. .. ..
.. . . .
an1 an2 · · · ann
maps the ith standard basis vector to (a1i , . . . , ani ). This precisely converts the basis vectors in the v1 , . . . , vn
basis to the w1 , . . . , wn basis! By linearity, this then extends to express any v = b1 v1 + · · · + bn vn in terms of
w1 , . . . , wn . So, B is the matrix that changes basis from v1 , . . . , vn to w1 , . . . , wn . To convert in the other direction,
you can simply use the matrix B−1 .
Problem 1.12 (3 Points). Find the change of basis matrix from {(1, 0), (0, 1)} to {(2, 3), (1, 4)} in ℝ2 . Hint:
This direction may be trickier than the reverse direction. You may then use the fact that
a b d −b ad − bc 0
= .
c d −c a 0 ad − bc
Now, say we have a linear map T : V → W , and we write it as the matrix M with input basis v1 , . . . , vn and
output basis w1 , . . . , wm . Say we would like to change the output basis to u1 , . . . , um . Using the basis change
matrix above, it is not too bad! Say that B is the matrix that transforms a vector in the w1 , . . . , wm basis to
u1 , . . . , um . Then, BM will take a vector of coefficients with respect to v basis, act on it with T , and then convert
it to the u basis. This is precisely matrix for T from the v basis to the u basis.
Now, say we would like to take our matrix M that has input basis v and output basis w. Then, let B be a basis
change matrix from some u′ basis to the v basis. Now, MB is the new matrix with respect to the u′ basis and w
basis.
Specifically in the case where T maps a vector space to itself, we see that if T is an operator on V with respect
to the v1 , . . . , vn basis, and B is the basis change matrix from v to w, then BMB−1 is the new matrix for T with
respect to the new basis!
We now focus on linear maps from a vector space to itself, sometimes called linear operators. In this context,
unless otherwise specified, when a linear map is written as a matrix, the input and output bases are the same.
There are two types of actions on vectors that we have introduced—multiplication by scalars, and acting on the
vector by a linear map. Sometimes, these two actions overlap heavily. For instance, there exist scalar linear
maps cI : v 7→ cv (here I is the identity map, which is linear), which act on vectors exactly the same as scalar
multiplication. Considering linear maps in Hom(V,V ) with respect to the some basis v1 , . . . , vn (the same basis
on the input and output space), there is a larger class of matrices that act a lot like scalars. Consider linear maps
14
S such that their matrix representations are:
λ1 0 · · · 0
0 λ2 · · · 0
φ (S) = .
.. . . ..
.. . . .
0 0 ··· λn
For such a linear map, we have that Svi = λi vi , so it acts like a scalar on every basis vector (though potentially
a different scalar on each). The matrix φ (S) being diagonal makes it much simpler to work with. For instance,
φ (Sn ) is just:
n
λ1 0 · · · 0
0 λn ··· 0
2
φ (S) = .
. .. . . ..
. . . .
0 0 ··· λnn
However, in general, exponentiating a matrix is far more tedious - you have to repeatedly do messy matrix
multiplication. This idea motivates a question: for every linear map, does there exist a basis with respect to
which it is diagonal? In a sense, this would be a key to understanding a linear map, a distinguished basis that
makes the map far easier to understand.
Definition 1.26. Let T be a linear map from V to V . A vector v ̸= 0 is called an eigenvector of T if T (v) = λ v
for some λ ∈ F. Then λ is called the corresponding eigenvalue to v.
An eigenvector, eigenvalue pair is a step in the right direction - after all, if S is diagonal with respect
to a basis, then each basis vector is an eigenvector, and the values along the diagonal are the corresponding
eigenvalues. For a given eigenvalue λ , the set of all vectors such that T (v) = λ v (including 0) is a subspace of
V , since it is closed under addition and scalar multiplication.
Proof. If v is an eigenvector of T with eigenvalue λ , then T (v) = λ v = λ Iv. This implies that (T − λ I)v =
T (v) − λ Iv = 0. Since v ̸= 0, this implies T − λ I has a nontrivial kernel.
Conversely, if T − λ I has a nontrivial kernel, then there exists a v ̸= 0 such that (T − λ I)v = 0, implying that
T (v) = λ Iv = λ v.
Theorem 1.28. Let T be a linear map from V to V . Say V1 , . . . ,Vm are the eigenspaces for distinct λ1 , . . . , λm .
Then V1 + · · · +Vm is a direct sum.
Proof. This result is vacuously true for m = 1. Assume by induction that the result is true for m − 1. Now, if
there are two ways of writing an element v ∈ V1 + · · · +Vm , then there is some vi ∈ Vi where v1 + · · · + vm = 0,
and at least one of the vi is nonzero, which we will assume is not v1 . Then we have that 0 = T (v1 + · · · + vm ) =
λ1 v1 + · · · + λm vm . Now subtracting λ1 0 = λ1 v1 + · · · + λ1 vm , we find that (λ2 − λ1 )v2 + · · · + (λm − λ1 )vm = 0.
This is another sum of values in V2 , . . . ,Vm , at least one of which is nonzero. By induction, however, this is
impossible!
It turns out that the number of distinct eigenvalues is actually at most the dimension of the vector space.
15
Theorem 1.29. Let T : V → V be a linear map, where V has dimension n. T has at most n eigenvalues.
Proof. Assume by contradiction that there are more than n distinct eigenvalues. Then, we know that the first
n eigenvalues correspond to eigenvectors v1 , . . . , vn . Since span(vi ) is a subspace of the eigenspace for λi , we
have that span(v1 , . . . , vn+1 ) = span(v1 ) ⊕ · · · ⊕ span(vn+1 ) (an exercise shows that if the sum of a collection of
vector spaces is a direct sum, then a sum of a collection of their subspaces is also a direct sum). This implies,
as mentioned above, that v1 , . . . , vn+1 are linearly independent. However, this is impossible, since it is longer
than the dimension, so clearly cannot be extended to a basis.
Problem 1.13 (3 Points). Prove that if V is a vector space, V1 , . . . ,Vm ⊆ V are subspaces, and Ui ⊆ Vi are
subspaces, then
V1 ⊕ · · · ⊕Vn =⇒ U1 ⊕ · · · ⊕Un .
Note: this notation for direct sum, while somewhat opaque, is standard. This problem, translated, reads: If the
sum of Vi is a direct sum, then the sum of Ui is a direct sum.
Problem 1.14 (2 Points). Prove that a linear map T : V → V has a basis of eigenvectors v1 , . . . , vn if and only if
T has a basis with respect to which it is diagonal.
Definition 1.30. The minimal polynomial of a linear map T is a polynomial p(x) with coefficients in F such
that p(T ) = 0, p is monic (the leading coefficient is 1) and p is the polynomial of minimum degree with this
property. Remember that multiplication of linear maps is composition.
We call p(x) “the” minimal polynomial because there can only be one such polynomial. If there were two,
then their difference would be a polynomial of smaller degree for which T evaluates to zero.
Theorem 1.31. Every linear operator T from on a finite-dimensional vector space V (that is, T : V → V ) of
dimension n has a minimal polynomial.
2
Proof. Consider the linear maps I, T, . . . , T n . They correspond to n2 + 1 vectors in the space Hom(V,V ), and
therefore they must be linearly dependent (since the dimension of Hom(V,V ) as a vector space is n2 ). So,
2
we have some a0 , . . . , an2 such that an2 T n + · · · + a0 I = 0. Then, we know that p(T ) = 0, where p(x) =
2
an2 xn + · · · + a0 . The existence of one polynomial that T satisfies shows that there must be a polynomial of
minimal degree, the minimal polynomial.
Theorem 1.32. A value λ ∈ F is a root of the minimal polynomial p(x) if and only if λ is an eigenvalue of T .
Proof. Assume by contradiction λ is a root of the polynomial p(x) and λ is not an eigenvalue. Because λ is
a root, p(x) is divisible by x − λ . So, p(T ) can be written as (T − λ I)q(T ). Since λ is not an eigenvalue,
then (T − λ I) is injective. If q(T ) ̸= 0, then q(T )v ̸= 0, which implies that (T − λ I)q(T )v ̸= 0. So, q(T ) ̸= 0
implies that p(T ) ̸= 0. We then have that since p(T ) = 0, q(T ) = 0. So, p(T ) is not the minimal polynomial, a
contradiction.
Conversely, assume that λ is an eigenvalue of T . Then, T − λ I has a nontrivial nullspace - say that (T −
λ I)v = 0 with v ̸= 0. We can, by polynomial division, write p(x) = q(x)(x − λ ) + r, where r ∈ F. Then,
p(T )v = q(T )(T − λ I)v + rv = rv. So, we must have that r = 0, implying that p(λ ) = 0.
16
Now, assume that F = ℂ, so we can leverage the following property, the fundamental theorem of algebra:
any polynomial with coefficients in ℂ has a root in ℂ. From this theorem, we can immediately factor the
minimal polynomial, and the roots of the minimal polynomial are the eigenvalues.
A polynomial is said to be separable if it has no repeated roots. Then, we have a convenient characterization of
diagonalizability. First, a lemma for the result.
Lemma 1.33. Say that λ1 , . . . , λn are distinct values. Then consider the polynomials pi (x) = (x − λ1 ) . . . (x −
λi−1 )(x − λi+1 ) . . . (x − λn ). There exists a linear combination ∑i ai pi (x) = 1.
Proof. We proceed by induction. For the base case n = 1, p1 (x) = 1, so the statement is clearly true. Now,
1
assume the statement is true for n. We prove it for n + 1. Note that λn −λ 1
(pn+1 (x) − pi (x)) = (x − λ1 )(x −
λ2 ) . . . (x − λi−1 )(x − λi+1 ) . . . (x − λn ). So these terms enumerate the pi for the n case, and we know a linear
combination of these terms yields 1. So, a linear combination of p1 through pn+1 can yield 1 as well, once the
coefficients are expanded!
Proof. First, say T is diagonalizable. Say φ is the map that takes T to its matrix in its diagonal basis. Then,
consider φ (Πi (T − λi I)), where each distinct eigenvalue is counted once. Every term along the diagonal of this
matrix is seen to be zero.
So, φ (Πi (T − λi I)) = 0, implying that for p(x) = Πi (x − λi ), p(T ) = 0. Since we proved that each x − λi
must also be a factor of the minimal polynomial, this proves that p(x), the minimal polynomial, is separable!
Say that p(x) = Πi (x − λi ) is the minimal polynomial, where each λi is distinct (so it is separable). Then we
would like to prove that if V1 , . . . ,Vn are the eigenspaces for λ1 , . . . , λn , then V1 ⊕ · · · ⊕ Vn = V . Then, by 1.5,
we would have that T is diagonalizable. To prove it, take a vector v ∈ V . Then, note that we can write as
in the above lemma 1 = ∑i ai pi (x). Then, v = Iv = ∑i ai pi (T )v, and note that since (T − λi I)pi (T ) = p(T ),
(T − λi I)(ai pi (T )v) = ai p(T )v = 0. So, since ai pi (T )v is in the null space of T − λ I, ai pi (T )v ∈ Vi , so we have
expressed an arbitrary element in V as a sum of the Vi .
Using this theorem, we can see that there exist linear maps that cannot possibly be diagonalized. Consider
the linear map T : ℂ2 → ℂ2 whose matrix with respect to the standard basis is:
1 1
0 1
The minimal polynomial of this matrix clearly cannot be degree 1, since it is not a multiple of the identity, and
we can calculate a degree 2 polynomial (which is therefore the minimal polynomial) to be p(x) = x2 − 2x + 1 =
(x − 1)2 . This polynomial is not separable, so this linear map cannot be diagonalizable.
In ℝ2 , there are many possible bases. As you can verify, {(1, 0), (0, 1)}, {(1, 0), (1, 1)}, and {(2, 1), (1, 1)} are
all bases of ℝ2 . However, when we think intuitively of ℝ2 , one basis seems special: the standard basis. Or at
least, when we draw axes for ℝ2 , we ensure that they are perpendicular. In this section, we introduce a structure
on vector spaces, making them “inner product” spaces, such that notions like perpendicular make sense. In
17
order to even understand “orthogonality,” we need our objects to have a notion of angles. In this section, we
assume that F is either ℝ or ℂ, and use a∗ to denote the complex conjugate of a.
Definition 1.35 (Inner Products). An inner product ⟨·, ·⟩ on a vector space V is an operation that takes two
vectors in V and outputs a scalar in F. It obeys the following properties:
Note that any orthogonal set of vectors can easily be made normal,
by taking eachvector vi and redefining
it as v′i = √ vi . Then, ⟨v′i , v′j ⟩ = √ 1
⟨vi , v j ⟩ = 0, ⟨v′i , v′i ⟩ = √ vi , √ vi =√ 1
2 ⟨vi , vi ⟩ = 1.
⟨vi ,vi ⟩ ⟨vi ,vi ⟩⟨v j ,v j ⟩ ⟨vi ,vi ⟩ ⟨vi ,vi ⟩ ⟨vi ,vi ⟩
Proof. Say that 0 = a1 v1 + · · · + an vn . Taking the inner product on either side with vi , we get that 0 = ⟨0, vi ⟩ =
⟨a1 v1 + · · · + an vn , vi ⟩ = ∑ j a j ⟨v j , vi ⟩ = ai . So, every ai = 0, proving that v1 , . . . , vn must be linearly independent.
Theorem 1.38 (Gram-Schmidt Procedure). For any basis v1 , . . . , vn , there exists an orthonormal w1 , . . . , wn such
that span(v1 , . . . , vi ) = span(w1 , . . . , wi ) for every 1 ≤ i ≤ n.
Proof. We iteratively define w1 , . . . , wn , such that the subsequences w1 , . . . , wi are orthonormal and the same
span as v1 , . . . , vi . Firstly, w1 = √ v1 is a normalized vector, and so is orthonormal. Now, define inductively
⟨v1 ,v1 ⟩
w′i = vi − ⟨vi , w1 ⟩w1 − · · · − ⟨vi , wi−1 ⟩wi−1 . w1 , . . . , wi−1 , vi and w1 , . . . , wi−1 , w′i have the same span since vi is in
the latter span and w′i is in the former span. This proves that v1 , . . . , vi and w1 , . . . , wi−1 , w′i .
Now, note that ⟨w′i , w j ⟩, when we expand the definition of w′i and expand by linearity, gives ⟨vi , w j ⟩−⟨vi , w j ⟩⟨w j , w j ⟩ =
w′
0. Now since, w′i is nonzero, we can define wi = √ ′i ′ to give the new orthonormal basis vector. It satisfies
⟨wi ,wi ⟩
all the conditions necessary, and completes this iterative step.
18
Problem 1.15 (7 Points).
(a) (2 Points) Let v1 , . . . , vn be an orthonormal basis for V . For v ∈ V , show that v = ∑i ⟨v, vi ⟩vi . Orthonormal
bases are very well behaved!
(b) (1 Point) Let v1 , . . . , vn be an orthonormal basis for V . If v = a1 v1 + · · · + an vn , prove that ⟨v, v⟩ = a21 +
· · · + a2n (this is the Pythagorean theorem).
(c) (4 Points) Say B is the basis change matrix from an orthonormal basis v1 , . . . , vn in V to another basis
w1 , . . . , wn in V . Prove that B’s rows are orthonormal vectors in F n and B’s columns are orthonormal
vectors in F n , where the inner product on F n is as defined in Example 1.6.
19
2 Group Theory
We will now be introducing ourselves to the world of group theory. Groups are the abstract way in which
mathematicians describe symmetries of arbitrary objects. The birth of the concept came from a French mathe-
matician named Evariste Galois who wanted
√ to understand polynomial equations. He made observations such
as the following. Suppose 3 + 2i (i = −1) is a solution to some polynomial equation f (x) = 0, where f (x) is
a polynomial with real coefficients. Then, one may deduce that 3 − 2i is another solution to f (x) (you probably
know how to prove this already from high school algebra!). Galois tried understanding this phenomenon of
transforming known solutions into new ones and exploited these properties to understand polynomials. It was
many years before this idea was molded into what is presently known as a group, but group theory has since
become one of the deepest and most prevalent parts of modern mathematics.
We will not be dealing so much with how groups deal with polynomial equations, but instead arrive at
the notion of a group through a separate route of motivation that is geometrically inspired. We will start this
journey by studying group actions, which amount to the transformations of an object or how we can change an
object without altering its main properties. We will then see that seemingly different objects actually have the
same set of transformations. This will lead us into describing a transformation abstractly, and lead us to the
definition of a group.
In this section, Problems 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.14 do not require proof. All other Problems
require proof.
We begin our exploration by studying arguably the simplest geometric object: the equilateral triangle. The
equilateral triangle is pretty “symmetric,” for instance, you can rotate an equilateral triangle by 120◦ around
its center and you get back an equilateral triangle. This doesn’t work for scalene triangles though, so there’s
something special to explore. Let’s try and quantify the different symmetries of the triangle. Let’s start by
1
labelling the vertices of the equilateral triangle 1, 2, 3 like this: △ . This is also shown by the leftmost triangle
2 3
in the diagram below. We call this the original or initial labelling of the triangle’s vertices. When we rotate
counterclockwise by 120◦ , the shape we get is still a triangle, but it’s labeled differently. Similarly, if we rotate
the original triangle counterclockwise by 240◦ , we end up with yet another labelling.
20
Are these all the symmetries of our triangle? What if we rotated clockwise instead?
As you can see, we don’t get anything new when we rotate clockwise, they’re just duplicates of our old
rotations. We will now call the labelling of our triangle rotated 120◦ counterclockwise as R120 and 240◦ coun-
terclockwise as R240 . Note that going forward, whenever we say rotation, we always mean counterclockwise,
it’s just what mathematicians have grown to prefer.
We have not yet completed the list of symmetries, however, since we can also “flip” our triangles:
These labellings are different than what we had before. We will call these new labellings FA , FB , and FC ,
where FA corresponds to flipping across the vertical perpendicular bisector, FB corresponds to the perpendicular
bisector which bisects the right side of the triangle, and FC the perpendicular bisector which bisects the left side
of the triangle.
Now, we have constructed 5 new labellings starting from our original one: R120 , R240 , FA , FB , FC . Can we
transform our triangles in other ways? Up to now, we have only applied single transformations to our triangle,
what if we do multiple in sequence? For example, if we rotate the original triangle by 120◦ , we get R120 , but if
we rotate R120 by 120◦ again, we get R240 . Of course, rotating twice by 120◦ just rotates by 240◦ . But, we can
do more interesting sequences of transformations, like what if we rotate and then flip?
21
Problem 2.3 (1 Point). Draw the resulting labelling of the triangle after applying the following sequences of
transformations:
(a) A 120◦ counterclockwise rotation followed by a flip across the vertical perpendicular bisector.
(b) A flip across the vertical perpendicular bisector followed by a 120◦ counterclockwise rotation.
(c) A 120◦ counterclockwise rotation followed by a flip across the vertical perpendicular bisector followed
by a 240◦ counterclockwise rotation.
As you can see, we never get any labellings that we haven’t gotten before. However, we do sometimes apply
multiple transformations and get back to where we started, so we should give our initial labelling a special name
as well, let’s call it I for “initial.” It’s the labelling that results when we “do nothing” to our original triangle.
Although we the process of doing multiple transformations hasn’t given us any new symmetries, it does
tell us that the symmetries interact with one another. To put it more formally, we shouldn’t think about our
labellings as just labellings, but rather as the transformations we did to get there. For instance, FA now doesn’t
represent the labelling of the triangle’s vertices after flipping, but instead the action of flipping the triangle
itself! From this point of view, it makes sense to compose such transformations. For example, performing FA
and then FB gives us the same labelling as R240 . We write this as R240 = FB ◦ FA . The reason for the seemingly
backwards notation is that it’s really function composition. We can think of FA , FB as functions, and we may
write stuff such as FA (△) to show that FA is acting on a triangle. Then, we’re saying R240 (△) = FB (FA (△)).
Problem 2.4 (1 Point). Write the following compositions of transformations as a single transformation
(a) FB ◦ FC .
(b) FC ◦ R120 ◦ FA .
(c) FA ◦ R120 ◦ I.
Now, what if we wanted to compose 5 transformations and see the result? It would be tedious to do
manually with a triangle, but luckily mathematicians have found a clever way around this by treating func-
tion composition as a sort of multiplication on the transformations. To see how this is used, we’ll start by
constructing a multiplication table, and then use that to compose many transformations. A word of caution:
as the examples above showed, this multiplication is not commutative unlike normal multiplication, in other
words, the order in which functions are composed matters. For instance, notice that FA ◦ R120 = FB , whereas
R120 ◦ FA = FC .
Problem 2.5 (3 Points). Complete the multiplication table. If the column has transformation S and the row has
transformation T , the entry in that box should be S ◦ T . No proof is needed.
I R120 R240 FA FB FC
I
R120 R240 FC
R240
FA R240
FB I
FC
22
Problem 2.6 (1 Point). Write the following composition of transformations as a single transformation It may
be helpful to use the table from the previous problem.
An interesting thing to note is that even though we physically perform a sequence of transformations
starting from the rightmost and working left, it actually doesn’t matter what order we compose in. This is
because transformations are functions, and function composition is associative: the multiplication operations
can be carried out in any order. This can simplify many things, for example, R240 ◦ (R120 ◦ FA ) = (R240 ◦ R120 ) ◦
FA = I ◦ FA = FA (you should do the calculation the other way and verify it’s still FA ).
An aside: in abstract algebra, and other branches of higher mathematics, associativity is actually a much
more natural condition than commutativity when defining mathematical structures.
We now have a more or less complete understanding of the triangle: we know all possible transformations,
and we can easily compute what will happen if we compose them in different orders.
Now we analyze the square □, which is also a very symmetric shape. There are the 90◦ , 180◦ , and 270◦
rotational symmetries given by R90 , R180 , R270 (remember, we always rotate counter-clockwise). We can also
flip the square, either vertically, horizontally, or across the diagonals. These are denoted as FA for flipping
across the vertical perpendicular bisector, FB for flipping across the horizontal perpendicular bisector, FC for
flipping across the diagonal connecting the bottom left and top right vertices, and FD for flipping across the
diagonal connecting the top left and bottom right vertices. There is also I, the “do nothing” symmetry. Let’s try
and understand how these compose. Here is a partially completed 8 × 8 multiplication table for the symmetries
of the square, with the same conventions as Problem 2.5. If you would like, you can try filling in some more
cells to this multiplication table.
We’ll take some time to extract important patterns that have shown up in the multiplication tables of Problems
2.5 and the one above. Notice how every row and every column contains exactly one I. The interpretation of
this is that every transformation has an “opposite” or inverse transformation such that composing the two will
result in doing nothing.
Problem 2.7 (3 Points). Give a general rule for the inverse for rotations R∗ and flips F∗ .
23
Another important thing to notice is that the row to the right of I is the same as the input row above it, and
the column right below I is the same as the input column to its left. Moreover, this doesn’t happen for any other
row or column, they all change the inputs. The element I has a very important role, it’s analogous to multiplying
by the number 1, and inverse transformations are analogous to multiplicative inverses. This element I will be
called the identity element. Also notice how every transformation appears in every row exactly once, and every
transformation appears in every column exactly once.
Let’s see how the ideas we developed for the equilateral triangle and the square may be generalized to
arbitrary regular polygons. Let Pn denote the regular polygon with n sides, with n ≥ 3. Take a copy of the
regular n-gon Pn in ℝ3 such that its circumcenter lies at the origin of ℝ3 . A symmetry transformation of the
regular n-gon is a bijective1 map of Pn onto itself which is effected by a rotation of ℝ3 with respect to an axis
passing through 0 ∈ ℝ3 . This definition clearly does not depend on the “embedding” of Pn in ℝ3 .
Definition 2.1. The collection of transformations of a regular n-gon will be denoted Dn , the dihedral group
with 2n elements. Transformations which preserve orientations are called rotations, and transformations which
reverse orientations are called reflections or flips.
In particular, under this definition, the identity transformation I of the regular n-gon is a rotation.
In the following problem, the identity (do nothing transformation) is denoted I and T denotes an arbitrary
transformation in Dn :
The inverse of a transformation T will henceforth be denoted T −1 . You may have noticed that there are
some “simple” transformations which you can apply repeatedly to get more complicated transformations. Let’s
formalize this.
1 Recall that a map f : S → T is called injective if for any x, y ∈ S, f (x) = f (y) implies that x = y, is called surjective if the image of
f , i.e. the set { f (x) | x ∈ S}, is all of T , and is called bijective if it is injective and surjective.
24
Problem 2.10 (3 Points). Show that there are two transformations R, F ∈ Dn where R is a rotation and F is a flip
such that for any T ∈ Dn , we can write T = F j ◦ Ri for some 0 ≤ i ≤ n − 1, j ∈ {0, 1}, where Ri = R ◦ R ◦ · · · ◦ R
with i total compositions. Furthermore, show that T is a flip if and only if j = 1.
Is there only one possibility for the initial choice of transformations R and F?
We remarked earlier that multiplications (compositions) do not generally commute. Let’s try and under-
stand better when they do.
In this last part, we can see that we get some utility out of an abstract way to represent transformations. We
concluded something geometric (whether it matters what order you flip something) using algebraic manipula-
tions!
Let Cn be the subset of all rotations in Dn . We call Cn the rotation subgroup of Dn . Observe that the set Cn
is closed under the operations of composition and inverse. Now, for any flip F ∈ Dn , the subset {I, F} of Dn is
called a reflection subgroup of Dn . There are n such reflection subgroups. Each reflection subgroup is closed
under the operations of composition and inverse. We will eventually formalize the idea of a subgroup.
In the following section, we fix the following notation. For any positive integer n, set n = {1, 2, . . . , n}.
We now explore our next incarnation of transformations. Suppose you’re playing the game where someone
has 3 cups upside down with a dice underneath one and you need to guess which one has the dice after they’re
shuffled. The total number of cups hasn’t changed, but they have been moved around, i.e. they have been
permuted. We could try detecting this by labelling every cup with a number and seeing where each label ends
up. This motivates the following.
You can think of the expression p(a) = b as meaning the object labeled a has been moved to where b used
to be.
Definition 2.3 (Orbit). For a given a ∈ n and p ∈ Sn , the set {pn (a) | n ∈ ℤ≥0 } is denoted Orb p (a) and called
the orbit of a under p.
25
In other words, the orbit of an object is all the possible places a ends up by applying p to a. (note p0 is the
identity map, so that p0 (a) = a).
This problem gives us a very nice way to represent permutations. It would be very tedious if every time we
wanted to talk about a permutation p we had to specify p(1), p(2), . . . , p(n). Instead, we can use what is called
the cycle decomposition of a permutation:
Definition 2.4. Let A = a1 , . . . , am be an orbit for p ∈ Sn such that p(ai ) = ai+1 for i = 1, 2, . . . , m − 1 and
p(am ) = a1 . Then C := (a1 a2 · · · am ) is called the cycle corresponding to A. We say that C is an m-cycle;
here m = |A| is the length of the cycle C. If A1 , . . . , Ak are the orbits for p, the cycle decomposition for p is
denoted by the concatenation of the cycles C1 , . . . ,Ck corresponding to the respective orbits A1 , . . . , Ak . If the
cycle Ai = {a} is a singleton set, the corresponding cycle is sometimes omitted. The cycle type of p is the list of
the cardinalities |A1 |, |A2 |, . . . |Ak |, i.e. the list of the lengths of the cycles of p. The cycle type is usually listed
in non-increasing order.
A transposition (or more informally, a swap) is the simplest type of non-trivial permutation whose cycle
decomposition consists of just one cycle of length 2, and otherwise singletons. In other words, a transposition
swaps two elements and leaves all other elements fixed.
We have seen how to recover the cycle decomposition of any permutation p ∈ Sn . Conversely, let C1 , . . . ,Ck
be mutually disjoint cycles with elements in n. By mutually disjoint, we mean the following: if Ci = (a1 a2 · · · al ), C j =
(b1 b2 · · · bm ) for some 1 ≤ i < j ≤ k, then as ̸= bt for all 1 ≤ s ≤ l, 1 ≤ t ≤ m. Then, the cycle product C1 C2 . . .Ck
defines a permutation in Sn . Sometimes we use the term m-cycle to refer to a permutation in Sn whose cycle
decomposition consists of one cycle of length m and all other cycles of length 1. In particular, a 1-cycle is just
the identity map id : n → n.
Example 2.1. The permutation p = (1 3 6)(2 4) ∈ S7 is given explicitly by p(1) = 3, p(2) = 4, p(3) = 6, p(4) =
2, p(5) = 5, p(6) = 1, p(7) = 7. The cycle type of p is 3, 2, 1, 1. Note that the cycle decomposition (1 3 6)(2 4)
could represent a valid element of Sn for any n ≥ 6, and (1 3 6)(2 4) ∈ Sn would fix any element i ≥ 6.
As another example, the permutation q = (2 3) ∈ S4 is a transposition, given explicitly by q(1) = 1, q(2) =
3, q(3) = 2, q(4) = 4.
Note that (2 4)(3 6 1) is also a valid cycle decomposition of the permutation p ∈ S7 defined in the above
example. So is (6 1 3)(4 2). This highlights the following property: the cycle decomposition of a permutation
p ∈ Sn is unique up to (1) permuting cycles and (2) cyclically permuting the elements of each cycle. The proof
of this fact is omitted from the power round.
We shall now consider how to compose permutations. Cycles can be multiplied, where multiplication is
just given by the usual function composition. Multiplication is typically denoted by concatenating cycles. In
lieu with the usual convention for function composition, the convention is that the cycles on the right “act”
first. For example, (3 5)(1 2)(2 3) denotes the composition (3 5) ◦ (1 2) ◦ (2 3) of the three transpositions
(2 3), (1 2), and (3 5) (recall that function composition is associative). In particular, note that (3 5)(1 2)(2 3)
and (1 2 5 3) are the same permutation of n (n ≥ 5). Now, let us see how permutations can be multiplied via
26
cycle decomposition. Suppose p ∈ Sn has cycle decomposition C1 C2 . . .Ck . Then, interpreting each cycle Ci
as a permutation of n, we have in fact p = C1 ◦ C2 ◦ · · · ◦ Ck as elements of Sn . Equality here does not depend
on the choice of cycle decomposition of p (i.e. does not depend on the order in which we write the cycles of
p). In particular, if q ∈ Sn is another permutation with cycle decomposition C1′ C2′ . . .Cl′ , then the permutation
q ◦ p ∈ Sn is the product of k + l cycles C1′ C2′ . . .Cl′ C1 C2 . . .Ck .
Example 2.2. Let p = (1 3 6)(2 4), q1 = (2 4), q2 = (1 3 6), all permutations in S7 . Then, p = q2 ◦ q1 =
q1 ◦ q2 . Thus, (1 3 6)(2 4) may either be viewed as a permutation in S7 , or the product of the two permutations
(1 3 6), (2 4) in S7 .
Another example: let p = (2 3 6)(7 1 5 8), q = (1 3 5 6 7 8)(2 4). Then, q ◦ p = (1 6 4 2 5)(3 7).
Problem 2.14 (3 Points). Simplify the following permutations into their cycle decomposition:
(a) (1 point) (1 3 4)(2 3 1)
(b) (1 point) (1 2 4)(4 5 3)(5 4 3)(4 2 1)
(c) (1 point) τ ◦ σ , where σ = (6 1 3)(4 2) and τ = (1 2 5 6 3 7)
There are some very close similarities in the structure of the permutations Sn with the set of transformations
Dn explored earlier. Composition of permutations is associative. There is a “do nothing” permutation of n,
namely, the identity map id : n → n. Now we explore inverses. Recall that for any permutation p ∈ Sn that there
is an inverse permutation p−1 ∈ Sn .
Problem 2.15 (3 Points). Given a cycle decomposition for a permutation p, describe the cycle decomposition
for its inverse (that is, a permutation q such that p ◦ q = q ◦ p = I where I is the permutation that fixes every
element).
The symmetric groups give an example of a subgroup called the alternating group which is more subtle than
the rotation and reflection subgroups we encountered earlier. Similar to how we simplified cycles in the previous
problem, we can also write permutations as products of the simplest cycles, transpositions. Any permutation
p ∈ Sn may be written as a product of finitely many transpositions. For example, p = (1 5 2 3 8)(4 6 11 10)(7 9)
may be written as the product of 8 transpositions:
p = (1 5)(5 2)(2 3)(3 8)(4 6)(6 11)(11 10)(7 9),
while (4 3 9 7)(6 2 8) may be written as the product of 5 transpositions. Unlike cycle decomposition, this
product is not unique; for example, (2 3 1) = (1 2)(2 3) = (1 3)(2 1). However, the following problem asserts
that it is unique up to the parity of the number of transpositions.
Now we state some facts without proof. Let p ∈ Sn be a permutation. If p can be written as the product
of an even number of transpositions, then any way to express the permutation as a product of transpositions
will also involve an even number of transpositions. In this case, p is called an even permutation. Any other
permutation is called an odd permutation.
Furthermore, if p, q ∈ Sn are permutations, each of which may be written as the product of an even number
of transpositions, the composition p ◦ q ∈ Sn and the inverse p−1 ∈ Sn have the same property. You are welcome
to verify these facts if you like (the proof is combinatorial in nature).
The collection of all permutations in Sn which may be written as the product of an even number of trans-
positions is called the alternating group An and is a subgroup of Sn , a concept that will be formalized later. For
now, think of it as being analogous to the subgroup of rotations Cn of Dn in the sense that it is a “subset of the
transformations that satisfy some additional property.”
27
Here, we also have a concept “dual” to a subgroup, namely a quotient group. By the above problem, every
element of Sn can be written as either an even number or an odd number of transpositions. We now explore this
idea further:
The punchline is that we can create a “multiplication table” with the elements being the sets An , Bn as
follows:
An Bn
An An Bn
Bn Bn An
The “multiplication of sets” is as follows: the product of two sets X,Y (where X,Y are each either An or Bn )
is found by taking an element of each set, i.e. x ∈ X, y ∈ Y , and identifying which of the two sets An , Bn the
product x ◦ y lies in. The previous problem shows that the above multiplication table is indeed the correct
one. Of course, this multiplication makes sense only if the choices of elements x, y do not matter. That is, if we
picked some other x′ ∈ X, y′ ∈ Y , then the product x′ ◦ y′ lies in An if and only if x ◦ y lies in An . This construction
is an example of a quotient group, a concept which will be formalized later.
To motivate the abstract definition to come, we will see that some of the groups we have encountered in two
different contexts are really the same. Namely, there are bijections which ”preserve composition laws”.
Problem 2.17 (4 Points). Show that there is a bijection ϕ : D6 → S3 such that ϕ(a ◦ b) = ϕ(a) ◦ ϕ(b).
Similar to how we defined Dn , we can also define transformation groups of three-dimensional regular
polyhedra as being the transformations which preserve the shape, but may change the labelling of the vertices.
Rigorously speaking: take a copy of any regular polyhedron P in ℝ3 such that its circumcenter lies at the
origin of ℝ3 . A (rigid) symmetry transformation of P is a bijective map of P onto itself which is effected by a
rotation of ℝ3 with respect to an axis passing through 0 ∈ ℝ3 . As usual, this definition does not depend on the
“embedding” of P in ℝ3 . Any symmetry transformation of P determines a permutation of the vertices of P (or
the edges of P, or the faces of P). Furthermore, if S, T are any two symmetry transformations of P, then S ◦ T
is again a symmetry transformation of P.
The transformations which can be obtained as a rotation (as opposed to a flip) are called orientation pre-
serving.
Problem 2.18 (4 Points). Let G be the set of symmetry transformations of a cube. Show that there is a bijection
ϕ : G → S4 such that ϕ(a ◦ b) = ϕ(a) ◦ ϕ(b) for all symmetry transformations a, b ∈ G.
28
Problem 2.19 (5 Points). Let G be the group of symmetry transformations of a dodecahedron. Show that there
is a bijection ϕ : G → A5 such that ϕ(a ◦ b) = ϕ(a) ◦ ϕ(b) for all a, b ∈ G. Hint: consider the following figure:
As the above problems show, the collection of symmetries of two seemingly different objects actually turn
out to be the same. This is known as an isomorphism, which will be formalized later. In modern mathematical
fashion, we now abstract out the important aspects of what symmetries are. We use this abstract notion to derive
powerful statements about the transformations. The important aspects are: the existence of a “do nothing”
transformation, a composition law, and the existence of inverses. Using these aspects, we are finally ready to
give the “official” formal definition of a group.
Definition 2.5 (Group). A group is a nonempty set G with a distinguised identity element, often denoted e ∈ G,
and a binary operation · : G × G → G, (g, h) 7→ g · h called the multiplication map (i.e., g · h is the product of g
and h in that order), satisfying the following axioms:
• Inverse: for all x ∈ G, there exists y ∈ G such that x · y = y · x = e. Such a y is usually denoted x−1 ).
The data of a group will be written as (G, ·, e), or simply G for brevity.
If in addition we have
then the group G is called commutative or abelian. Otherwise, the group is called non-commutative or non-
abelian.
The cardinality of the set G, denoted by |G|, is called the order of the group G. We say that G is a finite
group if |G| is finite.
Notes on notation:
• The identity element may also be written as I, 1, 0, e depending on the context.
• The multiplication may also be written as x × y, x ◦ y, x + y or simply as xy.
• Usually 0 and + are used in conjunction when working with an abelian group and here, the ”multiplica-
tion map” is usually called addition.
29
To explain the definition, a group can be thought of as a collection of symmetries with the multiplication map
acting as composition. The identity element is the “do nothing” symmetry while the inverse is performing the
opposite symmetry. The associativity says that it doesn’t matter what order we compose symmetries, we will
end up getting the same end result. The groups that we have encountered so far have not been abelian, but we
will later encounter abelian groups.
The advantage of making this abstract definition is that it allows us to talk about a group as an independent
object separate from any set that its acting on. Hence, we can make overarching statements that apply to many,
sometimes even all, groups and get a more wholistic understanding of them. The remainder of this section will
deal with abstract properties of groups, and later sections will deal with understanding how they act on different
objects.
Now let’s explore some basic properties:
Example 2.3. Any singleton set may be viewed as a group, called the trivial group.
The integers under addition (ℤ, +, 0) is an abelian group. The identity element is 0 ∈ ℤ, and the inverse of
any n ∈ ℤ is just −n ∈ ℤ.
Let ℚ× be the set of non-zero rational numbers. Then, the non-zero rational numbers under multiplication
(ℚ× , ·, 1) is an abelian group. The identity element is 1 ∈ ℚ× , and the inverse of a rational number qp ∈ ℚ, where
p, q are coprime integers with p, q ̸= 0, is qp ∈ ℚ× .
The set Dn of symmetry transformations of the regular polygon with n sides under the composition op-
eration ◦ is a group, called the dihedral group with 2n elements (cf. Section 2.2). Its identity element is I,
the “do nothing” transformation, or identity transformation. You have already seen in Problem 2.10 that every
transformation T ∈ Dn has an inverse. The group Dn is non-abelian.
Similarly, we have seen that the set Sn of all permutations of n = {1, 2, . . . , n} under the function composi-
tion operation ◦ is a non-abelian group, called the symmetric group on n elements (cf. Section 2.3).
Definition 2.6. A nonempty subset H of a group (G, ·, e) is a subgroup of G if it is closed under multiplication
and inverse. That is, for all x, y ∈ H, we have x · y, x−1 ∈ H (thus, we may speak of the group (H, ·, e)). We use
the notation H ≤ G to say that H is a subgroup of G. We say that H is a proper subgroup of G if H is a proper
subset of G (as sets).
30
A left coset of H in G is the set gH = {gh | h ∈ H} where g ∈ G is fixed. Similarly, a right coset of H in G
is Hg = {hg | h ∈ H} where g ∈ G is fixed. In both cases, g is called a representative for the coset.
Example 2.4. For any group (G, ·e), the subset {e} is always a subgroup of G, called the trivial subgroup.
The subset Cn of all rotations in Dn is a proper subgroup of Dn . For any flip F ∈ Dn , the subset {I, F} of
Dn is a proper subgroup of Dn . We call Cn the rotation subgroup of Dn , and call {I, F} a reflection subgroup of
Dn (cf. Section 2.2).
The set An of all permutations in Sn which may be written as the product of an even number of of transpo-
sitions is a proper subgroup of Sn . The subgroup An is called the alternating group on n elements (cf. Section
2.3). The set Bn := Sn \ An is a coset of An by Problem 2.16: for any fixed σ ∈ Bn , we can write Bn = σ ◦ An .
Definition 2.7 (Index of a subgroup). Let G/H be the set of left cosets of H in G (thus, G/H is a set of subsets
of G). The index of H in G, denoted by [G : H], is defined to be the cardinality of the set G/H.
|G|
Hence, Lagrange’s theorem implies that when H ≤ G is a subgroup and G is finite, then [G : H] = |H| .
Lagrange’s theorem is really important.
Example 2.5. For any n ≥ 3, the index of Cn in Dn is 2.
Remark. In the above definition, we could have replaced the word “left” with “right” and obtained the same def-
inition of index. It is somewhat tricky to prove that these two definitions coincide, especially since Lagrange’s
theorem only addresses the case that G is finite!
The group ℤ/mℤ (sometimes simply denoted Zm ) is called the integers under addition modulo m. It is also
called the cyclic group of order m. You have probably already seen this structure in elementary number theory.
The group ℤ/mℤ is of order m, equal to the index [ℤ : mℤ], and it is an example of a quotient group, which
we now explore.
31
Definition 2.8. Given two groups (G, ·G , eG ), (G′ , ·G′ , eG′ ), a group homomorphism (or homomorphism for
short) is a map ϕ : G → G′ such that ϕ(g ·G h) = ϕ(g) ·G′ ϕ(h) for all g, h ∈ G. A homomorphism is an
isomorphism if it is a bijection as a map of sets. The kernel of a homomorphism, denoted by ker(ϕ), is the set
{g ∈ G | ϕ(g) = eG′ }.
Intuitively, a homomorphism ϕ is map between groups that preserves the group structure. You can either
do group multiplication before or after you apply the map ϕ, and you’ll get the same answer in the end! Two
groups are isomorphic if they are basically the “same object.”
Problem 2.24 (3 Points). Suppose (G, ·G , eG ) (G′ , ·G′ , eG′ ) are groups and ϕ : G → G′ is a homomorphism.
(a) (1 Point) Show that ϕ(g−1 ) = (ϕ(g))−1 and ϕ(eG ) = eG′ .
(b) (1 Point) Show that for any subgroup H ≤ G, the set ϕ(H) := {ϕ(g) | g ∈ H} is a subgroup of G′ (ϕ(H)
is called the image of H under ϕ).
(c) (1 Point) Show that for any subgroup H ′ ≤ G′ , the set ϕ −1 (H ′ ) := {g ∈ G | ϕ(g) ∈ H ′ } is a subgroup of
G (ϕ −1 (H ′ ) is called the pre-image of H ′ under ϕ). In particular, ker(ϕ) is a subgroup of G.
Problem 2.25 (2 Points). Let m > 1 be a positive integer. Explain why there is no group homomorphism
ϕ : ℤ/mℤ → ℤ.
Definition 2.9. A subgroup H ≤ G is normal, denoted by H ⊴ G, if for all g ∈ G, gH = Hg (the left and
right cosets coincide). Equivalently, ∀h ∈ H, g ∈ G, gHg−1 ∈ H Given a normal subgroup H ⊴ G, the quotient
group G/H is a group whose set of elements is G/H = {gH | g ∈ G}, the set of left cosets of H in G, with
multiplication gH ·G/H g′ H = (g ·G g′ )H and identity eH (where e ∈ G is the identity).
We have already seen examples of the quotient group: Sn /An and ℤ/mℤ. However, it turns out that it
is not possible to quotient out by any arbitrary group when G is not abelian. Let’s see what happens if we
try and brute force multiply elements of cosets and call the product the coset of the resulting element. For
example, consider G = S3 , H = {1, (12)}. The cosets are eH = {1, (12)}, (123)H = {(123), (13)}, (132)H =
{(132), (23)}. Let’s multiply elements of the second and third cosets: (123)(23) = (12) but (13)(23) = (132).
The first product lands in the first coset, but the second product lands in the third coset. So there isn’t a well
defined way to multiply these (their answers are different). If left cosets are equal to right cosets however,
gHg′ H = g(Hg′ )H = g(g′ H)H = gg′ HH = gg′ H so multiplication is well defined.
We have a surjective group homomorphism from G to G/H if H is normal which sends each group el-
ement to the left (or right) coset of H generated by it. Conversely, it can be checked that the kernel of any
homomorphism is normal: if ϕ : G → G′ is a group homomorphism and H = ker(ϕ), for any g ∈ G, h ∈ H,
ϕ(ghg−1 ) = ϕ(g)ϕ(h)ϕ(g)−1 = e, so ghg−1 ∈ H and H is normal. Thus being normal is the right condition to
be able to construct a quotient group. It is easily checked that any subgroup of an abelian group is normal, and
thus can be quotiented by.
32
Example 2.7. The map sgn : Sn → {−1, +1} defined by sgn(σ ) = +1, σ ∈ Sn if and only if σ is an even
permutation, is a surjective group homomorphism. The kernel of this homomorphism is An ≤ Sn . Hence, An is
a normal subgroup of Sn , and we may form the quotient group Sn /An .
Definition 2.10. For g ∈ G, the order of g, denoted by |g|, is the smallest positive integer n such that gn = e
where gn = g · · · · · g n many times. If there is no such g, we say the order of g is infinite. The cyclic subgroup
of G generated by g is the set ⟨g⟩ = {gn | n ∈ ℤ}, where we define g−n = (g−1 )n .
In lieu of the above definition: let m be a positive integer, and let Zm be the cyclic group of order m. We
say that any element g ∈ Zm of order m is a generator of Zm .
Notice that for any element g ∈ G of a group, the order of g is the same as the order of ⟨g⟩. Thus, by
Lagrange’s theorem (Problem 2.24), the order of g divides the order of G.
Example 2.8. The order of any generator of the rotation subgroup Cn ≤ Dn is exactly n. For instance, the 90◦
rotation in D4 is of order 4. The order of any flip in Dn is 2.
Let p be a permutation in Sn with cycle decomposition C1 C2 . . .Ck . Let li be the length of cycle Ci . Then, it is
not too difficult to verify that the order of p is exactly lcm(l1 , l2 , . . . , lk ). For instance, the order of (1 6 4 2)(9 3 5)
is 12.
Problem 2.27 (5 Points). Given groups G1 , . . . , Gn , their direct product is a group, whose set is the Cartesian
product
G1 × · · · × Gn = {(g1 , . . . , gn ) | gi ∈ Gi },
and group multiplication is defined componentwise. Namely, for (g1 , . . . , gn ), (h1 , . . . , hn ) ∈ G1 × · · · × Gn , the
product (g1 , . . . , gn ) · (h1 , . . . , hn ) in G1 , × · · · × Gn is the ordered tuple (g1 h1 , . . . , gn hn ). Here, gi hi is the product
of gi and hi in Gi .
Let m > 1 be a positive integer, Suppose the prime factorization for m is pa11 , . . . , pann , for a1 , . . . , an ≥ 1. Prove
the following isomorphisms:
(a) (2 Points) ℤ/mℤ ∼ a
= ℤ/p11 ℤ × · · · × ℤ/pann ℤ You may recognize this as the Chinese Remainder Theorem.
(b) (3 Points) (ℤ/mℤ)× ∼ a
= (ℤ/p11 ℤ)× · · · × (ℤ/pann ℤ)×
33
3 More Group Theory
Earlier, we saw that a subgroup H of G is a normal subgroup precisely when gHg−1 = H for each g ∈ G,
suggesting that conjugation is an important tool for studying the structure of groups and that we should try to
harness its power. We will make this formal by introducing the notion of group actions (of which conjugation
will then be a special case, acting on subsets of the group G itself!).
Definition 3.1. Let G be a group with identity element e ∈ G and S a set. Then, a (left) group action of G on S
is a function f : G × S → S such that f (e, s) = s and f (g, f (h, s)) = f (gh, s) for all s ∈ S and g, h ∈ G.
Typically, f (g, s) is just written as g · s (so, the second axiom says g · (h · s) = (gh) · s for all s ∈ S and
g, h ∈ G).
We emphasize that the · above corresponds to the group action, not the group operation!!
Example 3.1. Let Sn be the symmetric group on n elements. Then, Sn acts on n = {1, 2, . . . , n} in the obvious
way: σ · i = σ (i) for all σ ∈ Sn , i ∈ n. This is called the permutation action.
We see some examples of group actions in the case that the set S is the group G itself.
Example 3.2 (Left Regular Action). Let G be a group. Then, the map f : G × G → G defined by f (g, h) = gh
is a group action. Here, gh is the product of group elements g and h. Thus, any group G acts on itself by left
multiplication. This is called the regular group action of G.
Example 3.3 (Conjugation). For any group G, we see that G acts on itself by conjugation: x · g = xgx−1 for
x, g ∈ G. In particular for x, y, g ∈ G, notice (xy) · g = xyg(xy)−1 = xygy−1 x−1 = x · (y · g).
We will now show how group actions induce group homomorphisms. Given a (left) group action f acting
on a finite set S, if g · s1 = g · s2 , then multiplying by g−1 implies that s1 = s2 , which means that the map
fg : S → S given by s 7→ g · s is injective. As S is finite, this implies that fg is surjective. Hence fg is a bijective
map S → S, making it an permutation of the set S. Hence we get an induced map φ : G → Perm(S) given by
g → fg . To check this is a group homomorphism, note that φ (g1 g2 )(s) = fg1 g2 (s) = (g1 g2 ) · s = g1 · ( fg2 (s)) =
fg1 ( fg2 (s)) = ( fg1 ◦ fg2 )(s) = (φ (g1 ) ◦ φ (g2 ))(s) for each s ∈ S, which implies that φ (g1 g2 ) = φ (g1 ) ◦ φ (g2 ), as
desired. Note that we can also recover the group action from the induced homomorphism because it contains
the data of g · s for each g ∈ G and s ∈ S.
Example 3.4 (An Induced Homomorphism). Consider the permutation action (Example 3.2). The resulting
homomorphism just identifies an element σ of Sn with its usual definition as a permutation of {1, · · · , n}
Definition 3.2 (Orbits and Stabilizers). Let G be a group acting on a set S. Given s ∈ S, the orbit of s ∈ S is the
subset of S given by
Orb(s) := {t ∈ S | g · s = t for some g ∈ G}.
The stabilizer of s ∈ S is the subset of G given by
Stab(s) := {g ∈ G | g · s = s}.
The above definition of orbit is consistent with the previous definition (Definition 2.3), if we take G =
Sn , S = n, and the permutation action.
In general, the distinct orbits of a group G acting on a set S define a partition of the set S. Namely, every
s ∈ S lies in exactly one distinct orbit of the group action!
34
Definition 3.3 (Conjugacy classes). Let G be a group, acting on itself by conjugation. The orbits of this group
action are called the conjugacy classes of G. Namely, for any g ∈ G, the conjugacy class of g is the subset
{xgx−1 | x ∈ G}.
Example 3.5 (Centralizer). Let G be a group and g ∈ G an element. The centralizer of g, denoted by CG (g), is
the subgroup of G commuting with g, i.e. CG (g) = {x ∈ G | gx = xg}. Note that CG (g) is precisely the stabilizer
of g under the conjugation action since CG (g) consists of the x ∈ G with xgx−1 = g!
Theorem 3.4 (The Orbit-Stabilizer Theorem). Given a finite group G, a set S, and an element s ∈ S, we have
that |Orb(s)| · |Stab(s)| = |G|.
Using this result, we can prove the class equation for any finite group G.
The class equation says that given any finite group G with conjugacy classes C1 , · · · ,Cn and a choice of
elements ci ∈ Ci for each i, we have that:
Proof. We return to the conjugation action on G of Example 3.4. We saw there that g1 and g2 are equivalent if
and only if they are in the same conjugacy class, and thus the equivalence classes under this action are just the
set of conjugacy classes of G. These equivalence classes partition G, and so we conclude that |G| = ∑ni=1 |Ci |.
Thus the result comes down to showing that |Ci | = [G : CG (ci )]. We can restrict the group action of G to the set
Ci and then choose ci ∈ Ci arbitrarily as our s in the class equation. The stabilizer of ci is the set of g ∈ G with
gci g−1 = ci , which is just the centralizer of ci by definition. Then by the Orbit-Stabilizer Theorem, we have
that |Orb(ci )||Stab(ci )| = |G|, and Orb(ci ) is just Ci , allowing us to conclude that |Ci | = |G|/|Cg (ci )|. Finally,
we have that |G|/|Cg (ci )| = [G : Cg (ci )], finishing the proof of the class equation.
Recall that Sn is the group consisting of all permutations of 1, · · · , n. We also introduced the sign homomorphism
sgn : Sn → {−1, +1} in Section 2. That is, for σ ∈ Sn , we have sgn(σ ) = +1 if and only if σ may be written
as the product of an even number of transpositions, and sgn(σ ) = −1 if and only if σ may be written as the
product of an odd number of transpositions.
The kernel of this homomorphism is the group An , the alternating group on the set {1, · · · , n}, consisting
of the σ with sgn(σ ) = +1, which are known as the even permutations. Similarly, those with sgn(σ ) = −1 are
known as odd permutations. We can determine the size of the subgroup An .
35
Example 3.6. (Even permutation in S5 ) Consider the permutation (1 2 3 4 5). It turns out that an n-cycle is
an even permutation when n is odd and an odd permutation when n is even, a result which will be proven in
the next subsection. This may seem a bit counterintuitive, but recall that a transposition is a 2-cycle, so one
can already see the parity imbalance with this simple example. We will discuss the power of cycles much more
later in this section.
Example 3.7. (Odd permutation in S5 ) Consider the permutation (1 2)(3 4 5). If we start with 1, we see that it
is mapped to 2, which is then mapped back to 1, giving a cycle of length 2. On the other hand, starting at 3, we
see it is mapped to 4, which is mapped to 5, which is mapped to 3, giving a second cycle of length 3. From the
above result on cycles, 2-cycles are even while 3-cycles are odd, and this permutation is the product of the two
cycles (1 2) and (3 4 5), and thus by the homomorphism property, it is odd.
It turns out that An is a normal subgroup of Sn (in fact, given a group G, any subgroup of index 2 is
automatically normal). For n ≥ 5, the alternating groups An have a special property: they are simple groups,
which means that they have no nontrivial normal subgroups (their only normal subgroups are {e} and An ). We
will assume this fact without proof.
Problem 3.3 (4 Points). Use the simplicity of An , n ≥ 5 to show that for any n ≥ 5, besides {e} and Sn itself,
An is the only normal subgroup of Sn .
Although this just gives information about the normal subgroups of Sn , it soon becomes apparent that
understanding the normal subgroups of Sn goes a long way in understanding general subgroups of Sn .
Problem 3.4 (5 Points). Show that for any n ≥ 5, if H is a proper subgroup of Sn and H ̸= An , then H has index
at least n in Sn .
This bound is tight because we can consider the subgroup of Sn with σ (n) = n, which is a subgroup of
index n. To see why this latter fact is true, note that there are (n − 1)! permutations with σ (n) = n, and thus
they form a subgroup of index n!/(n − 1)! = n. This subgroup turns out to be isomorphic to Sn−1 , which can be
seen by mapping σ to the permutation of {1, · · · , n − 1} it induces.
Recall from Section 2 that any element σ of Sn has a cycle decomposition, consisting of cycles of the form
{i, σ (i), σ 2 (i) · · · σ k−1 (i)}, where k is the size of the orbit of i ∈ {1, 2, . . . , n} under the subgroup ⟨σ ⟩ ⊆ Sn
acting on {1, · · · , n}. Furthermore, σ has a unique cycle type.
Problem 3.5 (2 Points). Show that if the cycle type of σ is b1 , · · · , br (i.e., the cycle lengths of σ , then the order
of σ is lcm(b1 , ..., br ).
Now we will see the connection between conjugacy classes in Sn and cycle type, which you can recall from
Section 2.
Problem 3.6 (2 Points). Let (a1 · · · ak ) be a cycle. Show that τ(a1 · · · ak )τ −1 = (τ(a1 ) · · · τ(ak )).
Thus writing an element σ as σ = ∏ri=1 (ai1 · · · aici ) (a disjoint cycle decomposition), we have that
τσ τ −1 = ∏ri=1 (τ(ai1 ) · · · τ(aici )), and so any element conjugate to σ has the same cycle type. Now choose
36
any other σ ′ with the same cycle type, say ∏ri=1 (bi1 · · · bici ). Choosing τ so that τ(aici ) = bici for each i, j pair
with 1 ≤ j ≤ ci , we see that any element of Sn with the same cycle type as σ is actually conjugate to σ , and
thus the conjugacy classes are precisely sets of elements with particular cycle type. A cycle type is precisely a
set of positive integers summing to n, i.e. a partition of n.
Problem 3.7 (3 Points). Let C be the conjugacy class consisting of the σ ∈ Sn corresponding to a partition P of
n. Given such a partition P, let ai be the number of appearances of i in the partition for each 1 ≤ i ≤ n. Show
that
n!
|C| = n
∏i=1 ai !iai
Since we know that the conjugacy classes partition Sn , we can swiftly prove a tricky combinatorial lemma
using the machinery developed thus far.
Example 3.8 (A Combinatorial Lemma). Recall that for a finite group G with conjugacy classes C1 , · · · ,Cn , we
had that |G| = ∑ni=1 |Ci |. Let T be the set of all partitions of n. Using the result of Problem 3.7 for the group
G = Sn , and summing over the elements of T , we conclude that
n!
n! = |Sn | = ∑ ∏n ai
.
P∈T i=1 ai !i
The cycle type of a permutation also determines its sign. As before, we first do the calculation for cycles.
Problem 3.8 (1 Point). Let (a1 · · · ak ) be a cycle. Then show that sgn((a1 · · · ak )) = (−1)k−1 .
Now let σ be a permutation with cycle lengths k1 , · · · kr . We have that k1 + · · · + kr = n, and thus that
sgn(σ ) = ∏ri=1 (−1)ki −1 = (−1)n−r . Hence the sign of σ depends only on the parity of n − r. In particular,
given any partition a1 , · · · , ar of n with n − r odd, the conjugacy class corresponding to that partition consists
of only odd permutations and thus intersects An trivially. Now we investigate the case when n − r is even. Any
element corresponding to such a partition will be an element of An .
In the above, we saw that a conjugacy class of Sn is either entirely in An or disjoint from An . This embodies
the fact that An is a normal subgroup of Sn in light of the result that given a group G and a subgroup H, H is
normal in G iff H is the union of conjugacy classes of G (why?). Furthermore, an individual conjugacy class
of G contained in H will itself be a union of conjugacy classes of H because if two elements are conjugate in
H, then they are necessarily conjugate in G, and so we may repeatedly remove H-conjugacy classes of some
G-conjugacy class until none remain, writing the G-conjugacy class as a disjoint union of H-conjugacy classes.
If K is a conjugacy class of G such that K is the (disjoint) union of k conjugacy classes in H, we say that K
splits into k conjugacy classes of H.
37
Problem 3.9 (8 Points).
(a) (5 Points) Let G be a finite group and H a normal subgroup. Show that if K is a conjugacy class of G
splitting into k conjugacy classes of H, then each of the k conjugacy classes has the same size.
(b) (3 Points) Now let G = Sn and H = An . Show that k = 1 or k = 2 and that k = 2 iff no element of K
commutes with an odd permutation.
The latter condition can be smoothly translated into a condition on cycle lengths:
Problem 3.10 (5 Points). Let g ∈ An be an even permutation. Then show that g commutes with no odd permu-
tation iff its cycle lengths are distinct odd positive integers.
From this, we deduce that the conjugacy classes of Sn that split are the ones corresponding to partitions
of n into distinct odd positive integers. Note that these satisfy the parity condition on conjugacy classes of An
automatically since n = k1 + · · · + kr ≡ r mod 2 (since each ki is odd).
Example 3.9. We demonstrate an example of the theory developed to conjugacy classes of S5 . Take the conju-
gacy class of 5-cycles, corresponding to the partition {5} of 5. Since this partition consists of distinct odd parts,
it consists only of even permutations, and by the previous two problems, it will split into exactly two conjugacy
classes in A5 . These two conjugacy classes are given, for example, by the conjugacy classes generated by the
elements (12345) and (21345).
We finish this section by giving an application of the theory developed so far to give a proof of a nice result
about subgroups of Sn . Let G be a finite abelian group. The Structure Theorem for Finite Abelian Groups tells
one that there exist a unique list of prime powers (q1 , · · · , qn ) (up to ordering) so that G ∼
= ∏ni=1 ℤ/qi ℤ. This
n
gives a well-defined invariant of G, namely ∑i=1 qi , which we know is determined by the uniqueness of the list
q1 , · · · , qn in the result. Denote this quantity by f (G). Then we have that G is isomorphic to a subgroup of Sn
iff f (G) ≤ n.
For one direction, if f (G) ≤ n, we may choose disjoint cycles of Sn of lengths qi for each 1 ≤ i ≤ n, which
together generate a subgroup with the desired isomorphism type.
Conversely, suppose that G is isomorphic to a subgroup of Sn , which we call H. We will define a group
action involving H. Let our set S just be {1, · · · , n} and consider the action of H on S just by permuting the
elements of S according to H. We consider the orbits of H acts on S, which are equivalence classes defined by
the relation i ∼ j iff for some h ∈ H, h(i) = j. Let the orbits of H acting on S be S1 , · · · , Sr . For each 1 ≤ i ≤ r, let
Hi be the subgroup of S|Si | (the subgroup of permutations of the elements of Si ) attained by restricting H to the
set Si . Then it follows that G injects into ∏ri=1 Hi upon composing the isomorphism with H and the restriction
of the action of H on the Si because if some element of H were to be the identity permutation on each of the
Si s, this would mean it would fix all of S, implying it is the identity permutation of S.
Now we claim that |Hi | = |Si | for each i. By definition of the equivalence relation, H and thus Hi acts
transitively on the individual Si (i.e. given s1 , s2 in some Si , we can choose h so that hs1 = s2 ). In particular,
the orbit of any element s ∈ Si under Hi is just |Si |. Thus by the Orbit-Stabilizer Theorem, we have that
|Hi | = |Si ||Stab(s)|. Thus we just need to show that Stab(s) is trivial. For this, choose h ∈ Stab(s) and note
that for any s′ ∈ Si , since we may write s′ = h′ s for some h′ ∈ Hi , we have that hs′ = hh′ s = h′ (hs) = h′ s = s′ ,
which shows that each element of Stab(s) fixes all of Si and thus is the identity element of Hi . This shows that
Stab(s) is trivial and so |Hi | = |Si |. Thus since the Si s partition S, we conclude that n = ∑ri=1 |Hi |. Next we claim
that ∑ri=1 |Hi | ≥ f (G). Since G is isomorphic to a subgroup of ∏ri=1 Hi , we have that f (∏ri=1 Hi ) ≥ f (G). But
then f is additive on direct products, so f (∏ri=1 Hi ) = ∑ri=1 f (Hi ). Finally, we have the bound that f (Hi ) ≤ |Hi |
38
since given positive integers a1 , · · · , an ≥ 2, ∑ni=1 ai ≤ ∏ni=1 ai . Thus we conclude that f (G) ≤ ∑ri=1 |Hi | = n, as
desired.
In this section, unless stated otherwise, G is any group. Given elements g, h ∈ G, we can define [g, h] = ghg−1 h−1
to be their commutator. The commutator subgroup G′ is the subgroup of G generated by all the commutators.
The commutator subgroup has many useful properties.
Problem 3.11 (1 Point). Let G be an abelian group. Then show that G′ is trivial.
In some sense, this result is telling one that there is no obstruction to being abelian for abelian groups. We
will refine this idea in the following problems to show how one can use the commutator subgroup to obtain an
abelian quotient of a group G.
Thus G/G′ = Gab is a group, known as the abelianization of G. In some sense, it is the maximal abelian
quotient of G. To make this more precise:
Problem 3.13 (5 Points). Let H be a normal subgroup of G such that G/H is abelian.
(a) (2 Points) Show that G′ ⊂ H.
(b) (3 Points) Show that G/H is isomorphic to a quotient of Gab .
Example 3.10. Let us compute the commutator subgroup of Sn . For n = 1, 2, it is trivial since Sn is abelian
in these cases. For each n > 2, it is nontrivial because Sn itself is nonabelian, meaning that xyx−1 y−1 =
xy(yx)−1 = e is not always true. Furthermore, it is also always a subgroup of An because sgn(xyx−1 y−1 ) =
sgn(x)sgn(y)sgn(x−1 )sgn(y−1 ) = sgn(xx−1 )sgn(yy−1 ) = 1, and so since the generators are even, multiplicativ-
ity implies every element in the whole subgroup is even. This implies that for n = 3, it is A3 since A3 has only
two subgroups and the commutator subgroup can’t be the trivial one. Now for n ≥ 5, Problems 3.3 implies that
the commutator subgroup of Sn must be An because it is the only nontrivial normal subgroup of Sn contained
in An . Thus it remains to handle the case of n = 4. For this, first note that there are 8 3-cycles in S4 , each of
which are elements of A4 , so it follows that the 3-cycles necessarily generate all of A4 by Lagrange’s Theorem
(the subgroup generated by them has size at least 9 while the group itself has order 12, and so since Lagrange
implies the order of the subgroup they generate will divide the size of the group, 12, it follows they generate the
whole group). Then given a 3-cycle (abc), we have that [(ac), (bc)] = (ac)(bc)(ac)(bc) = (acb)(acb) = (abc),
proving that each 3-cycle is in fact a commutator. Thus we conclude that A4 is the commutator subgroup in this
case as well. Hence the commutator subgroup of Sn is always An for any n > 1.
Problem 3.14 (6 Points). Let H be an abelian group and φ : G → H be a homomorphism. Let p be the projection
map G → Gab (given by g → gG′ ). Show there exists a unique map h : Gab → H such that φ = hp.
Universal properties are quite useful because the ensure the existence of something up to (unique) isomor-
phism, which gives a very clean construction in a variety of cases. In this case, Gab is the universal object in the
39
category of groups for maps from a group G into abelian groups because by the above exercise, and the map
factors through it (“goes through it”) to get to H.
40
4 Representation Theory
Representation theory studies the action of (finite) groups on vector spaces over a field. While representation
theory can be conducted over any field in general, for simplicity, we fix our field to be ℂ, the complex numbers.
Throughout this section, we take G to be a finite group. Unless otherwise stated, all vector spaces are
finite dimensional over ℂ.
4.1 Introduction
Before we dive into representation theory, we first introduce two important examples of groups related to finite
dimensional vector spaces.
Let V be a finite dimensional vector space over ℂ. Recall from Section 1 that given any two linear isomor-
phism T, T ′ : V → V , their composition T ◦ T ′ : V → V is again another linear isomorphism. The composition
of linear maps is associative, and any linear isomorphism has an inverse which is also a linear isomorphism.
Taking the identity map idV as the identity element, the set of all linear isomorphisms V → V forms a group
under composition. This group is called the general linear group of V and denoted GL(V ).
Recall that under a fixed choice of ordered basis, any linear transformation T : V → V corresponds to a
matrix MT ∈ Matn×n (ℂ). Fix a basis of V and consider the map GL(V ) → Matn×n (ℂ) defined by taking a linear
transformation T : V → V to its corresponding matrix MT with respect to the given basis. You should convince
yourself that T is a linear isomorphism if and only if MT is an invertible matrix, hence the image is precisely the
set of invertible n × n matrices over 𝔽. The group structure of GL(V ) also carries over with the identity matrix
I as the identity element and matrix multiplication as the group operation. As a group, we denote the set of
invertible n × n matrices over ℂ by GLn (ℂ). Observe that the mapping T 7→ MT then gives a group isomorphism
GL(V ) → GLn (ℂ).
Definition 4.1. Let V be a finite dimensional ℂ-vector space. A representation of G on V is a group homomor-
phism ϕ : G → GL(V ). We also call V together with ϕ, written as a pair (V, ϕ), a G-representation or simply a
representation. We will sometimes omit ϕ for brevity when it is clear that we are referring to V as a represen-
tation and not just a vector space. The degree or dimension of the representation is the dimension of V over ℂ.
We say a representation (V, ϕ) is faithful if ϕ is injective. In other words, (V, ϕ) is a faithful representation if
and only if the only element of G that gets mapped to the identity map on V is the identity element.
The converse to Problem 4.1 is also true, that is any group action of G on V such that (4.1) holds determines
a representation of G on V . Let us see why this is. Let n be the dimension of V over ℂ and {e1 , . . . , en } a basis.
For any element g ∈ G, define a linear map Mg : V → V by setting Mg (ei ) := g · ei for 1 ≤ i ≤ n. By (4.1),
Mg (v) = g · v for all v ∈ V . As g · (g−1 · v)) = v = g−1 · (g · v) for all v ∈ V , we see that Mg is invertible, hence
Mg ∈ GL(V ). Define ϕ : G → GL(V ) by ϕ(g) = Mg . As (gh) · v = g · (h · v) for any g, h ∈ G, Mgh = Mg Mh ,
so ϕ is a group homomorphism, hence a representation. We thus see that giving a representation of G on V is
41
equivalent to giving a group action of G on V that such that (4.1) holds. Throughout the rest of this section, we
will make use of this equivalent definition for a G-representation.
We now give some standard examples of representations. Recall that for any set S, ℂ[S] is the complex
vector space with basis given by the elements of S. In particular, we will denote by ℂ[G] the vector space over
ℂ on the underlying set of G.
Example 4.1 (Trivial representation). Consider the trivial homomorphism ϕ : G → GL(ℂ) given by ϕ(g) = idℂ
for all g ∈ G. This is a degree 1 representation called the trivial representation of G. When |G| > 1, the trivial
representation is not faithful.
Example 4.2 (Regular representation). Take V = ℂ[G]. We define an action of G on V by setting g · h = gh for
all g, h ∈ G and extending by linearity using (4.1). By the above, this gives a representation ϕ : G → GL(ℂ[G]),
which is called the regular representation of G. The regular representation is of degree |G| and is faithful as
any nonidentity element of G corresponds to a permutation of the basis of V which is not the identity.
Example 4.3 (Representation from a group action). Let S be a finite set equipped with a G action. Consider the
vector space V = ℂ[S]. We may then define an action of G on V by extending the action of G on S using (4.1).
This endows V with a G-representation structure. In particular, if the action of G on S is faithful, the resulting
representation will also be faithful.
Recall that we may identify GL(V ) with GLn (ℂ), the group of invertible n × n matrices over ℂ. A matrix
representation of G of dimension n is then a group homomorphism ϕ : G → GLn (ℂ). Note that (up to a choice
of basis) we have a one-to-one correspondence between representations of V and matrix representations of
V by post-composing with the appropriate isomorphism between GL(V ) and GLn (ℂ). Thus, giving a matrix
representation of V is equivalent to giving a representation of V , up to a choice of basis for V . Often times when
computing specific representations of a group, it is easier to work with matrix representations.
Example 4.4. Let G = S3 . We may define a 2-dimensional matrix representation ϕ : S3 → GL2 (ℂ) by
0 1 1 −1 −1 0
ϕ((1 2)) = , ϕ((1 3)) = , ϕ((2 3)) =
1 0 0 −1 −1 1
0 −1 −1 1
ϕ((1 2 3)) = , ϕ((3 2 1)) =
1 −1 −1 0
Example 4.5 (Representations of dimension 1). When V is a vector space of dimension 1, all linear transfor-
mations of V to itself are given by scalar multiplication by some complex number. The invertible elements are
thus the elements of ℂ× = ℂ \ {0}, the multiplicative group of ℂ, hence we may identify GL(V ) = ℂ× . In par-
ticular, the matrix of a linear map f : V → V is a well-defined complex number α ∈ ℂ× for any choice of basis
of V ! Thus, the 1-dimensional representations of a finite group G are “equivalent” to group homomorphisms
ϕ : G → ℂ× .
Example 4.6 (Permutation representation). Let V be an n-dimensional vector space over ℂ and fix a basis
{e1 , . . . , en }. We define an action of Sn on V by setting σ · ei = eσ (i) for all σ ∈ Sn , 1 ≤ i ≤ n and extending by
linearity. The corresponding representation ϕ : Sn → GL(V ) is called a permutation representation. Note that
this is a specialization of the above example (obtaining a G-representation from an action of G on a set S). All
permutation representations are faithful.
Using the same ordered basis {e1 , . . . , en }, the representation ϕ corresponds to a matrix representation
ψ : Sn → GLn (V ). Observe that any element of Sn then maps to a matrix that has exactly one 1 in every row and
column and 0’s elsewhere. Such matrices are called permutation matrices as they permute the basis vectors.
42
Given any group G and a finite set S equipped with a G-action, we may put examples 4.4 and 4.7 together
to obtain a group homomorphism f : G → Sn , where n = |S|.
Problem 4.2 (3 Points). Let G be a group and S a finite set equipped with a G-action. Fix an ordering of the
elements of S = {s1 , . . . , sn }.
(a) (1 Point) Construct a matrix representation ϕ : G → GLn (ℂ) with respect to the ordered basis s1 , . . . , sn
corresponding to the representation ℂ[S] (see Example 4.4).
(b) (1 Point) Let ψ : Sn → GLn (ℂ) be the matrix representation for the permutation representation from
Example 4.7. Show that im(ϕ) is a subgroup of im(ψ).
(c) (1 Point) Recall that ψ is injective, hence we may define a map f : G → Sn by f (g) = ψ −1 (ϕ(g)). Show
that f is a group homomorphism.
In previous sections, we discussed linear maps between vector spaces and group homomorphisms between
groups. We may combine the two to define G-linear maps between representations. Intuitively speaking, a
G-linear map is a linear map of vector spaces that preserves the representation structure.
Definition 4.2. Let V,W be two finite-dimensional complex vector spaces and ϕ : G → GL(V ), ψ : G → GL(W )
representations on V and W respectively. A G-linear map is a linear map T : V → W satisfying
T ◦ ϕ(g) = ψ(g) ◦ T (4.2)
for all g ∈ G, i.e. a linear map that preserves the G-action given by the representations. We say that the two
representations ϕ, ψ are equivalent or isomorphic if there exists an invertible G-linear map between them, i.e.
an invertible linear map T : V → W such that
T ◦ ϕ(g) ◦ T −1 = ψ(g) (4.3)
for all g ∈ G. The map T is also called a G-linear isomorphism. We denote by HomG (V,W ) the set of G-linear
maps V → W .
Problem 4.3 (3 Points). Let V,W, Z be G-representations. Verify the following properties of G-linear maps.
(a) (1 Point) The identity idV : V → V is a G-linear map.
(b) (1 Point) If T : V → W, S : W → Z are two G-linear maps, then S ◦ T : V → Z is also a G-linear map.
(c) (1 Point) If T : V → W is an invertible G-linear map, then so is T −1 : W → V , the linear transformation
inverse.
Remark. As a brief aside, the previous problem tells us that we may define the category of G-representations,
and equivalent G-representations are precisely those that are isomorphic as objects of the category.
Recall that for two vector spaces V,W , Hom(V,W ) is the set of all linear maps V → W and carries a
vector space structure. The following problem shows that Hom(V,W ) may be endowed with a G-representation
structure when V,W are both G-representations.
43
Problem 4.5 (5 Points). Let (V, ϕ), (W, ψ) be two G-representations.
(a) (1 Point) Check that g · T = ψ(g) ◦ T ◦ ϕ(g)−1 defines a group action of G on Hom(V,W ).
(b) (1 Point) Show that the action of G on Hom(V,W ) given by part (a) makes Hom(V,W ) into a G-
representation.
(c) (1 Point) Show that HomG (V,W ) is a vector subspace of Hom(V,W ).
(d) (2 Points) Define Hom(V,W )G to be the subset of linear maps in Hom(V,W ) that are fixed by the action
defined in (a). Check that Hom(V,W )G = HomG (V,W ). We say that Hom(V,W )G is a G-invariant
subspace of Hom(V,W ).
The main focus of this section is on irreducible G-representations (defined below), which can be thought of
as the building blocks for all G-representations. In particular, every G-representation admits a direct sum
decomposition into irreducible representations. In the later section on character theory, we shall see that this
decomposition is unique up to equivalence, i.e. two G-representations are equivalent if and only if they have
the same irreducible representation decomposition (up to reordering and equivalence of the summands).
Definition 4.3. Let (V, ϕ) be a G-representation. A vector subspace W of V is called G-stable if ϕ(g)v ∈ W
for all v ∈ W . We may then define ϕW : G → GL(W ) by ϕW (g)v = ϕ(g)v for all v ∈ W . (W, ϕW ) is then said
to be a subrepresentation of (V, ϕ). As the G-representation structure of W is completely determined by that of
V , we will often take ϕ, ϕW to be implicit and simply define a subrepresentation of a G-representation V to be
a G-stable vector subspace of V . In this case, one just says that W is a subrepresentation of V .
Example 4.7. In the context of problem 4.5, HomG (V,W ) is the G-invariant subrepresentation of Hom(V,W ).
Problem 4.7 (3 Points). Let (V, ϕ), (W, ψ) be two G-representations and T : V → W a G-linear map.
(a) (1 Point) Show that ker(T ) is a subrepresentation of V .
(b) (2 Points) Show that im(T ) is a subrepresentation of W .
Definition 4.4. A G-representation V is irreducible if it is nontrivial as a vector space (i.e. has dimension great
than 0) and does not have any nontrivial, proper (as a subspace) subrepresentations. We say V is reducible if it
is a nontrivial vector space that is not irreducible.
Example 4.8. All dimension 1 G-representations are irreducible as they have no nontrivial, proper subspaces.
The converse is true when G is abelian, as we will see in the next section.
Problem 4.8 (4 Points). Verify (without using character theory) that the matrix representation of S3 given in
Example 4.4 is an irreducible representation.
44
Recall that a vector space V is a direct sum of two subspaces U,W if U ∩W is the trivial subspace and every
vector v ∈ V admits a decomposition v = u + w with u ∈ U, w ∈ W . We denote the direct sum decomposition
by V = U ⊕W . Similarly, a G-representation V is a direct sum of two subrepresentations U,W if V = U ⊕W
as vector spaces.
Example 4.9. Let U,W be subspaces of a finite-dimensional vector space V such that V = U ⊕ W . Let ϕ1 :
G → GL(U), ϕ2 : G → GL(W ) be representations on U,W respectively.
Fix bases {u1 , . . . , un } of U and {w1 , . . . , wm } of W . Let ψ1 : G → GLn (ℂ) and ψ2 : G → GLm (ℂ) be the
corresponding matrix representations of U,W respectively.
We then define a matrix representation ψ : G → GLn+m (V ) by the block diagonal matrix
ψ1 (g) 0
ψ(g) = .
0 ψ2 (g)
Here the notation means that writing the element of a matrix M in the ith row and jth column as Mi, j , we have
ψ1 (g)i, j
1 ≤ i, j ≤ n
ψ(g)i, j = ψ2 (g)(i−n),( j−n) n + 1 ≤ i, j ≤ m .
0 otherwise
More generally, a block diagonal matrix is a matrix that may be written in terms of submatrices as above with
the nontrivial blocks on the diagonal.
Observe that {u1 , . . . , un , w1 , . . . , wm } is a basis of V . Then, U,W with their respective representations are
subrepresentations of V with representation given by ψ.
Recall that given a finite-dimensional vector space V and a subspace U, we may always find another sub-
space W such that V = U ⊕W . The analogous statement is true for G-representations and is called Maschke’s
Theorem. Maschke’s Theorem allows us to decompose representations into a direct sum of irreducible sub-
representations. This result is the foundation of character theory and is key to determining whether or not two
representations are equivalent.
Theorem 4.5. (Maschke’s Theorem) Let (V, ϕ) be a G-representation. If (U, ϕU ) is any subrepresentation of
V , then V has a subrepresentation (W, ϕW ) such that V = U ⊕W .
45
Problem 4.9 (10 Points). Prove Maschke’s Theorem:
(a) (1 Point) Show that there exists a (not necessarily G-invariant) subspace W0 of V such that V = U ⊕W0 .
(b) (2 Point) Define π0 : V → U by π0 (u + w) = u for u ∈ U, w ∈ W0 . For each g ∈ G, define
Problem 4.10 (3 Points). Using Maschke’s theorem, show that any G-representation V may be decomposed
into a direct sum of irreducible subrepresentations.
We will see in the following section using character theory that this decomposition into irreducible subrep-
resentations is unique up to reordering and equivalence of the summands. A key result to studying irreducible
representations is Schur’s Lemma, which gives restrictions on the allowed G-linear maps between any two
irreducible representations.
Theorem 4.6. (Schur’s Lemma) Let (V, ϕ), (W, ψ) be two irreducible G-representations.
(i) If T : V → W is a nontrivial G-linear map (i.e. there exists v ∈ V such that T (v) ̸= 0), then T is a G-linear
isomorphism.
(ii) If T : V → V is a nontrivial G-linear map, then T = λ I for some λ ∈ ℂ where I is the identity map.
Remark. As a consequence of Schur’s Lemma, given any two irreducible G-representations V,W , either V and
W are equivalent or HomG (V,W ) = {0}, where 0 denotes the zero map.
(a) (2 Point) Prove part (i) of Schur’s Lemma. Hint: use Problem 4.7.
(b) (4 Points) Prove part (ii) of Schur’s Lemma. Hint: use something you learned from Section 1.
A common question one may ask when given a finite group G is to compute all the irreducible G-representations
up to equivalence. In the next section, we will describe general methods and compute explicit examples using
46
character theory. As a precursor, in this subsection, we discuss some tools for doing so in relation to finite
abelian groups. The representation theory of finite abelian groups is far simpler than that of finite arbitrary
groups. We first illustrate this with the following problem, which is an application of Schur’s Lemma.
Problem 4.12 (3 Points). Let G be a finite abelian group. Prove that if (V, ϕ) is an irreducible G-representation,
then V must be of dimension 1.
Now, let G be an arbitrary finite group. Let V be a 1-dimensional representation of G. Recall from Example
4.5 that we may identify GL(V ) with ℂ× , hence any representation ϕ : G → GL(V ) induces an equivalent rep-
resentation ϕ ′ : G → ℂ× . Conversely, any homomorphism ϕ : G → ℂ× gives an equivalent representation of G
on V using the same identification GL(V ) ∼ = ℂ× . We leave it as an exercise to check that any two distinct homo-
×
morphisms G → ℂ give rise to inequivalent representations. We thus see that the inequivalent 1-dimensional
G representations are in one-to-one correspondence with group homomorphisms G → ℂ× .
Recall that we may associate to G the abelian group G/G′ , where G′ is the commutator subgroup. By
the universal property of abelianization, any homomorphism ϕ : G → C× factors through the projection map
π : G → G/G′ , i.e. there exists a unique group homomorphism ψ : G/G′ → C× such that ψ ◦ π = ϕ. We thus
may associate to any 1-dimensional representation of G a unique corresponding 1-dimensional representation
of G/G′ . Conversely, given a 1-dimensional representation ψ : G/G′ → ℂ× , we may take the composition
ψ ◦ π : G → ℂ× to get a 1-dimensional representation of G. Altogether we have the following result.
Proposition 4.7. Let G be an arbitrary finite group. Then the inequivalent one-dimensional representations of
G are in one-to-one correspondence with the inequivalent one-dimensional representations of the abelianization
G/G′ .
We will see examples of how Proposition 4.7 may be used to compute the 1-dimensional irreducible rep-
resentations of a group in the following section. We conclude this section with a result that allows us to restrict
the dimensions of the irreducible representations of a finite group G based on its abelian subgroups.
Problem 4.13 (7 Points). Let G be an arbitrary finite group and H an abelian subgroup of G. Show that if
(V, ϕ) is a finite irreducible G-representation, then dimℂ (V ) ≤ [G : H]. Note that this generalizes the result of
Problem 4.12.
Example 4.10. The symmetric group S3 has a cyclic subgroup of order three given by H = {(123), (132), (1)}.
As [S3 : H] = 2, all irreducible representations of S3 must be of either dimension 1 or dimension 2.
Example 4.11. Any dihedral group Dn (where |Dn | = 2n) has a cyclic subgroup of index 2 given by the rota-
tions, so all irreducible representations of Dn are of dimension at most 2.
47
5 Character Theory
In this section, all groups denoted by G are finite, and all vector spaces and representations are over ℂ
and finite dimensional, unless otherwise stated.
We have seen that it is in general difficult to find or classify the irreducible representations of a (finite)
group. We will now introduce the basics of character theory. The character of an irreducible representation is
a numerical invariant that encodes the isomorphism type of a representation. In this way, it is possible to give a
description of all the irreducible representations of a finite group without explicitly writing them out.
5.1 Introduction
48
Example 5.3. The two-dimensional matrix representation of S3 (denote by ϕ : S3 → GL2 (ℂ)) determined by
0 1 0 −1
ϕ((1 2)) = , ϕ((1 2 3)) =
1 0 1 −1
χ(1) = 2, χ((1 2)) = χ((2 3)) = χ((1 3)) = 0, χ((1 2 3)) = χ((1 3 2)) = −1.
Example 5.4. Let S := {1, 2, . . . n} be a finite set, and G a finite group acting on S. Let V := ℂ[S] be the ℂ-vector
space on S. In the previous section, we saw that ℂ[S] is a representation of G of dimension |S|. The “obvious”
basis of V , of course, is the one indexed by the elements of S. Denote by ϕ the corresponding homomorphism
of this representation.
Suppose that g · j = i for some i, j ∈ {1, 2, . . . , n}, g ∈ G. With respect to the above basis of V , the matrix
of ϕ(g) has a 1 in row i, column j, and a 0 in every other row of column j. One deduces the character χ of V
is given by
χ(g) = {# of i ∈ {1, 2, . . . , n} such that g · i = i}, g ∈ G,
i.e. the number of fixed points of g acting on S. In the special case where G = Sn with the natural permutation
action on {1, 2, . . . , n}, then the character χ of V is
Problem 5.2 (1 Point). Let χ be the character of the regular representation of G. Verify that for any g ∈ G,
(
|G| g = 1
χ(g) = .
0 g ̸= 1
Now, is not immediately apparent why characters might be so useful, or why they are closely connected to
class functions. Start by proving the following basic properties:
One of the main results of this Section that we shall prove will be the converse to the above problem, i.e. that two
representations with the same character are equivalent. Since we have seen that characters are class functions,
this suggests we might want to study characters of representations by developing the theory of class functions.
Namely, we shall introduce an inner product on the space of all class functions, and the First Orthogonality
Relation will describe how irreducible characters are realized by the inner product.
The precise setup is as follows. Let G be a finite group, and recall that ℂclass (G) denotes the set of all class
functions G → ℂ. Notice that for any χ, ψ ∈ ℂclass (G), and any scalars α, β ∈ ℂ, the “linear combination”
49
α χ + β ψ defined by (α χ + β ψ)(g) = α χ(g) + β ψ(g) is again a class function G → ℂ. Thus, ℂclass (G) is a
vector space over ℂ, and its dimension is finite, equal to the number of conjugacy classes of G. Now, one puts
a complex inner product on the set of class functions G → ℂ. Let χ, ψ : G → ℂ be class functions. Define a
function ⟨·, ·⟩G : ℂclass (G) × ℂclass (G) → ℂ by:
1
⟨χ, ψ⟩G := ⟨χ, ψ⟩ := ∑ χ(g)ψ(g)
|G| g∈G
Problem 5.4. (2 Points) Verify that ⟨·, ·⟩G as defined above is a complex inner product on the vector space
ℂclass (G) of class functions G → ℂ.
Example 5.5. Let χ be a character of a one-dimensional representation of a finite group G, i.e. χ is a group
homomorphism G → ℂ× . Since, for every g ∈ G, χ(g) is a root of unity, we have that χ(g)χ(g) = 1. Therefore,
1 1
⟨χ, χ⟩ = ∑ χ(g)χ(g) = ∑ 1 = 1.
|G| g∈G |G| g∈G
We are ready to state the First Orthogonality Relation, the powerful result which states that the irreducible
characters of G form a set of orthonormal vectors in ℂclass (G) with respect to the above inner product. Actually,
it is a generalization of the above example!
Theorem 5.3 (First Orthogonality Relation). Let χ, ψ be two irreducible characters of a finite group G. Then,
χ = ψ if and only if they correspond to equivalent representations of G. Furthermore,
(
1 χ =ψ
⟨χ, ψ⟩ = .
0 χ ̸= ψ
Note in particular that the First Orthogonality Relation verifies that the distinct irreducible characters of
G are in one-to-one correspondence with (a complete set of) the inequivalent irreducible representations of G.
Furthermore, recall that any set of orthogonal vectors in an inner product space are linearly independent (see
Section 1). Hence,
Corollary. The set of irreducible characters of a finite group G are linearly independent in ℂclass (G). The
number of irreducible representations of G is finite and at most the number of conjugacy classes of G, =
ℂclass (G).
Over the course of Problems 5.5, 5.6, 5.7, we will work through a proof of the First Orthogonality Relation.
50
Problem 5.5 (6 Points). Let G be a finite group, let (V, ϕ) be a representation of G, and let χ be the character
of V . Recall that
V G := {v ∈ V | ϕ(g)(v) = v for all g ∈ G}
is a subrepresentation of V , called the G-invariant subspace of V .
(a) (4 Points) Consider the linear map
1
P := ∑ ϕ(g) ∈ Hom(V,V ).
|G| g∈G
Prove that P = P2 and im(P) = V G . In other words, prove that P is a projection of V onto V G .
(b) (2 Points) Deduce that
1
dim V G = ∑ χ(g).
|G| g∈G
Problem 5.6 (5 Points). Let G be a finite group, and let (V, ϕ1 ), (W, ϕ2 ) be representations of G. Recall in
Problem 4.5 we defined a representation ϕ : G → GL(Hom(V,W )) as follows: for g ∈ G,
is the linear isomorphism which takes a linear map f : V → W to the linear map ϕ2 (g−1 ) ◦ f ◦ ϕ1 (g) : V → W .
We have also seen that HomG (V,W ) = Hom(V,W )G , so that HomG (V,W ) is a subrepresentation of Hom(V,W ).
Now, let χV , χW be the characters of V,W , respectively, and let χ be the character of (Hom(V,W ), ϕ). Prove
that
χ(g) = χV (g)χW (g−1 )
for all g ∈ G. Deduce that
1
dim HomG (V,W ) = ∑ χV (g)χW (g−1 ).
|G| g∈G
Take notation as in Problem 5.6, now assuming V,W are irreducible and inequivalent. By Schur’s lemma,
we have
1 1
∑ χV (g)χW (g−1 ) = dim HomG (V,W ) = 0, ∑ χV (g)χV (g−1 ) = dim HomG (V,V ) = 1
|G| g∈G |G| g∈G
(namely, HomG (V,V ) = ∼ ℂ as vector spaces). In particular, we clearly cannot have χV = χW . Therefore, the
proof of the First Orthogonality Relation is complete once we have verified the following fact:
Problem 5.7 (5 Points). Let (V, ϕ) be a representation of a finite group G, and χ its character. Then, for any
g ∈ G, prove that χ(g) is a sum of complex roots of unity, and χ(g−1 ) = χ(g).
Having completed the proof of the First Orthogonality Relation and shown that the irreducible characters of G
form a basis of ℂclass (G), we shall now discuss its relation to direct sum decompositions of representations into
51
irreducible components. In particular, we shall prove the promised assertion that two representations with the
same character are equivalent. As usual, let G be a finite group and (V, ϕ) a finite dimensional representation
of G. Suppose that we have a direct sum decomposition
V = V1 ⊕V2 ⊕ · · · ⊕Vs ,
Problem 5.8 (4 Points). Let (V, ϕ1 ) be a representation of a finite group G, and χ its character. Let U1 , . . . ,Ur
be a complete set of inequivalent irreducible representations of G, and let χ1 , . . . , χr their respective characters.
(a) (2 Points) Prove that for any direct sum decomposition V = V1 ⊕ V2 ⊕ · · · ⊕ Vs , where V1 ,V2 , . . . ,Vs are
irreducible subrepresentations of V , the number of V j ’s equivalent to Ui (for any 1 ≤ i ≤ r) is equal to
⟨χ, χi ⟩. Deduce that the character of V is simply
r
χ = ∑ ⟨χ, χi ⟩χi .
i=1
(b) (2 Points) Now, let (W, ϕ2 ) be another representation of G, and ψ its character. Prove that if χ = ψ, then
V,W are equivalent representations.
Therefore, given any representation V of G, and a complete set of inequivalent irreducible representations
U1 , . . . ,Ur of G, we may write
V = U1⊕n1 ⊕U2⊕n2 ⊕ · · · ⊕Ur⊕nr
for some integers n1 , . . . , nr ≥ 0. This notation is to say that V decomposes as n1 + · · · + nr irreducible com-
ponents, where exactly ni components are equivalent to Ui . We say that ni is the multiplicity of Ui in V . Of
course, we have already verified using character theory that the integer ni is uniquely determined, i.e. ni does
not depend on the choice of direct sum decomposition.
One of the immediate consequences of the character theory of reducible representations is as follows. Let
χ be a character of a group G. Write χ = ∑ri=1 ai χi as a linear combination of irreducible characters of G.
52
Then, ⟨χ, χ⟩ = a21 + · · · + a2r by the First Orthogonality Relation. In particular, ⟨χ, χ⟩ = 1 if and only if χ
is irreducible. You might also observe that ⟨χ, χ⟩ = 2 implies that χ is the sum of two distinct irreducible
characters of G. This method allows us to study reducible representations of G without even knowing what the
irreducible representations/characters are! Let us consider an example in which we exhibit the power of many
these tools at once.
Example 5.6. Let W be the (usual) permutation representation of S3 . This means that we are letting S3 act
naturally on three points {1, 2, 3}; letting W be the complex vector space on {1, 2, 3}, this yields a representation
ϕ : S3 → GL(W ). Let χ be the character of (W, ϕ). Thus, χ(σ ) is the number of fixed points of σ on {1, 2, 3}
by Example 5.6. In particular,
1
⟨χ, χ⟩ = (32 + 3 · 12 + 2 · 0) = 2,
6
so, as remarked above, χ = χ1 + χ2 , where χ1 , χ2 are irreducible characters of S3 . Now, let ψ be the character
of the matrix representation of S3 in Example 5.5. Observe that
1 1
⟨ψ, ψ⟩ = (22 + 3 · 0 + 2 · (−1)2 ) = 1, ⟨χ, ψ⟩ = (3 · 2 + 3 · (1 · 0) + 2 · (0 · (−1)) = 1,
6 6
so matrix representation in Example 5.5 is irreducible and of multiplicity 1 in W . In particular, χ = χ1 + ψ for
some irreducible character χ1 . Observe that χ1 = χ − ψ ≡ 1, i.e. χ1 is the trivial character of S3 . This verifeis
that W decomposes as the direct sum of the trivial representation of S3 and an irreducible representation of S3
of dimension 2.
We have already seen that the irreducible characters χ1 , . . . , χr are linearly independent and orthonormal
vectors in ℂclass (G) via the First Orthogonality Relation. We shall now strengthen this corollary to
Theorem 5.4. Let χ1 , . . . , χr be the irreducible characters of a finite group G. Then, χ1 , . . . , χr form an or-
thonormal basis of ℂclass (G). In particular, the number of inequivalent irreducible representations of G equals
the number of conjugacy classes of G.
Problem 5.9 (8 Points). Prove Theorem 5.4. Hint: the proof that we have in mind starts as follows. Let
α : G → ℂ be a class function. For any representation (V, ϕ) of G, consider the associated linear map
ρα,V : ∑ α(g)ϕ(g).
g∈G
Recall that for any finite abelian group G, a (finite dimensional) representation V of G is irreducible if and
only if it is one-dimensional. Moreover, recall that the one-dimensional representations of any arbitrary finite
group G are in one-to-one correspondence with those of G/G′ , where G′ ≤ G is the commutator subgroup.
Then, we obtain the following consequence to Theorem 5.4:
Corollary. The number of inequivalent one-dimensional representations of any finite group G equals the index
[G : G′ ]. In particular, any finite abelian group G has |G| inequivalent one-dimensional representations.
We are now equipped with the wealth of tools and the theory of representations and characters, especially
those that we have developed in the previous sections. Example 5.8 was only one simple application of this
53
theory Now we are ready to apply this theory to prove interesting facts on group representations and study
various explicit examples! One of the main tasks you will have in this section is the problem of classifying all
irreducible representations/characters of a finite group G. Even with the First Orthogonality Relation on hand,
this task is, in general, quite challenging (in some cases perhaps impossible without even more theory), so be
prepared to use everything you know in the following series of problems!
First, we shall decompose the character of the regular representation of G using the First Orthogonality
Relation:
Problem 5.10 (4 Points). Let G be a finite group, and let χ be the character of the regular representation of G.
(a) (2 Points) Let χ1 , . . . , χr be the complete set of distinct irreducible characters of G. Prove that
r
χ = ∑ (dim χi )χi .
i=1
In particular, every irreducible representation of G has positive multiplicity in the regular representation
of G. Deduce the formula
r
|G| = ∑ (dim χi )2 .
i=1
The result of Problem 5.8 is standard in the study of representations of finite groups. In particular, we have
that (dim χ)2 ≤ |G| for any irreducible character χ of the finite group G. This result, may be strengthened,
however, as the following problem states. You probably need character theory to solve it.
Problem 5.11 (5 Points). Let V be any irreducible representation of a finite group G. Show that (dim V )2 ≤
[G : Z(G)]. Here, recall that Z(G) ≤ G is the center of G.
The classification of the irreducible representations of the symmetric group is just one of many bridges
between the fields of algebra and combinatorics. We shall only begin to explore this beautiful theory in this
Power Round. This is precisely the topic of the next problem, where we exhibit a well-known family of
irreducible representations of Sn , known as the standard representation of Sn . Note we have already seen an
example of what the standard representation looks like for the small n = 3.
We shall now move onto the problem of classifying all irreducible characters of a finite group. To this end,
we introduce the character table of G, a bookkeeping device to track the data of all the irreducible characters
of G. Informally speaking, the character table is a table with r rows and r columns (r = # of conjugacy classes
of G) which tracks all the possible values all the irreducible characters of G take, over the distinct conjugacy
classes of G. The formal definition is:
54
Definition 5.5. Given a finite group G, let χ1 , . . . , χr be the distinct irreducible characters of G, and let g1 , . . . , gr ∈
G be a set of representatives of the conjugacy classes of G. Then, the character table of G is the r × r square
matrix
(χi (g j ))1≤i, j≤r .
Since the character table of G is an r × r matrix with values in ℂ, we may study it using techniques from
linear algebra. View the rows of this matrix as vectors in ℂr . Despite the First Orthogonality Relation, these
vectors are in general not orthogonal under the usual Euclidean inner product of ℂr . However, the formula
for the inner product of ℂclass (G) is not too different from that of ℂr . To this end, we recall what orthogo-
nality/orthonormality means for the Euclidean inner product. Given any n × n square matrix P, let P† be the
conjugate transpose matrix of P. We have already seen that the rows of P form an orthonormal basis of ℂn
if and only if P is invertible with P−1 = P† if and only if the columns of P form an orthonormal basis of ℂn .
The intuition here may be applied to the similar but somewhat different situation of character tables. As the
First Orthogonality Relation concerns the rows of a character table, we might believe that there is also a column
orthogonality condition on the character table of G. This is indeed the case, the exact statement given by:
Problem 5.13 (6 Points). Let χ1 , . . . , χr be the distinct irreducible characters of a finite group G. Prove the
Second Orthogonality Relation: for any x, y ∈ G, we have
(
r
0 x, y not conjugate in G
∑ χi (x)χi (y) = |C (x)| x, y conjugate in G .
i=1 G
Hence, the columns of any character table are a set of r orthogonal (hence linearly independent) vectors in
ℂr with the standard Euclidean inner product, so the character table of any finite group is an invertible matrix!
In particular,
Corollary. Elements x, y ∈ G of a finite group are conjugate if and only if χ(x) = χ(y) for all irreducible
characters χ of G.
Now, let us illustrate the practical usefulness of the character table. In the character table of G, the rows are
usually labelled with the irreducible characters of G, and the columns are usually labelled with the conjugacy
classes of G along with their respective sizes. It is particularly helpful to track the sizes of the conjugacy
classes to aid with the computation of the inner products of characters. We shall now present some example
computations of character tables (a complete character table of G is a complete classification of the irreducible
characters of G). In general, the step by step process is as follows (for a finite group G:
(1) Compute the conjugacy classes of G, labelling the rows of the character table of G (in particular, this
determines the size of the character table).
(2) Compute the one-dimensional (irreducible) representations of G (i.e., by studying the abelian quotient
G/G′ ).
(3) Compute the remaining irreducible characters of G using the First Orthogonality Relation and other facts
we know.
Example 5.7. We compute the character table of S3 . The conjugacy classes of S3 are indexed by the distinct
partitions of 3: they are precisely {1}, {(1 2), (2 3), (1 3)}, {(1 2 3), (1 3 2)} (in particular, recall two elements
of S3 are conjugate if and only if they have the same cycle type). Since S3′ = A3 (why?), there are 2 = [S3 : A3 ]
one-dimensional representations of S3 , given by the two distinct homomorphisms S3 → {±1} ≤ ℂ× ; the non-
55
trivial one is called the sign representation. We have already seen a two-dimensional irreducible representation
of S3 , the standard representation. Hence, the character table of S3 is
classes: 1 (1 2) (1 2 3)
sizes: 1 3 2
χ1 1 1 1 ,
χ2 1 −1 1
χ3 2 0 −1
where χ1 , χ2 are the characters of S3 corresponding to the trivial and the sign representation, respectively, and
χ3 is the character of the two-dimensional standard representation.
Example 5.8. We compute the character table of Z4 , the cyclic group of order 4. This group is abelian, so there
are three conjugacy classes and three inequivalent irreducible representations of Z4 . We see that there are four
distinct homomorphisms Z4 → {±1, ±i} ≤ ℂ× , each determined uniquely by the choice of where the generator
1 ∈ Z4 maps to. This means we have found all irreducible characters, so the character table of Z4 is
classes: 0 1 2 3
sizes: 1 1 1 1
χ1 1 1 1 1
.
χ2 1 i −1 −i
χ3 1 −1 1 −1
χ4 1 −i −1 i
Example 5.9. We compute the character table of D4 , the dihedral group of order 8. To write down the elements
of D4 , let r ∈ D4 be the 90◦ rotation, and let s ∈ D4 be any flip. Thus, r4 = s2 = 1, and rs = sr−1 , and as a
set, D4 = {1, r, r2 , r3 , s, sr, sr2 , sr3 }. The conjugacy classes of D4 are {1}, {r2 }, {r, r3 }, {s, sr2 }, {sr, sr3 }, and the
commutator subgroup is D′4 = {1, r2 }. Note D4 /D′4 is isomorphic to ℤ/2ℤ × ℤ/2ℤ, the direct product of the
cyclic group of order 2 by itself. In particular, we may define a surjective homomorphism D4 → ℤ/2ℤ × ℤ/2ℤ
by mapping r to (1, 0) and s to (0, 1); this is valid because the images of r and s satisfy the relations of D4 . The
kernel of this homomorphism is exactly D′4 , hence the requested isomorphism. Now, D4 /D′4 ∼ = ℤ/2ℤ × ℤ/2ℤ
×
is order 4, and so there are exactly four distinct homomorphisms D4 → ℂ . This yields four distinct one-
dimensional characters of D4 , as listed in the following incomplete character table of D4 :
classes: 1 r2 s r sr
sizes: 1 1 2 2 2
χ1 1 1 1 1 1
χ2 1 1 −1 1 −1 .
χ3 1 1 1 −1 −1
χ4 1 1 −1 −1 1
χ5
We just need to determine the remaining character, χ5 . We can do so without even finding the corresponding
irreducible representation, simply by recalling results from Problem 5.10. We have ∑5i=1 (dim χi )2 = 8, so
dim χ5 = 2. Now, if χ is the character of the regular representation of D4 , then the fact
χ = χ1 + χ2 + χ3 + χ4 + 2χ5
determines the value of χ on every conjugacy class of G. This completes our computation of the character table
of D4 :
56
classes: 1 r2 s r sr
sizes: 1 1 2 2 2
χ1 1 1 1 1 1
χ2 1 1 −1 1 −1 .
χ3 1 1 1 −1 −1
χ4 1 1 −1 −1 1
χ5 2 −2 0 0 0
One can check by hand that for every one of these character tables we have computed so far, the rows satisfy
the First Orthogonality Relation, while the columns satisfy the Second Orthogonality Relation.
In the last example, we have shown that it is possible to compute the character table of G without the full
knowledge of the irreducible representations of G. Moreover, we have seen that finding irreducible representa-
tions of G explicitly are certainly helpful (crucial even) for computing the character table of G. However, given
the character table of G, it is in general a far more challenging problem to find an irreducible representation
affording each irreducible character.
We conclude this section with some exercises on computing the character tables of some familiar groups
of small order. The challenge problem here is part (d).
Problem 5.14 (22 Points). Write down the character tables of the following groups:
(a) (2 Points) Q8 .
(b) (3 Points) D5 .
(c) (5 Points) S4 and A4 .
(d) (12 Points) A5 .
No explanations are needed for parts (a), (b), (c). A proof is required for part (d).
57
6 Real Character Tables of An and Sn
In this section, we seek to understand when the character tables of An and Sn consist of only real entries. We
start by characterizing what it means for χi (g) to be real for each 1 ≤ i ≤ r.
Problem 6.1 (4 Points). Let g ∈ G be an element. Show that χi (g) is real for each 1 ≤ i ≤ r iff g and g−1 are in
the same conjugacy class.
It turns out that the character table of Sn always consists of real entries:
Problem 6.2 (2 Points). Let n be a positive integer. Show that each entry of the character table of Sn is real.
The case of An is more subtle, but we can now use the machinery developed to handle it. For An , by Problem
6.1, the problem reduces to determining when g and g−1 are in the same conjugacy class for all g ∈ An . Firstly,
we only need to consider certain cycle types:
Problem 6.3 (2 Points). Suppose that g ∈ An has cycle type not consisting of distinct odd integers. Then show
that g and g−1 are conjugate.
In particular, we only need to consider the g with cycle types consisting of distinct odd integers.
Problem 6.4 (6 Points). Let g ∈ An be any element with cycle type consisting of k distinct odd integers. Show
that g and g−1 are conjugate in An iff n ≡ k mod 4.
With this, we now have all the information we need to complete the investigation of when An has real
character table:
Problem 6.5 (3 Points). Let n > 1 be a positive integer. Show that An has real character table iff n =
2, 5, 6, 10, 14.
Given a conjugacy class C of a group G, note that the inverses of the elements of C also form a conjugacy
class of G. We call this conjugacy class C−1 and say that a conjugacy class C is self-inverse if C = C−1 . Let r
be the number of conjugacy classes of a finite group G. From Problem 6.1, we know that a finite group G has
r real irreducible characters iff it has real character table iff g and g−1 are conjugate for each g ∈ G iff G has r
self-inverse conjugacy classes. We can prove a stronger result of this form:
Problem 6.6 (10 Points). Show that for any finite group G, the number of real irreducible characters of G is
equal to the number of self-inverse conjugacy classes of G.
58