1 Span and Bases: Lecture 2: September 30, 2021
1 Span and Bases: Lecture 2: September 30, 2021
Recall the definition of the span of a set, discussed in the previous lecture.
Note that we only include finite linear combinations. Also, since linear combinations of vectors are
still in V, we have Span (S) ⊆ V. In fact, you can check that Span (S) is also a vector space. Such
a subset of V, which is also a vector space, is called a subspace of V.
Remark 1.2 Note that the definition above and the previous definitions of linear dependence and
independence, all involve only finite linear combinations of the elements. Infinite sums cannot be
said to be equal to a given element of the vector space without a notion of convergence or distance,
which is not necessarily present in an abstract vector space.
Definition 1.3 A set B is said to be a basis for the vector space V if B is linearly independent and
Span ( B) = V.
We will say that a set B ⊆ V is a maximal linearly independent set if B is linearly indepen-
dent and for all v ∈ V \ B, B ∪ {v} is linearly dependent. We also discussed that B ⊆ V is
a basis for V if and only if B is a maximal linearly independent set.
We will now discuss a tool that’ll be very helpful in arguing about bases of vector spaces.
Proposition 1.4 (Steinitz exchange principle) Let {v1 , . . . , vk } be linearly independent and
{v1 , . . . , vk } ⊆ Span ({w1 , . . . , wn }). Then ∀i ∈ [k] ∃ j ∈ [n] such that w j ∈
/ { v1 , . . . , v k } \ { v i }
and {v1 , . . . , vk } \ {vi } ∪ {w j } is linearly independent.
1
Proof: Assume not. Then, there exists i ∈ [k ] such that for all w j ∈ / { v1 , . . . , v k } \ { v i },
{v1 , . . . , vk } \ {vi } ∪ {w j } is linearly dependent. Note that this means we cannot have
vi ∈ {w1 , . . . , wn } (why?)
The above gives that for all j ∈ [n], w j ∈ Span ({v1 , . . . , vk } \ {vi }). However, this implies
{v1 , . . . , vk } ⊆ Span ({w1 , . . . , wn }) ⊆ Span ({v1 , . . . , vk } \ {vi }) ,
which is a contradiction.
Corollary 1.5 Let B1 = {v1 , . . . , vk } and B2 = {w1 , . . . , wn } be two bases of a vector space V.
Then, they must have the same size i.e., k = n.
Note that we are already assuming in the above statement that the bases are finite, which
may not necessarily be the case for all vector spaces. The above is just saying that if there
happen to be two finite bases, then they must be of equal sizes.
Proof Sketch: Use the exchange principle to successively replace elements from B1 by
those from B2 . Since we need to replace k elements and no element of B2 can be used twice
(why?) we must have k ≤ n. By symmetry, we must also have n ≤ k.
Note that the above argument just needs that we can remove elements from B1 and replace
them by new elements from B2 . While it is true that the intermediate sets we will construct
will also be bases, we don’t need this to argue k ≤ n. □
A vector space V is said to be finitely generated if there exists a finite set T such that
Span ( T ) = V. Note that Corollary 1.5 proves that all bases of a finitely generated vector
space (if they exist!) have the same size. It is easy to see that a similar argument can also
be used to prove that a basis must always exist.
Exercise 1.6 Prove that a finitely generated vector space with a generating set T has a basis (which
is a subset of the generating set T).
Exercise 1.7 Let V be a finitely generated vector space and let S ⊆ V be any linearly independent
set. Then S can be “extended” to a basis of V i.e., there exists a basis B such that S ⊆ B.
The size of all bases of a vector space is called the dimension of the vector space, denoted as
dim(V ). Using the above arguments, it is also easy to check that any linearly independent
set of the right size must be a basis.
2
Exercise 1.8 Let V be a finitely generated vector space and let S be a linearly independent set with
|S| = dim(V ). Prove that S must be a basis of V.
Of course, it need not always be the case that the vector space we are dealing with is finitely
generated. For example, the vector space R[ x ] of polynomials has no finite generating
set (since the maximum degree in any finite set T generating set will be an upper bound
on the degree of polynomials in Span ( T ).) However, it is still the case that every vector
space has a (possibly infinite) basis, such that all elements can be expressed as finite linear
combinations of the basis elements. Such a basis is known as a Hamel basis. We will
present the argument for existence of a Hamel basis below, in case you are interested.
However, this will not be incuded in tests or homeworks for the class.
To prove the existence of a basis for every vector space, we will need Zorn’s Lemma (which
is equivalent to the axiom of choice). We first define the concepts needed to state and apply
the lemma.
Definition 1.9 Let X be a non-empty set. A relation ⪯ between elements of X is called a partial
order
- x ⪯ x for all x ∈ X.
- x ⪯ y, y ⪯ x ⇒ x = y.
- x ⪯ y, y ⪯ z ⇒ x ⪯ z.
The relation is called a partial order since not all the elements of X may be related. A subset Z ⊆ X
is called totally ordered if for every x, y ∈ Z we have x ⪯ y or y ⪯ x. A set Z ⊆ X is called
bounded if there exists x0 ∈ X such that z ⪯ x0 for all z ∈ Z. An element x0 ∈ X is maximal if
there does not exist any other y ∈ X ∈ { x0 } such that x0 ⪯ x.
Proposition 1.10 (Zorn’s Lemma) Let X be a partially ordered set such that every totally ordered
subset of X is bounded. Then X contains a maximal element.
We can use Zorn’s Lemma to in fact prove a stronger statement than the existence of a basis
(which we already saw for finitely generated vector spaces).
Proposition 1.11 Let V be a vector space over a field F and let S be a linearly independent subset.
Then there exists a basis B of V containing the set S.
3
Proof: Let X be the set of all linearly independent subsets of V that contain S. For T1 , T2 ∈
X, we say that T1 ⪯ T2 if T1 ⊆ T2 . Let Z be a totally ordered subset of X. Define T ∗ as
T ∗ :=
[
T = {v ∈ V | ∃ T ∈ Z such that v ∈ T } .
T∈Z
Then we claim that T ∗ is linearly independent and is hence in X. It is clear that T ⪯ T ∗ for
all T ∈ Z and this will prove that Z is bounded by T ∗ . By Zorn’s Lemma this shows that
X contains a maximal element (say) B, which must be a basis containing S.
To show that T ∗ is linearly independent, note that we only need to show that no finite
subset of T ∗ is linearly dependent. Indeed, let {v1 , . . . , vn } be a finite subset of T ∗ . By the
definition of T ∗ , there exists a T ∈ X such that {v1 , . . . , vn } ⊆ T. Thus, {v1 , . . . , vn } must
be linearly independent. This proves the claim.
are n linearly independent polynomials in R≤n−1 [ x ]. Thus, they must form a basis for
R≤n−1 [ x ] and we can write the required polynomial, say p as
n
p = ∑ ci · f i ,
i =1
Exercise 1.12 Check that the above argument can be used to find a polynomial of degree at most
n − 1 in the space F[ x ] for any field F such that |F| ≥ n.
4
1.4 Secret Sharing
Consider the problem of sharing a secret s, which is an integer in a known range [0, M ]
with a group of n people, such that if any d of them get together, they are able to learn
the secret message. However, if fewer than d of them are together, they do not get any
information about the secret. We can then proceed as follows:
Note that if any group of d or more people get together, they can uniquely determine the
polynomial Q by Lagrange interpolation. They can then recover the secret by evaluating
Q at 0. However, if d − 1 of them gather, then there is always a polynomial consistent with
the values they hold, and any possible value at 0. To precisely say that they learn nothing
about the secret, we we use the fact that there is exactly one polynomial consistent with the
values they hold and any given value at 0. Since for any given secret s there are exactly
pd−1 polynomials with Q(0) = s, and we chose the polynomial at random conditioned
on the secret, this means that any two secrets have the same probability of producing the
observed (d − 1)-tuple of shares. We will talk in more depth about arguments like this
when we discuss probability in the second half of the course.
2 Linear Transformations
Definition 2.1 Let V and W be vector spaces over the same field F. A map φ : V → W is called a
linear transformation if
- φ ( v1 + v2 ) = φ ( v1 ) + φ ( v2 ) ∀v1 , v2 ∈ V.
- φ(c · v) = c · φ(v) ∀v ∈ V.
5
- φ : C ([0, 1], R) → C ([0, 1], R) defined by φ( f )( x ) = f ( x2 ).
Proposition 2.3 Let V, W be vector spaces over F and let B be a basis for V. Let α : B → W
be an arbitrary map. Then there exists a unique linear transformation φ : V → W satisfying
φ(v) = α(v) ∀v ∈ B.
Definition 2.4 Let φ : V → W be a linear transformation. We define its kernel and image as:
- ker( φ) := {v ∈ V | φ(v) = 0W }.
- im( φ) = { φ(v) | v ∈ V }.
dim(im( φ)) is called the rank and dim(ker( φ)) is called the nullity of φ.
Example 2.7 Consider the matrix A which defines a linear transformation from F72 to F32 :
0 0 0 1 1 1 1
A = 0 1 1 0 0 1 1 .
1 0 1 0 1 0 1
- dim(im( φ)) = 3.
- dim(ker( φ)) = 4.
- Check that ker( φ) is a code which can recover from one bit of error.
- Check that this is also true for the (2k − 1) × k matrix Ak where the ith column is the number
i written in binary (with the most significant bit at the top).
This code is known as the Hamming Code and the matrix A is called the parity-check matrix of the
code.