8 Non-Abelian Gauge Theory: Perturbative Quantization: YM YM ? A S (R)
8 Non-Abelian Gauge Theory: Perturbative Quantization: YM YM ? A S (R)
We’re now ready to consider the quantum theory of Yang–Mills. In the first few sec-
tions, we’ll treat the path integral formally as an integral over infinite dimensional spaces,
without worrying about imposing a regularization. We’ll turn to questions about using
renormalization to make sense of these formal integrals in section 8.3.
To specify Yang–Mills theory, we had to pick a principal G bundle P ! M together
with a connection r on P . So our first thought might be to try to define the Yang–Mills
partition function as Z
? SYM [r]/~
ZYM [(M, g), gYM ] = DA e (8.1)
A
where Z
1
SYM [r] = 2 tr(Fr ^ ⇤Fr ) (8.2)
2gYM M
as before, and A is the space of all connections on P .
To understand what this integral might mean, first note that, given any two connections
r and r0 , the 1-parameter family
r⌧ = ⌧ r + (1 ⌧ )r0 (8.3)
is also a connection for all ⌧ 2 [0, 1]. For example, you can check that the rhs has the
behaviour expected of a connection under any gauge transformation. Thus we can find a
path in A between any two connections. Since r0 r 2 ⌦1M (g), we conclude that A is an
infinite dimensional affine space whose tangent space at any point is ⌦1M (g), the infinite
dimensional space of all g–valued covectors on M . In fact, it’s easy to write down a flat
(L2 -)metric on A using the metric on M :
Z Z
2 1 p
dsA = tr( A ^ ⇤ A) = g µ⌫ Aaµ Aa⌫ g dd x . (8.4)
M 2 M
In other words, given any two tangent vectors (a1 , a2 ) 2 ⌦1M (g) at the point r 2 A,
Z
2
dsA (a1 , a2 ) = tr(a1 ^ ⇤a2 ) , (8.5)
M
independent of where in A we are. This is encouraging: A just looks like an infinite
dimensional version of Rn , with no preferred origin since there is no preferred connection
on P .
We might now hope that the path integral (8.1) means formally that we should pick
an arbitrary base–point r0 2 A, then write any other connection r = r0 + A, with
the measure DA indicating that we integrate over all A 2 ⌦1M (g) using the translationally
invariant measure on A associated to the flat metric (8.4). (Such an infinite dimensional flat
measure does not exist — we’re delaying this worry for now.) For a connection r = r0 +A,
the action becomes
Z
1
SYM [r] = 2 tr(Fr ^ ⇤Fr )
2gYM M
Z Z (8.6)
1 1 2 2
= 2 tr(F r0 ^ ⇤F r0 ) 2 tr(r 0 A + A ) ^ ⇤(r 0 A + A ) .
2gYM M 2gYM M
– 153 –
For example, on a topologically trivial bundle a standard choice would be to pick the trivial
connection r0 = d as base–point. Then Fr0 = 0 and the action takes the familiar form
Z
1
SYM [d + A] = 2 tr(dA + A2 ) ^ ⇤(dA + A2 ) (8.7)
2gYM M
as above. The path integral (8.1) would be interpreted as an integral over all gauge fields A.
However, in some circumstances we’ll meet later (even when P is topologically trivial), it
will be useful to choose a di↵erent base–point r0 for which Fr0 6= 0, known as a background
field. In this case, the first term on the rhs of (8.6) is the action for the background field
and comes out of the path integral as an overall factor, while the remaining action for A
involves the covariant derivative with respect to the background field.
Of course, there’s a problem. By construction, the Yang–Mills action was invariant
under gauge transformations, so the integrand in (8.1) is degenerate along gauge orbits.
Consequently, the integral will inevitably diverge because we’re vastly overcounting. You
met this problem already in the case of QED during the Michaelmas QFT course. There,
as here, the right thing to do is to integrate just over physically inequivalent connections —
those that are not related by a gauge transform. In other words, the correct path integral
for Yang–Mills should be of the form
Z
ZYM [(M, g), gYM ] = Dµ e SYM [r]/~ (8.8)
A/G
where G is the space of all gauge transformations, so that A/G denotes the space of all gauge
equivalence classes of connections: we do not count as di↵erent two connections that are
related by a gauge transformation. Note that this definition means gauge ‘symmetry’ does
not exist in Nature! We’ve taken the quotient by gauge transformations in constructing the
path integral, so the resulting object has no knowledge of any sort of gauge transformations.
They were simply a redundancy in our construction. The same conclusion holds if we
compute correlation functions of any gauge invariant quantities, whether they be local
operators built from gauge invariant combinations of matter fields, or Wilson loops running
around some curves in space.
However, we’re not out of the woods. Whilst A itself was just an affine space, the
space A/G is much more complicated. For example, it has highly non–trivial topology
investigated by Atiyah & Jones, and by Singer. Certainly A/G is not affine, so we don’t
yet have any understanding of what the right measure Dµ to use on this space is, even
formally. In the case of electrodynamics, you were able to avoid this problem (at least in
perturbation theory on R4 ) by picking a gauge, defining the photon propagator and just
getting on with it. The non–linear structure of the non–Abelian theory means we’ll have
to consider this step in more detail.
– 154 –
Suppose we have a function S : R2 ! R defined at any point on the (x, y)-plane,
and suppose further that this function is invariant under rotations of the plane around the
origin. We think of S(x, y) as playing the role of our ‘action’ for ‘fields’ (x, y), while rota-
tions represent ‘gauge transformations’ leaving this action invariant. Of course, rotational
invariance implies that S(x, y) = h(r) in this example, where h(r) is some function of the
radius. We easily compute
Z Z 1
S(x,y)/~
dx dy e = 2⇡ dr r e h(r)/~ (8.9)
R2 0
which will make sense for sufficiently well–behaved f (r). The factor of 2⇡ = vol(SO(2))
appears here because the original integral was rotationally symmetric: it represents the
redundancy in the expression on the left of (8.9).
In the case of Yang–Mills, if we integrated over the space A of all connections rather
than over A/G, the redundancy would be infinite: while the volume vol(G) computed using
the Haar measure on G is finite if the structure group G is compact, the volume vol(G)
of the space of all gauge transformations is infinite — heuristically, you can think of it as
a copy of vol(G) at each point of M . What we’d like to do is understand how to keep
R1
the analogue of 0 dr r e h(r) in the gauge theory case, without the redundancy factor.
However, neither the right set of gauge invariant variables (analogous to r) nor the right
measure on A/G (generalizing r dr) are obvious in the infinite dimensional case.
Returning to (8.9), suppose C is any curve traveling out from the origin that intersects
every circle of constant radius exactly once. More specifically, let f (x) be some function
such that i) for any point x 2 R2 there exists a rotation R 2 SO(2) so that f (Rx) = 0, and
ii) f is non–degenerate on the orbits; i.e., f (Rx) = f (x) i↵ R is the identity76 in SO(2).
The curve
C = {x 2 R2 : f (x) = 0}
then intersects every orbit of the rotation group exactly once, and we can think of C ⇢ R2
as a way to embed the orbit space (R2 {0})/SO(2) ⇠ = R> in the plane (see figure 11).
This non–degeneracy property means that f (x) itself is certainly not rotationally invariant.
In anticipation of the application to Yang–Mills theory, we call f (x) the gauge fixing
function and the curve C it defines the gauge slice.
Now consider the integral
Z
dx dy (f (x)) e S(x,y)/~ (8.10)
R2
over all of R2 . Clearly, the -function restricts this integral to the gauge slice. However,
the actual value we get depends on our choice of specific function f (x); for example, even
replacing f ! cf for some constant c (an operation which preserves the curve C) reduces
the integral by a factor of 1/|c|. Thus we cannot regard (8.10) as an integral over the moduli
space (R2 {0})/SO(2) — it also depends on exactly how we embedded this moduli space
inside R2 .
76
Technically, we should restrict to R2 {0} to ensure this condition holds. For smooth functions S(x, y)
this subtlety won’t a↵ect our results and I’ll ignore it henceforth.
– 155 –
R2
C
Figure 11: The gauge slice C should be chosen to be transverse to the gauge orbits, which
for SO(2) are the circles shown in grey.
a a
a b The problem arose because the -function changes as we change f (x). To account for
h i = + = h a b iconn + h a i h b i
this, define
b b @
f (x) = f (R✓ x) (8.11)
@✓ ✓=0
where the right hand side means we compute the rate of change of f with respect to a
rotation R✓ through angle ✓, evaluated at the identity ✓ = 0. Notice that we only need to
know how an infinitesimal rotation acts in order to compute this. It’s clear that the new
integral Z
S(x,y)/~
dx dy | f (x)| (f (x)) e (8.12)
R2
involving the modulus of f doesn’t change if we rescale f by a constant factor as above.
Nor does it change if we rescale f by a non–zero, r-dependent factor c(r), which means
that (8.12) is completely independent of the choice of function used to define the gauge
slice C. In fact, I claim that (8.12) is actually independent of the particular gauge slice
itself. To see this, let f1 and f2 be any two di↵erent gauge–fixing functions. Since the
curves C1,2 they define each intersect every orbit of SO(2) uniquely, we can always rotate
C1 into C2 , provided we allow ourselves to rotate by di↵erent amounts at di↵erent values
of the radius r. Thus we must have
for some r-dependent rotation R12 (r) and where the proportionality factor depends at most
on the radius. By rescaling invariance,
0
| f2 (x)| (f2 (x)) = | f1 (x )| (f1 (x0 )) (8.14)
where we’ve defined x0 := R12 x for any point x 2 R2 , whether it lies on our curves or not.
Now, the statement that the action S(x) is rotationally invariant means that it takes the
same value all around every circle of constant radius, so S(x) = S(x0 ). Similarly,
dx0 dy 0 = dx dy (8.15)
– 156 –
because again this measure is rotationally invariant at every value of r 77 . Putting all this
together, the integral in (8.12) is independent of the choice of gauge slice R> ,! R2 , as we
wished to show.
As a concrete example, suppose we choose C to be the x-axis, defined by the zero–set
of f (x) = y. With this choice, f (R✓ x) = y cos ✓ x sin ✓ where R represents anti-clockwise
rotation through ✓. Thus
@
f (x) = (y cos ✓ x sin ✓) = x (8.16)
@✓ ✓=0
which disagrees with the radial part of our original integral by a factor of 2. What’s gone
wrong is that circles of constant r intersect the x-axis twice — when x > 0 and when x < 0
— and our gauge fixing condition y = 0 failed to account for this; in other words, it slightly
failed the non–degeneracy property. We’ll see below that this glitch is actually a model of
something that also happens in the case of Yang–Mills theory.
To recap, what we’ve achieved with all this is that, for any non–degenerate gauge–fixing
function f , we can write the desired integral over the space of orbits (R2 {0})/SO(2) ⇠ = R>
as Z Z
1
dr r e h(r)/~ = dx dy | f (x)| (f (x)) e S(x,y)/~ . (8.19)
R> |Z SO(2) | R 2
where |ZSO(2) | = 2 is the number of elements of the centre of SO(2). The point is that the
expression on the rhs refers only to functions and coordinates on the affine space R2 , and
uses only the standard measure dx dy on R2 . When the gauge orbits have dimension > 1
we must impose several gauge fixing conditions f a , one for each transformation parameter
✓a . Then we take the integral to include a factor
Y
| f (x)| (f a (x)) (8.20)
a
77
Again, it’s a good idea to check you’re comfortable with this assertion by writing
! ! !
x0 cos ↵(r) sin ↵(r) x
=
y0 sin ↵(r) cos ↵(r) y
and
pexplicitly working out the transformation of the measure, allowing for the fact that the angle ↵(r) =
↵( x2 + y 2 ) depends on the radius. You’ll find the measure is nonetheless invariant.
– 157 –
where now f is the Faddeev–Popov determinant
✓ a ◆
@f (R✓ x)
f (x) = det (8.21)
@✓b
for a generic set of variables x 2 Rn where the action is invariant under some transformation
x ! R✓ x (not necessarily a rotation) depending on parameters ✓a . Again, this will allow us
to write an integral over the space of orbits of these transformations in terms of an integral
over the affine space Rn . These are things we have access to in the gauge theory case78
where the affine space in question is the space A of all gauge fields, and the transformation
group is the space G of all gauge transformations. Armed with these ideas, we now turn
to the case of gauge theory.
In Yang–Mills theory, we can fix the gauge redundancy by picking a particular con-
nection in each gauge equivalence class — in other words, by picking an embedding of
A/G ,! A specified by some gauge–fixing functional f [A]. The most common choices
of gauge fixing functional are local, in the sense that f [A] depends on the value of the
gauge field just at a single point x 2 M . Heuristically, we then restrict to f [A(x)] = 0
Q
at every point x 2 M by inserting “ [f ] = x2M (f [A(x)])” in the path integral. We’ll
consider how to interpret this infinite–dimensional -function below. The Faddeev–Popov
determinant is then
f a [A (x)]
f = det b (y)
, (8.22)
dimensional matrix; we’ll consider what this determinant means momentarily. With these
ingredients, our Yang–Mills path integral can be written as
Z Z
Dµ e SYM [r]/~ = DA | f | [f ] e SYM [r]/~ , (8.23)
A/G A
where the factor of | f | [f ] restricts us to an arbitrary gauge slice, but leaves no depen-
dence on any particular choice of slice, as above. Again, the advantage of the rhs is that it
refers only to the naı̈ve integral measure over all connections.
For some purposes, especially in perturbation theory, it’s useful to rewrite the [f ]
and f factors in a way that makes them amenable to treatment via Feynman diagrams.
Taking our lead from Fourier analysis, we introduce a new bosonic scalar field h (sometimes
called a Nakanishi–Lautrup field) and write
Z
[f ] = Dh e Sgf [h,r]/~ , (8.24)
where Z Z
i p
Sgf [h, r] = i tr(h ⇤ f [A]) = ha f a [A] g dd x (8.25)
M 2 M
78
Modulo, as always, the problem that there is no Lebesgue measure on A: this is what we’ll treat with
renormalization.
– 158 –
is the gauge–fixing action. The idea is that h is a Lagrange multiplier — performing
the path integral over h imposes f [A(x)] = 0 throughout M . Notice that since we needed
one gauge–fixing condition for every gauge parameter, we take h to lie in the adjoint
representation, h 2 ⌦0M (g). This does not imply that (8.24) is gauge invariant: indeed it
cannot be if it is to fix a gauge! For the Faddeev–Popov determinant f , recall that if
M is an n ⇥ n matrix and (ci , c̄j ) are n-component Grassmann variables, then det(M ) =
R n n
d c d c̄ exp(c̄j M ji ci ). Applying the same idea here, we have79
Z
f a [A (x)]
det b (y)
= Dc Dc̄ e Sgh [c̄,c,r]/~ (8.26)
where Z
f a [A (x)] b
Sgh [c̄, c, r] = c̄a (x) b (y)
c (y) (8.27)
M ⇥M
and the fields (ca , c̄a ) are fermionic scalars, again valued in the adjoint representation of
G. They’re known as ghosts (c) and antighosts (c̄). Putting everything together, our
Yang–Mills path integral can finally be written as
Z Z ✓ ◆
SYM [r]/~ 1
Dµ e = DA Dc Dc̄ Dh exp (SYM [r] + Sgh [c̄, c, r] + Sgf [h, r])
A/G ~
(8.28)
where the integral on the rhs is formally to be taken over the space of all fields (r, c̄, c, h).
Everything on the right is now written in terms of an integral over the naıve, affine space of
all connections, together with the space of ghost, antighost and Nakanishi–Lautrup fields,
weighted by some action S[A, c, c̄, h]. Thus we can hope to compute it perturbatively using
Feynman rules.
Let’s now make all this more concrete by seeing how it works in an example. An
important, frequently occurring choice of gauge is Lorenz80 gauge: we pick the trivial
connection @ as a base-point by writing r = @ + A = @ iAa ta , and impose that the Aa
obey
f a [A] = @ µAaµ (x) = 0 (8.29)
for all x 2 M and for all a = 1, . . . , dim(G). An obvious reason to want to work in Lorenz
gauge is that in the important case (M, g) = (Rd , ), it respects the SO(d) invariance of
the flat Euclidean metric. Under an infinitesimal gauge transformation with parameters
a , the gauge field transforms as
Aaµ 7! Aaµ + @ a a c
+ fcd Aµ d
(8.30)
so the gauge variation of the Lorenz gauge–fixing condition is
f a [A ] = @ µ (Aaµ + @ a a c
+ fcd Aµ d
)
(8.31)
= @ µ Aaµ + @ µ (@µ a a c
+ fcd Aµ d
).
79
One can show that the determinant is positive–definite, at least in a neighbourhood of the trivial
connection. Thus, for the purposes of perturbation theory around the trivial background, we can drop the
modulus sign. Non–perturbatively we must be more careful.
80
Poor Ludvig Lorenz. Eternally outshone by Hendrik Lorentz to the point of having his work misat-
tributed.
– 159 –
Consequently, the matrix appearing in the Faddeev–Popov determinant is
f a [A (x)]
b (y)
= (@ µ @µ a
b
a c
+ fcb Aµ ) d
(x y)
(8.32)
µ
= (@ rµ )ab d (x y)
where the di↵erential operator @ µ rµ acts formally on the variable x appearing in the -
functions81 . These -functions arose because our gauge–fixing condition was local: the
object @ µ (rµ ) lives at one point x, so we get nothing if we vary it wrt to changes in at
some other point. Using this result in the ghost action yields
Z
Sgh [c̄, c, r] = c̄a (x) (@ µ rµ )ab d (x y) cb (y) dd y dd x
M ⇥M
Z Z
= c̄a (@ µ rµ c)a dd x = (@ µ c̄a )(rµ c)a dd x (8.33)
M
Z M
= 2 tr (dc̄ ^ ⇤rc)
M
where in the first step we integrated out y using the -function, recalling that the di↵erential
operators only care about x. Altogether, in Lorenz gauge we have the action
Z Z Z
1
S[r, c, c̄, h] = 2 tr (F ^ ⇤F ) 2 tr (dc̄ ^ ⇤rc) i tr (h d⇤A) (8.34)
2gYM M M M
Except for the strange spin/statistics of the ghost fields and the mixture of covariant and
normal derivatives, this is now a perfectly respectable, local action for scalar fields coupled
to the gauge field. Notice that in the Abelian case where the adjoint representation is
R
trivial, the Lorenz gauge ghost action would be M dc̄ ^ ⇤dc and in particular would be
independent of the gauge field A. Thus the ghosts would have completely decoupled, which
is why you didn’t meet them last term.
– 160 –
u is orthogonal to the kernel of the adjoint of the Laplacian in the L2 norm on (M, g).
R R
However, the Laplacian is self–adjoint ( M ⇤ = M ( ) ⇤ ) and ker consists of
constant functions, since if u 2 ker then
Z Z
0= u⇤ u= du ^ ⇤du = kduk2 (8.36)
M M
whenever u has compact support. Hence du = 0 so u is constant. Thus, for any generic
electromagnetic potential Aµ , we can find a gauge transform that puts it in Lorenz gauge.
In the non–Abelian case, things are more complicated because the gauge transform of a
connection is non–linear: A ! A + rA , whose value depends on which A we start with. It
turns out that we can always solve (8.29), but the proof is considerably more complicated
— one was found by Karen Uhlenbeck in 1982 (at least for some common choices of M ),
and an alternative proof was later found by Simon Donaldson.
We must still check that (8.29) singles out a unique representative, so that we count
each gauge equivalence class only once. Encouragement comes from the fact that connec-
tions obeying (8.29) are orthogonal to connections that are pure gauge, with respect to
the L2 -metric (8.4) on A. For if a1 is a tangent vector at the point r 2 A that obeys
r ⇤ a1 = 0, while a2 = r is also a tangent vector at r that points in the direction of an
infinitesimal gauge transform, then
Z Z Z
2
dsA (a1 , a2 ) = tr(a1 ^ ⇤a2 ) = tr(a1 ^ ⇤r ) = tr((r ⇤ a1 ) ) = 0 (8.37)
M M M
using the Lorenz condition. Thus changing our connection in a way that obeys Lorenz gauge
takes us in a direction that is orthogonal to the orbits of the gauge group. This certainly
shows that starting from any base-point and integrating over all gauge fields along the slice
incorporates only gauge inequivalent connections while we’re near our base-point.
However, as in the finite dimensional example where the line y = 0 intersected each
circle on constant radius twice, it doesn’t guarantee that some other connection, far away
along the gauge slice, isn’t secretly gauge equivalent to one we’ve already accounted for.
This troubling possibility is known as the Gribov ambiguity, after Vladimir Gribov who
first pointed it out and showed it actually occurs in the case of Coulomb gauge @ i Ai = 0
(the indices just run over R3 ⇢ R3,1 ). Somewhat later, Iz Singer showed that the Gribov
ambiguity is in fact inevitable: no matter which gauge condition you pick, the gauge orbit
always intersects the gauge slice more than once (at least for most reasonable M ). To show
this, Singer noted that A is itself an infinite dimensional principle bundle over the space
B := A/G where the group G of all gauge transformations plays the role of the structure
group. A gauge slice amounts to a global section of this bundle — i.e., the choice of a
unique point in A for each point in B. A result I won’t prove states that a principal bundle
only admits a global section when it’s topologically trivial, so the existence of a global
gauge slice would imply
?
A⇠ =B⇥G. (8.38)
Since A is an affine space, clearly ⇡k (A) = 0 for all k > 0 (i.e. A itself is topologically
trivial and has no non–contractible cycles). However, Singer computed that ⇡k (G) 6= 0
– 161 –
for at least some k > 0 which says that there are some non–contractible cycles in the
space on the rhs of (8.38). Thus A = 6 B ⇥ G, so A is non–trivial as a principal bundle
over B, and no global gauge choice exists. In practice, we’ll work perturbatively, meaning
we never venture far enough from our chosen base-point connection to meet any Gribov
copies. Non–perturbatively, we’d have to cover A/G with di↵erent coordinate patches, pick
di↵erent gauges in each one and then piece them together at the end. I’m not aware of
anyone actually trying to do this.
– 162 –
where ✏ is a constant, Grassmann (i.e., fermionic) parameter. Note that [c, c] = ifbc a cb cc t
a
is in fact symmetric in the two ghost fields because exchanging them produces two minus
signs: one from the Lie bracket and one from the Grassmann statistics. Letting i denote
any of the fields {Aµ , c, c̄, h}, we’ll often write these transformations as = ✏ Q so that
Q represents the rhs of (8.39) with the anticommuting parameter ✏ stripped away. Q i
thus has opposite statistics to i itself.
The expression for A shows that, as far as the gauge field itself is concerned, these
BRST transformations act just like a gauge transformation Aµ ! Aµ + rµ with infinites-
imal gauge parameter (x) = ✏c given in terms of the ghost field. It follows that any
gauge–invariant function of the connection alone, such as the original Yang–Mills action
SYM [r], is invariant under the transformations (8.39). To see that the rest of the action is
also invariant under (8.39), we’ll first show that [ 1 , 2 ] = 0, where 1,2 are transformations
with parameters ✏1,2 , so that the BRST transformations form an Abelian group. In fact,
we can write this as
i i i
[ 1, 2] = 1 (✏2 Q ) 2 (✏1 Q )= (✏1 ✏2 ✏2 ✏1 )Q2 i
= 2✏1 ✏2 Q2 i
(8.40)
using the fact that the parameters are fermionic and so anticommute with Q. Therefore
the statement [ 1 , 2 ] = 0 amounts to the statement that the transformation 7! Q is
nilpotent. For the Nakanishi–Lautrup field h, this assertion is trivial. Similarly, for the
antighost c̄ we have
Q2 c̄ = i(Qh) = 0 (8.41)
since h itself is invariant. For the gauge field,
1
Q2 Aµ = Q(rµ c) = [rµ c, c] rµ ([c, c]) (8.42)
2
where the first term is the variation of the connection acting on c in the adjoint (i.e. the
variation of Aµ in the [Aµ , c] term in rµ c), and the second is the variation of the ghost we
produced the first time Q acted. This vanishes since rµ [c, c] = [rµ c, c]+[c, rµ c] = 2[rµ c, c]
using the symmetry of this bracket, as explained above. Thus
Q 2 Aµ = 0 . (8.43)
– 163 –
Now let’s show that the BRST transformation is nilpotent even when acting on an
arbitrary functional O(A, c, c̄, h) of the fields. We compute
✓ ◆ 2O
2 i O 2 i O i j
Q O = Q (Q ) i
= (Q ) i
Q Q j i
. (8.46)
The first term vanishes by our calculations above. To see that the second term also vanishes,
split the sums over all fields (labelled by i, j) into separate sums over bosonic fields i 2
{Aµ , h} and fermionic fields i 2 {c, c̄}. In the case that i and j both refer either to bosonic
or fermionic fields, the term cancels because Q i has opposite statistics to i itself, so that
pre–factor is symmetric if the second derivatives are antisymmetric, and vice–versa. The
mixed terms cancel among themselves.
We’re now in position to see why the full, gauge–fixed action is BRST invariant. Firstly,
as mentioned above the original Yang–Mills action is BRST invariant because it is a gauge
invariant function of Aµ alone. For the remaining terms, we note
Z Z ✓ ◆
d f [A ]
Q tr(c̄f [A]) d x = i tr(hf [A]) tr c̄ c dd x = Sgf [h, A] + Sgh [A, c, c̄] ,
(8.47)
so the gauge–fixing and ghost terms in the action are the BRST transformation of tr(c̄f [A]).
Since BRST transformations are nilpotent, it follows that these terms are BRST invariant
for any gauge–fixing functional f [A]. Combined with the gauge invariance of the original
Yang–Mills action, this shows that BRST transformations preserve the full Yang–Mills
gauge–fixed action. Provided we regularize the path integral measure in a way that pre-
serves this (as will be true perturbatively in dimensional regularization), BRST symmetry
will be a symmetry of the quantum theory, and all new terms that are generated by RG
flow will also be constrained by BRST invariance83 . In particular, terms that depend
only on the original gauge field will be constrained to be gauge invariant, preventing the
appearance of a mass term ⇠ A2 even at the quantum level.
83
Because the BRST transformations do not act linearly on the fields, the form of these BRST trans-
formations in the quantum e↵ective action can be di↵erent to the classical form (8.39). See section ??, or
Weinberg, The Quantum Theory of Fields, vol. 2, chapter 17 for further details.
84
Here we’re assuming that all the operators are bosonic (though they may be built up using fermionic
fields), otherwise bringing the operator that is to be varied to the left may introduce some signs.
– 164 –
where (QOi ) is the variation of the ith operator under the BRST transformations (8.39).
In particular, if the operators O2 , . . . , On do happen to be BRST invariant, then
* n
+
Y
0 = (QO1 ) Oj (8.49)
j=2
which says that the correlation function of a BRST exact operator QO with any number
of BRST invariant operators vanishes.
Furthermore, we can now see that correlation functions of BRST invariant operators
will be independent of our choice of gauge–fixing condition f [A], just as for the partition
function. This is because f [A] appears only in the BRST exact term Q tr(hf [A]) in the
action. In particular, the di↵erence between the gauge–fixed actions with two di↵erent
choices of gauge–fixing condition, f1 [A] and f2 [A], is
Z
S1 S2 = Q tr c̄(f1 [A] f2 [A]) dd x = QV12 , (8.50)
where h· · ·i1,2 refers to the correlation function computed using the actions S1,2 , respec-
tively. In deriving this equation, we used the fact that
QV12 /~ 1 1
e = 1 + QR12 where R12 = V12 + 2 V12 (QV12 ) ··· (8.52)
~ 2~
and the Ward identity (8.39) with O1 = V12 .
The BRST transformations (8.39) correspond to a global symmetry of the gauge–fixed
action, unlike the gauge redundancy of the unfixed action. Hence, by Noether’s theorem,
we can derive an expression for the conserved BRST charge. A short calculation shows
that this is given by85
Z ✓ ◆
µ 1 ⌫ 1 p d 1
QBRST = n tr 2 Fµ⌫ r c + hrµ c + @µ c̄ [c, c] gd x (8.53)
N gYM 2
in the case of the Lorenz gauge action (8.34), where nµ is the unit normal to the hypersur-
face N (e.g., N could be a constant time slice). There is also a ghost number charge
Z
Qgh = nµ tr (c̄ rµ c @µ c̄ c) dd 1 x (8.54)
N
85
In a more sophisticated treatment, I’d point out that under canonical quantization this charge is really
the Chevalley–Eilenberg di↵erential of the infinite–dimensional Lie group G of all gauge transformations,
acting on the space of fields.
– 165 –
corresponding to the U(1) symmetry transformations
c 7! ei✓ c c̄ 7! e i✓
c̄ (8.55)
with all other fields invariant, where ✓ is a constant (bosonic) parameter. The gauge–fixed
action is real if the fields obey
whereupon Q†BRST = QBRST and Q†gh = Qgh . As with any Noether charge, in the quantum
theory these charges act on states. This charge generates BRST transformations, in the
sense that
QO = [QBSRT , Ô] (8.56)
in the canonical picture. In particular, Q2BRST = 0 and QBRST |⌦i = 0 saying that the
vacuum |⌦i is BRST invariant.
and we (formally) take the Hilbert space to be H = L2 (AN /GN ). Note that there is no
momentum conjugate to A0 ; this is the problem of gauge redundancy in the Hamiltonian
framework. The Hamiltonian of the gauge theory is represented by
Z Z ✓ 2
◆
1 d 1 1 1
H= 2 tr(E · E + B · B) d x = tr gYM 2 + 2 B · B dd 1 x (8.60)
2
2gYM N 2 N Ai gYM
– 166 –
We must now understand how to recover this space from the gauge–fixed Yang–Mills
action including ghosts. In fact, this is simplest to see using axial gauge where we set one
component of the gauge field to zero. We achieve this using the gauge–fixing functional
f [A] = nµ Aµ , so the ghost and gauge–fixing parts of the action become
Z Z Z
µ p d µ p d p
Q tr(c̄ n Aµ ) g d x = i tr(h n Aµ ) g d x + tr(c̄ nµ rµ c) g dd x . (8.61)
M M M
Upon integrating out the Nakanishi–Lautrup field and imposing the gauge condition nµ Aµ =
0, we obtain the ghost action
Z
p
Sgh [c, c̄] = tr(c̄ nµ @µ c) g dd x (8.62)
M
The fact that the ghosts and gauge fields have decoupled is a special feature of axial gauges.
Varying the action SYM [r] + Sgh [c, c̄], with this gauge condition we obtain the boundary
term Z
1 p
⇥= 2 nµ tr(Fµ⌫ A⌫ ) + tr(c̄ c) g dd 1 x (8.63)
g
@M YM
among the fields. The gauge field commutation relations are exactly the same as before
in (8.59), while the anticommutation relations appropriate for the fermionic ghost fields
show that the antighost plays the rôle of momentum conjugate to the ghost. Since the
ghost action is purely its kinetic term, the ghost Hamiltonian vanishes and we have
Z ✓ 2
◆
1 1
H= tr gYM 2 + 2 B · B dd 1 x
2
(8.65)
2 N Ai gYM
as before. However, in the ‘position representation’ we should build our QFT wavefunctions
from the boundary values of the remaining components Ai of the gauge field, together with
the boundary values of the ghost c. The remaining degrees of freedom Ei = nµ @µ Ai and
c̄ are treated as di↵erential operators on the space of fields87 . Thus the space of states
may be identified as H e ⇠= O(AN ) ⌦ ^g̃⇤ , where AN is the space of connections on N ,
O(AN ) is the space of functions on AN , g̃ is the (infinite dimensional) Lie algebra of gauge
transformations on N and ^g̃⇤ is the space of functions of the ghosts. This is because a
e may be expanded in powers of the fermionic ghosts as
generic [Ai , c] 2 H
1 Z
X Y
[Ai , c] = ca1 (x1 ) ca2 (x2 ) · · · can (xn ) (n)
a1 a2 ...an [Ai ; x1 , · · · , xn ] dd 1
xi ,
n=0 ⌦n N i
87
Here, we are taking nµ to be both the unit normal to the quantization hypersurface N , and the vector
that picks out our choice of axial gauge. In the case of Lorentzian signature, where N is a constant time
hypersurface in Minkowksi space, this is often called temporal gauge.
– 167 –
with coefficients (n) that are valued in O(AN )-valued elements of ^n g̃⇤ , the dual space
to n powers of the fermionic ghosts.) We say the space H̃ is graded by ghost number and
L
write H̃ = n H̃(n) , where H(n) is the space of terms in the above expansion that are nth
order monomials in the ghosts.
The full space H e is not a Hilbert space, because the anticommutation relations im-
posed on the fermionic scalar ghosts violates the spin–statistics theorem. (Recall from
the Michaelmas QFT course that fermionic fields had to have odd half-integer spin in any
unitary theory.) However, elements of the ghost number zero part H(0) are just function of
the gauge field A|N , so BRST transformations act on them as gauge transformations on N .
In particular, BRST invariant functions that are independent of ghosts correspond to
2
elements of the ‘standard’ Hilbert space L (AN /GN ). As a corrollary, suppose we wish to
compute the correlation function of a set of BRST-invariant local operator insertions on a
manifold M with boundary @M . If @M = then we have shown above that these correlators
are independent of the choice of gauge–fixing condition. This remains true in the presence
of a non-trivial boundary only if the states associated with each boundary are themselves
BRST invariant.
The above discussion was somewhat abstract, so it may be useful to see the significance
of BRST cohomology in a down-to-earth example. Let us consider Maxwell theory in 3+1
dimensional Minkowski space, with gauge condition f [A] = @ µ Aµ . In this Abelian case the
adjoint representation is trivial, and BRST transformations become
where we have
We expand our fields in terms of momentum space as
Z
1 d3 p h ip·x † ip·x
i
Aµ (x) = a µ (p) e + a µ (p) e
(2⇡)3/2 (2p0 )1/2
Z
1 d3 p h ip·x † ip·x
i
c(x) = c(p) e + c (p) e (8.67)
(2⇡)3/2 (2p0 )1/2
Z
1 d3 p h ip·x † ip·x
i
c̄(x) = c̄(p) e + c̄ (p) e .
(2⇡)3/2 (2p0 )1/2
Using the operator QBRST to apply the transformations (8.66) on the field modes, and
matching coefficients of eip·x on both sides, we obtain
[QBRST , c̄(p)] = pµ c(p) , [QBRST , a†µ (p)] = pµ c† (p) ,
[QBRST , c(p)] = 0 , [QBRST , c† (p)] = 0 , (8.68)
[QBRST , c̄(p)] = pµ aµ (p)/⇠ [QBRST , c̄† (p)] = pµ a†µ (p)/⇠ .
Suppose we have a state | i that is BRST invariant, so QBRST | i = 0. Then the state
|✏, i = ✏µ a†µ (p)| i containing one extra photon will also be BRST invariant provided
✏ · p = 0, so that the photon is transverse. On the other hand, if ✏µ / pµ then we have
✏µ a†µ (p)| i / QBRST c̄† (p)| i and hence is BRST exact. Thus the state
⇣ ⌘
|✏ + ↵p, i = |✏, i + ↵pµ a†µ (p)| i = |✏, i + QBRST c̄† (p)| i (8.69)
– 168 –
is physically equivalent to the state |✏, i. This is the usual notion of gauge equivalence
classes for photons. Similarly, the state |c, i = c† (p)| i with one extra ghost obeys
!
✏µ a†µ (p)| i
|c, i = QBRST (8.70)
✏·p
and so is inevitably BRST exact. So the physical Hilbert space of Maxwell theory is
gauge invariant and free from ghosts. You’ll explore BRST cohomology further in the final
problem set.
88
More correctly, to ensure the action remains finite we should impose boundary conditions that the
curvature vanishes as |x| ! 1, so that the connection is pure gauge on a large S d 1 . It is then possible to
have non–trivial bundles, even though the space Rd itself is contractible, classified by the homotopy group
⇡d 1 (G). For example, all semi–simple Lie groups G have non–trivial ⇡3 (G), the case that is relevant to
instantons on R4 .
– 169 –
Aaµ Ab Aaµ Ab Aaµ
k k k
Ab
p
k
Aaµ q Ac
that involves the derivative of the gauge field. This vertex is cyclically symmetric in the
Aaµ participating gluons and
three Abis given
Aaµ by Ab Aaµ Ab
k k k
abc gYM abc
µ⌫⇢ (k, p, q) = f [(k p)⇢ µ⌫ + (p q)µ ⌫⇢ + (q k)⌫ ⇢µ ] (8.72)
~
in momentum space. The four–gluon vertex
Ab
p Ab Ac
k
Aaµ q Ac
Aaµ Ad
– 170 –
together with the constraint that h = @ µ Aµ . (We used this constraint in analysing the
content of BRST transformations in Maxwell theory at the end of the last section.) Thus
the e↵ect of the path integral over the Nakanishi–Lautrup field is simply to modify the
kinetic term of the gauge field. Combining this with the kinetic part of the original Yang–
Mills action, the gluon kinetic terms are
Z Z
1 a a µ ⌫a ⌫ µa d 1
(@µ A⌫ @⌫ Aµ )(@ A @ A )d x + @ µAaµ @ ⌫Aa⌫ dd x
4 2⇠
Z
1 1
= @µ Aa⌫ @ µ A⌫ a @µ Aa⌫ @ ⌫ Aµ a + @ µAaµ @ ⌫Aa⌫ (8.76)
2 ⇠
Z ✓ ◆
1 a ⌫ 2 1
= A⌫ µ @ 1 @µ @ ⌫ Aµ a dd x .
2 ⇠
Inverting this kinetic operator, one finds the gluon propagator is
ab ~ ab pµ p⌫
Dµ⌫ (p) = 2 µ⌫ (1 ⇠) 2 (8.77)
p p
in momentum space. This propagator is often said to be in R⇠ gauge. Since it originally
appeared in front of a BRST exact term, the value of ⇠ can be chosen freely; common
choices are ⇠ = 0 (Landau’s choice – which recovers the original Lorenz gauge as for
electromagnetism) and ⇠ = 1 (Feynman and ’t Hooft’s choice). We’ll usually take ⇠ = 1
below, so that
ab ab ~
Dµ⌫ (p) = µ⌫ 2 , (8.78)
p
an especially simple form for the gluon propagator that is diagonal in both space-time and
colour indices.
We must also consider ghost fields which can run around loops even if they do not
appear externally. The action
Z
Sgh [c, c̄, A] = @ µ c̄a @µ c + gYM fbc
a µ a b c d
@ c̄ Aµ c d x (8.79)
– 171 –
Feynman diagrams will quickly run into a huge proliferation of terms. In fact, counting
each term in the vertices (8.72) & (8.73) separately, even a fairly simple process like 2 ! 3
gluon scattering receives contributions from ⇠ 10,000 terms, already just at tree level! In
theories with charged matter, such as QCD and the Standard Model, there are further
interactions coming from the gluons in the covariant derivatives, and at loop level there
are further contributions from the ghosts.
On the one hand, perhaps this is just the way it is. After all, Yang–Mills theory is a
complicated, non–linear theory. If you come along and prod it in a more or less arbitrary
way (i.e. do perturbation theory), you should expect that the consequences will indeed
be messy and complicated. But another possible response to the above is a slight feeling
of nausea. The whole point of our treatment of Yang–Mills theory in terms of bundles,
connections and curvature was to show how tremendously natural this theory is from a
geometric perspective. Yet this naturality is badly violated by our splitting of the Yang–
Mills action into (dA)2 , A2 dA and A4 pieces, none of which separately have any geometric
meaning. Surely there must be a di↵erent way to treat this theory — one that is less ugly,
and treats the underlying geometric structure with more respect?
Many physicists sympathize with this view (me included). In fact, over the years
various di↵erent ways to think about Yang–Mills theory have been proposed, ranging from
viewing Yang–Mills theory as a type of string theory, to writing it in twistor space instead
of space-time, to putting it on a computer. Some of these approaches have been highly
successful, others only partially so. For now though, we must soldier on and do our best
to understand the theory perturbatively in the neighbourhood of the trivial connection.
To do otherwise would be somewhat akin to trying to understand di↵erential geometry
without first knowing what a vector is.
a
⇧ab
µ (k)1 loop = Aµ Ab Aaµ Ab Aaµ Ab
k k k
where, as always, the external legs are amputated in computing the 1PI contribution. Note
that each of these graphs contributes an amount / g 2 (µ), just reflecting the fact that in
– 172 –
b
A
p Ab Ac
2
the original action, gYM was indistinguishable from ~.
From our experience with 4 theory, we do not expect the first of these diagrams to
play any role in renormalizing the kinetic term, because the external momentum does not
flow around the loop. Indeed – the only term it can contribute to would be a mass term
for the gauge field. Likewise, there is a k-independent contribution to the second diagram
involving the 3-pt gluon interaction that again must contribute to the mass of the gauge field
in the quantum e↵ective action. Neither of these diagrams, nor their sum, is compatible
with BRST invariance which we recall acts on functions of the gauge field alone – such as
the quadratic term in the e↵ective action – just like a gauge transformation. However, we’ll
see that this would-be mass contribution is precisely cancelled by the k-independent part
of the diagram involving the ghosts. The sum of all three diagrams indeed has a tensor
structure compatible with F µ⌫ Fµ⌫ .
Using the Feynman choice ⇠ = 1 of our gauge smearing parameter, the first diagram
corresponds to the (dimensionless) momentum space expression
Z Z
g 2 (µ) µ2 d dd p ⇢ cd
abcd 2 2 d ab dd p µ⌫
µ⌫⇢ = g (µ) µ C2 (G) (d 1) (8.82)
2 (2⇡)d p2 (2⇡)d p2
built from the Feynman gauge propagator and the four–point vertex (8.73). Here, C2 (G)
is the quadratic Casimir of the gauge group G in the adjoint representation, defined by
tradj (ta tb ) = ab C2 (G). It arises from accounting for the di↵erent possible ‘colours’ of the
gauge field running around the loop (i.e., from considering the possible structure of the
Lie algebra indices) and in the case G = SU(N ), we have C2 (G) = N . Of course, this
integral diverges for any d 2 N, but to understand the full e↵ect we should combine it with
the remaining two diagrams before aiming to cancel the divergence with counterterms. To
do this, for ease of comparison to the remaining diagrams, it will be helpful to write the
integral in (8.82) as
Z Z Z1 Z
dd p µ⌫ dd p µ⌫ (p + k)2 dd P µ⌫
= = dx [P 2 + (1 x)2 k 2 ]
(2⇡)d p2 (2⇡)d p2 (p + k)2 (2⇡)d (P + )2
0
2 3
2 Z1 Z1 2
µ⌫ k d
4 (1 d/2) x(x 1) (1 x)
= dx + (2 d/2) dx5
(4⇡)d/2 2 2 d/2 2 d/2
0 0
(8.83)
where we have introduced a factor of 1 = (p + k)2 /(p + k)2 ,
combined the two denominators
using a Feynman parameter x and defined P = p + xk and = x(1 x)k 2 .
We now turn to the second diagram, involving two copies of the three-gluon ver-
tex (8.72). This diagram gives a contribution
Z
g 2 (µ)µ2 d dd p ad bc
D (p + k) D⌫⇢ (p) dbc⌫⇢ (p + k, k, p)
cad
⇢µ (p, k, p + k)
2 (2⇡)d µ
Z 1 Z (8.84)
g 2 (µ)µ2 d ab dd P Nµ⌫
= C2 (G) dx d 2
,
2 0 (2⇡) (P )2
– 173 –
where again we have combined the two propagators using a Feynman parameter, with
P = p + xk and = x(1 x)k 2 as before. The tensor structure of this integral is the
rather unappealing expression
⇢ ⇢
Nµ⌫ = [ µ⌫ (k p) + (2p + k)µ
⇢ µ (p + 2k)⇢ ] [ ⌫ (p k) (2p + k)⌫ + ⌫ (p + 2k)⇢ ]
⇥ ⇤
= µ⌫ (2k + p) + (p k)2
2
d(k + 2p)µ (k + 2p)⌫
+ (2k + p)(µ (k + 2p)⌫) + (k p)(µ (2k + p)⌫) (k + 2p)(µ (k p)⌫) ,
(8.85)
where a(µ b⌫) = aµ b⌫ + a⌫ bµ . Writing this in terms of P = p + xk, discarding linear terms
in P and replacing Pµ P⌫ ! µ⌫ P 2 /d, we obtain the simpler form
✓ ◆
2 1 2
⇥ ⇤ ⇥ ⇤
Nµ⌫ = µ⌫ 6P 1 µ⌫ k (2 x)2 + (1 + x)2 +kµ k⌫ (2 d)(1 2x)2 + 2(1 + x)(2 x)
d
(8.86)
which holds under the integral sign. Altogether, this second pure gauge field diagram yields
a contribution
2
2 d 2 ab Z1
g (µ) µ C2 (G) 4 3(d 1) 2 x(1 x)
d/2
(1 d/2) k µ⌫ 2 d/2
dx
(4⇡) 2
0
0 13
Z1 2 2 Z 1 2
(2 x) + (1 + x) (1 d/2)(1 2x) + (1 + x)(2 x) A5
+ (2 d/2) @ µ⌫ 2 d/2
dx kµ k⌫ 2 d/2
dx .
2 0
0
(8.87)
Like the previous integral from the 4-pt vertex, this contribution just looks like pure
garbage, as does the sum of the two. Indeed, this is correct – these two diagrams are
completely meaningless, because their value depends on the choice of gauge we used to
write down the propagator for Aµ . This is to be expected. They come from considering
just the pure Yang–Mills part of the full action, together with the (@ µAµ )2 term we obtained
from integrating out the Nakanishi–Lautrup field in the gauge–fixing term. Just as in the
zero–dimensional toy case (8.10), this part of the path integral is indeed gauge dependent,
and knows about all the details of our choice of gauge slice.
To cancel this dependence and obtain some meaningful, gauge–invariant quantity, we
must also include the Faddeev–Popov determinant, represented at O(~) by the 1-loop graph
Ab Aaµ Ab Aaµ Ab
k k
– 174 –
involving the ghost fields. This diagram yields
Z
2 2 d dd p 1 1
g (µ) µ f dac (p + k)µ f cbd p⌫
(2⇡) p (p + k)2
d 2
Z1 (8.88)
g 2 (µ)µ2 d ab 1 x(1 x)
= C 2 (G) (1 d/2) µ⌫ k 2 + (2 d/2)kµ k⌫ dx
(4⇡)d/2 2 2 d/2
0
where we recall that, as with the electron loop in our vacuum polarization calculation of
QED, we obtain a minus sign from the fact that the fields running around the loop are
fermionic.
We’re now ready to combine these diagrams. We’ll begin by looking at the coefficient
of (1 d/2). This -function has a pole in d = 2. The only way our integrals could
R
diverge in as low a dimension as d = 2 is if they behave in the UV as ⇠ dd p/p2 , with only
one factor of 1/p2 . Such terms come from the original diagram (involving the 4-pt vertex),
as well as the k-independent parts of the remaining two diagrams. Therefore, unless it is
cancelled by the coefficient, this pole would correspond to the generation of a mass term
in the quantum e↵ective action. Combining the diagrams, the coefficient contains a factor
3d 3 d2 + d 1
= (1 d/2)(d 2)
2
and since (1 d/2) (1 d/2) = (2 d/2) we see that indeed the pole cancels! BRST
symmetry is working: the ghosts have ensured that the diagrams computed from the Yang–
Mills action alone (together with a choice of gauge in which to write propagator) do not
generate a gauge–violating mass term in the e↵ective action. Even better, we now see that
all the terms are in fact proportional to (2 d/2) (with a pole at d = 4, but not at d = 2)
and that they combine to give89
✓ ◆
ab ab 2 kµ k⌫
⇧µ⌫ (k) = k µ⌫ ⇡(k 2 ) (8.89)
k2
where
Z1
2 g 2 (µ) µ2 d/2
⇡(k ) = C2 (G) (2 d/2) (1 d/2)(1 2x)2 + 2 dx + O(~) (8.90)
(4⇡)d/2 2 d/2
0
and I remind you that = x(1 x)k 2 . This may be compared to the expressions (5.74a)-
(5.74b) for vacuum polarization in QED. The crucial point is that it this expression is
proportional to the tensor structure ab ( µ⌫ kµ k⌫ /k 2 ), corresponding to a quantum cor-
rection to the coefficient of the kinetic term @ [µ A⌫] @[µ A⌫] in the e↵ective action, together
with higher derivative terms coming from the k dependence in .
89
It’s still (!) not entirely trivial to obtain this result. Among other things, one needs to use the
fact that the integral over the Feynman parameter x is invariant under x 7! (1 x) to judiciously write
x = 12 x + 12 (1 x) = 12 in terms in the numerator of the integral that are linear in x. See e.g. Peskin &
Schroeder, An Introduction to Quantum Field Theory, section 16.5 for more details.
– 175 –
The factor of (2 d/2) shows there is still a pole in d = 4, and setting d = 4 ✏ leads
to the asymptotic expression
✓ ◆
2 g 2 (µ) C2 (G) 5 2 4⇡µ2 31
⇡(k ) ⇠ 2
+ ln 2 + + O(✏) (8.91)
16⇡ 3 ✏ k 9
in four dimensional momentum space. In particular, this shows that the gauge boson
propagator (in Feynman gauge) is now
ab
ab 1 µ⌫
Dµ⌫ = 2 (8.94)
k2 1 + 5~ 2
g C2 (G) ln µk 2
48⇡ 2
to this order. This gives us another interpretation of the scale µ: it is the energy of the
gauge boson at which the propagator is just 1/k 2 .
Unlike in QED, the quadratic part of the e↵ective action is not gauge (or BRST)
invariant. Indeed, whilst in any gauge the self-energy is transverse and free from quadratic
divergences, its coefficient may depend on the BRST trivial parameter ⇠. For general values
of the gauge smearing parameter, it turns out we should replace
5 13 ⇠
!
3 6 2
in expressions such as (8.91) and (8.92), showing that the e↵ective gluon propagator de-
pends on ⇠. This result is not surprising: the gluon propagator depended on ⇠ even at the
classical level. However, it shows that our expression still contains BRST non-invariant
pieces. We do obtain a BRST invariant answer for any physical observable, such as the
2 ! 2 gluon scattering amplitude, but if we compute this to 1-loop accuracy, then as well
as the 1-loop propagator we must include vertex corrections such as
and the corresponding diagrams involving ghosts running around the loop. These diagrams
also individually depend on ⇠ in a way that intricately cancels out in the sum of all diagrams
– 176 –
contributing to a given physical process. Only in the sum of all such diagrams do we obtain
a BRST invariant answer.
Rather than compute all these terms, in the next section we’ll examine a simpler way
to extract physically meaningful results, such as the running of the Yang-Mills coupling
constant.
on the background and fluctuations. Notice that a transforms in the adjoint here; this is
as expected, since it is the di↵erence a = r r0 between the true connection and our
choice of background connection r0 . However, we obtain the same transformation of r if
we instead declare
1 1
r0 7! r0 and a 7! h r0 h + h ah , (8.96b)
assigning the whole transformation to a and leaving the background connection invariant.
Thus, if we choose the gauge fixing functional to be
we break invariance under the gauge transformations (8.96b) whilst preserving invariance
under the background gauge transformations (8.96a). This is significant because the results
of our loop calculations will give a function of r0 that is manifestly invariant under r0 7!
h 1 r0 h, which is a powerful constraint on the form these loop corrections can take.
With the choice (8.97) of f (a) we have the ghost action
Z
Sgh = tr (c̄ r0 ⇤ (r0 c + [a, c])) . (8.98)
M
while, in the presence of an R⇠ -type gauge smearing term, the path integral over the
Nakanishi–Lautrup field gives
Z ✓ Z ◆ ✓ Z ◆
⇠ 2 1 µ ⌫ d
Dh exp itr (hr0 ⇤ a) + ⇤ tr(h ) = exp tr(r0 aµ r0 a⌫ ) d x .
M 4 ⇠ M
(8.99)
– 177 –
This is essentially the same as before, except that the derivative is covariant with respect
to background gauge transformations.
We’d like to obtain the e↵ective action for r0 when the fluctuations a are integrated
out. As always, to one–loop accuracy it’s sufficient to restrict our attention to the quadratic
terms in the fluctuations, though of course the higher order terms would be important at
higher loops. Consider first the fluctuations in the gauge field. Keeping just the quadratic
terms in the gauge smeared action with ⇠ = 1, we have
SYM [r0 + gYM a] SYM [r0 ]|O(a2 )
Z
= tr (r0µ a⌫ )(rµ0 a⌫ r⌫0 aµ ) + Fr µ⌫
0
[aµ , a⌫ ] + rµ0 aµ r⌫0 a⌫ dd x
Z ⇣ ⌘ (8.100)
= tr a⌫ r0 + rµ0 r⌫0 r⌫0 rµ0 aµ + Fr
µ⌫ 2 µ⌫
0
[a ,
µ ⌫a ] dd x
Z ⇣ ⌘
µ⌫
= tr a⌫ r20 a⌫ + 2a⌫ [Fr 0
, a µ ] dd x .
Here, in going to the third equality we have used the fact that, since a transforms in the
adjoint under background gauge transformations, (rµ0 r⌫0 r⌫0 rµ0 )aµ = [Fr µ⌫
0
, aµ ] and also
µ⌫ µ⌫
tr Fr0 [aµ , a⌫ ] = tr a⌫ [Fr0 , aµ ] using cyclicity of the trace. The ghosts can only appear in
loops, so to 1-loop we can ignore the c̄ r0 (⇤[a, c]) vertex, treating the ghost action simply
as Z
quad
Sgh [c̄, c, a] = tr c̄ r20 c dd x . (8.101)
Thus, integrating out the fluctuations, to one loop accuracy we have the e↵ective action
e↵ 1
SYM [r0 ] = SYM [r0 ] + ln det(4) ln det( r20 ) , (8.102)
2
where the first term is the classical contribution, while the determinants come from inte-
grating out the bosonic fluctuations in the gauge field and the fermionic ghosts, respectively.
In the first determinant, 4 is the matrix of operators
(4µ⌫ )ac = a 2
c µ⌫ r0
a
2fbc (Fr0 )bµ⌫ (8.103)
acting on fluctuations aµ = iacµ tc in the adjoint representation for background transfor-
mations. The first term here is a Laplacian, covariant with respect to the background field,
whilst the second term is the magnetic moment coupling between the fluctuation and the
background fieldstrength.
Let’s compute these determinants, beginning with the ghost contribution. We have
r20 = @ 2 + i(@ µ Aaµ + Aaµ @µ )ta + Aµa Abµ ta tb
(8.104)
= @2 + (1)
+ (2)
where (i) is the term involving i powers of the background gauge field Aµ . Thus
ln det( r20 ) = ln det( @ 2 ) + ln det(1 + ( @ 2 ) 1
( (1)
+ (2)
))
= ln det( @ 2 ) + tr ln(1 + ( @ 2 ) 1
( (1)
+ (2)
))
✓ ◆ ✓ ◆
2 1 (2) 1 1 (1) 1 (1)
= ln det( @ ) + tr tr + O(A3 ) .
@2 2 @2 @2
(8.105)
– 178 –
In going to the last line we’ve used the fact that, for a semi-simple gauge group, tr ta = 0
so the term tr( @ 2 ) 1 (1) ) that is linear in Aµ vanishes. Note that we only need keep
trackabof terms up to asecond order inbA; background gauge invariance ensures that bthese
⇧µ (k)1terms
quadratic loop = A A Aa Ab Aaµ The two remaining
will µbe completed to fullµ non–Abelian curvatures.
A
k k
field–dependent terms may be represented by the Feynman diagrams
k
b
A field is the background gauge field A. These diagrams respectively
where the external
correspond to the pmomentum space expressions
✓ ◆ Z Ab Z Ac
1 d
d k a d
d p 1
(2)
tr k 2
= µ4 d d
õ ( k)Ãb⌫ (k) d
tr ( µ⌫ ta tb ) 2 (8.106)
@ (2⇡) (2⇡) p
and Aaµ q Ac
✓ ◆ Z a Z ✓ ◆
1 1 (1) 1 (1) µ4 d d d k Aµ dddp
(2p + k)µ ta (2p + k)⌫ tb
A
tr = Ãa ( k)Ãb⌫ (k) tr .
2 @2 @2 2 (2⇡)d µ
(2⇡) p2 (p + k)2 d
(8.107)
Individually, these expressions are not meaningful, but they combine (using the same ma-
nipulations as in the previous section) to give
Z ✓ ◆d/2 2
C2 (G) 1 dd k a k2
ln det( r20 ) = (2 d/2) Ã ( k)Ãa⌫ (k)(k 2 µ⌫ µ ⌫
k k ) + O(A3 )
3(4⇡)d/2 2 (2⇡)d µ µ2
(8.108)
which is the contribution to the background field e↵ective action from integrating out the
ghosts to one loop.
We now turn to the determinant ln det(4) coming from integrating out fluctuations
in the gauge field. Again, we write
exactly the same as in the ghost calculation, except that here they act on all d space-time
components of the fluctuating field aµ , since each of these components can run around the
loop. Thus these terms give d⇥(8.108) so, bearing in mind the weighting of the ghost and
gauge determinants in (8.102), they combine with the ghost contribution to give a term
Z ✓ ◆d/2 2
d 2 C2 (G) dd k a k2
(2 d/2) Ã ( k)Ãa⌫ (k)(k 2 µ⌫ µ ⌫
k k ) (8.110)
2 6(4⇡)d/2 (2⇡)d µ µ2
– 179 –
in the e↵ective action.
The remaining contribution is the magnetic moment term. As above, it suffices to
work to quadratic order in the background field, so we can treat the coupling Fr0 as its
Abelian part dA. We obtain
✓ ◆ Z
1 (F ) 1 (F ) 4 d dd k dd p k 2 µ⌫ k µ k ⌫
tr 2 2
= 4C 2 (G)µ d d
tr(õ (k)Ã⌫ ( k)) 2
@ @ (2⇡) (2⇡) p (p + k)2
Z ✓ 2 ◆d/2 2
4C2 (G) dd k 2 µ⌫ µ ⌫ k
= (2 d/2) d
k k k tr(õ (k)Ã⌫ ( k))
(4⇡) d/2 (2⇡) µ2
(8.111)
(For further details of this calculation, see e.g. section 16.6 of Peskin & Schroeder.)
Combining all the pieces, we obtain the e↵ective action for the background field as
Z ✓ 2 ◆d/2 2
e↵ (2 d/2) 11C2 (G) 1 d @
SYM [r0 ] = SYM [r0 ] d x tr(Fr0 ^ ⇤Fr0 )
(4⇡) d/2 3 2 µ2
Z " ✓ 2 ◆d/2 2 #
1 d 1 (2 d/2) 11C2 (G) @
= d x 2 tr(Fr0 ^ ⇤Fr0 )
2 g (µ) (4⇡) d/2 3 µ2
(8.112)
where the first term is the classical piece and the remaining terms come from expanding
the 1-loop determinants. Note that we only computed the quadratic terms in A above:
in momentum space these involved the characteristic factor k 2 µ⌫ kµ k⌫ corresponding to
(@µ A⌫ @⌫ Aµ )2 in position space, and we used background gauge invariance to conclude
µ⌫
that these terms must be completed to the full, non-Abelian curvature Fr0 µ⌫ Fr 0
involving
cubic and quartic powers of the background field that we neglected above. As always, the
full quantum e↵ective action will also involve an infinite series of higher powers of Fr0 and
its derivatives, though these will be irrelevant in d = 4.
The -function has a pole in d = 4 which we can remove using the MS counterterm
11 C2 (G) 2
Z3 = + ln 4⇡ (8.113)
3 (4⇡)2 ✏
leaving us with the 1-loop e↵ective action
Z ✓ 2◆
e↵ 1 1 1 11C2 (G) @
SYM [r0 ] = d4 x 2 + ln tr(Fr0 ^ ⇤Fr0 ) (8.114)
2 g (µ) (4⇡)2 3 µ2
for the background Yang-Mills field in d = 4.
– 180 –
corresponding to the running coupling
g 2 (µ)
g 2 (µ0 ) = g 2 (µ) 11
. (8.117)
1+ 16⇡ 2 3 2
C (G) ln(µ0 /µ)2
The -function is negative, and we see that if µ0 > µ then g(µ0 ) < g(µ). Thus the Yang–
Mills coupling is marginally relevant: Yang–Mills theory approaches a free theory in the
ultraviolet, and becomes strongly coupled in the infra-red. This, finally, is our paradigmatic
example of a continuum QFT. Yang–Mills theory begins arbitrarily close to a Gaussian
fixed point and moves out along the renormalized trajectory in a (marginally) relevant
direction as we probe at lower and lower energies. The fact that it is strongly coupled at
large distances means perturbation theory around the classical action is a poor guide to the
low–energy physics, which is how it avoids Pauli’s initial criticism that we don’t observe
any long range forces beyond gravity and electromagnetism.
More generally, as you’ll explore in the final problem sheet, if we couple scalar fields
in representations ri of G and Dirac spinors in representations rj , then the -function is
modified to
2 3
g 3 4 11 1 X 4 X
(g) = C2 (G) C(ri ) C(rj )5 (8.118)
16⇡ 2 3 3 3
i2scalars j2fermions
where trr (ta tb ) = C(r) ab . For example, if G = SU(N ) then C2 (G) = N and C(fund) =
1/2. Provided we do not have too much matter, the -function of Yang–Mills theory is
negative, so this theory approaches a free theory in the UV, with
g 2 (µ)
g 2 (µ0 ) = ⇣ ⌘ (8.119)
g 2 (µ) 1P P
1+ 16⇡ 2
11
3 C2 (G) 3 i C(ri )
4
3 j C(rj ) ln µ0 2 /µ2
as the running coupling. In fact, a theorem of Gross and Coleman states that non–Abelian
gauge theories are the only possible non–trivial, asymptotically free QFTs in four dimen-
sions.
– 181 –
r ! r + A, the di↵erence90
✓ ◆
2
tr(F(r+A) ^ F(r+A) ) tr(Fr ^ Fr ) = d tr A ^ rA + A ^ A ^ A (8.121)
3
is a total derivative. Thus the integral over this term vanishes if @M = ;, or more generally
if we require A|@M = 0. Consequently, classical physics is completely una↵ected if we
include this term in the action; the Euler–Lagrange equations of
since F@ = 0 obviously. Our argument that the value of Stop [r] is the same for all connec-
tions on a fixed topological type of bundle now shows that Stop [r] = 0 for any connection
on the trivial bundle P = M ⇥ G. However, we can’t use this argument to conclude that
Stop [r] is always vanishes because for a topologically non–trivial bundle P (like a higher
dimensional version of our example of the Mobius strip), it is not possible to express the
connection r as @ + A for any A defined globally over M and indeed for topologically non–
trivial bundles, Stop [r] 6= 0. Though I won’t prove it here, the factor of 1/8⇡ 2 in (8.120)
ensures that when the gauge group G = SU (N ), in fact
Z
1
tr(F ^ F ) 2 Z (8.124)
8⇡ 2 M
so that Stop [r] is just i✓ times an integer. The particular integer we get is known as
the instanton number in physics, or the second Chern class in maths, and gives us
information about how ‘topologically twisted’ our bundle is.
If we also include the topological term (8.120) in the action and sum over all bundle
topologies — allowing for instantons — then the partition function becomes
(Z ) Z
X X
2 S[r] i✓n
ZYM [gYM , ✓, g] = DA e = e DA e SYM [r] (8.125)
topologically
A/G n2Z
distinct P
where in the second equality I’ve used without proof the fact that, when G is semi–simple,
the instanton number n gives a complete specification of the topological type of the bundle
P ! S 4 . We see that using the action (8.122) weights topologically distinct bundles by
the factor ei✓n , where n is the instanton number. The value of the coupling ✓ allows us to
vary the relative importance of di↵erent instanton contributions.
90
It’s a good exercise to derive this result for yourselves, particularly if you’re taking the Applications
of Di↵erential Geometry to Physics course. If not, just treat what I’m writing here as a schematic way to
keep track of all the derivatives and gauge fields.
– 182 –
The presence of the ✏µ⌫⇢ in the definition of Stop implies that this term violates CP
symmetry, with the size of CP violation being related to ✓ (at least classically). Physically,
such CP violation would lead to e↵ects including the generation electric dipole moment for
the neutron. Thus, the observed absence of any such dipole moment provides a constraint
on the size of ✓. In fact, experiments can test for neutron dipole moments very sensitively,
and they tell us that in QCD, |✓| ⌧ 10 9 . The small size of ✓ is known as the strong
CP problem and calls for an explanation. The favoured theoretical solution (proposed by
Roberto Peccei and Helen Quinn in 1977) involves promoting the coupling constant ✓ to a
new field ✓(x). While there’s as yet no experimental evidence for this field, quite remarkably
t it turns out that the theoretical properties ✓(x) must have to enable it to solve the strong
CP problem also automatically make it a good candidate to be an important constituent
of dark matter. This is just one of many striking examples where the apparently di↵erent
arenas of High Energy Theory and Cosmology come together.
– 183 –