Foundations of Debugging for Golang
Foundations of Debugging for Golang
MATTHEW BOYLE
byteSizeGo
Copyright © 2024 by byteSizeGo
No part of this book may be reproduced in any form or by any electronic or mechanical
means, including information storage and retrieval systems, without written
permission from the author, except for the use of brief quotations in a book review.
Every effort has been made in the preparation of this book to ensure the accuracy of the
information presented. However, the information contained in this book is sold
without warranty, either express or implied. Neither the author, nor byteSizeGo or its
dealers and distributors, will be held liable for any damages caused or alleged to have
been caused directly or indirectly by this book.
CONTRIBUTORS
Matt has been writing Go for production since 2018 and often shares
blog posts and fun trivia about Go over on Twitter (@Matt-
JamesBoyle1).
ABOUT THE TECHNICAL REVIEWERS
Ansar Smagulov is a full-stack engineering lead specializing in Go
and TypeScript.
Ansar has led teams and projects at Central Asia's largest cloud
provider, PS Cloud Services. He also develops Zen, a free and efficient
privacy guard for desktop operating systems, among other open
source projects.
You can find more about Ansar's open source work at GitHub
(@anfragment2).
To learn more about him, you can visit his website https://swastik.is-
a.dev/
Michael Bang is a pragmatic developer that loves to spend his time
building maintainable and well-tested software.
He fell in love with Go more than a decade ago and has used it happily
ever since. He loves helping the community in small ways from the
shadows; Foundations of Debugging for Golang is the fourth book
he has contributed to as a technical reviewer. If you need help with a
technical book, you're welcome to reach out to @micvbang5.
CONTENTS
1. WELCOME! 1
METHODS OF DEBUGGING
1. DEBUGGING BY EYE 9
A Simple Exercise 1O
Strategies for Effective Code Inspection 12
Interfaces 14
Concurrency 14
Styleguides 14
Another Exercise 15
Wrapping Up 16
2. PAIR PROGRAMMING 18
What is Pair Programming? 19
Switching Roles 19
Pairing Remotely 24
3. LOGGING 29
Logging Locally with the fmt package 30
Slog 35
Creating a Logging Strategy 39
An Exercise 42
Wrapping Up 47
4. THE DEBUGGER 49
Setting up the Debugger in VSCode 50
Breakpoints 52
Stepping Over 60
Stepping Into 61
Conditional Breakpoints 63
Debugging goroutines 65
DEBUGGING IN PRODUCTION
1. METRICS 79
What are Metrics? 80
Categories of Metrics 80
Exposing Metrics 87
Viewing Metrics 88
Introduction to PromQL 90
Alerting 95
Exercise 96
2. DISTRIBUTED TRACING 99
Open Telemetry 100
Exercise 154
Notes 159
WELCOME!
................................................... . ... . ,
■ 3/ e e elcome to this book, Foundations of Debugging for
G/\nS Golang.
This book started life as a course and was turned into a book by
popular request. If you are a junior or mid-level Go engineer, I think
this book will be useful for you.
I have written this book from the ground up, but the content still
overlaps with the course somewhat. If you are more of a visual learner
like me, and as a thank you for buying this book, you can get 20% off
the course from https://bytesizego.com using the code BOOK20.
1
MATTHEW BOYLE
If you don’t know what any of those things are, do not worry, as by
the end of this book you will.
I hope you enjoy this book and I look forward to hearing what you
think! The best way to reach me is over on Twitter (or X): @matt-
jamesboyle. 1
- Matt Boylt
2
WHAT IS DEBUGGING AND WHY
DO WE DO IT?
3
MATTHEW BOYLE
This can be tricky, as some only appear in specific situations like high
load.
This ensures the bug stays fixed and the new code maintains integrity.
4
FOUNDATIONS OF DEBUGGING FOR GOLANG
& byteSizeGo
Just like Grace Hopper debugging the Harvard Mark II, we too can
methodically squash the bugs in our code!
5
METHODS OF DEBUGGING
In this section we will start by learning some high level techniques for
getting better at debugging (such as how to get better at spotting them
by eye or pairing with a buddy). After that, we will dive into logging
and discuss strategies for how to ensure you don’t log too much; a
common problem I see.
We’ll finish the chapter out by learning about the debugger, ensuring
you can get it setup and we’ll learn some more advanced techniques
for using it.
Z? s a Go developer,
, the most...
one of... ............
fundamental skills youll
need is the ability to debug by eye. This involves developing
an instinct to spot issues just by looking at code.
Whilst tooling has come a long way, and Go tooling really is excellent,
it's important to build this fundamental skill too.
9
MATTHEW BOYLE
cially when starting out, try and stick with an issue for a while; it is
how you will grow as a developer.
Debugging by eye may sound simple, but it's a really powerful tool.
Over time, you'll develop a sixth sense for spotting anomalies and
potential bugs just by reading your code. It's not just about finding
errors; it's about understanding the flow and logic of your program.
A SIMPLE EXERCISE
Let's start with a simple exercise to train your eye to catch a real issue.
Take a moment to look at the following piece of Go code and see if
you can spot the bug:
package main
import "fmt"
10
METHODS OF DEBUGGING
func main() {
numbers := []int{l, 2, 3, 4, 5}
fmt.PrintIn("Sum:", sum(numbers))
}
Can you spot the issue? It's a really small change. The change is
underlined on the next page.
11
MATTHEW BOYLE
package main
import "fmt"
func main() {
numbers := []int{l, 2, 3, 4, 5}
fmt.PrintIn("Sum:", sum(numbers))
}
With the original code, we're ranging over a slice of integers. Our
loop invariant states that we will loop less than or equal to the length
of the slice. In Go, slices are zero-indexed, and therefore, we would
receive an index out-of-range error when we run this code. This
would not be a compile-time error; it would be a runtime error,
meaning our program will crash when we run it.
This may not seem like a big deal in this situation, but if this piece of
code did not have unit tests and was deployed to production, it could
lead to a customer-facing issue.
12
METHODS OF DEBUGGING
• Pair programming.
ERROR HANDLING
Unlike in many other languages, in Go, errors are not exceptions but
values that can (and should!) be propagated and handled explicitly.
Understanding the idiomatic way of handling errors in Go is crucial
to writing robust and reliable code. Here’s a simple example:
We can then return this error to the caller function who can handle it
explicitly.
13
MATTHEW BOYLE
INTERFACES
Interfaces in Go are a way of defining behavior, and they play a
central role in achieving abstraction and code reusability. Being able
to recognize and work with interfaces is a fundamental skill for any
Go developer.
Learning interfaces is beyond the scope of this book. If you feel you
have not quite grasped them yet, I recommend the gobyexample inter
face tutorial1 .
CONCURRENCY
Concurrency is another core feature of Go, so I recommend getting
familiar with goroutines and channels. These patterns can be tricky to
master, but once you do, they'll open up new possibilities for writing
efficient and scalable code. The official Go site has a nice interactive
introduction to concurrency2.
STYLEGUIDES
Reading and understanding style guides can be a great way to get a
sense of what idiomatic Go code looks like. The Go team3 and organi
zations like Uber4 have published their style guides, which can serve
as excellent resources for learning best practices and avoiding
common pitfalls.
In addition to style guides, there are tools like Golangci-lint5 that can
help you catch common code smells and potential issues. These tools
can be invaluable for catching subtle bugs and improving the overall
quality of your code.
As you gain more experience with Go, you'll start to develop an eye
for spotting potential issues in code. This skill, often referred to as
"debugging by eye," can save you a lot of time and effort in the
long run.
14
METHODS OF DEBUGGING
ANOTHER EXERCISE
Below is another exercise to practice debugging by eye. It’s followed
immediately by the solution so try not to skip ahead! See if you can
spot the issue, simply by reading the code carefully.
15
MATTHEW BOYLE
package main
import (
" f mt"
"os"
"strings"
)
func main() {
content, _ := os.ReadFile("config.txt")
if strings.Contains(string(con-
tent), "enable_feature") {
fmt.Printin("Feature Enabled")
} else {
fmt.Printin("Feature Disabled")
}
}
Solution
WRAPPING UP
This has been a gentle introduction to the concept of debugging. It
may seem simple, but this fundamental skill - debugging by eye - can
save your bacon in many cases.
If you want further practice, try to identify and fix the bugs in the
following code snippets without running the code. This exercise will
help you apply what you've learned and sharpen your debugging
abilities.
16
METHODS OF DEBUGGING
Take a look at the code below and try and fix all the issues. By my
count, there are four.
package main
import (
" fmt"
"strconv"
)
func main() {
ar_ := strconv.Atoi("thirty three")
t := Test{}
if t.str == string(a) {
fmt.Printin("should log")
}
}
Can you spot them all? If not, grab a buddy and head on over to the
next chapter on pair programming to learn how you can work
together to debug. If you don’t have anybody to pair with right now,
don’t worry - you can watch me tackle the solution here7.
17
PAIR PROGRAMMING
When I used to work 100% in the office, we used to pair often and I
credit it with much of my technical growth. I found moving to remote
18
METHODS OF DEBUGGING
working has made pairing much harder, but I found a few ways to
make it work (although I think the experience is still inferior). I’ll
share these with you at the end of the chapter.
SWITCHING ROLES
To get the full benefits of pair programming, it's important to switch
roles regularly; I recommend every 45 minutes or so (and that you
pair for at least for an hour and a half). This role reversal ensures that
both programmers have the opportunity to experience the different
perspectives and challenges associated with each role, ultimately
leading to a more well-rounded understanding of the codebase and
the problem at hand.
19
MATTHEW BOYLE
20
METHODS OF DEBUGGING
When you've spent hours staring at a problem and can't seem to spot
the issue, a fresh set of eyes can do wonders.
21
MATTHEW BOYLE
22
METHODS OF DEBUGGING
23
MATTHEW BOYLE
that the driver might not be aware of. This not only helps
improve the current code but also contributes to the driver’s
professional growth.
7. Stay Patient and Respectful: Be patient, especially if the
driver is less experienced or takes longer to understand
certain concepts. Respect their pace and learning process.
Remember that the goal is to support and help them grow.
8. Balance Guidance and Silence: Know when to speak up and
when to stay silent. Too much input can be overwhelming,
while too little can leave the driver feeling unsupported.
Strike a balance by providing guidance at key moments and
allowing the driver space to think and work independently.
9. Encourage Experimentation: Support the driver in
exploring different approaches and solutions. Encourage
them to try out new ideas, even if they might not work out.
This fosters a learning environment and can lead to
innovative solutions.
10. Maintain a Positive Attitude: Keep the mood positive. Pair
programming can be intense, and maintaining a positive
attitude helps keep the session productive and enjoyable.
Celebrate successes and view mistakes as learning
opportunities.
PAIRING REMOTELY
Pairing in person is fairly easy; you can pass the keyboard back and
forth and take breaks to grab a coffee and catch up about Love Island.
Doing it remotely takes discipline and patience. There is going to be
times when your partner’s internet is flaky, they have to answer the
door or the cat decides to sit on the keyboard; this is normal and
should be embraced and used as an opportunity to have a quick break
or non-code based chat. Using pairing as an opportunity to build
friendship and rapport is encouraged!
24
METHODS OF DEBUGGING
There is a whole host of tools out there to “help” you pair remotely;
some of them are free and some of them are quite expensive. In most
cases, I still just share my screen via Zoom or Google Hangouts and
this is sufficient. I have tried some of the tools created for remote
pairing such as Pair With Me1from JetBrains and, whilst very techni
cally impressive, I found it can actually detract from the process as
each participant gets their own cursor and view of the code - it is
therefore very easy to get distracted and go and look at a different file
or jump ahead.
My advice here: keep it simple to start with. If you pair with someone
a lot and want to explore some of the fancier tools then give it a try.
Let me know if you come across a great one!
25
MATTHEW BOYLE
package main
import (
" f mt"
"golang.org/x/sync/errgroup"
)
func main() {
var g errgroup.Group
values := []int{l, 2, 3}
for v := range values {
g.Go(func() error {
if qtctc := process(v); err != nil {
return err
}
return nil
})
}
26
METHODS OF DEBUGGING
27
MATTHEW BOYLE
Whilst modern tools have their place, there's no substitute for good
old-fashioned code comprehension and collaboration. And who
knows? The solution to your next production puzzle might just lie in
the simple act of "debugging by eye."
28
LOGGING
In this chapter, we'll dive into the world of logging and logging strate
gies. We’re going to start by talking about logging generally before
moving on to structured logs and how they differ from the unstruc
tured logs you might be used to.
Structured logs are more than just plain text; they're organized,
machine-readable data structures that can be easily parsed and
29
MATTHEW BOYLE
Let’s go through some of the most common functions you might want
to use.
fmt.Print
fmt.Print("hello")
Even for complex data types like structs, fmt.Print handles the
formatting in an easy to consume format. Consider the following
code:
30
METHODS OF DEBUGGING
u := user{
name: "Matt",
age: 3,
}
fmt.Print(u)
fmt.Println
u := user{
name: "Matt",
age: 3,
}
fmt.Printin(u)
fmt.Printin(u)
Would print:
{Matt 3}
{Matt 3}
31
MATTHEW BOYLE
fmt.Printf
This function allows you to format the output using placeholders. It's
one of the most commonly used printing functions, offering flexi
bility and readability.
u := user{
name: "Matt",
age: 3,
}
fmt.Printf("The user's name is %s", u.name)
The fmt package will take you a surprisingly long way. Whilst you’re
just starting out, you should embrace it. However, it is missing a
couple of features that can make our logs a little more useful. Let’s
take a look at another standard library package that can make our logs
a little more awesome.
32
METHODS OF DEBUGGING
The log package defines a Logger type with methods for formatting
output, as well as helper functions for common logging tasks. It also
includes additional functions like Fatal and Panic, which can be
used to handle critical errors and exceptional situations.
log.Print("hello")
Outputs:
2024/03/31 Hello
The API for the log package is very similar to that of fmt. All of the
below will work:
u := user{
name: "Matt",
age: 3,
}
log.Print("Hello")
33
MATTHEW BOYLE
log.Printin("Hello")
log.Printf("The user's name is %s", u.name)
log.Fatal("oops")
log.Print("Hello")
The above code will print oops but not hello. This is because log.
Fatal logs a message and then calls os.Exit, terminating the
program. This function should be used sparingly, and typically only in
the main function, when the program encounters a critical error and
cannot continue executing.
Whilst the log package is a great tool for local debugging, it may not
be the best choice for production environments, for the reasons we
outlined above. The slog package offers advanced features like struc
tured logging, which can provide better organization and filtering
capabilities for your logs. We’ll look at that in the next section.
34
METHODS OF DEBUGGING
SLOG
In Go 1.21, slog was introduced to the standard library2. Before this,
structured logging had to either be built manually or by using a third
party library (I typically used Zap3 from Uber).
As you can see we have gained the INFO keyword and the ability to
add attributes that have printed in a JSON format. This is structured,
leveled logging and is how I recommend you log for production.
Structured Logging
The problem with logs generally is they are unstructured. This means
they do not adhere to a consistent, predefined format, making them
hard to query and sort, which are two traits that are pretty critical if
we intend for our logs to be ingested into another system so we can
use them for debugging.
35
MATTHEW BOYLE
logger := slog.New(
slog.NewJSONHandler(
os.Stdout,
nil
) r
logger.Info("hello, world",
"user", "Matt",
)
{"time":"2023-08-04116:58:02.939245411-
04:00","level":"INFO","msg":"hello,
world","user":"Matt"}
slog.LogAttrs(
context.Background(),
slog.Levelinfo,
"hello, world",
36
METHODS OF DEBUGGING
slog.String("user", "Matt")
)
As you can see, LogAttrs takes a context. This means a handler can
access this and pull out values such as the trace ID (we talk about
tracing later in this book).
slog.LogAttrs(
ctx,
slog.Levelinfo,
"Processing request",
slog.String("userID", userID)
)
slog.LogAttrs(
ctx,
slog.Levelinfo,
"Request processed successfully",
slog.String("userID", userID)
)
}
37
MATTHEW BOYLE
There is much more to slog, but beyond log levels, you now know
near enough everything you need to know to start producing struc
tured logs!
Log Levels
You can adjust the log level dynamically using the slog.Level
option. For example, to set the log level to Error and above, you can
use the following code:
opts := &slog.HandlerOptions{
Level: slog.LevelError,
}
logger := slog.New(slog.NewJSONHandler(os-
.Stdout, opts))
err := errors.New("some-error")
logger.Info(
"Hello World",
slog.String("meta_info", "some
thing else"),
slog.Int("account_id", 35464),
slog.Any("err", err),
)
logger.Error(
"Hello World",
38
METHODS OF DEBUGGING
slog.String("meta_info", "some
thing else"),
slog.Int("account_id", 35464),
slog.Any("err", err),
)
{"time":"2009-11-10T23 :00:00Z","level":"ER
ROR" ,"msg":"Hello World","meta_info":"something
else","account_id":35464,"err":"some-error"}
The info log would be completely ignored because the log level has
been set to error which is a higher severity than info. For complete
ness, here is all the log levels provided by the slog package, in order of
priority:
By default, your log level should be set to Error, but you can enable
Info by switching an environment variable. This stops your logging
systems from becoming overwhelmed with noise generally, but does
enable you to “turn it up” for periods of time if you need to, without
having to make code changes and redeploy your application.
39
MATTHEW BOYLE
“wrap” errors further down your stack to bubble them up to the layer
in which we can log them. Here’s an example:
func main() {
logger := slog.New(
slog.NewJSONHandler(
os.Stdout,
nil
)
40
METHODS OF DEBUGGING
)
err := initApp()
if err != nil {
logger.Fatal(
"App startup failed",
slog.Error("err", err)
)
}
logger.Info("App started successfully")
}
The above also showcases two different log levels happening. We have
a call to Fatal, which will log then terminate our application. We
would likely be always interested in seeing these in standard out, or
pushing them to our logging tool. However, the info log might be
something we enable if we are debugging, but not something we need
to see always.
The simplest thing you can do is output logs to a file. You then have
the ability to review these during or after an event to piece together
what is happening. This is very challenging and does not scale very
well.
41
MATTHEW BOYLE
A common next step is to push your logs into something like Elastic-
Search4 and view them with Kibana5. In the exercise below I have
included a docker-compose file and a means to export your logs and
view them in Kibana, so check it out and give it a go!
Once your logs are in something like Kibana, you have the ability to
run queries against them, build dashboards and even setup alerts if
you see an increase in a certain amount of logs within a time period.
AN EXERCISE
This exercise is longer than the previous ones, but as part of the Ulti
mate Guide to Debugging with Go course 6 I teach, there is an excuse
which has you submit logs to a Kibana instance we setup using
42
METHODS OF DEBUGGING
[jj Add data ,5, Add Elastic Agent [?| Upload a file
Ingest data from popular apps and services. Add and manage your fleet of Elastic Agents Import your own CSV, NDJSON, or log file.
and integrations.
43
MATTHEW BOYLE
I logging-module
If you do it a few times, you’ll see various success and failures, such as
I did below.
44
METHODS OF DEBUGGING
45
MATTHEW BOYLE
You should see some logs from our service there. It will look some
thing like this:
Have a play around here. Make some more requests and experiment
with the filters. Most places I have worked use Kibana somewhere in
their stack, so experience with it is a great skill to have.
/book?author=$authorName
Using the code in the service. To do this you'll need to add some code
to transporthttp.go and dbadaptor.go, adding logs as you
go. Once this is done, review your endpoint and ensure you are happy
with the logs.
Final part. A customer has got in touch and said they keep receiving
an error "author not supported", even though they are making a
request for "rachel barnes" which the library says should be a
supported author. Why might this be? Can you debug it using only
logs in Kibana?
46
METHODS OF DEBUGGING
WRAPPING UP
I did not expect to write so much about logs at any point in my life,
and I bet you never expected to read so much either, yet here we
both are!
47
MATTHEW BOYLE
48
THE DEBUGGER
^r the debugger.
In this chapter, we will ensure that you have the debugger setup in
either VSCode or Goland, and then learn how to use everything from
the beginner to the advanced features. All the setup instructions
below are for Mac, but I don’t think they should differ too much by
49
MATTHEW BOYLE
This chapter will probably not be that useful to you unless you
actively have an IDE in front of you, so I recommend coming back to
this chapter later if you’re currently on the Northern Line to
Edgware.
Next, open a Go project. Any will do. Creating a new one is fine also.
50
METHODS OF DEBUGGING
You can open an existing project by selecting "Open Folder" from the
welcome screen or by going to File > Open Folder.
https://marketplace.visualstudio.com/items?itemName=golang.go3.
Once you have done this, you might notice a banner in the bottom
right corner of VSCode, with a message saying some other Go depen
dencies are missing or there are missing pieces or tools. If you click
on it, it'll tell you that you need to install Go PLS and some other
things to make the debugger work. Just click on it and hit install.
51
MATTHEW BOYLE
BREAKPOINTS
Breakpoints can be thought of as a pause in your program.
52
METHODS OF DEBUGGING
go main.go X
cmd > server > go main.go > $ main
14 "net/http"
15 "os"
16 )
17
18 func mainO {
19
20 logLevel := slog.LevelError
21
• 22 esURL := os.Getenv("ENV_ES_URL")
23 | if esURL == "" {
24 | log. FatalCelasticsearch url cannot be empty")
25 }
26 logMode := os.Getenv("ENV_LOG_LEVEL")
27 if logMode == "debug" {
28 logLevel = slog.Levelinfo
29 }
30
31 ctx := context.Background()
32
20 logLevel := slog.LevelError
21
□ V("ENV_ES_URL")
3 Remove Breakpoint
Edit Breakpoint... asticsearch url cannot be empty")
o Disable Breakpoint
env("ENV_LOG_LEVEL")
Copy vscode.dev Link bug" {
I
logLevel = slog.Levelinfo
29 }
30
In Goland, you can simply left click the breakpoint again to remove it.
You can also right click it to suspend it or add conditional logic:
53
MATTHEW BOYLE
In VSCode, you can click the “Run And Debug” button on the left
hand side (bottom icon in the screenshot provided). The first time you
do it you will see that you need to create a launch.json file.
Click “create a launch.json file”. Here is a basic one which should help
you get started:
54
METHODS OF DEBUGGING
{
// Use IntelliSense to learn about possi
ble attributes.
// Hover to view descriptions of exist
ing attributes.
// For more information, visit: https://go.
microsoft.com/fwlink/?linkid=8303 87
"version": "0.2.0",
"configurations": [
{
"name": "Launch Package",
"type": "go",
"request": "launch",
"mode": "auto",
"program": "${fileDirname}"
}
]
}
Once you have copied this, you should have the option to click
“launch pack” in the Run and Debug menu:
55
MATTHEW BOYLE
& 4
5
6
er" 7
8
9
10
11
If you click this, you should see your application begin running, and
then pause at your breakpoint as follows:
□ 25
26
*
log.FatalC'elastlcsearch url cannot be empty")
logHode os.Getenv("EWV_LOG_LEVEL")
27 iflogHode = "debug" {
logLevel = slog.Levelinfo
29 }
39
31 ctx :■ context.Background()
32
33 d := db.HockObO
34
35 es, err :■ elasticsearch.NewESWrlterlesURL)
36 if err In nil {
37 log.Fatal(err)
38
39
40
41
42 I :■ Hog.Ne«Multi$ourceLoggerLogger(&slog.HandlerOptions<
43 Level: logLevel,
}, loggers...)
45
a :■ library.HevHockAdaptor(d)
47
48
V CALL STACK
v • [Go 1] maln.ma_. paused on breakpoint .
noin.rain main.go 22 52
runtiae.aofn proc, go let 53
I runtiae.goaxit asm_amd64.s fOSO 54 svc, err := library.NewServiceta, sa, I)
> [Go 2] runtlme.gopark paused 33 if err In nil {
I.Errorcontext(ctx, “failed to create new service", slog.Any("err", err))
> [Go 18] runtlme.gopark paused „ os.Exit(l)
> [Go 19] runtlme.gopark paused 5g
> [Go 20] runtlme.gopark paused 59
® “ h, err :■ transport.Nesdlandlerlesvc, l>
if err In nil {
-- BREAKPOINTS l.ErrorContextlctx, "failed to create new handler", slog.*ny("err", err))
® _____ .. - .. —. 63
You can see it has highlighted where it was paused and we have some
56
METHODS OF DEBUGGING
In Goland, you don’t need to do any setup here. Simply hit the bug
icon in the top right:
S main.go x 8 router.go 8 handler.go 6 middleware.go 8 server.go * mentee.go 8 repository.go Debug 'go build github.com/anfragment/golangm
1 package main
slog.SetLogLoggerLevel(slog.LevelDebug)
err := godotenv.LoadO
if err != nil {
slog.ErrorContext(ctx, msg: "failed to load .env file", slog.Any( key: "error", err))
♦}
• slog.DebugContext(ctx, msg: "loaded .env file")
Once pressed, your program should hit the breakpoint and show
similar highlighting and panels as VScode.
57
MATTHEW BOYLE
package main
import "fmt"
func main() {
numbers := []int{l, 2, 3, 4}
fmt.Printin(numbers[4])
}
When we run this code, a panic occurs because our index is out of
bounds. Go prints out a trace showing the sequence of function calls
that led to the crash.
This example is simple, but you can always read the trace from
bottom to top to follow the path of execution. The last function call
before the panic shows where the problem originated. It will look
something like this:
goroutine 1 [running]:
main.main()
/home/user/project/main.go:7 +0x165
Although very short, In this trace, we can learn a lot about what went
wrong. Let’s go through it line by line.
58
METHODS OF DEBUGGING
In our IDE, we can also click on the main.go:10 line and it will take us
to the exact place the panic occurred in our code. This means we can
set a breakpoint before the panic occurs. I set one on line 6:
tt main.go x
1 package main
2
| 3 import "fmt"
4
5 O func main() {
S numbers := []int{l, 2f 3f 4}
I 8
fmt.Printin(numbers[A])
’ I
If we run it in debug mode, we’ll then see execution pause before the
panic.
59
MATTHEW BOYLE
STEPPING OVER
Once execution pauses, we can use the Step Over button to continue
execution. In Goland that button looks like this (the slightly bent
arrow you can see me hovering over in the image below):
And in VSCode like this (the arched arrow with a dot underneath it).
Once we step over, we’ll be able to see our slice of numbers show up
in the Evaluation console.
60
METHODS OF DEBUGGING
At this point if you step over again, the program will still panic, so
hopefully the ability to inspect values has helped you figure out your
issue!
STEPPING INTO
Let’s consider a slightly different Go program:
func main() {
x := 3
y := 4
sum := add (x, y)
fmt.Printf(
"The sum of %d and %d is %d\n",
x, y, sum)
}
Let’s say we suspect their is a bug with our add function and set a
breakpoint on the line y:=4. If we keep pressing the step over
61
MATTHEW BOYLE
button, we will simply skip over the add’s function logic and go
straight to the fmt.Printf. Instead, what we want to do is use the
“Step Into” button when we reach the line sum := add(x, y).
In Goland, that button looks like this (arrow pointing directly down):
And VSCode it looks like this (the arrow pointing down with a dot
underneath).
When you press this, you’ll jump into the add function and you can
step through the logic line by line.
62
METHODS OF DEBUGGING
CONDITIONAL BREAKPOINTS
Let’s adapt the example above slightly:
func main() {
for i := 0; i < 5; i++ {
x := rand.Intn(10) + 1
y := rand.Intn(10) + 1
sum := add (x, y)
}
}
The code is similar, but now we add a for loop and an element of
randomness to the sums being calculated. We hear reports that for
some reason, we see issues with our code when the value of x is 3. We
could modify our program so that x = 3 to debug it, but that risks
accidentally “fixing” the bug and the issue not being truly recreated.
63
MATTHEW BOYLE
In Goland, you can right-click in the gutter and select add conditional
breakpoint:
x := rand.Intn( n: 10) + 1
y := rand.Intn( n: 10) + 1
Add Breakpoint
1 sum of
Add Conditional Breakpoint...
Add Bookmark
Add Mnemonic Bookmark...
Soft-Wrap
Configure Soft Wraps...
Appearance >
main()
srvpr
Once selected, you’ll see a menu like this where you can enter a condi
tion. Conditions should be boolean, such as x==3.
>/ Enabled
V Suspend execution
64
METHODS OF DEBUGGING
21 I ,
v("ENV_ES_
Remove Breakpoint
Edit Breakpoint... | 'hen expre!
Disable Breakpoint
------------------------------------ asticsearc
Copy vscode.dev Link
20 logLevel := slog.LevelError
21
• 22 esURL := os.Getenv("ENV_ES_URL")
Expression x==3
23 if esURL == ,,H {
24____________log.Fatal("elasticsearch url cann
When you debug your program now, it will only pause execution on
this line if this condition is true!
DEBUGGING GOROUTINES
Goroutines are one of the most powerful features of the Go program
ming language, but they can also be challenging to debug due to their
concurrent nature. Even with the same inputs, programs that use the
go keyword may not yield the same outputs every time they are run;
the order of execution cannot be guaranteed, making the debugging
process more complex.
65
MATTHEW BOYLE
func main() {
go printNumbers("A")
go printNumbers("B")
func worker(
66
METHODS OF DEBUGGING
id int,
tasks <-chan Task,
wg * sync.WaitGroup
) {
for task := range tasks {
fmt.Printf(
"Worker %d started task %d\n",
id,
task.ID
)
processTime := time.Dura
tion (rand. Intn (5) ) * time.Second
fmt.Printf(
"Worker %d complet
ed task %d in %v\n",
id,
task.ID,
processTime
)
wg.Done()
}
}
func main() {
var wg sync.WaitGroup
numWorkers := 5
numTasks := 10
67
MATTHEW BOYLE
// Start workers
for i := 1; i <= numWorkers; i++ {
go worker(i, tasks, &wg)
}
wg.Wait()
close(tasks)
fmt.Printin("All tasks completed")
}
Running this program will output the tasks being processed by each
worker, showcasing how different workers may handle more or fewer
tasks than others.
68
METHODS OF DEBUGGING
To make our life easier here, we can attach some metadata to the
workers here using the runtime/pprof package. Once imported,
we can modify our main function as follows:
func main() {
var wg sync.WaitGroup
numWorkers := 5
numTasks := 10
// Start workers
for i := 1; i <= numWorkers; i++ {
labels := pprof.Labels("worker", strcon-
v.Itoa(i))
pprof.Do(
context.Background(),
labels,
func(_ context.Context) {
go worker(i, tasks, &wg)
69
MATTHEW BOYLE
}
)
}
wg.Wait()
close(tasks)
fmt.PrintIn("All tasks completed")
}
The key change here is to add labels to our workers using pprof.La-
bels. By doing this, you'll be able to easily identify and locate the
specific goroutine you're interested in debugging.
Now, when you debug your program, you'll notice that your gorou-
tines are neatly annotated with labels, making it much easier to find
and inspect the one you're looking for.
70
METHODS OF DEBUGGING
71
MATTHEW BOYLE
72
METHODS OF DEBUGGING
In fact, in almost all situations where you use the debugger, I would
advise ensuring you add a test to cover whatever it was you were
debugging.
Certainty
With a suite of tests, you can confidently make changes to your code
base, knowing that any regressions or unintended consequences will
be detected before they reach production. Ideally locally, but if not in
your CI pipeline.
73
MATTHEW BOYLE
Collaboration
Debugging Entrypoint
By using the debugger in conjunction with writing tests, you can gain
a deeper understanding of how various components interact and how
data flows through your application. This knowledge not only assists
you in debugging but also empowers you to write more robust and
maintainable code.
In the repo below, you'll be given the code for the HTTP server with
some issues. Your task is to:
You can get the starter code for this exercise from here: https://
github.com/MatthewJamesBoyle/ultimate-debugging-course-debug-
74
METHODS OF DEBUGGING
module.7 The code to run is called "exercise". Try and solve all the
challenges just with the debugger.
Your product manager has created a ticket with the following feedback
from your customer.
"I have to be honest, the new TODO app sucks. It seems to crash all the
time. Also when I add a new to do, the IDs don't seem to be working
quite right. When I go and check my TODOs, it doesn't work at all and
on the off chance it does, sometimes I get back an empty response. Can
you fix this please?"
After an initial investigation, it seems there are four major issues with
the exercise application. Can you fix them all?
Good luck! If you need help or want to validate your answer, there is
a video of me walking through the solution here8
75
DEBUGGING IN
PRODUCTION
In this section we are going to look at other tools and techniques you
can use to debug your Go application as you start to deploy it into the
wild and let real users interact with it.
METRICS
I ^n
n this chapter, well talk about metrics,,,, they matter,
and tips on what to measure.
We'll learn how we can add metrics to our Go applications and I’ll
share some tips on how you can create dashboards for you to see and
monitor them. If your company doesn't use metrics yet, you'll know
how to start using them by the end of this chapter.
79
MATTHEW BOYLE
We'll finish the chapter by having an exercise for you to learn more
and get hands-on experience.
Metrics are not a Go specific thing. What you learn in this chapter can
be applied to any programming language.
CATEGORIES OF METRICS
There are a few different categories of metrics we should consider
and monitor.
Application Metrics
80
DEBUGGING IN PRODUCTION
Business Metrics
• Login failures.
• Successful payments.
• Response times.
Infrastructure Metrics
This seems an awful lot of things to think about! Whilst this is true,
thankfully, at least in Go, measuring a lot of these things is very easy
due to the Prometheus1 library. Prometheus is an open-source, real
time monitoring and alerting toolkit that is well-suited for cloud
based environments. Prometheus stores metrics as time-series data,
81
MATTHEW BOYLE
with each data point including a timestamp and optional labels. This
allows for powerful querying using PromQL, a query language that
lets you select and aggregate data based on labels.
We'll learn how to use PromQL a little bit later to build Grafana dash
boards and visualize our metrics.
Counters
These metrics represent a single number that can only increase over
time. An example would be tracking successful transactions. Each
time a transaction is successful, the counter goes up.
Gauges
Histograms
These metrics are handy for tracking observations like request dura
tions or response sizes. They count the observations in different
buckets, providing a summary of all the observed values. For example,
you could use a histogram to monitor how quickly your application
responds to requests.
Summaries
82
DEBUGGING IN PRODUCTION
Metrics can be even more informative when you add labels to them.
Labels act like dimensions, allowing you to split your metrics based
on different criteria. For instance, you could label transaction success
metrics by payment method (Visa, Amex, MasterCard, etc.).
Notice how we give our metric a name and a help description. These
details make it easier to understand what the metric represents.
Registering Metrics
83
MATTHEW BOYLE
err := prometheus.Register(transactionSuccess-
Counter)
if err != nil {
// Handle registration error
}
Once our metrics are registered, we can start using them in our appli
cation's code. Let's say we have a function that handles successful
transactions, such as:
func handleSuccessfulTransaction() {
// Logic for handling successful transaction
transactionSuccessCounter.Inc()
}
Gauges
Gauges are useful for tracking values that can fluctuate, like blood
sugar levels. Here's an example of how to define and use a gauge:
84
DEBUGGING IN PRODUCTION
Histograms
duration := time.Since(start).Seconds()
requestDurationHistogram.Observe(duration)
}
85
MATTHEW BOYLE
Summaries
86
DEBUGGING IN PRODUCTION
EXPOSING METRICS
Now we know how to register and increment metrics, we need to
ensure we expose them so that they are available for us to view and
make decisions on. The easiest way to do that is to expose them via
HTTP. Thankfully, the Prometheus library makes this really easy to
do in Go!
import "github.com/prometheus/client_-
golang/prometheus/promhttp"
func main() {
// Register metrics (as shown earlier)
http.Handle(
"/metrics",
promhttp.Handler()
)
log.Fatal(
http.ListenAndServe(":8080", nil)
)
}
87
MATTHEW BOYLE
VIEWING METRICS
We have talked about the /metrics endpoint a lot, so let’s actually
see one! If you have been following along and have been adding some
metrics for a go application, go ahead and run it and browse to
http://localhost:8080/metrics in your browser. You can also
clone the repo I prepared here9.
Once you navigate to /metrics, you should see something like this
(I trimmed it for succinctness).
88
DEBUGGING IN PRODUCTION
At the very top, you'll see some general metrics about our program,
like how many goroutines are running and which version of the Go
programming language we're using. The really cool thing about this is
we get all of this information for free, just by calling promhttp.Han-
dler(). You’ll also notice that transaction_success_total, our custom
metric appears too.
89
MATTHEW BOYLE
We can now see some metrics on an endpoint, but the single values it
provides are not all that useful. Prometheus stores the data from this
endpoint in its database and that means we can do some time-based
queries. Let’s take a look at how to do that and then discuss how that
will help us debug production.
INTRODUCTION TO PROMQL
PromQL is the query language used by Prometheus.
http_requests_total
For more targeted data retrieval, you can use braces {} to filter by
specific labels.
Here’s an example:
http_requests_total{service="user-service",
method="GET"}
This query will return the total count of GET requests up to the
current point in time for a service called user-service.
sum(rate(http_responses_total{status=~"5.."}
[5m]))
90
DEBUGGING IN PRODUCTION
There’s a lot going on here that shows the value of Prometheus, so lets
talk through it bit by bit:
http_responses_total
This is the name of the metric being queried. In this case, http_re-
sponses_total likely records the total number of HTTP responses
generated by your application.
{status=~"5.."}
[5m]
Anything within [] is a range vector. This specifies the time range over
which to evaluate the metric. [5m] means "the last 5 minutes." When
applied to a metric, this creates a range vector, which includes data
points for each instance of the metric within the last 5 minutes.
rate()
91
MATTHEW BOYLE
sum()
PromQL can be a little challenging to get started with, but as you can
see it is very powerful. Typically when you deploy Prometheus or use
a managed service, you will get access to the Prometheus UI which
can be used for entering queries such as those discussed above:
92
DEBUGGING IN PRODUCTION
Here are some application queries that will help you get started with
tracking your Go application.
Goroutine Count
go_goroutines
Heap Usage
go_memstats_heap_alloc_bytes
histogram_quantile(0.9, rate(go_gc_duration_sec-
onds_bucket[5m]))
CPU Usage
Monitor average CPU usage per second over the last 5 minutes .
rate(process_cpu_seconds_total[5m])
Calculate the rate of HTTP requests per method over the last 5
minutes.
sum(rate(http_requests_total[5m])) by (method)
93
MATTHEW BOYLE
sum(rate(http_responses_total{status=~"5.."}
[5m])) / sum(rate(http_responses_total[5m]))
Measure the 95th percentile of HTTP request latency over the last 5
minutes.
histogram_quantile(0.95, rate(http_request_dura-
tion_seconds_bucket[5m]))
Thread Count
process_threads
Track the number of open file descriptors, which should not approach
the limit set by your system.
process_open_fds
94
DEBUGGING IN PRODUCTION
ALERTING
Grafana11 and Prometheus12 not only provide visualization capabili
ties but also offer alerting and automation features. You can define
alert rules based on specific metric conditions, such as transaction
success rates dropping below a certain threshold or HTTP request
durations exceeding a predefined limit.
95
MATTHEW BOYLE
EXERCISE
Using this repo13, the goal is to build a dashboard in Grafana to track
the number of goroutines currently in use in the application. The
README gives more instructions on how to get started.
Once you’re done, you can watch me walk through a solution here14.
Good luck!
System Overload:
More metrics mean needing more resources for storage and process
ing, which costs more. It also makes managing your monitoring
system more complicated.
96
DEBUGGING IN PRODUCTION
With so many metrics, it's tough to spot the really important informa
tion. This can slow down how quickly you respond to serious issues
in your system.
It’s not all doom and gloom though. Here are some tips to help make
sure you do measure the right things.
Only track metrics that are really useful for understanding your
system.
This can reduce the amount of data you need to store and
process.
Make sure your monitoring tool isn’t using too many resources.
You are now a metrics expert! I hope this chapter has given you some
97
MATTHEW BOYLE
concrete actions you can take away that will give you valuable insight
into your production systems.
In the next chapter we are going to discuss distributed tracing. I’ll see
you there!
98
DISTRIBUTED TRACING
99
MATTHEW BOYLE
| frontend /driver.DriverService/Fi.
v | frontend http get:/route
OPEN TELEMETRY
OpenTelemetry(often referred to as OTel) is an open-source project
designed to make observability simple. It allows developers to collect,
analyze, and export telemetry data such as metrics, logs, and, the focus
100
DEBUGGING IN PRODUCTION
of this chapter, traces. The goal is to help you monitor your software
and debug issues more effectively.
Code Emitters
These are the components within your application that emit traces.
Collectors
• Datadog
• Honeycomb.io
• AWS offerings
Firstly, let’s get Jaegar started on our machine. One simple way to do
this is use this Docker Compose file3:
version: '3'
services:
jaeger:
image: j aegertracing/all-in-one:latest
ports:
- "16686:16686" # UI
- "14268:14268" # Collector
101
MATTHEW BOYLE
- "14250:14250" # gRPC
- "9411:9411" # Zipkin
• Collector
• Query
• Agent
Once you've saved the Docker Compose file, simply run docker-
compose up in your terminal, and you'll have a fully functional Open
Telemetry stack running locally. You should be able to access the
Jaeger UI by navigating to http://localhost:16686 in your
web browser.
package main
import (
"log"
"net/http"
)
func main() {
http.HandleFuncSimpleHandler)
err := http.ListenAndServe(":8080", nil)
if err != nil {
log.Fatal(err)
102
DEBUGGING IN PRODUCTION
}
}
package main
import (
"go.opentelemetry.io/contrib/instrumenta-
tion/net/http/otelhttp"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/j aeger"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/sem-
conv/vl.7.0"
trace2 "go.opentelemetry.io/otel/trace"
"log"
"net/http"
)
func main() {
exporter, err := jaeger.New(
jaeger.WithCollectorEndpoint(),
)
103
MATTHEW BOYLE
if err != nil {
log.Fatal(err)
}
otel.SetTracerProvider(tp)
tracer = tp.Tracer("app-one")
)
err := http.ListenAndServe(8080", nil)
If err != nil {
log.Fatal(err)
}
}
104
DEBUGGING IN PRODUCTION
We've added Jaeger as the tracing backend and created a new HTTP
handler with tracing middleware using OpenTelemetry's otel-
http.NewHandler function. This middleware will trace incoming
requests to our endpoint. If you are not familiar with middleware,
they act as a layer that processes requests and responses in a web
application, allowing for various operations such as logging, authenti
cation, or, in this case, tracing. Middleware functions can be
composed to handle different aspects of request processing, providing
a modular and reusable approach to managing cross-cutting concerns
in your application.
You can run this code and access the "/" endpoint. Jaeger will start
collecting traces, and you can visualize them in the Jaeger UI. If every
thing runs as expected, when you make a request to http://localhost:
8080/ in your browser, you should also see a trace appear in the
Jaegar UI when you access it. If you used my docker-compose file,
it should be available at http://localhost: 16686/.
105
MATTHEW BOYLE
Even with this very basic example, you can see some of the promise
that distributed tracing can bring. If you click on one of the traces in
the UI and expand it, you’ll see the following:
106
DEBUGGING IN PRODUCTION
Writer, r *http.Request) {
// imagine this is a slow DB query.
time.Sleep(time.Second * 3)
_ = w.Write([]byte("Hello, World!"))
}
If we make a request again, we can see that the span is now 3 seconds
long, but it doesn’t really help us debug why it is 3 seconds long.
1 I
Service & Operation v > t; » Ops 750.29ms 1.5s 2.25s 3s
app-one
614a2S3iSSMI0lc «•
107
MATTHEW BOYLE
This is really powerful. We can now see app-one has a span within it
called db-query that is causing almost all of the processing time. If we
put an artificial delay of a second before the db-query span as follows:
108
DEBUGGING IN PRODUCTION
span.End()
_ = w.Write([]byte("Hello, World!"))
}
internal.span.format jaeger
109
MATTHEW BOYLE
CAPTURING ERRORS
Although you could do something like the below, there is a better way.
span.SetAttributes(attribute.String("error",
err.Error))
Let’s update the code in our SimpleHandler one more time to repre
sent a database error being returned. It now looks like this:
110
DEBUGGING IN PRODUCTION
Now in the Jaeger UI, we can see at a glance error traces which is
incredibly helpful!
In the Jaeger UI, you'll be able to see spans from different services
seamlessly connected, providing end-to-end visibility into the request
flow. This becomes particularly useful when diagnosing issues that
span multiple components or services. Let’s see an example of how to
do that.
111
MATTHEW BOYLE
package main
import (
" f mt"
"go.opentelemetry.io/contrib/instrumenta-
tion/net/http/otelhttp"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/exporters/j aeger"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/sem-
conv/vl.7.0"
trace2 "go.opentelemetry.io/otel/trace"
"log"
"net/http"
)
func main() {
exporter, err := jaeger.New(
jaeger.WithCollectorEndpoint(),
)
if err != nil {
log.Fatal(err)
}
112
DEBUGGING IN PRODUCTION
semconv.SchemaURL,
semconv.ServiceName-
Key.String("app-one"),
) r
) !
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(
propagation.TraceContext{},
)
tracer = tp.Tracer("app-one")
http.Handle(
otelhttp.NewHandler(
http.HandlerFunc(SimpleHandler),
"Hello",
) r
)
if err := http.ListenAnd-
Serve(":8080", nil); err != nil {
log.Fatal(err)
}
}
113
MATTHEW BOYLE
ctx,
http.MethodGet,
"http://localhost:8081/",
nil,
)
if err != nil {
span.SetStatus(
codes.Error,
err.Error(),
)
w.WriteHeader(http.Statuslnternal-
ServerError)
return
}
client := http.Client{
Transport: otelhttp.NewTransport(http.De-
faultTransport),
}
res, err := client.Do(req)
defer res.Body.Close()
if err != nil {
span.SetStatus(
codes.Error,
err.Error(),
)
w.WriteHeader(http.Statuslnternal-
ServerError)
return
}
if res.StatusCode != http.StatusOK {
span.SetAttributes(
attribute.Int("upstream_status_-
code", res.StatusCode),
)
114
DEBUGGING IN PRODUCTION
w.WriteHeader(http.Statuslnternal-
ServerError)
return
}
fmt.Fprintf(w, "Hello from app-one!")
}
otel.SetTextMapPropagator(propagation.Trace-
Context { } )
The line tells Open Telemetry to propagate our contexts over the
wire(in practice, this means it adds HTTP headers to the outgoing
request) . If you find that your traces are not propagating between
services correctly, this one line missing might be the reason.
client := http.Client{
115
MATTHEW BOYLE
Transport: otelhttp.NewTransport(http.De-
faultTransport),
}
res, err := client.Do(req)
if err != nil {
span.SetStatus(
codes.Error,
err.Error()
)
w.WriteHeader(http.Statusinternal-
Server Error)
return
}
defer res.Body.Close()
In this line here we are creating a new http request and we are using
the context that we started a couple of lines above. Again, if we do not
do this, our trace will not propagate. It’s also really important we use a
http client wrapped in the Otel.NewTransport function or, you
guessed it, it won’t work!
package main
import (
" fmt"
"go.opentelemetry.io/contrib/instrumenta-
tion/net/http/otelhttp"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/j aeger"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
116
DEBUGGING IN PRODUCTION
semconv "go.opentelemetry.io/otel/sem-
conv/vl.7.0"
trace2 "go.opentelemetry.io/otel/trace"
"log"
"net/http"
"time"
)
func main() {
// Initialize Jaeger Exporter
exporter, err := jaeger.New(
jaeger.WithCollectorEndpoint(),
)
if err != nil {
log.Fatal(err)
}
otel.SetTracerProvider(tp)
otel.SetTextMapPropagator(
propagation.TraceContext{},
)
117
MATTHEW BOYLE
tracer = tp.Tracer("app-two")
otelhttp.NewHandler(
http.HandlerFunc(SimpleHandler),
"Index",
) r
)
log.Fatal(
http.ListenAndServe(":8081", nil),
)
}
118
DEBUGGING IN PRODUCTION
Look how cool that is! We have made one request, but Jaegar can see
that it spans two services (app-one and app-two). If we click on a trace
it is even more interesting:
app-one: Hello
processing_request
By doing this, we also unlocked another cool new feature which is the
System Architecture tab in Jaegar.
119
MATTHEW BOYLE
You’ll see:
This might not look very impressive, but take a step back and think
about what just happened. Simply by making http calls between
applications and propagating traces, we now have a real-time view of
which services are talking to which.
120
DEBUGGING IN PRODUCTION
• Database calls
• gRPC communications
By creating spans around critical sections of your code, you can gain
valuable insights into the performance and behavior of your entire
system, enabling you to make informed decisions and optimize your
applications for better user experiences.
YOUR TURN!
Take a look at the exercise package in this repo. You can find it in
cmd/exercise.4
Firstly, run docker compose up and ensure you can get to the
Jaegar UI on 16686.
121
MATTHEW BOYLE
• Figure out which of the function calls is slower. How much slower
is it?
Once you have given it your best, you can see me step through the
solution here5
Performance Impact:
Tracing every request can add overhead to your system. This can slow
down your application, especially under high load, potentially
affecting user experience and system reliability.
Data Overload
Increased Costs
More data means higher storage and processing costs. Storing every
trace, especially in large-scale systems can become prohibitively
expensive very quickly.
Resource Saturation
122
DEBUGGING IN PRODUCTION
Your tracing backend (in our example, Jaeger) and any associated
databases will face increased load. This can lead to performance
degradation, increased latency in trace retrieval, and even system
outages in extreme cases.
Start with a sampling rate that makes sense for your system’s load and
the criticality of the data. Adaptive sampling, where the rate changes
based on system load, can be particularly effective.
123
PROFILING & PPROF
Like other things in this book, profiling isn’t a Go-specific thing, but
Go does have amazing first class support for it.
124
DEBUGGING IN PRODUCTION
WHY PROFILE?
Profiling offers benefits for both debugging and for general opti
mization.
Identify Bottlenecks
Optimize Performance
CPU Profile
Memory Profile
125
MATTHEW BOYLE
Concurrency Profile
Now we know at a high level what profiling is. Let’s go ahead and add
the profiler to a Go application.
import _ "net/http/pprof"
Once you've added the net/http/pprof package, you can run the
profiler using the go tool pprof command. For example, if your
application is running on localhost:8080, you can run
126
DEBUGGING IN PRODUCTION
This command will capture a CPU profile and open the pprof tool,
allowing you to analyze the collected data.
What if your application doesn't have an HTTP server? pprof still has
you covered. You can start a dedicated HTTP server solely for
profiling purposes. Here's an example:
package main
import (
"log"
"net/http"
_ "net/http/pprof"
)
func main() {
log.PrintIn(
http.ListenAndServe(
"localhost:6060",
nil,
) r
)
}
More likely than not, your application will do more than just start
pprof server. You probably want to have your own business logic
which may include another http server. This is a good way to do that:
package main
import (
"context"
127
MATTHEW BOYLE
"log"
"net/http"
_ "net/http/pprof"
"golang.org/x/sync/errgroup"
)
func main() {
ctx := context.Background()
g, ctx := errgroup.WithContext(ctx)
g.Go(func() error {
log.PrintIn("Starting pprof server on
localhost:6060")
return http.ListenAndServe(
"localhost:6060",
nil,
)
})
g.Go(func() error {
log.Printin("Starting main application
logic...")
// Your main application logic here
return nil
})
128
DEBUGGING IN PRODUCTION
_ "net/http/pprof"
For example, importing a package solely for its side effects might
make sense in certain situations (such as registering database drivers),
but overuse or misuse can result in code that is harder to understand
and maintain. It is generally better to import the package explicitly
and use its exported functions to make your intentions clear.
package main
import (
"net/http"
"net/http/pprof"
)
func main() {
129
MATTHEW BOYLE
mux := http.NewServeMux()
mux.HandleFunc(
"/debug/pprof/",
pprof.Index
)
mux.HandleFunc(
"/debug/pprof/cmdline",
pprof.Cmdline
)
mux.HandleFunc(
"/debug/pprof/profile",
pprof.Profile
)
mux.HandleFunc(
"/debug/pprof/symbol",
pprof.Symbol
)
mux.HandleFunc(
"/debug/pprof/trace",
pprof.Trace
)
http.ListenAndServe(
":8080",
mux
)
}
This means you can also customize the path of the endpoints used.
130
DEBUGGING IN PRODUCTION
Signals that you may need to optimize your memory usage include:
A Basic Example
package main
import (
131
MATTHEW BOYLE
" f mt"
"net/http"
"net/http/pprof"
"time"
)
func main() {
go func() {
fmt.Printin(http.ListenAnd-
Serve(":8082", nil))
}()
time.Sleep(time.Second * 20)
}
While this may seem like a silly example, it's not uncommon to
encounter situations where data structures are created, used, and
discarded within loops, leading to unnecessary memory allocations
and potential performance issues.
132
DEBUGGING IN PRODUCTION
If we run this program, it will hang for a while, as you’d expect. Whilst
it’s still running, we can open another terminal and run the following
command:
Type: inuse_space
Time: Feb 8, 2024 at 5:12am (GMT)
(pprof)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
133
MATTHEW BOYLE
512.44kB
You can use this to look at varying paths through your application.
This example is very simple, but they can get really big, which is a
little overwhelming when just starting out.
There has been a challenge going around the internet which has been
a fun exploration of how far modern Java can be pushed for achieving
one billion rows from a text file. However, we can also do it in Go!
134
DEBUGGING IN PRODUCTION
As an output, you must for each unique station, find the minimum,
average and maximum temperature recorded and emit the final result
on STDOUT in station name’s alphabetical order with the format
{<station name>:<min>/<average>/<max>;<station
name>:<min>/<average>/<max>}.
In the blog, the author Shraddha starts with a naive solution which
takes ~6 minutes, and through profile-guided optimization3 manages
to get her solution down to 14 seconds.
Once you have cloned the repo, you can run go tool pprof mem-
135
MATTHEW BOYLE
This has a lot more going on than our simple example. You’ll see
malg is taking up 30.73%, which is the thing responsible for handling
dynamic memory allocation. You’ll also see on the left StartCpu-
Profile which is using 35.26%. This proves a very useful to know
fact; profiling does have an overhead and we therefore do not want to
leave it on all time, especially in performance critical systems.
136
DEBUGGING IN PRODUCTION
Back to Basics
package main
import (
" fmt"
"net/http"
137
MATTHEW BOYLE
_"net/http/pprof"
"time"
)
func main() {
go func() {
fmt.Printin(http.ListenAnd-
Serve(addr: ":8082", handler: nil))
}()
time.Sleep(time.Second * 20)
}
package main
138
DEBUGGING IN PRODUCTION
import (
" fmt"
"net/http"
_"net/http/pprof"
"time"
)
func main() {
go func() {
fmt.Printin(
http.ListenAndServe(
addr: ":8082",
handler: nil
)
)
}()
time.Sleep(
time.Second * 20
)
}
139
MATTHEW BOYLE
This function finds the nth Fibonacci number. It calls itself with the
two previous numbers. As the number n gets bigger, the time to finish
grows quickly.
Let’s run this code a few times with different inputs. For example, for
fib(10) it would look as follows.
func main() {
140
DEBUGGING IN PRODUCTION
start := time.Now()
fib(10)
elapsed := time.Since(start)
fmt.Printf("took %s", elapsed)
}
if we run the fibonacci function with different inputs, we see that the
runtime increases exponentially:
As you can see, even a simple algorithm like the Fibonacci sequence
can quickly become a performance bottleneck as the input size
increases.
141
MATTHEW BOYLE
which can help you pinpoint the areas of your code that are
consuming the most CPU resources.
Let’s add pprof and the http server back to our application as before.
Your imports and main function should now look as follows:
package main
import (
" fmt"
"net/http"
_ "net/http/pprof"
"time"
)
func main() {
start := time.Now()
go func() {
http.ListenAndServe(":8082", nil)
}()
fib(60)
elapsed := time.Since(start)
fmt.Printf("took %s", elapsed)
}
curl http://localhost:8082/debug/pprof/profile?
seconds=30 > cpu.prof
142
DEBUGGING IN PRODUCTION
Once the profile has been captured, we can analyze it using the pprof
tool by running go tool pprof cpu.prof. As before, you
should see something similar to:
Type: cpu
Time: Feb 9, 2024 at 6:51pm (GMT)
Duration: 30.13s, Total samples = 27.23s (90.37%)
Entering interactive mode (type "help "for commands, "o "for options)
It gives us some useful metadata here, and confirms the type of the
profile, which is always useful! As before, we can type top and you’ll
see the following:
(pprof) top
Showing nodes accountingfor 27.14s, 99.67%) of27.23s total
Dropped 25 nodes (cum <= 0.14s)
flat flat%> sum%> cum cum%
27.14s 99.67%, 99.67%, 27.17s 99.78%> main.flb
0 0%> 99.67%, 27.17s 99.78% main.main
0 0% 99.67% 27.17s 99.78% runtime.main
Now we are a little bit more familiar with the tool, let’s go through
the output line by line.
143
MATTHEW BOYLE
Overall, this output shows that the main.fib function is the primary
consumer of CPU resources and is likely the first place to look when
optimizing the program for CPU usage. This is very unsurprising
since we intentionally called our slow fib function from main(), this
is unsurprising. What might be more interesting is where exactly in
144
DEBUGGING IN PRODUCTION
the function that time is being spent. It’s time to introduce a new
command.
pprof list
In our terminal, enter pprof list main.fib. You should see the
following:
Total: 27.23s
ROUTINE ======================== main.fib in /Users/mattheyvboyle/Dev/ultimate-debug-
ging-with-go-profiling/cnid/nuiin.go
27.14s 39.98s (flat, cum) 146.82% of Total
9.46s 9.46s 22:func fib(n int) int {
(pprof)
There’s a lot we can learn from these lines, so let’s go through them
line by line again.
• Total: 27.23s: This indicates the total CPU time that was profiled.
145
MATTHEW BOYLE
• 39.98s under cum is the cumulative CPU time including all calls
made by main.fib.
The lines:
show the amount of time spent executing each particular line within
the fib function.
• The dots . represent lines in the source code that didn't consume a
measurable amount of CPU time.
We can learn from this we are spending a lot of time on the recursive
calculation. Let’s look at a strategy called memoization to see if we
can make our code spend less time calculating and therefore overall
be faster.
146
DEBUGGING IN PRODUCTION
Memoization
func main() {
start := time.Now()
fib(10)
elapsed := time.Since(start)
fmt.Printf("took %s", elapsed)
}
147
MATTHEW BOYLE
quite large for larger input values, potentially leading to high memory
usage. As with any optimization, it's crucial to strike a balance
between CPU and memory use based on your application's specific
requirements. For example, consider a scenario where your applica
tion is running on a resource-constrained environment, such as a
server with limited memory or a containerized environment with
strict resource limits. In such cases, optimizing for CPU performance
alone may not be enough; you'll also need to consider the memory
footprint of your application too. However, for our application, I
think we can agree this trade-off is more than worth it.
More importantly, for the purpose of this book, using lots of gorou
tines can make an application harder to debug. Earlier in this book we
annotated our goroutines in an attempt to make this a little better.
Thankfully, there is another tool which makes this even easier.
148
DEBUGGING IN PRODUCTION
package main
import (
" fmt"
"net/http"
_ "net/http/pprof"
"sync"
"time"
)
func main() {
go func() {
fmt.Printin(http.ListenAnd-
Serve(":8082", nil))
}()
startTime := time.Now()
var wg sync.WaitGroup
numberOfTasks := 1000
149
MATTHEW BOYLE
wg.Wait()
fmt.Printf("All tasks complet
ed in %s\n", time.Since(startTime))
}
Before we do some profiling, to make our life easier let’s increase the
wait time from 10 milliseconds to 10 seconds. The program now
looks like this:
package main
import (
" fmt"
"net/http"
_ "net/http/pprof"
"sync"
"time"
)
150
DEBUGGING IN PRODUCTION
func main() {
go func() {
fmt.Printin(http.ListenAnd-
Serve(":8082", nil))
}()
startTime := time.Now()
var wg sync.WaitGroup
numberOfTasks := 1000
wg.Wait()
fmt.Printf("All tasks complet
ed in %s\n", time.Since(startTime))
}
/
goroutine
This will open the familiar pprof terminal. As before we can use the
top command. You should see an output such as:
(pprof)
151
MATTHEW BOYLE
This shows us we are using 1003 goroutines in total, and 1000 are
being used by performTask. We can also use the command list,
and it will show us that at the time of profiling, our goroutines are
spending most of their time on the time.Sleep() which makes
sense. Finally, we can run web as before, which shows us:
For the simple program we have here, perhaps the performance is ok.
However, as your application gets more complex and performance
more critical, I have found that experimenting with the amount of
goroutines you use can have a meaningful impact on performance.
Let’s see this in action. Here is a refactor to the above code to only use
10 goroutines. I also lowered the wait time to 10ms:
152
DEBUGGING IN PRODUCTION
package main
import (
" fmt"
"sync"
"time"
)
func main() {
startTime := time.Now()
var wg sync.WaitGroup
numberOfTasks := 1000
numberOfWorkers := 10
tasks := make(chan int, numberOfTasks)
153
MATTHEW BOYLE
wg.Wait()
close(tasks)
fmt.Printf("All tasks complet
ed in %s\n", time.Since(startTime))
}
EXERCISE
Take a look at the code here6. The README contains instructions on
how it works and how to get it running.
Here are the Go program arguments to run it if you are using an IDE.
-i "wordlists/lmillion_passwords.txt" -1 2 -o
"out.txt"
Try to do this without looking at the code first; our goal is to look at
the profiles and take action based on what we see there.
154
DEBUGGING IN PRODUCTION
Good luck!
Thank you to Nino Stephen 7 on Twitter for sharing this repo with
me and approving it for this purpose.
Once you have given it your best, you can watch me give a solution to
this here8.
155
MATTHEW BOYLE
156
THANK YOU
I fyou have read this far, thank you so much .or reading this
^bo book and I hope you learnt a lot. If you did, please tweet
about it on X or give me your thoughts directly by emailing hello@
bytesizego.com.
157
MATTHEW BOYLE
any ideas for content you’d like to see, or even would like to make a
course yourself!
- Matt Boylt
158
NOTES
CONTRIBUTORS
1. https://x.com/MattJamesBoyle
2. https://x.com/anfragment
3. https://www.linkedin.com/in/tonilovejoy
4. https://vlang.io/
5. http://twitter.com/micvbang
1. WELCOME
1. https://twitter.com/mattjamesboyle
1. DEBUGGING BY EYE
1. https://gobyexample.com/interfaces
2. https://go.dev/tour/concurrency/ 11
3. https://go.dev/doc/effective_go
4. https://github.com/uber-go/guide/blob/master/style.md
5. https://golangci-lint.run/
6. https://goplay.tools/snippet/X2XUK2wnRPH
7. https://www.bytesizego.com/view/courses/the-ultimate-guide-to-debugging-
with-go/2357585-methods-of-debugging-debugging-by-eye/7530834-exercise-
solution
2. PAIR PROGRAMMING
1. https://www.jetbrains.com/help/go/code-with-me.html
3. LOGGING
1. https://pkg.go.dev/fmt#hdr-Printing
2. https://go.dev/blog/slog
3. https://github.com/uber-go/zap
4. https://www.elastic.co/
5. https://www.elastic.co/kibana
6. http://bytesizego.com/the-ultimate-guide-to-debugging-with-go
7. https://docs.docker.com/get-docker/
8. https://www.bytesizego.com/view/courses/the-ultimate-guide-to-debugging-
with-go/2359216-methods-of-debugging-logging/7571287-exercise-solution
159
NOTES
9. https://twitter.com/MattJamesBoyle/status/1746213913757196710
4. THE DEBUGGER
1. https://www.jetbrains.com/go/
2. https://code.visualstudio.com/
3. https://marketplace.visualstudio.com/items?itemName=golang.go
4. https://go.dev/doc/faq#no_goroutine_id
5. https://quii.gitbook.io/leam-go-with-tests
6. https://www.microsoft.com/en-us/research/wp-content/uploads/2009/ 10/Realiz
ing-Quality-Improvement-Through-Test-Driven-Development-Results-and-Expe
riences-of-Four-Industrial-Teams-nagappan_tdd.pdf
7. https://github.com/MatthewJamesBoyle/ultimate-debugging-course-debug-
module.
8. https://www.bytesizego.com/view/courses/the-ultimate-guide-to-debugging-
with-go/2359222-methods-of-debugging-the-debugger/7608312-exercise-solu
tion
1. METRICS
1. http://prometheus.io
2. https://www.bytesizego.com/blog/keeping-alive-with-go
3. https://prometheus.io/docs/guides/go-application/
4. https://www.bytesizego.com/blog/keeping-alive-with-go
5. https://stackoverflow.com/questions/51146578/what-use-cases-really-make-
prometheuss-summary-metrics-type-necessary-unique
6. https://aws.amazon.com/prometheus/
7. https://cloud.google.com/stackdriver/docs/managed-prometheus
8. https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/prometheus-
metrics-overview
9. https://github.com/MatthewJamesBoyle/ultimate-debugging-with-go-metrics/
10. https://promlabs.com/promql-cheat-sheet/
11. https://grafana.com/docs/grafana/latest/alerting/
12. https://prometheus.io/docs/alerting/latest/overview/
13. https://github.com/MatthewJamesBoyle/ultimate-debugging-with-go-metrics
14. https://www.bytesizego.com/view/courses/the-ultimate-guide-to-debugging-
with-go/2359231-debugging-in-production-metrics/7659150-exercise-solution
2. DISTRIBUTED TRACING
1. https://research.google/pubs/dapper-a-large-scale-distributed-systems-tracing-
infrastructure/
2. https://github.com/MatthewJamesBoyle/ultimate-debugging-with-go-tracing
3. https://github.com/MatthewJamesBoyle/ultimate-debugging-with-go-tracing/
blob/main/docker-compose.yaml
4. https://github.com/MatthewJamesBoyle/ultimate-debugging-with-go-tracing
160
NOTES
5. https://www.bytesizego.com/view/courses/the-ultimate-guide-to-debugging-
with-go/2359239-debugging-in-production-tracing/7683889-exercise-solution
161