Skip to content

Commit 344698f

Browse files
committed
slog-handler-guide: speed section
Change-Id: If06f51747e6e4926d95e88840ab5fdce5f9a0189 Reviewed-on: https://go-review.googlesource.com/c/example/+/513457 Reviewed-by: Ian Cottrell <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Jonathan Amsterdam <[email protected]>
1 parent 17e2836 commit 344698f

File tree

5 files changed

+578
-5
lines changed

5 files changed

+578
-5
lines changed

slog-handler-guide/README.md

+118-2
Original file line numberDiff line numberDiff line change
@@ -894,6 +894,122 @@ error.
894894

895895
## Speed
896896

897-
TODO(jba): discuss
897+
Most programs don't need fast logging.
898+
Before making your handler fast, gather data&mdash;preferably production data,
899+
not benchmark comparisons&mdash;that demonstrates that it needs to be fast.
900+
Avoid premature optimization.
898901

899-
TODO(jba): show how to pool a []byte.
902+
If you need a fast handler, start with pre-formatting. It may provide dramatic
903+
speed-ups in cases where a single call to `Logger.With` is followed by many
904+
calls to the resulting logger.
905+
906+
If log output is the bottleneck, consider making your handler asynchronous.
907+
Do the minimal amount of processing in the handler, then send the record and
908+
other information over a channel. Another goroutine can collect the incoming log
909+
entries and write them in bulk and in the background.
910+
You might want to preserve the option to log synchronously
911+
so you can see all the log output to debug a crash.
912+
913+
Allocation is often a major cause of a slow system.
914+
The `slog` package already works hard at minimizing allocations.
915+
If your handler does it own allocation, and profiling shows it to be
916+
a problem, then see if you can minimize it.
917+
918+
One simple change you can make is to replace calls to `fmt.Sprintf` or `fmt.Appendf`
919+
with direct appends to the buffer. For example, our IndentHandler appends string
920+
attributes to the buffer like so:
921+
922+
buf = fmt.Appendf(buf, "%s: %q\n", a.Key, a.Value.String())
923+
924+
As of Go 1.21, that results in two allocations, one for each argument passed to
925+
an `any` parameter. We can get that down to zero by using `append` directly:
926+
927+
buf = append(buf, a.Key...)
928+
buf = append(buf, ": "...)
929+
buf = strconv.AppendQuote(buf, a.Value.String())
930+
buf = append(buf, '\n')
931+
932+
Another worthwhile change is to use a `sync.Pool` to manage the one chunk of
933+
memory that most handlers need:
934+
the `[]byte` buffer holding the formatted output.
935+
936+
Our example `Handle` method began with this line:
937+
938+
buf := make([]byte, 0, 1024)
939+
940+
As we said above, providing a large initial capacity avoids repeated copying and
941+
re-allocation of the slice as it grows, reducing the number of allocations to
942+
one.
943+
But we can get it down to zero in the steady state by keeping a global pool of buffers.
944+
Initially, the pool will be empty and new buffers will be allocated.
945+
But eventually, assuming the number of concurrent log calls reaches a steady
946+
maximum, there will be enough buffers in the pool to share among all the
947+
ongoing `Handler` calls. As long as no log entry grows past a buffer's capacity,
948+
there will be no allocations from the garbage collector's point of view.
949+
950+
We will hide our pool behind a pair of functions, `allocBuf` and `freeBuf`.
951+
The single line to get a buffer at the top of `Handle` becomes two lines:
952+
953+
bufp := allocBuf()
954+
defer freeBuf(bufp)
955+
956+
One of the subtleties involved in making a `sync.Pool` of slices
957+
is suggested by the variable name `bufp`: your pool must deal in
958+
_pointers_ to slices, not the slices themselves.
959+
Pooled values must always be pointers. If they aren't, then the `any` arguments
960+
and return values of the `sync.Pool` methods will themselves cause allocations,
961+
defeating the purpose of pooling.
962+
963+
There are two ways to proceed with our slice pointer: we can replace `buf`
964+
with `*bufp` throughout our function, or we can dereference it and remember to
965+
re-assign it before freeing:
966+
967+
bufp := allocBuf()
968+
buf := *bufp
969+
defer func() {
970+
*bufp = buf
971+
freeBuf(bufp)
972+
}()
973+
974+
975+
Here is our pool and its associated functions:
976+
977+
```
978+
var bufPool = sync.Pool{
979+
New: func() any {
980+
b := make([]byte, 0, 1024)
981+
return &b
982+
},
983+
}
984+
985+
func allocBuf() *[]byte {
986+
return bufPool.Get().(*[]byte)
987+
}
988+
989+
func freeBuf(b *[]byte) {
990+
// To reduce peak allocation, return only smaller buffers to the pool.
991+
const maxBufferSize = 16 << 10
992+
if cap(*b) <= maxBufferSize {
993+
*b = (*b)[:0]
994+
bufPool.Put(b)
995+
}
996+
}
997+
```
998+
999+
The pool's `New` function does the same thing as the original code:
1000+
create a byte slice with 0 length and plenty of capacity.
1001+
The `allocBuf` function just type-asserts the result of the pool's
1002+
`Get` method.
1003+
1004+
The `freeBuf` method truncates the buffer before putting it back
1005+
in the pool, so that `allocBuf` always returns zero-length slices.
1006+
It also implements an important optimization: it doesn't return
1007+
large buffers to the pool.
1008+
To see why this important, consider what would happen if there were a single,
1009+
unusually large log entry&mdash;say one that was a megabyte when formatted.
1010+
If that megabyte-sized buffer were put in the pool, it could remain
1011+
there indefinitely, constantly being reused, but with most of its capacity
1012+
wasted.
1013+
The extra memory might never be used again by the handler, and since it was in
1014+
the handler's pool, it might never be garbage-collected for reuse elsewhere.
1015+
We can avoid that situation by keeping large buffers out of the pool.

slog-handler-guide/guide.md

+99-3
Original file line numberDiff line numberDiff line change
@@ -612,6 +612,102 @@ error.
612612

613613
## Speed
614614

615-
TODO(jba): discuss
616-
617-
TODO(jba): show how to pool a []byte.
615+
Most programs don't need fast logging.
616+
Before making your handler fast, gather data&mdash;preferably production data,
617+
not benchmark comparisons&mdash;that demonstrates that it needs to be fast.
618+
Avoid premature optimization.
619+
620+
If you need a fast handler, start with pre-formatting. It may provide dramatic
621+
speed-ups in cases where a single call to `Logger.With` is followed by many
622+
calls to the resulting logger.
623+
624+
If log output is the bottleneck, consider making your handler asynchronous.
625+
Do the minimal amount of processing in the handler, then send the record and
626+
other information over a channel. Another goroutine can collect the incoming log
627+
entries and write them in bulk and in the background.
628+
You might want to preserve the option to log synchronously
629+
so you can see all the log output to debug a crash.
630+
631+
Allocation is often a major cause of a slow system.
632+
The `slog` package already works hard at minimizing allocations.
633+
If your handler does its own allocation, and profiling shows it to be
634+
a problem, then see if you can minimize it.
635+
636+
One simple change you can make is to replace calls to `fmt.Sprintf` or `fmt.Appendf`
637+
with direct appends to the buffer. For example, our IndentHandler appends string
638+
attributes to the buffer like so:
639+
640+
buf = fmt.Appendf(buf, "%s: %q\n", a.Key, a.Value.String())
641+
642+
As of Go 1.21, that results in two allocations, one for each argument passed to
643+
an `any` parameter. We can get that down to zero by using `append` directly:
644+
645+
buf = append(buf, a.Key...)
646+
buf = append(buf, ": "...)
647+
buf = strconv.AppendQuote(buf, a.Value.String())
648+
buf = append(buf, '\n')
649+
650+
Another worthwhile change is to use a `sync.Pool` to manage the one chunk of
651+
memory that most handlers need:
652+
the `[]byte` buffer holding the formatted output.
653+
654+
Our example `Handle` method began with this line:
655+
656+
buf := make([]byte, 0, 1024)
657+
658+
As we said above, providing a large initial capacity avoids repeated copying and
659+
re-allocation of the slice as it grows, reducing the number of allocations to
660+
one.
661+
But we can get it down to zero in the steady state by keeping a global pool of buffers.
662+
Initially, the pool will be empty and new buffers will be allocated.
663+
But eventually, assuming the number of concurrent log calls reaches a steady
664+
maximum, there will be enough buffers in the pool to share among all the
665+
ongoing `Handler` calls. As long as no log entry grows past a buffer's capacity,
666+
there will be no allocations from the garbage collector's point of view.
667+
668+
We will hide our pool behind a pair of functions, `allocBuf` and `freeBuf`.
669+
The single line to get a buffer at the top of `Handle` becomes two lines:
670+
671+
bufp := allocBuf()
672+
defer freeBuf(bufp)
673+
674+
One of the subtleties involved in making a `sync.Pool` of slices
675+
is suggested by the variable name `bufp`: your pool must deal in
676+
_pointers_ to slices, not the slices themselves.
677+
Pooled values must always be pointers. If they aren't, then the `any` arguments
678+
and return values of the `sync.Pool` methods will themselves cause allocations,
679+
defeating the purpose of pooling.
680+
681+
There are two ways to proceed with our slice pointer: we can replace `buf`
682+
with `*bufp` throughout our function, or we can dereference it and remember to
683+
re-assign it before freeing:
684+
685+
bufp := allocBuf()
686+
buf := *bufp
687+
defer func() {
688+
*bufp = buf
689+
freeBuf(bufp)
690+
}()
691+
692+
693+
Here is our pool and its associated functions:
694+
695+
%include indenthandler4/indent_handler.go pool -
696+
697+
The pool's `New` function does the same thing as the original code:
698+
create a byte slice with 0 length and plenty of capacity.
699+
The `allocBuf` function just type-asserts the result of the pool's
700+
`Get` method.
701+
702+
The `freeBuf` method truncates the buffer before putting it back
703+
in the pool, so that `allocBuf` always returns zero-length slices.
704+
It also implements an important optimization: it doesn't return
705+
large buffers to the pool.
706+
To see why this important, consider what would happen if there were a single,
707+
unusually large log entry&mdash;say one that was a megabyte when formatted.
708+
If that megabyte-sized buffer were put in the pool, it could remain
709+
there indefinitely, constantly being reused, but with most of its capacity
710+
wasted.
711+
The extra memory might never be used again by the handler, and since it was in
712+
the handler's pool, it might never be garbage-collected for reuse elsewhere.
713+
We can avoid that situation by keeping large buffers out of the pool.

0 commit comments

Comments
 (0)