Lisa13flamegraphs 131107112122 Phpapp01
Lisa13flamegraphs 131107112122 Phpapp01
with
Flame Graphs
Brendan Gregg
An Interactive Visualization for Stack Traces
My Previous Visualizations Include
• Latency Heat Maps (and other heat map types), including:
mysqld`_Z13add_to_statusP17system_status_varS0_+0x47
mysqld`_Z22calc_sum_of_all_statusP17system_status_var+0x67
mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x1222
mysqld`_Z10do_commandP3THD+0x198
mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
5530
Example: CPU Profiling
# dtrace -x ustackframes=100 -n 'profile-997 /execname == "mysqld"/ {
@[ustack()] = count(); } tick-60s { exit(0); }'
dtrace: description 'profile-997 ' matched 2 probes
CPU ID FUNCTION:NAME Profiling
1 75195 :tick-60s
[...] Command
libc.so.1`__priocntlset+0xa (DTrace)
libc.so.1`getparam+0x83
libc.so.1`pthread_getschedparam+0x3c
libc.so.1`pthread_setschedprio+0x1f
mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x9ab
mysqld`_Z10do_commandP3THD+0x198
mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
4884
mysqld`_Z13add_to_statusP17system_status_varS0_+0x47
mysqld`_Z22calc_sum_of_all_statusP17system_status_var+0x67
Stack mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x1222
mysqld`_Z10do_commandP3THD+0x198
Trace mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
5530 # of occurrences
Example: Profile Data
• Over 500,000 lines were elided from that output (“[...]”)
• Full output looks like this...
Example: Profile Data
Size of
First One Stack
Stack
Last
Stack
27,053 Unique
60 seconds ofStacks
on-CPU MySQL
Example: Profile Data
• The most frequent stack, printed last, shows CPU usage in
add_to_status(), which is from the “show status” command.
Is that to blame?
• Hard to tell – it only accounts for < 2% of the samples
• I wanted a way to quickly understand stack trace profile data,
without browsing 500,000+ lines of output
Example:Visualizations
• To understand this profile data quickly, I created visualization
that worked very well, named “Flame Graph” for its
resemblance to fire (also as it was showing a “hot” CPU issue)
some
Perl
Example: Flame Graph
Same profile data
Example: Flame Graph
Same profile data
Where CPU is
really consumed One Stack
Sample
The
"show
status"
Stack
libc.so.1`_lwp_start 0
Background: Stack Trace
• One full stack:
libc.so.1`mutex_trylock_adaptive+0x112
libc.so.1`mutex_lock_impl+0x165
libc.so.1`mutex_lock+0xc
mysqld`key_cache_read+0x741
mysqld`_mi_fetch_keypage+0x48
mysqld`w_search+0x84
mysqld`_mi_ck_write_btree+0xa5
mysqld`mi_write+0x344
mysqld`ha_myisam::write_row+0x43
mysqld`handler::ha_write_row+0x8d
mysqld`end_write+0x1a3
mysqld`evaluate_join_record+0x11e
mysqld`sub_select+0x86
mysqld`do_select+0xd9
mysqld`JOIN::exec+0x482
mysqld`mysql_select+0x30e
mysqld`handle_select+0x17d
mysqld`execute_sqlcom_select+0xa6
mysqld`mysql_execute_command+0x124b
mysqld`mysql_parse+0x3e1
mysqld`dispatch_command+0x1619
mysqld`do_handle_one_connection+0x1e5
mysqld`handle_one_connection+0x4c
libc.so.1`_thrp_setup+0xbc
libc.so.1`_lwp_start
Background: Stack Trace
• Read top-down or bottom-up, and look for key functions
libc.so.1`mutex_trylock_adaptive+0x112
libc.so.1`mutex_lock_impl+0x165
libc.so.1`mutex_lock+0xc
mysqld`key_cache_read+0x741 Ancestry
mysqld`_mi_fetch_keypage+0x48
mysqld`w_search+0x84
mysqld`_mi_ck_write_btree+0xa5
mysqld`mi_write+0x344
mysqld`ha_myisam::write_row+0x43
mysqld`handler::ha_write_row+0x8d
mysqld`end_write+0x1a3
mysqld`evaluate_join_record+0x11e
mysqld`sub_select+0x86
mysqld`do_select+0xd9
mysqld`JOIN::exec+0x482
mysqld`mysql_select+0x30e
mysqld`handle_select+0x17d
mysqld`execute_sqlcom_select+0xa6
mysqld`mysql_execute_command+0x124b
mysqld`mysql_parse+0x3e1
mysqld`dispatch_command+0x1619
mysqld`do_handle_one_connection+0x1e5
mysqld`handle_one_connection+0x4c
libc.so.1`_thrp_setup+0xbc Code Path
libc.so.1`_lwp_start
Background: Stack Modes
• Two types of stacks can be profiled:
• user-level for applications (user mode)
• kernel-level for the kernel (kernel mode)
• During a system call, an application may have both
Background: Software Internals
• You don’t need to be a programmer to understand stacks.
• Some function names are self explanatory, others require
source code browsing (if available). Not as bad as it sounds:
• MySQL has ~15,000 functions in > 0.5 million lines of code
• The earlier stack has 20 MySQL functions. To understand
them, you may need to browse only 0.13%
(20 / 15000) of the code. Might take hours, but it is doable.
libc.so.1`mutex_lock_impl+0x165 libc.so.1`mutex_lock_imp...
libc.so.1`mutex_lock+0xc libc.so.1`mutex_lock+0xc
mysqld`key_cache_read+0x741 mysqld`key_cache_read+0x741
Stack
Depth
Time (seconds)
Background: Time Series Stacks
• Time series ordering allows time-based pattern identification
• However, stacks can change thousands of times per second
One Stack
Sample
Stack
Depth
Time (seconds)
Background: Frame Merging
• When zoomed out, stacks appear as narrow stripes
• Adjacent identical functions can be merged to improve
readability, eg:
mu... mu... ge... muex_tryl... ge...
Stack
Depth
Alphabet
Flame Graphs: Definition
• Each box represents a function (a merged stack frame)
• y-axis shows stack depth
• top function led directly to the profiling event
• everything beneath it is ancestry (explains why)
• x-axis spans the sample population, sorted alphabetically
• Box width is proportional to the total time a function was
profiled directly or its children were profiled
• All threads can be shown in the same Flame Graph (the
default), or as separate per-thread Flame Graphs
• Flame Graphs can be interactive: mouse over for details
Flame Graphs:Variations
• Profile data can be anything: CPU, I/O, memory, ...
• Naming suggestion: [event] [units] Flame Graph
• Eg: "FS Latency Flame Graph"
• By default, Flame Graphs == CPU Sample Flame Graphs
• Colors can be used for another dimension
• by default, random colors are used to differentiate boxes
• --hash for hash-based on function name
• Distribution applications can be shown in the same Flame
Graph (merge samples from multiple systems)
Flame Graphs: A Simple Example
• A CPU Sample Flame Graph:
f()
d() e()
c() h()
b() g()
a()
d() e()
c() h()
b() g()
a()
c() h()
b() g()
a()
d() e()
c() h()
b() g()
a()
b() g()
a()
d() e()
c() h()
b() g()
a()
d() e()
c() h()
b() g()
a()
f()
d() e()
c() h()
status line
b() g() or tool tip:
a() b() is 90%
f()
d() e()
c() h()
status line
b() g() or tool tip:
a() g() is 10%
d() e()
c() h()
b() g()
a()
d() e()
c() h()
b() g()
a()
Extra Function:
UnzipDocid()
Linux SmartOS
http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/
Generation: stackcollapse.pl
• Converts profile data into a single line records
• Variants exist for DTrace, perf, SystemTap, Instruments, Xperf
• Eg, DTrace:
unix`i86_mwait+0xd
unix`cpu_idle_mwait+0xf1
unix`idle+0x114
unix`thread_start+0x8
19486
unix`thread_start;unix`idle;unix`cpu_idle_mwait;unix`i86_mwait 19486
Generation: stackcollapse.pl
• Converts profile data into a single line records
• Variants exist for DTrace, perf, SystemTap, Instruments, Xperf
• Eg, DTrace:
unix`i86_mwait+0xd
unix`cpu_idle_mwait+0xf1
unix`idle+0x114
unix`thread_start+0x8
19486
unix`thread_start;unix`idle;unix`cpu_idle_mwait;unix`i86_mwait 19486
• Options:
--titletext change the title text (default is “Flame Graph”)
--width width of image (default is 1200)
--height height of each frame (default is 16)
--minwidth omit functions smaller than this width (default is 0.1 pixels)
--fonttype font type (default “Verdana”)
--fontsize font size (default 12)
--countname count type label (default “samples”)
--nametype name type label (default “Function:”)
--colors color palette: "hot", "mem", "io"
--hash colors are keyed by function name hash
Types
Types
• CPU
• Memory
• Off-CPU
• More
CPU
CPU
• Measure code paths that consume CPU
• Helps us understand and optimize CPU usage, improving
performance and scalability
• Commonly performed by sampling CPU stack traces at a
timed interval (eg, 100 Hertz for every 10 ms), on all CPUs
• DTrace/perf/SystemTap examples shown earlier
• Can also be performed by tracing function execution
CPU: Sampling
CPU stack sampling:
A A A B - - - - B A A A
A A
A( )
B( )
user-level
syscall
kernel
On-CPU X Off-CPU
block . . . . . . . . . interrupt
CPU: Tracing
CPU function tracing:
A( B( B) A)
A( )
B( )
user-level
syscall
kernel
On-CPU X Off-CPU
block . . . . . . . . . interrupt
CPU: Profiling
• Sampling:
• Coarse but usually effective
• Can also be low overhead, depending on the stack type
and sample rate, which is fixed (eg, 100 Hz x CPU count)
• Tracing:
• Overheads can be too high, distorting results and hurting
the target (eg, millions of trace events per second)
• Also good for learning kernel internals: browse the active code
CPU: Recognition
• Once you start profiling a target, you begin to recognize the
common stacks and patterns
• Linux getdents() ext4 path:
• The next slides show similar
example kernel-mode CPU
Sample Flame Graphs
CPU: Recognition: illumos localhost TCP
• From a TCP localhost latency issue (illumos kernel):
illumos
fused-TCP
illumos
send
fused-TCP
receive
CPU: Recognition: illumos IP DCE issue
DCE
lookup
DCE
lookup
DCE
lookup
CPU: Recognition: Linux TCP send
• Profiled from a KVM guest:
Linux TCP
sendmsg
CPU: Recognition: Syscall Towers
CPU: Recognition: Syscall Towers
lstat() sendfile()
bnx
open() writev() bnx ip fanout
xmit
recv receive
close() pollsys()
read() write()
stat()
stat64()
CPU: Both Stacks
• Apart from showing either user- or kernel-level stacks, both
can be included by stacking kernel on top of user
• Linux perf does this by default
• DTrace can by aggregating @[stack(), ustack()]
• The different stacks can be highlighted in different ways:
• different colors or hues
• separator: flamegraph.pl will color gray any functions
called "-", which can be inserted as stack separators
user
only
kernel user
stack stack
Advanced Flame Graphs
Other Targets
• Apart from CPU samples, stack traces can be collected for
any event; eg:
• disk, network, or FS I/O
• CPU events, including cache misses
• lock contention and holds
• memory allocation
• Other values, instead of sample counts, can also be used:
• latency
• bytes
• The next sections demonstrate memory allocation, I/O tracing,
and then all blocking types via off-CPU tracing
Memory
Memory
• Analyze memory growth or leaks by tracing one of the
following memory events:
• 1. Allocator functions: malloc(), free()
• 2. brk() syscall
• 3. mmap() syscall
• 4. Page faults
• Instead of stacks and
sample counts,
measure stacks
with byte counts
• Merging shows show total bytes by code path
Memory: Four Targets
Memory: Allocator
• Trace malloc(), free(), realloc(), calloc(), ...
• These operate on virtual memory
• *alloc() stacks show why memory was first allocated (as
opposed to populated): Memory Allocation Flame Graphs
• With free()/realloc()/..., suspected memory leaks during tracing
can be identified: Memory Leak Flame Graphs!
• Down side: allocator functions are frequent, so tracing can
slow the target somewhat (eg, 25%)
• For comparison: Valgrind memcheck is more thorough, but its
CPU simulation can slow the target 20 - 30x
Memory: Allocator: malloc()
• As a simple example, just tracing malloc() calls with user-level
stacks and bytes requested, using DTrace:
# dtrace -x ustackframes=100 -n 'pid$target::malloc:entry {
@[ustack()] = sum(arg0); } tick-60s { exit(0); }' -p 529 -o out.malloc
Application
FS
Physical I/O:
Block Device Interface Measure here for kernel stacks,
and disk I/O latency
Disks
I/O: Logical I/O Laency
• For example, ZFS call latency using DTrace (zfsustack.d):
#!/usr/sbin/dtrace -s
fbt::zfs_read:entry, fbt::zfs_write:entry,
fbt::zfs_readdir:entry, fbt::zfs_getattr:entry,
fbt::zfs_setattr:entry Timestamp from
{
self->start = timestamp;
function start (entry)
}
fbt::zfs_read:return, fbt::zfs_write:return,
fbt::zfs_readdir:return, fbt::zfs_getattr:return,
fbt::zfs_setattr:return
/self->start/
{
this->time = timestamp - self->start; ... to function end (return)
@[ustack(), execname] = sum(this->time);
self->start = 0;
}
dtrace:::END
{
printa("%k%s\n%@d\n", @);
}
I/O: Logical I/O Laency
• Making an I/O Time Flame Graph:
# ./zfsustacks.d -n 'tick-10s { exit(0); }' -o out.iostacks
off-CPU on-CPU
X
A
A( )
user-level
syscall
kernel
On-CPU X Off-CPU X
block . . . . . . . . . interrupt
Off-CPU: Performance Analysis
• Generic approach for all blocking events, including I/O
• An advanced performance analysis methodology:
• http://dtrace.org/blogs/brendan/2011/07/08/off-cpu-performance-analysis/
server... bash`waitchld+0x87
bash`wait_for+0x2ce
bash`execute_command_internal+0x1758
bash`execute_command+0x45
bash`reader_loop+0x240
bash`main+0xaff
bash`_start+0x83
1193160644
libc.so.1`__read+0x15
bash`rl_getc+0x2b
bash`rl_read_key+0x22d
bash`readline_internal_char+0x113
bash`readline+0x49
bash`yy_readline_get+0x52
bash`shell_getc+0xe1
bash`read_token+0x6f
bash`yyparse+0x4b9
bash`parse_command+0x67
bash`read_command+0x52
bash`reader_loop+0xa5
bash`main+0xaff
bash`_start+0x83
12588900307
Off-CPU: MySQL Idle
Off-CPU: MySQL Idle
buf_flush_page_cleaner_thread mysqld_main
dict_stats_thread srv_monitor_thread
fts_optimize_thread srv_master_thread
io_handler_thread srv_error_monitor_thread
lock_wait_timeout_thread pfs_spawn_thread
mysqld Threads
Off-CPU: MySQL Idle
• Some thread columns are wider than the measurement time:
evidence of multiple threads
• This can be shown a number of ways. Eg, adding process
name, PID, and TID to the top of each user stack:
#!/usr/sbin/dtrace -s
sched:::on-cpu
/self->ts/
{
@[execname, pid, curlwpsinfo->pr_lwpid, ustack()] =
sum(timestamp - self->ts);
self->ts = 0;
}
thread ID (TID)
Off-CPU: Challenges
• Including multiple threads in one Flame Graph might still be
confusing. Separate Flame Graphs for each can be created
• Off-CPU stacks often don't explain themselves:
random narrow
stacks during
work, with no
reason to
sleep?
Off-CPU: MySQL Busy
• Those were user-level stacks only. The kernel-level stack,
which can be included, will usually explain what happened
• eg, involuntary context switch due to time slice expired
• Those paths are likely hot in the CPU Sample Flame Graph
Hot/Cold
Hot/Cold: Profiling
On-CPU
Profiling
Off-CPU
Profiling
(everything else)
On-CPU (!?)
Off-CPU
Hot/Cold: Challenges
• Sadly, this often doesn't work well for two reasons:
• 1. On-CPU time columns get compressed by off-CPU time
• Previous example dominated by the idle path – waiting for
a new connection – which is not very interesting!
sleep wakeup
)
A(
user-level
kernel
On-CPU
X Off-CPU X
block . . . . . . . . . . . . . wakeup
B(
Tracing Wakeups
• The systems knows who woke up who
• Tracing who performed the wakeup – and their stack – can
show the real reason for waiting
• Wakeup Latency Flame Graph
• Advanced activity
• Consider overheads – might trace too much
• Eg, consider ssh, starting with the Off CPU Time Flame Graph
Off-CPU Time Flame Graph: ssh
... woke up
these objects
Tracing Wakeup, Example (DTrace)
#!/usr/sbin/dtrace -s
This example targets sshd
#pragma D option quiet
#pragma D option ustackframes=100 (previous example also matched
#pragma D option stackframes=100
int related[uint64_t]; vmstat, after discovering that
sshd was blocked on vmstat,
sched:::sleep
/execname == "sshd"/ which it was: "vmstat 1")
{
ts[curlwpsinfo->pr_addr] = timestamp;
}
Time from sleep to wakeup
sched:::wakeup
/ts[args[0]->pr_addr]/
{
this->d = timestamp - ts[args[0]->pr_addr];
@[args[1]->pr_fname, args[1]->pr_pid, args[0]->pr_lwpid, args[0]->pr_wchan,
stack(), ustack(), execname, pid, curlwpsinfo->pr_lwpid] = sum(this->d);
ts[args[0]->pr_addr] = 0;
}
Stack traces of who is doing the waking
dtrace:::END
{
printa("\n%s-%d/%d-%x%k-%k%s-%d/%d\n%@d\n", @);
}
Aggregate if possible instead of dumping, to minimize overheads
Following Stack Chains
• 1st level of wakeups often not enough
• Would like to programmatically follow multiple chains of
wakeup stacks, and visualize them
• I've discussed this with others before – it's a hard problem
• The following is in development!: Chain Graph
Chain Graph
Chain Graph
...
Wakeup Thread 2
I wokeup
Wakeup Thread 1
I wokeup
Wakeup Stacks
why I waited
Off CPU Stacks:
why I blocked
Chain Graph Visualization
• New, experimental; check for later improvements
• Stacks associated based on sleeping object address
• Retains the value of relative widths equals latency
• Wakeup stacks frames can be listed in reverse (may be less
confusing when following towers bottom-up)
• Towers can get very tall, tracing wakeups through different
software threads, back to metal
Following Wakeup Chains, Example (DTrace)
#!/usr/sbin/dtrace -s
sched:::sleep
/execname == "sshd" || related[curlwpsinfo->pr_addr]/
{
ts[curlwpsinfo->pr_addr] = timestamp;
}
sched:::wakeup
/ts[args[0]->pr_addr]/
{
this->d = timestamp - ts[args[0]->pr_addr];
@[args[1]->pr_fname, args[1]->pr_pid, args[0]->pr_lwpid, args[0]->pr_wchan,
stack(), ustack(), execname, pid, curlwpsinfo->pr_lwpid] = sum(this->d);
ts[args[0]->pr_addr] = 0;
related[curlwpsinfo->pr_addr] = 1;
}
Also follow who
dtrace:::END
{ wakes up the waker
printa("\n%s-%d/%d-%x%k-%k%s-%d/%d\n%@d\n", @);
}
Developments
Developments
• There have been many other great developments in the world
of Flame Graphs. The following is a short tour.
node.js Flame Graphs
• Dave Pacheco developed the DTrace ustack helper for v8,
and created Flame Graphs with node.js functions
http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/
OS X Instruments Flame Graphs
• Mark Probst developed a 1. Use the Time Profile instrument
way to produce Flame 2. Instrument -> Export Track
Graphs from Instruments 3. stackcollapse-instruments.pl
4. flamegraphs.pl
http://schani.wordpress.com/2012/11/16/flame-graphs-for-instruments/
Ruby Flame Graphs
• Sam Saffron developed Flame Graphs with the Ruby
MiniProfiler
• These stacks are very
deep (many frames),
so the function names
have been dropped
and only the rectangles
are drawn
• This preserves the
value of seeing the
big picture at first
glance!
http://samsaffron.com/archive/2013/03/19/flame-graphs-in-ruby-miniprofiler
Windows Xperf Flame Graphs
• Bruce Dawson developed Flame Graphs from Xperf data, and
an xperf_to_collapsedstacks.py script
http://randomascii.wordpress.com/2013/03/26/summarizing-xperf-cpu-usage-with-flame-graphs/
WebKit Web Inspector Flame Charts
• Available in Google Chrome developer tools, these show
JavaScript CPU stacks as colored rectangles
• Inspired by Flame Graphs but
not the same: they show the
passage of time on the x-axis!
• This generally works here as:
• the target is single threaded
apps often with repetitive
code paths
• ability to zoom
• Can a "Flame Graph" mode be
provided for the same data?
https://bugs.webkit.org/show_bug.cgi?id=111162
Perl Devel::NYTProf Flame Graphs
• Tim Bunce has been adding Flame Graph features, and
included them in the Perl profiler: Devel::NYTProf
http://blog.timbunce.org/2013/04/08/nytprof-v5-flaming-precision/
Leak and Off-CPU Time Flame Graphs
• Yichun Zhang (agentzh) has created Memory Leak and Off-
CPU Time Flame Graphs, and has given good talks to explain
how Flame Graphs work
http://agentzh.org/#Presentations
http://agentzh.org/misc/slides/yapc-na-2013-flame-graphs.pdf ... these
http://www.youtube.com/watch?v=rxn7HoNrv9A also provide
http://agentzh.org/misc/slides/off-cpu-flame-graphs.pdf
http://agentzh.org/misc/flamegraph/nginx-leaks-2013-10-08.svg examples of using
https://github.com/agentzh/nginx-systemtap-toolkit SystemTap on Linux
Color Schemes
• Colors can be used to convey data, instead of the default
random color scheme. This example from Dave Pacheco
colors each function by its degree of direct on-CPU execution
• A Flame Graph
tool could let you
select different
color schemes
• Another can be:
color by a hash on
the function name,
so colors are
consistent
https://npmjs.org/package/stackvis
Zoomable Flame Graphs
• Dave Pacheco has also used d3 to provide click to zoom!
Zoom
https://npmjs.org/package/stackvis
Flame Graph Differentials
• Robert Mustacchi has been experimenting with showing the
difference between two Flame Graphs, as a Flame Graph.
Great potential for non-regression testing, and comparisons!
Flame Graphs as a Service
• Pedro Teixeira has a project for node.js Flame Graphs as a
service: automatically generated for each github push!
http://www.youtube.com/watch?v=sMohaWP5YqA
References & Acknowledgements
• Neelakanth Nadgir (realneel): developed SVGs using Ruby
and JavaScript of time-series function trace data with stack
levels, inspired by Roch's work
• Roch Bourbonnais: developed Call Stack Analyzer, which
produced similar time-series visualizations
• Edward Tufte: inspired
me to explore
visualizations that show
all the data at once, as
Flame Graphs do
• Thanks to all who have
developed Flame
Graphs further! realneel's function_call_graph.rb visualization
Thank you!
• Questions?
• Homepage: http://www.brendangregg.com (links to everything)
• Resources and further reading:
• http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/: see "Updates"
• http://dtrace.org/blogs/brendan/2012/03/17/linux-kernel-performance-flame-
graphs/
• http://dtrace.org/blogs/brendan/2013/08/16/memory-leak-growth-flame-graphs/
• http://dtrace.org/blogs/brendan/2011/07/08/off-cpu-performance-analysis/
• http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-
its-time/