0% found this document useful (0 votes)
165 views85 pages

Linuxperftools 140820091946 Phpapp01

Uploaded by

api-265708564
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
165 views85 pages

Linuxperftools 140820091946 Phpapp01

Uploaded by

api-265708564
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Llnux erformance 1ools

8rendan Cregg
!"#$%& ("&)%&*+#," -&,.$/",/


bgregg[neullx.com
[brendangregg

AugusL, 2014
A qulck Lour of many Lools.

Masslve AWS LC2 Llnux cloud
1ens of Lhousands of lnsLances
AuLoscale by ~3k each day
CenLCS and ubunLu
lree8Su for conLenL dellvery
Approx 33 of uS lnLerneL Lramc aL nlghL
erformance ls crlucal
CusLomer sausfacuon: >30M subscrlbers
$$$ prlce/performance
uevelop Lools for cloud-wlde analysls,
use server Lools as needed
8rendan Cregg
Senlor erformance ArchlLecL, neullx
Llnux and lree8Su performance
erformance Lnglneerlng Leam ([coburnw)
8ecenL work:
Llnux perf-Lools, uslng race & perf_evenLs
SysLems erformance, renuce Pall, 2014
revlous work lncludes:
uSL MeLhod, ame graphs, laLency &
uullzauon heaL maps, u1race1oolklL,
losnoop and oLhers on CS x, ZlS L2A8C
1wluer [brendangregg (Lhese slldes)
Agenda
MeLhodologles & 1ools
1ool 1ypes:
CbservablllLy
8enchmarklng
1unlng
1raclng
MeLhodologles & 1ools
MeLhodologles & 1ools
1here are dozens of performance Lools for Llnux
ackages: syssLaL, procps, coreuuls, .
Commerclal producLs
MeLhodologles can provlde guldance for
chooslng and uslng Lools eecuvely
-#0-MeLhodologles
1he lack of a dellberaLe meLhodology.
SLreeL LlghL Anu-MeLhod:
1. lck observablllLy Lools LhaL are
lamlllar
lound on Lhe lnLerneL or aL random
2. 8un Lools
3. Look for obvlous lssues
urunk Man Anu-MeLhod:
1une Lhlngs aL random unul Lhe problem goes away
MeLhodologles
lor example, Lhe uSL MeLhod:
lor every resource, check:
uullzauon
SaLurauon
Lrrors
CLher meLhods lnclude:
Workload characLerlzauon, drlll-down analysls, evenL
Lraclng, basellne sLaLs, sLauc performance Lunlng, .
SLarL wlLh Lhe quesuons, Lhen nd Lhe Lools
Command Llne 1ools
useful Lo sLudy even lf you never use Lhem:
Culs and commerclal producLs oen use Lhe
same lnLerfaces
$ vmstat 1
procs -----------memory---------- ---swap--
r b swpd free buff cache si so
9 0 0 29549320 29252 9299060 0
2 0 0 29547876 29252 9299332 0
4 0 0 29548124 29252 9299460 0
5 0 0 29548840 29252 9299592 0
kernel

/proc, /sys, .
1ool 1ypes
!"#$ &'()(*+$),-.*
CbservablllLy WaLch. Safe, usually, dependlng on
resource overhead.
8enchmarklng Load LesL. Cauuon: producuon LesLs can
cause lssues due Lo conLenuon.
1unlng Change. uanger: changes could hurL
performance, now or laLer wlLh load.
CbservablllLy 1ools
Pow do you measure Lhese?
CbservablllLy 1ools: 8aslc
upume
Lop (or hLop)
ps
vmsLaL
losLaL
mpsLaL
free
upume
Cne way Lo prlnL 1%+2 +3"&+4"5:

A measure of resource demand: Cus + dlsks
CLher CSes only show Cus: easler Lo lnLerpreL
Lxponenually-damped movlng averages wlLh
ume consLanLs of 1, 3, and 13 mlnuLes
PlsLorlc Lrend wlLhouL Lhe llne graph
Load > # of Cus, may mean Cu saLurauon
uon'L spend more Lhan 3 seconds sLudylng Lhese

$ uptime
07:42:06 up 8:16, 1 user, load average: 2.27, 2.84, 2.91
Lop (or hLop)
SysLem and per-process lnLerval summary:
Cu ls summed across all Cus
Can mlss shorL-llved processes (aLop won'L)
Can consume nouceable Cu Lo read /proc
$ top - 18:50:26 up 7:43, 1 user, load average: 4.11, 4.91, 5.22
Tasks: 209 total, 1 running, 206 sleeping, 0 stopped, 2 zombie
Cpu(s): 47.1%us, 4.0%sy, 0.0%ni, 48.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.2%st
Mem: 70197156k total, 44831072k used, 25366084k free, 36360k buffers
Swap: 0k total, 0k used, 0k free, 11873356k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5738 apiprod 20 0 62.6g 29g 352m S 417 44.2 2144:15 java
1386 apiprod 20 0 17452 1388 964 R 0 0.0 0:00.02 top
1 root 20 0 24340 2272 1340 S 0 0.0 0:01.51 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
[]
hLop
ps
rocess sLaLus llsung (eg, ASCll arL foresL"):
CusLom elds:
$ ps -ef f
UID PID PPID C STIME TTY STAT TIME CMD
[]
root 4546 1 0 11:08 ? Ss 0:00 /usr/sbin/sshd -D
root 28261 4546 0 17:24 ? Ss 0:00 \_ sshd: prod [priv]
prod 28287 28261 0 17:24 ? S 0:00 \_ sshd: prod@pts/0
prod 28288 28287 0 17:24 pts/0 Ss 0:00 \_ -bash
prod 3156 28288 0 19:15 pts/0 R+ 0:00 \_ ps -ef f
root 4965 1 0 11:08 ? Ss 0:00 /bin/sh /usr/bin/svscanboot
root 4969 4965 0 11:08 ? S 0:00 \_ svscan /etc/service
[]
$ ps -eo user,sz,rss,minflt,majflt,pcpu,args
USER SZ RSS MINFLT MAJFLT %CPU COMMAND
root 6085 2272 11928 24 0.0 /sbin/init
[]
vmsLaL
vlrLual memory sLausucs and more:
uSACL: vmsLaL [lnLerval [counL]]
llrsL ouLpuL llne has 5%*" summary slnce booL
values (should be all, parual ls confuslng)
Plgh level Cu summary. r" ls runnable Lasks.
$ vmstat Sm 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
8 0 0 1620 149 552 0 0 1 179 77 12 25 34 0 0
7 0 0 1598 149 552 0 0 0 0 205 186 46 13 0 0
8 0 0 1617 149 552 0 0 0 8 210 435 39 21 0 0
8 0 0 1589 149 552 0 0 0 0 218 219 42 17 0 0
[]
losLaL
8lock l/C (dlsk) sLaLs. 1
sL
ouLpuL ls slnce booL.
very useful
seL of sLaLs
$ iostat -xmdz 1

Linux 3.13.0-29 (db001-eb883efa) 08/18/2014 _x86_64_ (16 CPU)

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s \ ...
xvda 0.00 0.00 0.00 0.00 0.00 0.00 / ...
xvdb 213.00 0.00 15299.00 0.00 338.17 0.00 \ ...
xvdc 129.00 0.00 15271.00 3.00 336.65 0.01 / ...
md0 0.00 0.00 31082.00 3.00 678.45 0.01 \ ...
... \ avgqu-sz await r_await w_await svctm %util
... / 0.00 0.00 0.00 0.00 0.00 0.00
... \ 126.09 8.22 8.22 0.00 0.06 86.40
... / 99.31 6.47 6.47 0.00 0.06 86.00
... \ 0.00 0.00 0.00 0.00 0.00 0.00
Workload
8esulung erformance
mpsLaL
Mulu-processor sLausucs, per-Cu:
Look for unbalanced workloads, hoL Cus.
$ mpstat P ALL 1
[]
08:06:43 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
08:06:44 PM all 53.45 0.00 3.77 0.00 0.00 0.39 0.13 0.00 42.26
08:06:44 PM 0 49.49 0.00 3.03 0.00 0.00 1.01 1.01 0.00 45.45
08:06:44 PM 1 51.61 0.00 4.30 0.00 0.00 2.15 0.00 0.00 41.94
08:06:44 PM 2 58.16 0.00 7.14 0.00 0.00 0.00 1.02 0.00 33.67
08:06:44 PM 3 54.55 0.00 5.05 0.00 0.00 0.00 0.00 0.00 40.40
08:06:44 PM 4 47.42 0.00 3.09 0.00 0.00 0.00 0.00 0.00 49.48
08:06:44 PM 5 65.66 0.00 3.03 0.00 0.00 0.00 0.00 0.00 31.31
08:06:44 PM 6 50.00 0.00 2.08 0.00 0.00 0.00 0.00 0.00 47.92
[]
free
Maln memory usage:

buers: block devlce l/C cache
cached: vlrLual page cache
$ free -m
total used free shared buffers cached
Mem: 3750 1111 2639 0 147 527
-/+ buffers/cache: 436 3313
Swap: 0 0 0
CbservablllLy 1ools: 8aslc
CbservablllLy 1ools: lnLermedlaLe
sLrace
Lcpdump
neLsLaL
nlcsLaL
pldsLaL
swapon
sar (and collecLl, dsLaL, eLc.)
sLrace
SysLem call Lracer:
Lg, -uL: ume (us) slnce epoch, -1: syscall ume (s)
1ranslaLes syscall args
very helpful for solvlng sysLem usage lssues
CurrenLly has masslve overhead (pLrace based)
Can slow Lhe LargeL by > 100x. use exLreme cauuon.
$ strace tttT p 313
1408393285.779746 getgroups(0, NULL) = 1 <0.000016>
1408393285.779873 getgroups(1, [0]) = 1 <0.000015>
1408393285.780797 close(3) = 0 <0.000016>
1408393285.781338 write(1, "LinuxCon 2014!\n", 15LinuxCon 2014!
) = 15 <0.000048>
Lcpdump
Snl neLwork packeLs for posL analysls:
SLudy packeL sequences wlLh umesLamps (us)
Cu overhead opumlzed (sockeL rlng buers),
buL can sull be slgnlcanL. use cauuon.
$ tcpdump -i eth0 -w /tmp/out.tcpdump
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
^C7985 packets captured
8996 packets received by filter
1010 packets dropped by kernel
# tcpdump -nr /tmp/out.tcpdump | head
reading from file /tmp/out.tcpdump, link-type EN10MB (Ethernet)
20:41:05.038437 IP 10.44.107.151.22 > 10.53.237.72.46425: Flags [P.], seq 18...
20:41:05.038533 IP 10.44.107.151.22 > 10.53.237.72.46425: Flags [P.], seq 48...
20:41:05.038584 IP 10.44.107.151.22 > 10.53.237.72.46425: Flags [P.], seq 96...
[]
neLsLaL
varlous neLwork proLocol sLausucs uslng -s:
A mulu-Lool:
-l: lnLerface sLaLs
-r: rouLe Lable
defaulL: llsL conns
neLsLaL -p: shows
process deLalls!
er-second lnLerval
wlLh -c
$ netstat s
[]
Tcp:
736455 active connections openings
176887 passive connection openings
33 failed connection attempts
1466 connection resets received
3311 connections established
91975192 segments received
180415763 segments send out
223685 segments retransmited
2 bad segments received.
39481 resets sent
[]
TcpExt:
12377 invalid SYN cookies received
2982 delayed acks sent
[]
nlcsLaL
neLwork lnLerface sLaLs, losLaL-llke ouLpuL:
Check neLwork LhroughpuL and lnLerface uul
l wroLe Lhls years ago, 1lm Cook porLed Lo Llnux
$ ./nicstat 1
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
21:21:43 lo 823.0 823.0 171.5 171.5 4915.4 4915.4 0.00 0.00
21:21:43 eth0 5.53 1.74 15.11 12.72 374.5 139.8 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
21:21:44 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
21:21:44 eth0 20.42 3394.1 355.8 85.94 58.76 40441.3 0.00 0.00
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat
21:21:45 lo 1409.1 1409.1 327.9 327.9 4400.8 4400.8 0.00 0.00
21:21:45 eth0 75.12 4402.3 1398.9 1513.2 54.99 2979.1 0.00 0.00
[]
pldsLaL
very useful process sLaLs. eg, by-Lhread, dlsk l/C:
l usually prefer Lhls over Lop(1)
$ pidstat -t 1
Linux 3.2.0-54 (db002-91befe03) 08/18/2014 _x86_64_ (8 CPU)

08:57:52 PM TGID TID %usr %system %guest %CPU CPU Command
08:57:54 PM 5738 - 484.75 39.83 0.00 524.58 1 java
08:57:54 PM - 5817 0.85 0.00 0.00 0.85 2 |__java
08:57:54 PM - 5931 1.69 1.69 0.00 3.39 4 |__java
08:57:54 PM - 5981 0.85 0.00 0.00 0.85 7 |__java
08:57:54 PM - 5990 0.85 0.00 0.00 0.85 4 |__java
[]
$ pidstat -d 1
[]
08:58:27 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command
08:58:28 PM 5738 0.00 815.69 0.00 java
[]
swapon
Show swap devlce usage:


lf you have swap enabled.
$ swapon -s
Filename Type Size Used Priority
/dev/sda3 partition 5245212 284 -1
sar
SysLem AcuvlLy 8eporLer. Many sLaLs, eg:
Archlve or llve mode: (lnLerval [counL])
Well deslgned. Peader namlng convenuon,
loglcal groups: 1C, L1C, uLv, LuLv, .
$ sar -n TCP,ETCP,DEV 1
Linux 3.2.55 (test-e4f1a80b) 08/18/2014 _x86_64_ (8 CPU)

09:10:43 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
09:10:44 PM lo 14.00 14.00 1.34 1.34 0.00 0.00 0.00
09:10:44 PM eth0 4114.00 4186.00 4537.46 28513.24 0.00 0.00 0.00

09:10:43 PM active/s passive/s iseg/s oseg/s
09:10:44 PM 21.00 4.00 4107.00 22511.00

09:10:43 PM atmptf/s estres/s retrans/s isegerr/s orsts/s
09:10:44 PM 0.00 0.00 36.00 0.00 1.00
[]
CbservablllLy: sar
collecLl
sar-llke muluLool
SupporLs dlsLrlbuLed envlronmenLs
deslgned for PC
Cne ~6k llne erl program (hackable)
Lxposes /proc/lu/lo syscr & syscw sLaLs, so geLs
a doued llne Lo syscalls.
CbservablllLy: collecLl
CLher 1ools
WlLh Lhese measure-all-Lools, Lhe polnL lsn'L Lo
use sar or collecLl (or dsLaL or whaLever), lL's Lo
have + way Lo measure everyLhlng you wanL
ln cloud envlronmenLs, you are probably uslng a
monlLorlng producL, developed ln-house or
commerclal. Same meLhod applles.
Pow does your monlLorlng Lool
measure Lhese?
CbservablllLy 1ools: lnLermedlaLe
Advanced CbservablllLy 1ools
Mlsc:
lLrace, ss, lpLraf, eLhLool, snmpgeL, lldpLool, loLop,
blkLrace, slabLop, /proc
Cu erformance CounLers:
perf_evenLs, upLop
Advanced 1racers:
perf_evenLs, race, e8l, SysLem1ap, kLap, L11ng,
dLrace4llnux, sysdlg
Some selecLed demos.
ss
More sockeL sLausucs:
$ ss -mop
State Recv-Q Send-Q Local Address:Port Peer Address:Port
CLOSE-WAIT 1 0 127.0.0.1:42295 127.0.0.1:28527
users:(("apacheLogParser",2702,3))
mem:(r1280,w0,f2816,t0)
ESTAB 0 0 127.0.0.1:5433 127.0.0.1:41312
timer:(keepalive,36min,0) users:(("postgres",2333,7))
mem:(r0,w0,f0,t0)
[]
$ ss i
State Recv-Q Send-Q Local Address:Port Peer Address:Port
CLOSE-WAIT 1 0 127.0.0.1:42295 127.0.0.1:28527
cubic wscale:6,6 rto:208 rtt:9/6 ato:40 cwnd:10 send 145.6Mbps rcv_space:32792
ESTAB 0 0 10.144.107.101:ssh 10.53.237.72:4532
cubic wscale:4,6 rto:268 rtt:71.5/3 ato:40 cwnd:10 send 1.5Mbps rcv_rtt:72
rcv_space:14480
[]
lpLraf
loLop
8lock devlce l/C (dlsk) by process:
needs kernel supporL enabled
CCnllC_1ASk_lC_ACCCun1lnC
$ iotop
Total DISK READ: 50.47 M/s | Total DISK WRITE: 59.21 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
959 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [flush-202:1]
6641 be/4 root 50.47 M/s 82.60 M/s 0.00 % 32.51 % java Dnop X
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init
2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]
4 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/0:0]
5 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u:0]
6 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]
[]
slabLop
kernel slab allocaLor memory usage:
$ slabtop
Active / Total Objects (% used) : 4692768 / 4751161 (98.8%)
Active / Total Slabs (% used) : 129083 / 129083 (100.0%)
Active / Total Caches (% used) : 71 / 109 (65.1%)
Active / Total Size (% used) : 729966.22K / 738277.47K (98.9%)
Minimum / Average / Maximum Object : 0.01K / 0.16K / 8.00K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
3565575 3565575 100% 0.10K 91425 39 365700K buffer_head
314916 314066 99% 0.19K 14996 21 59984K dentry
184192 183751 99% 0.06K 2878 64 11512K kmalloc-64
138618 138618 100% 0.94K 4077 34 130464K xfs_inode
138602 138602 100% 0.21K 3746 37 29968K xfs_ili
102116 99012 96% 0.55K 3647 28 58352K radix_tree_node
97482 49093 50% 0.09K 2321 42 9284K kmalloc-96
22695 20777 91% 0.05K 267 85 1068K shared_policy_node
21312 21312 100% 0.86K 576 37 18432K ext4_inode_cache
16288 14601 89% 0.25K 509 32 4072K kmalloc-256
[]
perf_evenLs (counLers)
perf" command. Cu perf counLers (llsung):

ldenufy Cu cycle breakdowns, esp. sLall Lypes
Sadly, can'L access Lhese ln mosL clouds (yeL)
Can be ume-consumlng Lo use (Cu manuals)
$ perf list | grep i hardware
cpu-cycles OR cycles [Hardware event]
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
stalled-cycles-backend OR idle-cycles-backend [Hardware event]
instructions [Hardware event]
[]
branch-misses [Hardware event]
bus-cycles [Hardware event]
L1-dcache-loads [Hardware cache event]
L1-dcache-load-misses [Hardware cache event]
[]
rNNN (see 'perf list --help' on how to encode it) [Raw hardware event
mem:<addr>[:access] [Hardware breakpoint]
upLop
lC by process? MlSS? 8uS? Awesome!
needs some love. Sull can'L use lL yeL (cloud)
More Advanced 1ools.
Some oLhers worLh menuonlng:
!//0 1$-*),#./2
lLrace Llbrary call Lracer
eLhLool MosLly lnLerface Lunlng, some sLaLs
snmpgeL SnM neLwork hosL sLausucs
lldpLool Can geL LLu broadcasL sLaLs
blkLrace 8lock l/C evenL Lracer
/proc Many raw kernel counLers
Advanced 1racers
Many opuons on Llnux:
perf_evenLs, race, e8l, SysLem1ap, kLap, L11ng,
dLrace4llnux, sysdlg
MosL can do sLauc and dynamlc Lraclng
SLauc: pre-dened evenLs (LracepolnLs)
uynamlc: lnsLrumenL any soware (kprobes,
uprobes). CusLom meLrlcs on-demand. 6+/,. +11.
Many are ln-developmenL.
l'll summarlze Lhelr sLaLe laLer.
Llnux CbservablllLy 1ools
Llnux CbservablllLy 1ools
8enchmarklng 1ools
8enchmarklng 1ools
Mulu:
unlx8ench, lmbench, sysbench, perf bench
lS/dlsk:
dd, hdparm, o
App/llb:
ab, wrk, [meLer, openssl
neLworklng:
plng, hplng3, lperf, ucp, LracerouLe, mLr, pchar
Acuve 8enchmarklng
MosL benchmarks are mlsleadlng or wrong
?ou benchmark A, buL acLually measure 8, and
conclude LhaL you measured C
Acuve 8enchmarklng
1. 8un Lhe benchmark for hours
2. Whlle runnlng, analyze and conrm Lhe
performance llmlLer uslng %75"&3+7$1$/8 /%%15
We [usL covered Lhose Lools - use Lhem!
lmbench
Cu, memory, and kernel mlcro-benchmarks
Lg, memory laLency by sLrlde slze:
$ lat_mem_rd 100m 128 > out.latencies
some R processing
L1 cache
L2 cache
Maln
Memory
L3 cache
o
lS or dlsk l/C mlcro-benchmarks
$ fio --name=seqwrite --rw=write --bs=128k --size=122374m
[]
seqwrite: (groupid=0, jobs=1): err= 0: pid=22321
write: io=122374MB, bw=840951KB/s, iops=6569 , runt=149011msec
clat (usec): min=41 , max=133186 , avg=148.26, stdev=1287.17
lat (usec): min=44 , max=133188 , avg=151.11, stdev=1287.21
bw (KB/s) : min=10746, max=1983488, per=100.18%, avg=842503.94,
stdev=262774.35
cpu : usr=2.67%, sys=43.46%, ctx=14284, majf=1, minf=24
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued r/w/d: total=0/978992/0, short=0/0/0
lat (usec): 50=0.02%, 100=98.30%, 250=1.06%, 500=0.01%, 750=0.01%
lat (usec): 1000=0.01%
lat (msec): 2=0.01%, 4=0.01%, 10=0.25%, 20=0.29%, 50=0.06%
lat (msec): 100=0.01%, 250=0.01%
8esulLs lnclude baslc laLency dlsLrlbuuon
pchar
1racerouLe wlLh bandwldLh per hop!
$ pchar 10.71.83.1
[]
4: 10.110.80.1 (10.110.80.1)
Partial loss: 0 / 5 (0%)
Partial char: rtt = 9.351109 ms, (b = 0.004961 ms/B), r2 = 0.184105
stddev rtt = 4.967992, stddev b = 0.006029
Partial queueing: avg = 0.000000 ms (0 bytes)
Hop char: rtt = --.--- ms, bw = 1268.975773 Kbps
Hop queueing: avg = 0.000000 ms (0 bytes)
5: 10.193.43.181 (10.193.43.181)
Partial loss: 0 / 5 (0%)
Partial char: rtt = 25.461597 ms, (b = 0.011934 ms/B), r2 = 0.228707
stddev rtt = 10.426112, stddev b = 0.012653
Partial queueing: avg = 0.000000 ms (0 bytes)
Hop char: rtt = 16.110487 ms, bw = 1147.210397 Kbps
Hop queueing: avg = 0.000000 ms (0 bytes)
[]
needs love. 8ased on paLhchar (Llnux 2.0.30).
8enchmarklng 1ools
1unlng 1ools
1unlng 1ools
Cenerlc lnLerfaces:
syscLl, /sys
Many areas have cusLom Lunlng Lools:
Appllcauons: Lhelr own cong
Cu/scheduler: nlce, renlce, LaskseL, ullmlL, chcpu
SLorage l/C: Lune2fs, lonlce, hdparm, blockdev, .
neLwork: eLhLool, Lc, lp, rouLe
uynamlc paLchlng: sLap, kpaLch
1unlng MeLhods
Sclenuc MeLhod:
1. Cuesuon
2. PypoLhesls
3. redlcuon
4. 1esL
3. Analysls
Any %75"&3+0%#+1 or 7"#,.*+&9$#4 LesLs you can
Lry before Lunlng?
Conslder rlsks, and see prevlous Lools
1unlng 1ools
1raclng
1raclng lrameworks: 1racepolnLs
SLaucally placed aL loglcal places ln Lhe kernel
rovldes key evenL deLalls as a formaL" sLrlng
1raclng lrameworks: + probes
kprobes: dynamlc kernel Lraclng
funcuon calls, reLurns, llne numbers
uprobes: dynamlc user-level Lraclng
Cpuons:
race
perf_evenLs
e8l
SysLem1ap
kLap
L11ng
dLrace4llnux
sysdlg
1oo many cholces, and many sull ln-developmenL
1raclng 1ools
lmaglne Llnux wlLh 1raclng
WlLh a programmable Lracer, hlgh level Lools can
be wrluen, such as:
losnoop
lolaLency
opensnoop
.
losnoop
8lock l/C (dlsk) evenLs wlLh laLency:
# ./iosnoop ts
Tracing block I/O. Ctrl-C to end.
STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms
5982800.302061 5982800.302679 supervise 1809 W 202,1 17039600 4096 0.62
5982800.302423 5982800.302842 supervise 1809 W 202,1 17039608 4096 0.42
5982800.304962 5982800.305446 supervise 1801 W 202,1 17039616 4096 0.48
5982800.305250 5982800.305676 supervise 1801 W 202,1 17039624 4096 0.43
[]
# ./iosnoop h
USAGE: iosnoop [-hQst] [-d device] [-i iotype] [-p PID] [-n name] [duration]
-d device # device string (eg, "202,1)
-i iotype # match type (eg, '*R*' for all reads)
-n name # process name to match on I/O issue
-p PID # PID to match on I/O issue
-Q # include queueing time in LATms
-s # include start time of I/O (s)
-t # include completion time of I/O (s)
-h # this usage message
duration # duration seconds, and use buffers
[]
lolaLency
8lock l/C (dlsk) laLency dlsLrlbuuons:
# ./iolatency
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.

>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 2104 |######################################|
1 -> 2 : 280 |###### |
2 -> 4 : 2 |# |
4 -> 8 : 0 | |
8 -> 16 : 202 |#### |

>=(ms) .. <(ms) : I/O |Distribution |
0 -> 1 : 1144 |######################################|
1 -> 2 : 267 |######### |
2 -> 4 : 10 |# |
4 -> 8 : 5 |# |
8 -> 16 : 248 |######### |
16 -> 32 : 601 |#################### |
32 -> 64 : 117 |#### |
[]
opensnoop
1race open() syscalls showlng lenames:
# ./opensnoop -t
Tracing open()s. Ctrl-C to end.
TIMEs COMM PID FD FILE
4345768.332626 postgres 23886 0x8 /proc/self/oom_adj
4345768.333923 postgres 23886 0x5 global/pg_filenode.map
4345768.333971 postgres 23886 0x5 global/pg_internal.init
4345768.334813 postgres 23886 0x5 base/16384/PG_VERSION
4345768.334877 postgres 23886 0x5 base/16384/pg_filenode.map
4345768.334891 postgres 23886 0x5 base/16384/pg_internal.init
4345768.335821 postgres 23886 0x5 base/16384/11725
4345768.347911 svstat 24649 0x4 supervise/ok
4345768.347921 svstat 24649 0x4 supervise/status
4345768.350340 stat 24651 0x3 /etc/ld.so.cache
4345768.350372 stat 24651 0x3 /lib/x86_64-linux-gnu/libselinux
4345768.350460 stat 24651 0x3 /lib/x86_64-linux-gnu/libc.so.6
4345768.350526 stat 24651 0x3 /lib/x86_64-linux-gnu/libdl.so.2
4345768.350981 stat 24651 0x3 /proc/filesystems
4345768.351182 stat 24651 0x3 /etc/nsswitch.conf
[]
funcgraph
1race a graph of kernel code ow:
# ./funcgraph -Htp 5363 vfs_read
Tracing "vfs_read" for PID 5363... Ctrl-C to end.
# tracer: function_graph
#
# TIME CPU DURATION FUNCTION CALLS
# | | | | | | | |
4346366.073832 | 0) | vfs_read() {
4346366.073834 | 0) | rw_verify_area() {
4346366.073834 | 0) | security_file_permission() {
4346366.073834 | 0) | apparmor_file_permission() {
4346366.073835 | 0) 0.153 us | common_file_perm();
4346366.073836 | 0) 0.947 us | }
4346366.073836 | 0) 0.066 us | __fsnotify_parent();
4346366.073836 | 0) 0.080 us | fsnotify();
4346366.073837 | 0) 2.174 us | }
4346366.073837 | 0) 2.656 us | }
4346366.073837 | 0) | tty_read() {
4346366.073837 | 0) 0.060 us | tty_paranoia_check();
[]
kprobe
uynamlcally Lrace a kernel funcuon call or reLurn,
wlLh varlables, and ln-kernel lLerlng:
# ./kprobe 'p:open do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'
Tracing kprobe myopen. Ctrl-C to end.
postgres-1172 [000] d... 6594028.787166: open: (do_sys_open
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-1172 [001] d... 6594028.797410: open: (do_sys_open
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
postgres-1172 [001] d... 6594028.797467: open: (do_sys_open
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat
^C
Ending tracing...
Add -s for sLack Lraces, -p for lu lLer ln-kernel.
Culckly conrm kernel behavlor, eg: dld a
Lunable Lake eecL?
lmaglne Llnux wlLh 1raclng
1hese Lools aren'L uslng dLrace4llnux, SysLem1ap,
kLap, or any oLher add-on Lracer
1hese Lools use $3,-.24 5,263 *(#(7,0,.$-
no exLra kernel blLs, noL even kernel debuglnfo
!usL Llnux's bullL-ln 8)(*$ proler
uemoed on 5,263 9:;
Solvlng real lssues #%:
race
Added by SLeven 8osLedL and oLhers slnce 2.6.2
Already enabled on our servers (3.2+)
CCnllC_l18ACL, CCnllC_lunC1lCn_8CllLL8, .
use dlrecLly vla /sys/kernel/debug/Lraclng
My fronL-end Lools Lo ald usage
hups://glLhub.com/brendangregg/perf-Lools
unsupporLed hacks: see WA8nlnCs
Also see Lhe Lrace-cmd fronL-end, as well as perf
lwn.neL Loday: lLrace: 1he Pldden LlghL SwlLch"
My perf-Lools (so far.)
1raclng Summary
race
perf_evenLs
e8l
SysLem1ap
kLap
L11ng
dLrace4llnux
sysdlg
perf_evenLs
aka perf" command
<2 5,263. Add from llnux-Lools-common.
owerful mulu-Lool and proler
lnLerval sampllng, Cu performance counLer evenLs
user and kernel dynamlc Lraclng
kernel llne Lraclng and local varlables (debuglnfo)
kernel lLerlng, and ln-kernel counLs (perf sLaL)
noL very programmable, yeL
llmlLed kernel summarles. May lmprove wlLh e8l.
perf_evenLs Lxample
# perf record e skb:consume_skb -ag
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.065 MB perf.data (~2851 samples) ]
# perf report
[...]
74.42% swapper [kernel.kallsyms] [k] consume_skb
|
--- consume_skb
arp_process
arp_rcv
__netif_receive_skb_core
__netif_receive_skb
netif_receive_skb
virtnet_poll
net_rx_action
__do_softirq
irq_exit
do_IRQ
ret_from_intr
default_idle
cpu_idle
start_secondary
[]
Summarlzlng sLack
Lraces for a LracepolnL

perf_evenLs can do
many Lhlngs - hard Lo
plck [usL one example
e8l
LxLended 8l: programs on LracepolnLs
Plgh performance lLerlng: !l1
ln-kernel summarles: maps
Llnux ln 3.1? Lnhance perf_evenLs/race/.?
# ./bitesize 1
writing bpf-5 -> /sys/kernel/debug/tracing/events/block/block_rq_complete/filter

I/O sizes:
Kbytes : Count
4 -> 7 : 131
8 -> 15 : 32
16 -> 31 : 1
32 -> 63 : 46
64 -> 127 : 0
128 -> 255 : 15
[]
ln-kernel summary
SysLem1ap
lully programmable, fully feaLured
Complles Lraclng programs lnLo kernel modules
needs a compller, and Lakes ume
Works greaL on 8ed PaL"
l keep Lrylng on oLher dlsLros and have hlL Lrouble ln
Lhe pasL, make sure you are on Lhe laLesL verslon.
l'm llklng lL a blL more aer ndlng ways Lo use lL
wlLhouL kernel debuglnfo (a dlmculL requlremenL ln
our envlronmenL). Work ln progress.
Lver be malnllne?
kLap
Sampllng, sLauc & dynamlc Lraclng
LlghLwelghL, slmple. uses byLecode.
SulLed for embedded devlces
uevelopmenL appears suspended aer
suggesuons Lo lnLegraLe wlLh e8l (whlch lLself ls
ln developmenL)
kLap + e8l would be awesome: easy,
llghLwelghL, fasL. Llkely?
sysdlg
sysdlg: lnnovauve new Lracer. Slmple expresslons:
8eplacemenL for sLrace? (or perf Lrace" wlll)
rogrammable chlsels". Lg, one of mlne:

CurrenLly syscalls and user-level processlng only. lL ls
opumlzed, buL l'm noL sure lL can be enough for kernel Lraclng
# sysdig -c fileslower 1
TIME PROCESS TYPE LAT(ms) FILE
2014-04-13 20:40:43.973 cksum read 2 /mnt/partial.0.0
2014-04-13 20:40:44.187 cksum read 1 /mnt/partial.0.0
2014-04-13 20:40:44.689 cksum read 2 /mnt/partial.0.0
[]
sysdig fd.type=file and evt.failed=true
sysdig evt.type=open and fd.name contains /etc
sysdig -p"%proc.name %fd.name" "evt.type=accept and proc.name!=httpd
resenL & luLure
resenL:
race can serve many needs Loday
perf_evenLs some more, esp. wlLh debuglnfo
ah hoc SysLem1ap, kLap, . as needed
luLure:
race/perf_evenLs/kLap wlLh e8l, for a fully
feaLured and malnllne Lracer?
Cne of Lhe oLher Lracers golng malnllne?
1he 1raclng Landscape, Aug 2014
Scope & CapablllLy
L
a
s
e

o
f

u
s
e

sysdlg
perf
race
e8l
kLap
sLap
SLage of
uevelopmenL
(my oplnlon)
dLrace
(
b
r
u
L
a
l
)

(
l
e
s
s

b
r
u
L
a
l
)

(alpha) (maLure)
ln Summary
ln Summary.
lus dlagrams for benchmarklng, Lunlng, Lraclng
1ry Lo sLarL wlLh Lhe quesuons (meLhodology), Lo
help gulde your use of Lhe Lools
l hopefully Lurned some unknown unknowns lnLo
known unknowns
8eferences & Llnks
SysLems erformance: LnLerprlse and Lhe Cloud, renuce Pall, 2014
hup://www.brendangregg.com/llnuxperf.hLml
nlcsLaL: hup://sourceforge.neL/pro[ecLs/nlcsLaL/
upLop: hup://upLop.gforge.lnrla.fr/
1lpLop: Pardware erformance CounLers for Lhe Masses, Lrven
8ohou, lnrla 8esearch 8eporL , nov 2011.
race & perf-Lools
hups://glLhub.com/brendangregg/perf-Lools
hup://lwn.neL/Arucles/604/
e8l: hup://lwn.neL/Arucles/6033/
kLap: hup://www.kLap.org/
SysLem1ap: hups://sourceware.org/sysLemLap/
sysdlg: hup://www.sysdlg.org/
hup://www.slldeshare.neL/brendangregg/llnux-performance-analysls-and-Lools
1ux by Larry Lwlng, Llnux ls Lhe reglsLered Lrademark of Llnus 1orvalds
ln Lhe u.S. and oLher counLrles.
1hanks
Cuesuons?
hup://slldeshare.neL/brendangregg
hup://www.brendangregg.com
bgregg[neullx.com
[brendangregg

You might also like