Query Analysis With Mk-Query-Digest: March 27, 2010
Query Analysis With Mk-Query-Digest: March 27, 2010
Easy to use
I Consistent options and behavior amongst tools
I Single file per tool, self-contained
I Output is command-line and human friendly
I No need to install, just download
I wget http://www.maatkit.org/get/mk-<toolname>
What is mk-query-digest?
General-purpose “query event processor”
I Extract query events from some input
I Pass the events through a pipeline
I Apply transformations, filtering, aggregation. . .
I Perform many different types of actions with the events
I Produce some output when done
Query Events
A data structure that represents an occurrence of a query
I Internally, it’s a Perl hash table
I It’s just a key-value data structure
I In Maatkit-world, we refer to the data as attributes
# 860ms user time, 50ms system time, 14.89M rss, 20.06M vsz
# Overall: 884 total, 69 unique, 0 QPS, 0x concurrency ___________________
# total min max avg 95% stddev median
# Exec time 3s 147us 72ms 3ms 10ms 8ms 1ms
# bytes 170.18k 39 569 197.14 563.87 186.59 92.72
Default report for one fingerprint
A report on a single fingerprint’s queries
# Profile
# Rank Query ID Response time Calls R/Call Item
# ==== ================== ================ ===== ======== ===============
# 1 0x8FFEBD609B778EB2 0.6307 25.2% 67 0.0094 INSERT activity
# 2 0x1C55D6804083DB4C 0.3908 15.6% 7 0.0558 SELECT stats_cv
# 3 0x62EC2BC35CD62D85 0.3519 14.1% 7 0.0503 SELECT stats_cv
# 4 0x32AF9886FDBBAE30 0.1513 6.0% 144 0.0011 SELECT frs_file
# 5 0x1929E67B76DC55E7 0.1330 5.3% 3 0.0443 SELECT frs_dlst
# 6 0x60D6962E42C08882 0.1204 4.8% 67 0.0018 SELECT plugins
# 7 0xF2AF6DF05892D03E 0.1128 4.5% 5 0.0226 SELECT doc_data
# 8 0x64F8E6F000640AF8 0.0672 2.7% 3 0.0224 SELECT users
# 9 0x4636BFC0875521C9 0.0664 2.7% 93 0.0007 SELECT supporte
# 10 0x02CC64324ED7CA2C 0.0651 2.6% 12 0.0054 INSERT frs_dlst
Sources of input
mk-query-digest understands many types of input
1
You can also use the --processlist option to get queries from MySQL’s
SHOW FULL PROCESSLIST
Important and useful options
mk-query-digest has 50+ command-line options!
I --filter, --since, and --until
I --explain
I --group-by and --order-by
I --print
I --review and --review-history
The --filter option
I You can write arbitrary Perl code to filter and transform events
I Be sure it returns a true value so the pipeline continues
I Example: --filter ’$event->{db} eq "mydb"’
I The documentation has more examples
I The --until and --since options are just filters
The --explain option
I You can get the database to EXPLAIN your queries
I See the results right in the report
The --group-by option
I Defines how queries are aggregated into classes
I Not quite like SQL’s GROUP BY
I Default is fingerprint
I Special values: distill, tables
I Multiple reports: --group-by
fingerprint,tables,distill
The --order-by option
I Defines what is “worst”, and how to sort the report
I The sample query is “worst in class” by this criterion
I Default is Query time:sum
I You could also sort by other pseudo-attributes such as :max
The --print option
I Prints out events in MySQL’s “slow query log” format
I Maatkit uses this format as its lingua franca
I Other Maatkit tools can accept this input (example:
mk-query-advisor)
I You probably want to use --no-report with this
The --review option
I Tired of reviewing the same queries every day?
I Store them into a table, with arbitrary meta-data
I See only “new” queries in the report!
I With --review-history, store report stats too
Using mk-query-digest with PostgreSQL
Overview
I How to configure logging
I Demonstrations
I Features that aren’t done yet, future work
Choosing a log destination
Configuring postgresql.conf
I You can use either syslog or stderr
I Syslog has some benefits
I For syslog logging:
log_destination=syslog
syslog_facility=’LOCAL0’
syslog_ident=’postgres’
u => ’user’,
d => ’db’,
r => ’host’, # With port
h => ’host’,
p => ’Process_id’,
t => ’ts’,
m => ’ts’, # With milliseconds
i => ’Query_type’,
c => ’Session_id’,
l => ’Line_no’,
s => ’Session_id’,
v => ’Vrt_trx_id’,
x => ’Trx_id’,
Future Features
Deeper Analysis
I Time-series analysis
I Features for capacity planning (Erlang C math. . . )
I Session-based analysis
I Drill-down analysis
I R -vs- X analysis
Future
Current Limitations
I CSV log format not supported yet
I Assumes English locale, just like pgfouine
I The helpful hints (SHOW CREATE TABLE) are MySQL-centric