0% found this document useful (0 votes)
178 views

Query Analysis With Mk-Query-Digest: March 27, 2010

mk-query-digest is a tool from Maatkit that analyzes query events from various sources like MySQL logs. It processes events through a pipeline, applying filters and transformations before producing a report that aggregates events and identifies the most resource-intensive queries. The default report groups queries by fingerprint, sorts them by execution time, and provides statistics to help analyze query performance issues.

Uploaded by

sinleon
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views

Query Analysis With Mk-Query-Digest: March 27, 2010

mk-query-digest is a tool from Maatkit that analyzes query events from various sources like MySQL logs. It processes events through a pipeline, applying filters and transformations before producing a report that aggregates events and identifies the most resource-intensive queries. The default report groups queries by fingerprint, sorts them by execution time, and provides statistics to help analyze query performance issues.

Uploaded by

sinleon
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Query analysis with mk-query-digest

March 27, 2010


What is Maatkit?
Command-line Perl utilities, originally for MySQL
I Checksum replicas -vs- their master
I Synchronize data
I Analyze queries
I And dozens more tasks. . .

Today, Maatkit is useful for many systems


I MySQL
I PostgreSQL
I Web servers
I Memcached
What’s nice about Maatkit
High quality
I Thousands of unit and integration tests
I Complete and accurate documentation

Easy to use
I Consistent options and behavior amongst tools
I Single file per tool, self-contained
I Output is command-line and human friendly
I No need to install, just download
I wget http://www.maatkit.org/get/mk-<toolname>
What is mk-query-digest?
General-purpose “query event processor”
I Extract query events from some input
I Pass the events through a pipeline
I Apply transformations, filtering, aggregation. . .
I Perform many different types of actions with the events
I Produce some output when done
Query Events
A data structure that represents an occurrence of a query
I Internally, it’s a Perl hash table
I It’s just a key-value data structure
I In Maatkit-world, we refer to the data as attributes

Common query event attributes

Attribute Name Meaning


arg The text of the query
cmd The type of the event
ts The timestamp of the event
db Database
Query time Execution time / response time
pos in log The event’s byte offset in the input
The event pipeline
How the pipeline works
1. Get events from the source
2. Pre-filter (for options such as --since and --until)
3. Apply some default transformations, such as adding a
fingerprint to the event
4. Apply user-defined filters/functions to the events
5. Special-purpose filters: execute the event against a database,
add a delay into the pipeline, do a “query review”. . .
6. Aggregate events together
7. Compute and display a report
Fingerprints
What is a fingerprint?
I A fingerprint is a normalized/abstracted form of a query
I Whitespace collapsed, comments removed, literals replaced by
?
I Many other transformations are applied (collapsing IN lists. . . )
I Example:
SELECT *
FROM users
WHERE id IN(’eeyore’,’pooh’); -- where’s piglet?
I Result:
select * from users where id in(?);
The default report
1. Fingerprint every event
2. Auto-detect attributes, and run stats on numeric attributes
3. Aggregate events by fingerprint; place them into classes
4. Sort the classes by total Query time
5. Keep the “short head” and queries that perform very badly
6. Print out a report on each noteworthy class of queries
Demo: the default report
If you want to follow along, try this:
wget http://www.maatkit.org/get/mk-query-digest
wget http://maatkit.googlecode.com/svn/trunk/common/t/samples/pg-sample2
perl mk-query-digest --type pglog pg-sample2
Default report header
Info about the whole report
I This file has 884 query events, and 69 different fingerprints
I There’s a table of stats for the numeric attributes

# 860ms user time, 50ms system time, 14.89M rss, 20.06M vsz
# Overall: 884 total, 69 unique, 0 QPS, 0x concurrency ___________________
# total min max avg 95% stddev median
# Exec time 3s 147us 72ms 3ms 10ms 8ms 1ms
# bytes 170.18k 39 569 197.14 563.87 186.59 92.72
Default report for one fingerprint
A report on a single fingerprint’s queries

# Query 1: 0 QPS, 0x concurrency, ID 0x8FFEBD609B778EB2 at byte 97807 ____


# This item is included in the report because it matches --limit.
# pct total min max avg 95% stddev median
# Count 7 67
# Exec time 22 631ms 4ms 72ms 9ms 14ms 8ms 8ms
# bytes 6 11.73k 155 335 179.31 202.40 31.22 166.51
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms ################################################################
# 10ms #############################
# 100ms
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS LIKE ’activity_log’\G
# SHOW CREATE TABLE ‘activity_log‘\G
INSERT INTO activity_log <.... rest of query ....>
Default report profile
A profile of the queries, sorted by R descending

# Profile
# Rank Query ID Response time Calls R/Call Item
# ==== ================== ================ ===== ======== ===============
# 1 0x8FFEBD609B778EB2 0.6307 25.2% 67 0.0094 INSERT activity
# 2 0x1C55D6804083DB4C 0.3908 15.6% 7 0.0558 SELECT stats_cv
# 3 0x62EC2BC35CD62D85 0.3519 14.1% 7 0.0503 SELECT stats_cv
# 4 0x32AF9886FDBBAE30 0.1513 6.0% 144 0.0011 SELECT frs_file
# 5 0x1929E67B76DC55E7 0.1330 5.3% 3 0.0443 SELECT frs_dlst
# 6 0x60D6962E42C08882 0.1204 4.8% 67 0.0018 SELECT plugins
# 7 0xF2AF6DF05892D03E 0.1128 4.5% 5 0.0226 SELECT doc_data
# 8 0x64F8E6F000640AF8 0.0672 2.7% 3 0.0224 SELECT users
# 9 0x4636BFC0875521C9 0.0664 2.7% 93 0.0007 SELECT supporte
# 10 0x02CC64324ED7CA2C 0.0651 2.6% 12 0.0054 INSERT frs_dlst
Sources of input
mk-query-digest understands many types of input

Argument to --type 1 Meaning


slowlog (default) MySQL’s “slow query log”
binlog Output of MySQL’s mysqlbinlog program
genlog MySQL’s “general log” (log of all queries)
http HTTP TCP/IP traffic from tcpdump
pglog PostgreSQL logs (stdout or syslog)
tcpdump MySQL TCP/IP traffic from tcpdump
memcached memcached TCP/IP traffic from tcpdump

1
You can also use the --processlist option to get queries from MySQL’s
SHOW FULL PROCESSLIST
Important and useful options
mk-query-digest has 50+ command-line options!
I --filter, --since, and --until
I --explain
I --group-by and --order-by
I --print
I --review and --review-history
The --filter option
I You can write arbitrary Perl code to filter and transform events
I Be sure it returns a true value so the pipeline continues
I Example: --filter ’$event->{db} eq "mydb"’
I The documentation has more examples
I The --until and --since options are just filters
The --explain option
I You can get the database to EXPLAIN your queries
I See the results right in the report
The --group-by option
I Defines how queries are aggregated into classes
I Not quite like SQL’s GROUP BY
I Default is fingerprint
I Special values: distill, tables
I Multiple reports: --group-by
fingerprint,tables,distill
The --order-by option
I Defines what is “worst”, and how to sort the report
I The sample query is “worst in class” by this criterion
I Default is Query time:sum
I You could also sort by other pseudo-attributes such as :max
The --print option
I Prints out events in MySQL’s “slow query log” format
I Maatkit uses this format as its lingua franca
I Other Maatkit tools can accept this input (example:
mk-query-advisor)
I You probably want to use --no-report with this
The --review option
I Tired of reviewing the same queries every day?
I Store them into a table, with arbitrary meta-data
I See only “new” queries in the report!
I With --review-history, store report stats too
Using mk-query-digest with PostgreSQL
Overview
I How to configure logging
I Demonstrations
I Features that aren’t done yet, future work
Choosing a log destination
Configuring postgresql.conf
I You can use either syslog or stderr
I Syslog has some benefits
I For syslog logging:
log_destination=syslog
syslog_facility=’LOCAL0’
syslog_ident=’postgres’

I Send to a separate log file:


# in /etc/rsyslog.conf
local0.* -/var/log/pgsql
Query event attributes
What attributes can mk-query-digest extract?
I The query, of course!
I Error message
I Query time (response time)
I Byte offset in the log file
I Timestamp of the event
I Everything in the log line prefix
Configuring logging parameters
Log as much detail as you can
I More is better for performance analysis!
I For syslog logging:
log_min_duration_statement = 0
log_connections = on
log_disconnections = on

I Set the following to prevent duplicate output


log_statement = ’none’ # none, ddl, mod, all
Configuring log line prefix
I Very important way to get event attributes
I Suggested setting:
log_line_prefix = ’%m c=%c,u=%u,D=%d ’

I Compatible with pgfouine’s settings, but more is better!


Configuring log line prefix
mk-query-digest recognizes every possible attribute

u => ’user’,
d => ’db’,
r => ’host’, # With port
h => ’host’,
p => ’Process_id’,
t => ’ts’,
m => ’ts’, # With milliseconds
i => ’Query_type’,
c => ’Session_id’,
l => ’Line_no’,
s => ’Session_id’,
v => ’Vrt_trx_id’,
x => ’Trx_id’,
Future Features
Deeper Analysis
I Time-series analysis
I Features for capacity planning (Erlang C math. . . )
I Session-based analysis
I Drill-down analysis
I R -vs- X analysis
Future
Current Limitations
I CSV log format not supported yet
I Assumes English locale, just like pgfouine
I The helpful hints (SHOW CREATE TABLE) are MySQL-centric

What’s next for PostgreSQL?


I I need to learn about Postgres 9.0’s logging enhancements
I Support for TCP protocol would be great
I Niceties, improvements to fingerprinting
I What do you want?
More Demonstrations
I Demo 0: Look at the report from pg-log-001.txt
I Demo 1: Try out --report-format=profile
I Demo 2: difference between pg-sample2

You might also like