0179 Advanced Mysql Performance Optimization
0179 Advanced Mysql Performance Optimization
com
1
Advanced MySQL
Performance Optimization
© MySQL AB 2005
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
2
Introductions
• Peter Zaitsev, MySQL Inc
– Senior Performance Engineer
– MySQL Performance Group Manager
– MySQL Performance consulting and partner relationships
• Tobias Asplund
– MySQL Training Class Instructor
– MySQL Performance Tuning Consultant
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
3
Table of Contents
• A bit of Performance/Benchmarking theory
• Application Architecture issues
• Schema design and query optimization
• Sever Settings Optimizations
• Storage Engine Optimizations
• Replication
• Clustering
• Hardware and OS optimizations
• Real world application problems
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
4
Question Policy
• Interrupt us if something is unclear
• Keep long generic questions to the end
• Approach us during the conference
• Write us: [email protected], [email protected]
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
5
Defining Performance
• Simple world but many meanings
• Main objective:
– Users (direct or indirect) should be satisfied
• Most typical performance metrics
– Throughput
– latency/response time
– Scalability
– Combined metrics
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
7
Throughput
• Metric: Transactions per time (second/min/hour)
– Only some transactions from the mix can be counted
• Example: TPC-C
• When to use
– Interactive multi user applications
• Problems:
– “starvation” - some users can be waiting too long
– single user may rather need his request served fast
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
8
Response Time/Latency
• Metric: Time (milliseconds, seconds, minutes)
– derived: average/min/max response time
– derived 90 percentile response time
• Example: sql-bench, SetQuery
• When to use
– Batch jobs
– together with throughput in interactive applications
• Problems:
– Counts wall clock time, does not take into account what else
is happening
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
9
Scalability
• Metric: Ability to maintain performance with changing
– load (incoming requests)
– database size
– concurrent connections
– hardware
• Different performance metric
• “maintain performance” typically defined as response time
• When to use
– Capacity planning
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
10
Queuing Theory
• Multi User applications
• Request waits in queue before being processed
• User response time = queueing delay + service time
– Non high tech example – support call center.
• “Hockey Stick” - queuing delay grows rapidly when system
getting close to saturation
• Need to improve queueing delay or service time to
improve performance
• Improving service time reduces queuing delay
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
11
Benchmarks
• Great tool to:
– Quantify application performance
– Measure performance effect of the changes
– Validate Scalability
– Plan deployment
• But
– Can be very misleading if done wrong
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
13
Application Architecture
• Designing Scalable application Architecture
• Role of Caching
• Replication/Partition/Clustering
• Architectural notes for C/Perl/PHP/Java/.Net
• Application level performance analyses
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
19
Architecture Design
• Try to localize database operations
– “to change this we need to fix 15000 queries we need”
• Write code in “black boxes”
– control side effects
– be able to do local re-architecturing
• Think a bit ahead,
– 1 hour of work today may be a week in a year
• Do not trust claims and your guts
– run benchmarks early to check you're on the right way.
• Scale Out
– 32 CPU box vs 20 2 CPU boxes
– The “Google Way”
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
21
Magic of Caching
• Most applications benefit from some form of caching
• For many caching is the only optimization needed
• Many forms of caching
– HTTP Server side proxy cache
– Pre-parsed template cache
– Object cache in the application
– Network distributed cache
– Cache on file system
– Query cache in MySQL
– HEAP/MyISAM tables as cache
– Database buffers cache
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
23
Proxy Cache
• External request-response cache
• Useful when data does not change
• Must have for static, semi-static web sites
• Can be just overhead for dynamic only
• Problems with cache invalidation
– Protocol level control may not suite application
• Too high level
– Can't cache even if difference minimal
• Security issues
– storing sensitive data on the disk
– disclosing data to wrong user
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
24
Object/Functional cache
• Cache results of functions or objects
– for example user profile
• Will work for different templates and data presentations
– Post in LiveJournal appears in a lot of “friend” pages
• Caching in application – simple
– address space limit on 32bit systems
– Limited to memory on single system
– Multiple copies of same object
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
26
Database/OS Buffers
• Data and indexes cached in Database or OS buffers
• Provided automatically, usually presents
– MySQL server and OS Server settings.
• Fully transparent
• Very important to take into account
– Access to data in memory up to 1000s times faster than on
disk.
• Working set should fit in memory
– Meaning load should be CPU bound
– Often great way to ensure performance
– Not always possible
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
31
Number of Connections
• Many Established connections take resources
• Frequent connection creation take resources
– not as much as people tend to think
• Peak performance reached at small amount of running
queries
– CPU cache, disk thrashing, available resources per thread
– Limit concurrency for complex queries
• SELECT GET_LOCK(“search”,10)
• Use connection pool of limited size
• Limit number of connections can be established at the time
– FastCGI, Server side proxy for web world
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
32
Replication
• Board sense – getting multiple copies of your data
• Very powerful tool, especially for read mostly applications
• MySQL Replication (Will discuss later)
• Manual replication
– more control, tricky to code, can be synchronous
• Replication from other RDBMS
– GoldenGate, used ie at Sabre
• Just copy MyISAM tables
– Great for processed data which needs to be distributed
• Many copies: Good for HA, Waste of resources, expensive
to update
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
33
Partitioning
• Local partitioning: MERGE Tables
– Logs, each day in its own table.
• Remote partitioning – several hosts
– example: by hash on user name
– very application dependent
• Manual partitioning across many tables
– Easy to grow to remote partitioning
– Easy to manage (ie OPTIMIZE table)
– Fight MyISAM table locks.
• May need copies of data partitioned different way
• No waste of resources. Efficient caching
• Can be mixed with replication for HA
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
34
Clustering
• Clustering – something automatic to get me HA,
performance
• Manual clustering with MySQL Replication (more later)
• Clustering with shared/replicated disk/file system
– Products from Veritas/Sun/Novell
– Build your own using Heartbeat
– Innodb, Read-only MyISAM
– Does not save from data corruption
– Active-Passive – waste of resources
– Share Standby box to reduce overhead
– Switch time can be significant
– ACID guaranties – no transaction loss
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
35
Clustering2
• MySQL Cluster (Storage Engine)
– Available in MySQL 4.1-max binaries
• MySQL 5.0 will have a lot improved version
– Shared nothing architecture
• Replication + automatic hash partition
– Many MySQL servers using many storage nodes
– Synchronous replication, row level
– Requires fast system network for good performance
– Very much into providing uptime
• including online software update
– In memory only at this point. With disk backup.
– Fixed cluster setup – can't add more nodes as you grow
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
36
Clustering3
• Third party solutions – EMIC Application Cluster
– Nice convenient tools, easy to use
– Commercial,
– Patched MySQL version required
– Synchronous replication, Statement level
– Full data copy on each node
– Limited scalability for writes, good for reads
– Very transparent. Only need to reconnect
– No multi statement transactions support
– Some minor features are not supported
• ie server variables
– Quickly developing check with EMIC Networks
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
37
C/C++ considerations
• Native C interface is the fastest interface
– “reference” interface which Java and .NET reimplement
– Most tested. Used in main test suite, Perl DBI. PHP etc
– very simple. May like some fancy wrapper around
– Make sure to use threaded library version if using threads
– Only one thread can use connection at the same time
• use proper locking
• connection pool shared by threads is good solution
– Better to use same as server major version of client library
– Prepared statements can be faster and safer
• ODBC – great for compatibility
– performance overhead
– harder to use
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
38
Perl
• Use latest DBD driver it supports prepared statements
• Using with HTTP server use mod_perl or FastCGI
• Do not forget to free result, statement resources
– This is very frequently forgotten in scripting languages
• Beware of large result sets.
– Set mysql_use_result=0 for these
• Pure Perl DBD driver for MySQL exists
– Platforms you can't make DBI/DBD compiled
– Has lower performance
• Special presentation on Perl topic by Patrick
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
39
PHP
• Standard MySQL Interface
– compatibility
• mysqli interface in PHP 5
– Object mode
– prepared statements like interface
• safer
– Support for prepared statements
– Faster
• PEAR DB
– Slower
– Compatibility, support multiple databases
– Object interface, prepared statements like interface
– PHP5 Presentation
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
40
Java
• Centralize code that deals with data base
– Change persistence strategies without rewrite
• Keep SQL out of your code
– Makes changes/tuning possible without recompiling
• Use connection pooling, do not set pool size too large
• Do not use “autoReconnect=true”, catch exceptions
– it can lead to hard to catch problems
• Use Connector/J's 'logSlowQueries'
– It shows slow queries from client perspective
• Try to use Prepared Statements exclusively
– Normally faster
– More Secure (harder to do SQL Injection)
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
41
.NET
• Try to use prepared statements as much as possible.
• Close all connection you open
– Simple but very typical problem
• Use ExecuteReader for all queries where you are just
iterating over the rows
– DataSets are slow and should only be used when you really
need access to all of the rows on the client
• Handle Disconnect and other exceptions
– No auto-reconnect support so less room for error
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
42
Shema design
• Optimal schema depends on queries you will run
• Data size and cardinality matters
• Storing data outside of database or in serialized for
– XML, Images etc
• Main aspects of schema design:
– Normalization
– Data types
– Indexing
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
44
Normalization
• Normalized in simple terms
– all “objects” in their own tables, no redundancy
– Simple to generate from ER diagram
– Compact, single update to modify object property
– Joins are expensive
– Limited optimizer choices for selection, sorting
• select * from customer, orders where
customer_id=order_id and order_date=”2004-01-01” and
customer_name=”John Smith”
– Generally good for OLTP with simple queries
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
45
Non-Normalized
• Non-Normalized
– Store all customer data with each order
– Huge size overhead
– Data updates are complex
• To change customer name may need to update many rows.
– Careful with data loss
• deleted last order no data about customer any more
– No join overhead, more optimizer choices
• select * from orders where order_date=”2004-01-01” and
customer_name=”John Smith”
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
46
Normalisation: Mixed
• Using Normalised for OLTP and non-normalised for DSS
• Materialized Views
– No direct support in MySQL but can create MyISAM table
• Caching some static data in the table
– both “city” and “city_id” columns
• Keep some data non-normalized and pay for updates
• Use value as key for simple objects
– IP Address, State
• Reference by PRIMARY/UNIQUE KEY
– MySQL can optimize these by pre-reading constant values
• select city_name from city,state where state_id=state.id
and state.code=”CA” converted to select city_name from
city where state_id=12
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
47
Data Types
• Use appropriate data type – do not store number as string
– “09” and “9” are same number but different strings
• Use appropriate length.
– tinyint is perhaps enough for person age
• Use NOT NULL if do not plan to store NULLs
• Use appropriate char length. VARCHAR(64) for name
– some buffers are fixed size in memory
– sorting files, temporary tables are fixed length
• Check on automaticly converted schema
– DECIMAL can be placed instead of INT etc
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
48
Indexing
• Index helps to speed up retrieval but expensive to maintain
• MySQL can only use prefix of index
– key (a,b) .... where b=5 will not use index.
• Index should be selective to be helpful
– index on gender is not a good idea
• Define UNIQUE indexes as UNIQUE
• Make sure to avoid dead indexes
– never used by any query
• Order of columns in BTREE index matters
• Avoid duplicated - two indexes on the same column(s)
• Index, being prefix of other index is rarely good idea
– remove index on (a) if you have index on (a,b)
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
49
Indexing
• Covering index – save data read, faster scans with long
rows
– select name from person where name like “%et%”
• Prefix index for data selective by first few chars
– key(name(8))
• Short keys are better, Integer best
• Close key values are better than random
– access locality is much better
– auto_increment better than uuid()
• OPTIMIZE TABLE – compact and sort indexes
• ANALYZE TABLE - update statistics
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
50
Index Types
• BTREE
– default key type for all but HEAP
– helps “=” lookups as well as ranges, sorting
– supported by all storage engines
• HASH
– Fast, smaller footprint
– only exists for HEAP storage engine
• slow with many non-unique values
– Only helpful for full “=” lookups (no prefix)
– can be “emulated” by CRC32() in other storage engines
• select * from log where url=”http://www.mysql.com” and
url_crc=crc32(“http://www.mysql.com”);
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
51
Designing queries
• General notes
• Reading EXPLAIN output
• Understanding how optimizer works
• What exactly happens when query is executed
• Finding problematic queries
• Checking up against server performance data
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
53
General notes
• Know how your queries are executed
– On the real data, not on the 10 rows pet table.
• Watch for query plan changes with upgrades, data change
• Do not assume a query that executes fast on other
databases will do so on MySQL.
• Use proper types in text mode queries
– int_col=123 and char_col='123'
• Use temporary table for caching
• Sometimes many queries works better than one
– and easier to debug when 70K query joining 25 tables
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
54
Reading EXPLAIN
•
Reading EXPLAIN
• type – how table is accessed (most frequent)
– “ALL” - full table scan
– “eq_ref” - “=” reference by primary or unique key (1 row)
– “ref” - “=” by non-unique key (multiple rows)
– “range” - reference by “>”, “<” or compex ranges
• possible_keys - indexes MySQL could use for this table
– check their list matches what you expect
• key – index MySQL sellected to use
– only one index per table in MySQL 4.1 (fixed in 5.0)
– Make sure it is correct one(s)
• key_length - Used key length in bytes
– Check expected length is used for multiple column indexes
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
56
Reading EXPLAIN
• “ref” - The column or constant this key is matched against
• “rows” - How many rows will be looked up in this table
– Multiply number or rows for tables in single select to estimate
complexity
• “extra” - Extra Information
– “Using Temporary” - temporary table will be used
– “Using Filesort” - external sort is used
– “Using where” - some where clause will be resolved with this
table read
mysql> explain select * from t1,t2 where t1.i=t2.i order by t1.i+t2.i;
+----+-------------+-------+------+---------------+------+---------+------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+---------------------------------+
| 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 36864 | Using temporary; Using filesort |
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 36864 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+---------------------------------+
2 rows in set (0.00 sec)
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
57
Simple Example
• SELECT City.Population/Country.Population FROM
City,Country WHERE CountryCode=Code;
• MySQL need to select table order
– Scanning City and checking Country for each
– Scanning Country and checking all Cityes for it
• In each table orders different keys can be used
• Search set too large – not all possibilities tested
• Next Step: Optimize order by/group by if present
– Should use index to perform sort ? Filesort ?
– Should use temporary table or sort for group by ?
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
59
General Settings
• --character-set
– use simple character set (ie latin1) if single language
• --join_buffer_size
– buffer used for executing joins with no keys. Avoid these
• --binlog_cache_size
– when --log-bin enabled. Should fit most transactions
• --memlock
– lock MySQL in memory to avoid swapping
• --max_allowed_packet
– should be large enough to fit lagest query
• --max_connections
– number of connections server will allow. May run out of
memory if too high
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
67
MySQL Status
• Aborted_clients - are you closing your connections ?
– if no check network and max_allowed_packet
• Aborted_connects – should be zero
– Network problems, wrong host,password, invalid database
• Binlog_cache_disk_use (1), Binlog_cache_use (2)
– If ½ is large, increase binlog_cache_size
• Bytes_received/Bytes_sent - Traffic to/from server
– Can network handling it ? Is it expected ?
• Com_* - Different commands server is executing
– Com_select – number of selects, excluding served from
query cache
– Shows load information on query basics
– Are all of them expected ? ie Com_rollback
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
71
MySQL Status
• Connections – number of new connections established
– way to high number may ask for connection pooling.
• Created_tmp_tables - internal temporary tables created
for some queries executions.
– sometimes can be avoided with proper indexes
• Created_tmp_disk_tables – table taking more than
tmp_table_size will be converted to MyISAM disk table
– if BLOB/TEXT is sellected disk based table is used from start
– look at increasing tmp_table_size if value is large
• Created_tmp_files – temporary files used for sort and
other needs.
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
72
Status
• Max_used_connections – maximum number of
connections used
– check if it matched max_connections
• too low value or sign of overload.
• Open_files - number of files opened, watch for the limits
– Storage engines (ie Innodb may have more)
• Open_tables - number of currently open tables
– single table opened two times is counted as two
– check it against table_cache, it should be large enough
• Opened_tables – number of times table was opened
(table_cache miss)
– check how many opens per second are happening, increase
table_cache if many
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
75
Server Status
• Questions – number of questions server got
– all of them including malformed queries
– good rough load indicator for stable load mix
• Select_full_join – number of joins without indexes
– should be zero, these are real performance killer
• Select_full_range_join - number joins with range lookup
on referenced table
– potentially slow. Good optimization candidates
• Select_range – number of joins with range lookup on first
table
– typically fine
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
78
Server Status
• Select_range_check – joins when key selection is to be
performed for each row
– large overhead, check query plan
• Select_scan – joins with full table scan on first table
– check if it can be indexed
• Slow_launch_threads – threads took more than
slow_launch_time to create
– connection delay
• Slow_queries – queries considered to be slow
– logged in slow_query_log if it is enabled
– taking more than long_query_time seconds to run
– doing full table scan, if log_queries_not_using_indexes is
specified
– check query plans
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
79
Storage Engines
• MyISAM specific Optimizations
• Innodb specific Optimizations
• Heap Specific Optimizations
• Power of multiple Storage Engines
• Designing your own storage engine
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
82
MyISAM
• MyISAM Properties
– no transactions, will be corrupted on power down
– small disk and memory footprint
– packed indexes, works without indexes, FULLTEXT,RTEE
– table locks, concurrent inserts
– read-only packed version
– only index is cached by MySQL, data by OS
• Typical MyISAM usages:
– Logging applications
– Read only/read mostly applications
– Full table scan reporting
– Bulk data loads, data crunching
– Read/write with no transactions low concurrency
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
83
– Buffer Pool
• Buffer pool size 24576, Free buffers 0, Database pages 23467, Modified db pages 0
– Log activity
• 5530215 log i/o's done, 0.00 log i/o's/second
– Row activity
• 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 242.44 reads/s
MySQL Replication
• MySQL Replication Architecture
• Setting up MySQL Replication
• Replication concepts for your application
• Bidirectional, Circular replication issues
• Fallback/Recovery in MySQL Replication
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
97
Replication Options
• --log-slave-updates – log updates from slave thread
– useful for chain replication, using slave for backup
• --read-only - do not allow updates to the slave server
– useful as protection from application errors.
• --replicate-do-table,--replicate-wild-do-table – specify
tables, databases to replicate
– avoid using –replicate-do-db
• --slave_compressed_protocol=1 Use compressed
protocol
– useful for replication over slow networks
• --slave-skip-errors - continue replication with such errors
• --sync_binlog=1 - Sync binlog on each commit
– if you want to continue after master restart from crash
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
101
Replication concepts
• Master -> Slave
– Most simple one, gives some HA and performance
• Master <-> Master
– write to both nodes, simple fall back, update conflict problem
• Master -> Slave1...SlaveN
– Great for mostly read applications, easy slave recovery
– More complex fall back, resource waste – many copies
– Write load does not scale well.
• Master1 -> Slave1, Master1->Slave2 ...
– Replication together with data partition.
– Can be used in bi-directional mode too
– Limited resource waste, good write load scalability
– Can have several slaves in each case
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
102
Bi-Directional Replication
• Master1 <-> Master2
– Writing to both nodes – update conflicts, no detection
• Due to asynchronous replication
• auto increment values collide
– MySQL 5.0 --auto-increment-offset=N
• updates can be lost
– Make sure no conflicting updates if both Masters writable
• Check if queries can be executed in any order
– UPDATE TBL SET val=val+1 WHERE id=5
• Partition by tables/ objects
– Master1 works with even IDs Master2 with odd
– Writing to one of them at the time
• Other protected by --read-only
• Easy to fall back – no need to reconfigure
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
103
Chain,Circular Replication
• Chain Replication
– Slave1->Slave2->Slave3
– Can be used as “tree” replication if there are too many slaves
– HA – if middle node fails, all below it stop getting updates
– Complex rule to find proper position for each on recovery
• Circular Replication
– Slave1->Slave2->Slave3->Slave1
– Same problems as in Bi-Directional replication
– Same HA issues as Chain Replication
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
104
Hardware,OS, Deployment
• Hardware selection for MySQL
• Hardware Configuration
• OS Selection
• OS Configuration
• Physical Deployment
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
109
Hardware Selection
• CPU: Consider 64bit CPUs
– EM64T/Opteron are best price/performance at this point
• CPU Cache – Larger, better
– CPU Cache benefit depends on workload
• 1MB->2MB seen to give from 0 to 30% extra
• Large number of threads benefit from increased size
• Memory Bandwidth – Frequent bottleneck for CPU bound
workloads
– Fast memory, dual channel memory, dedicated bus in SMP
• Number of CPUs: Single query uses single CPU
– multiple queries scale well for multiple CPUs
• consider logs Storage engine is setting for you
• HyperThreading – gives improvement in most cases
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
110
Hardware Selection II
• System Bus - can be overloaded on high load
– different buses of IO, Network may make sense
• Video Card, Mouse, Keyboard
– MySQL Server does not care :)
• Network card
– Watch for latency, 1Gb Ethernet are good
– CPU offloading (Checksum generation etc)
• check for driver support
• Extension possibilities
– Can you add more memory ? More disks ?
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
111
Disk IO Subsystem
• Need RAID to ensure data security
– Slaves could go with RAID0 for improved performance
• RAID10 – best choice for many devices
– RAID1 if you have only two disks
• RAID5 – very slow for random writes, slow rebuild
– cheaper drives in RAID10 usually work better
• Battery backed up write cache
– truly ACID transactions with small performance hit
• Multiple channels good with many devices
• Software RAID1/RAID10 typically good as well
– random IO does not eat much of CPU time
• Use large RAID chunk (256K-1MB)
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
112
Disk IO Subsystem
• Compute your IO needs – drive can do (150-250 IO/sec)
• Test your RAID if it gives you performance it should
– SysBench http://sourceforge.net/projects/sysbench
• Test if Hardware/OS really syncs data to disk
– Or bad corruption may happen, especially with Innodb
• SAN – easy to manage but slower than direct disks
• NAS, NFS – Test very carefully
– works for logs, binary logs, read only MyISAM
– a lot of reported problems with Innodb
• Place Innodb logs on dedicated RAID1 if a lot of devices
– otherwise sharing works well
– OS could use the same drive
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
113
Hardware configuration
• Mainly make sure it works as it should
– sometimes bad drivers are guilty
• Does your IO system delivers proper throughput
– check both random and sequential read/writes
– Cache set to proper mode ?
• good to benchmark, settings, ie read-ahead
• Is your network is set in proper mode (ie 1GB/full duplex)
– CPU offloading works ? Any errors ?
– What is about interrupt rate ?
• Some drivers seems to have problem with buffering, taking
interrupt for each packet
• Test memory with memtest86 if unsure
– broken memory frequent source of MySQL “bugs”
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
114
OS Selection
• MySQL Supports wide range of platforms
– Linux, Windows,Solaris are most frequently used
• all three work well
– Better to use OS MySQL delivers packages for
– RedHat, Fedora, SuSE, Debian, Gentoo – most frequent
• Any decent distribution works
• Get MySQL server from http://www.mysql.com
– Ensure vendor can help you – we can't fix some OS bugs
• Watch for good threads support
– Kernel level threads library for SMP support
– Older FreeBSD, NetBSD had some issues
• Make sure your memory is addressable by OS
• Make sure all your hardware is well supported by OS
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
115
OS Configuration
• Allow large process sizes
– MySQL Server is single process
• Allow decent number of open files, especially for MyISAM
• If possible lock MySQL in memory (ie –memlock)
• Make sure VM is tuned well, to avoid swapping
– And Size MySQL buffers well
• Tune read-ahead. Too large read-ahead limits random IO
performance
• Set proper IO scheduling configuration (elevator=deadline
for Linux 2.6)
• Use large pages for MySQL process if OS allows ie
– --large-pages option in 5.0 for Linux
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
116
OS Configuration
• Use Direct IO if using Innodb for Data
– Logs and MyISAM are better with buffered
– O_DIRECT in Linux “forcedirectio” in Solaris
• Set number of active commands for SCSI device
– default is often too low
• Make sure scheduler is not switching threads too often
– with large number of CPUs, CPU binding could help
• Use large file system block/extent size
– tables are typically large
– use “notail” for reiserfs
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
117
Deployment Guidelines
• Automate things, especially dealing with many systems
• Have load statistic gathering and monitoring
• Use different Database and Web (application) Server
– different configuration, quality requirements, scaling
• Do not have MySQL servers on external network
– Web servers with 2 network cards are good
• Have regular backup schedule
– RAID does not solve all the problem
• Use binary log so you can do point in time recovery
• Have slow log enabled to catch slow queries.
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
118
MySQL Workloads
• MySQL in OLTP Workloads
• MySQL in DSS/Data warehouse Workloads
• Batch jobs
• Loading data
• Backup and recovery
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
119
OLTP Workloads
• Online Transaction Processing
– Small Transactions, Queries touching few rows, random
access
– Data size may range from small to huge, not uniform access
• Make sure your schema is optimized for such queries
• If you can fit your working set in memory – great
• Watch for locks (table locks, row locks etc)
• For large databases – check random IO your disks can
handle
• Configure MySQL for your number of connections
– Large global buffers (key_buffer, innodb_buffer_pool)
– Smaller per thread buffers - sort_buffer, read_rnd_buffer
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
120
Logging
• Logs in database are cool – easy reporting using SQL
– SELECT AVG(rtime) FROM log WHERE request=”search”
• MyISAM table with no indexes – fast logging and scans
– “Archive” storage engine has smaller footprint
• Use “INSERT DELAYED” so live reporting possible
– if “no holes” CONCURRENT insert should work as well
– may write them to file and use separate “feeder”
• Limit indexes – these are most expensive to update
– with index - keep tables small so index tree fits in memory
• Create multiple tables, easy, fast data purging:
– INSERT INTO log20050101 (...) VALUES (...)
– if error, CREATE TABLE log20050101 LIKE base_table
– retry insert
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
135
Listing navigation
• Common problem – directories, forums, blogs etc
– “show everything from offset 2000 to 2010”
– SELECT * FROM tbl LIMIT ORDER BY add_time 2000,10 works but
slow
• 2000 rows has to be scanned and thrown away
• Precompute position
– SELECT * FROM tbl WHERE POS BETWEEN 2000 and 2010 is fast
• hard to do live, may use delayed published
• “new” entries can be shown out of order until position counted
• Cache - pull first 1000 entries and precompute positions
– only few people will go further than that.
• Specific applications may have more solutions
MySQL Users Conference 2005 , April 18-21 | Advanced MySQL Optimization Tutorial | © MySQL AB 2005 | www.mysql.com
137
Resources
• MySQL Online Manual – great source for Information
– http://dev.mysql.com/doc/mysql/en/index.html
• SysBench - Benchmark and Stress Test tool
– http://sourceforge.net/projects/sysbench
• FullText Search systems
– Mnogosearch: http://www.mnogosearch.org
– Sphinx: http://www.shodan.ru/projects/sphinx
• MySQL Benchmarks mailing list
– [email protected]
• Write us your questions if you forgot to ask
– [email protected] [email protected]
– Feel free to grab on the conference to discuss your problems