Student Guide - SQL-4402 MySQL Performace Tuning
Student Guide - SQL-4402 MySQL Performace Tuning
D62424
Edition 1.0
D61820GC10
December 2010
Student Guide
Copyright © 2009, 2010, Oracle and/or its affiliates. All rights reserved.
Disclaimer
This document contains proprietary information, is provided under a license agreement containing restrictions on use and
disclosure, and is protected by copyright and other intellectual property laws. You may copy and print this document solely for
your own use in an Oracle training course. The document may not be modified or altered in any way. Except as expressly
permitted in your license agreement or allowed by law, you may not use, share, download, upload, copy, print, display,
perform, reproduce, publish, license, post, transmit, or distribute this document in whole or in part without the express
authorization of Oracle.
The information contained in this document is subject to change without notice. If you find any problems in the document,
please report them in writing to: Oracle University, 500 Oracle Parkway, Redwood Shores, California 94065 USA. This
document is not warranted to be error-free.
If this documentation is delivered to the U.S. Government or anyone using the documentation on behalf of the U.S.
Government, the following notice is applicable:
Trademark Notice
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective
owners.
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
iii
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The “shell” is your command interpreter. On Linux, this is typically a program such as sh, csh, or bash. On
Windows, the equivalent program is command.com or cmd.exe, typically run in a console window. When you enter
a command or statement shown in an example, do not type the prompt shown in the example.
Database, table, and column names must often be substituted into statements. To indicate that such substitution is
necessary, this manual uses db_name, tbl_name, and col_name. For example, you might see a statement like
this:
This means that if you were to enter a similar statement, you would supply your own database, table, and column
names for the placeholdes db_name, tbl_name, and col_name., perhaps like this:
mysql> SELECT author_name FROM biblio_db.author_list;
In syntax descriptions, square brackets ([ and ]) indicate optional words or clauses. For example, in the following
statement, IF EXISTS is optional:
When a syntax element consists of a number of alternatives, the alternatives are separated by vertical bars (pipe, |).
When one member from a set of choices may be chosen, the alternatives are listed within square brackets ([ and ]):
When one member from a set of choices must be chosen, the alternatives are listed within braces ({ and }):
iv
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
An ellipsis (...) indicates the omission of a section of a statement, typically to provide a shorter version of more
complex syntax. For example, INSERT ... SELECT is shorthand for the form of INSERT statement that is
followed by a SELECT statement.
An ellipsis can also indicate that the preceding syntax element of a statement may be repeated. In the following
example, multiple reset_option values may be given, with each of those after the first preceded by commas:
If you are using csh or tcsh, you must issue commands somewhat differently:
and
shell> ./configure
v
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Inline Lab Throughout the course the instructor will conduct labs in line with
the instruction, which are designed to help you to understand the
“nuts and bolts” (inner-workings) of the topic.
Further Practice This image is used to convey to the student that there is a final
Lab exercise to complete prior to the completion of the chapter.
Student notes This image identifies an area on a page designated for students to
write notes associated with the class.
123 Slide number box Indicates the number of an existing slide that corresponds to the text.
vi
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Acknowledgments
Sun Microsystems would like to thank the many individuals that played a part in bringing this training material to
the numerous students who will benefit from the knowledge and effort that each of these contributors put into the
training. Even though there were a large number of contributions from many Sun Microsystems' employees, the
following list of contributors played a vital role in developing this material and ensuring that its contents were
accurate, timely and most of all presented in a way that would benefit those that are utilizing it for the benefit of
improving their skills with MySQL.
Max Mether, Course Development Manager
vii
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
2 Course Objectives
The MySQL Performance Tuning course is designed for Database Administrators and others who wish to monitor
and tune MySQL. This course will prepare each student with the skills needed to utilize tools for monitoring,
evaluating and tuning. Students will evaluate the architecture, learn to use the tools, configure the database for
performance, tune application and SQL code, tune the server, examine the storage engines, assess the application
architecture, and learn general tuning concepts.
● Develop a tuning strategy.
viii
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Table of Contents
1 Introduction............................................................................................................................... 1-1
1.1 Learning Objectives........................................................................................................... 1-1
1.2 MySQL Overview..............................................................................................................1-2
1.2.1 Sun Acquisition.......................................................................................................... 1-2
1.2.2 MySQL Partners.........................................................................................................1-3
x
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
xi
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
xii
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
xiii
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
xiv
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7
1 INTRODUCTION
1-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
"We have used MySQL far more than anyone expected. We went from experimental to mission-critical in a
couple of months."
Jeremy Zawodny--MySQL Database Expert Yahoo! Finance
1-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
NOTE: MySQL Support will continue to answer questions and provide assistance related to MySQL GUI
Tools Bundle, as well as assist our customers in upgrading from those tools to MySQL Workbench 5.2 until
June 30, 2010.
The features from the above graphical tools are being added to the existing MySQL Workbench starting with the 5.2
revision. Users should plan to upgrade to MySQL Workbench 5.2 GA.
14 MySQL Workbench
The MySQL Workbench GUI Tool (as of the 5.2 revision) will give DBAs and developers an integrated tools
environment for:
1-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
○ MySQL Connector/J
A JDBC (Java Database Connectivity) 4.0 driver for Java 1.4 and higher. Provides a java native
implementation of the MySQL client/server protocol.
○ MySQL Connector/NET
A fully managed ADO.NET provider for the .NET framework (version 1.1 and 2.0). Provides a .NET
implementation of the MySQL client/server protocol.
○ MySQL Connector/C
A C client library for client-server communication. It is a standalone replacement for the MySQL Client
Library shipped with the MySQL Server.
● Developed by Community
○ PHP - mysqli, ext/mysqli, PDO_MYSQLND, PHP_MYSQLND
Provides MySQL connectivity for PHP programs. Currently there are two MySQL specific PHP extensions
available that use libmysql: the mysql and mysqli extensions. There is also MySQL support for the
generic PHP Database objects (PDO) extension. In addition there is the PHP native driver called mysqlnd
which can replace libmysql in the mysqli extension.
○ Perl - DBD::mysql
○ Python - MySQLdb
○ Ruby – DBD::MySQL, ruby-mysql
○ C++ Wrapper - for MySQL C API (MySQL++)
NOTE: The above connectors (and their documentation) can be downloaded from our MySQL Connectors web
page: http://mysql.com/products/connector/ .
1-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● Purchased support
○ Enterprise subscription
○ Support for MySQL Cluster
○ Support for MySQL Embedded (OEM/ISV)
○ Online Knowledge Base
1-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
○ You have the option of adding a Technical Account Manager (Platinum level only) to be your liaison within
MySQL. Your TAM be your single point of contact, providing a custom review of your systems, regular
phone calls and on-site visits, to guarantee that you get the most out of MySQL Support Services.
● Online Knowledge Base
○ For quick self-help knowledge, you will have access to a comprehensive and easily searchable knowledge
base library with hundreds of technical articles regarding difficult problems on popular database topics such
as performance, replication, configuration, and security.
NOTE: For more detailed information regarding Enterprise support, please see our Enterprise web page:
http://www.mysql.com/products/enterprise/support.html .
1 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
This list is continually being updated. For the most current information, please check our website:
http://www.mysql.com. Source code and special builds are also available.
1 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Exam Administration
All exams are administered through several testing centers available world-wide. Visit the MySQL certification web
page, online forum, or email [email protected] for more information.
1 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Developer Path
● Introductory Courses
○ MySQL and PHP: Developing Dynamic Web Applications – This course provide the tools needed for the
1 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Intermediate Introductory
MySQL
and PHP MySQL for Certification:
Beginners CMA
4 days 4 days
MySQL
Advanced MySQL MySQL
Advanced/
Certification:
CMCDBA
Developer DBA
1 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Virtual Classroom
26 MySQL also offers virtual (online) courses covering various topics in relation to the MySQL suite of tools. These
classes are instructor-led and delivered synchronously, via the web. Information regarding the available courses can
be found on the MySQL training web page.
Introductory
Developer DBA
1 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
29 Information such as the following can be found on the Developer Zone web page:
● Current product and service promotions
● Get Started with MySQL
○ Installation information page for MySQL beginners
● Developing with:
○ Links to specific information on using MySQL with; PHP, Perl Python, Ruby, Java/JDBC and
.Net/C#/Visual Basic
● MySQL Librarian
1 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Click on the Products tab located at the top of the MySQL home page. Scroll down the list to review the
various product information provided.
Effect: A list of currently available products will appear, with links for further information and downloads.
Step 4. MySQL Enterprise
Action: Review the details of the new MySQL Enterprise program, by clicking on the Learn More >> link.
Effect: MySQL® Enterprise™ provides a comprehensive set of enterprise-grade software, support and services
directly from the developers of MySQL to ensure the highest levels of reliability, security and uptime.
Step 5. Services
Action: Click on the Services tab located near the top of the MySQL home page.
Effect: A list of currently available services will appear, with links for further information and downloads.
Step 6. MySQL Training and Certification
Action: Review the details of the MySQL Training & Certification program, by clicking on the Learn More >>
link.
Effect: Featured information on this page will be updated periodically. For specific Training sub-topics select one of
the links in the sub-menu in the upper-left corner of the page.
Step 7. Certification
Action: From the MySQL Training & Certification web page, select the Certification link in the sub-menu and
review the contents.
Effect: Featured information on this page will be updated periodically. For specific Certification sub-topics select
one of the links in the sub-menu in the upper-left corner of the page.
1 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Click on the Services tab located at the top of the MySQL home page.
Effect: Returns to the top level Services page.
Step 9. Support Services
Action: Review the details of MySQL Support programs, by clicking on the Learn More >> link. (Continued
on next page.)
Effect: Shows the various support programs available.
Step 10. MySQL Community
Action: Return to the MySQL home page using the MySQL.com tab at the top of the page, then click on the
News & Events tab located near the top of the MySQL home page.
Effect: Shows the latest MySQL happenings, from news stories to upcoming seminars.
1 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Log in as root using the following command (the password for root is oracle):
su -
Action: Change into the /usr/local directory using the following command:
cd /usr/local
Effect: This will be the directory where the MySQL folder will be located.
Action: Create a symbolic link from the MySQL folder in the /stage directory to a /usr/local/mysql
directory by issuing the following command:
ln -s /stage/mysql-5.1.44-linux-i686-icc-glibc23 mysql
Action: Execute the mysql_install_db script to initialize the MySQL data directory and creates the system
tables using the following commands:
cd mysql
scripts/mysql_install_db
Effect: The MySQL system tables are installed and the help files are filled.
1 - 24
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Step 2. Setup the MySQL server to execute with the mysql user
groupadd mysql
Effect: The group mysql is created and will be used in the next few steps.
Action: Execute the following command to create a user called mysql assigned to the mysql group using the
following command:
Effect: The user mysql is created and assigned to the group mysql.
Action: Change the owner of the /usr/local/mysql/data directory to mysql using the following command:
Effect: The data directory that houses the mysql data is now owned by the user mysql instead of root.
Step 3. Start the MySQL server
Action: Enter the following command to copy the mysql server executable to the /etc/init.d directory:
cp support-file/mysql.server /etc/init.d/mysql
Effect: This process will allow the MySQL server to start up every time the server is started/restarted.
/etc/init.d/mysql start
Effect: This will start the MySQL server file that was just copied to the /etc/init.d directory. By starting
it at this location, you are assured that the MySQL server will start on server start/restart.
1 - 25
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Edit the /etc/bashrc file and add the path to the MySQL bin directory at the end of the file using the
following commands:
• vi /etc/bashrc
• Press the capital letter G to go the end of the file
• Press the capital letter A to insert text at the end of the line.
• Press the <Enter> to go down to the next line.
• Type PATH=${PATH}:/usr/local/mysql/bin
source /etc/bashrc
Effect: This will allow the bashrc file to be refreshed without having to restart the server.
NOTE: For full installation from the MySQL website on the Windows and/or Linux Operating systems, see
Appendix A at the end of this training guide.
1 - 26
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Open the Firefox web browser (icon located on the desktop) and go to the following website to
download the world example database:
http://dev.mysql.com/doc/index-other.html
Effect: This will open up the Other MySQL Documentation page on the MySQL developers website.
Action: Select the Gzip link to the right of the world database text located under the Example Databases
heading.
Effect: This will open up a pop-up window asking you what you would like to do with the file you are about to
download. Choose the Save File option and press OK. The file is automatically downloaded to the
/home/oracle/Desktop directory.
Action: Open up a terminal window (or use the terminal window from the last lab) and enter the following
commands:
cd /home/oracle/Desktop
gunzip world.sql.gz
Effect: This will unzip the world.sql.gz file leaving just the world.sql file to be used in the next few
steps.
1 - 27
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Effect: A message will appear confirming that the connection to your version of the server has been launched;
Type 'help;' or '\h' for help. Type '\c' to clear the current input
statement.
Action: Type:
Effect: Instructs client to use the newly created world database. Returns the following message;
Database changed
Action: Type:*
Effect: * Instructor will give you the file path if it is different than shown here. Several 'Query' messages will scroll
pass while tables and data are being uploaded for the world database.
1 - 28
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
1 - 29
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
34
2 MYSQL ARCHITECTURE
2-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Client
Client Program
Program
And so on ...
2-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Due to the fact that MySQL is a thread-based server architecture there is the potential for problems to occur
when two processes attempt to use the same common resource. Mutex (mutual exclusion) algorithms or
semaphores are able to minimize the potential for any possible conflicts (one process canceling out the other)
in concurrent programming; however, even this process is not 100% reliable. With that said, in MySQL there
are times when using certain "work arounds" have to be implemented to prevent any potential conflicts. Some
of these "work arounds" will be discussed in the process of this course.
2-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
As with many of the other subsystems discussed in this overview, the query cache is a key discussion in
performance tuning and will be discussed in greater detail during the remainder of this course.
Query parsing
This is the process of deconstructing the SQL statement into a parse tree. This is an extremely complex process that does
not have any user variables that can be modified.
Optimization
This subsystem is responsible for finding the "optimal" execution plan for each query.
NOTE: More to come
Optimization is a key discussion in performance tuning and will be discussed in greater details throughout the
remainder of this course.
Execution
This process is called the statement execution unit and is responsible for executing the optimized path for the SQL
command passed through the parser and optimizer. This process is ultimately responsible for ensuring the results of the
query executed reach the originating client.
2-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
2-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Most of the MySQL server operates in the same way no matter what storage engine is used: all the usual SQL
commands are independent of the storage engine. Naturally, the optimizer may need to make different choices
depending on the storage engine, but this is all handled through a standardized interface (API) which each storage
engine supports.
2-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● Changes ignored – A ROLLBACK command can be used to ignore all the changes that were made prior to
committing to data.
● Failed updates protected - If an update fails, all the changes are reverted from the start of the transaction.
(With non-transactional tables, all changes that have taken place are permanent.)
A limitation to transactional storage engines is the performance overhead associated with managing the data integrity.
The following is a description of the ACID compliancy test for storage engines that claim to be transactional:
● Atomicity – Does the transactional storage engine ensure that all the tasks in a transaction are done, or none
of them? The transaction must be completed, or else it must be undone (rolled back).
● Consistency - Does every transaction preserve the integrity constraints -- the declared consistency rules -- of
2-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
49
InnoDB
The InnoDB engine provides support for most of the database functionality associated with the MyISAM engine (Full-
text and GIS indexes are not supported) along with full transaction ACID compliant capabilities. The key to the InnoDB
system is a database, caching and indexing structure where both indexes and data are cached in memory as well as being
stored on disk. This enables very fast recovery, and works even on very large data sets. With row level locking, data can
be added to an InnoDB table without the engine locking the table with each modification, thus speeding up both the
recovery and storage of information in the database.
2-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
NDBCluster
The NDBCluster storage engine runs in server memory and supports transactions and synchronous replication.
“Synchronous replication" is within nodes in the cluster only and is automatic. The data is thus spread across many
possible servers. The engine spreads data redundantly across many nodes in the cluster. The storage engine allows one
of the nodes to go offline and does not interrupt data availability. The storage engine uses row-level locking. All reads
are non-locking by default. READ-COMMITTED is the only supported isolation level. As with clusters in general,
the NDB Cluster storage engine ensures individual node crashes do not stop the cluster. This storage engine provides
for automatic synchronization of data node at restart as well as recovery from checkpoints and logs at cluster crash.
The NDB Cluster storage engine can perform many maintenance tasks online, including on-line backup and on-line
software upgrades. In addition, the storage engine supports unique hash and T-tree ordered indexes.
MEMORY
The MEMORY storage engine (previously known as the HEAP storage engine) stores all data in memory and has no
footprint on disk; with that said, once the MySQL server has been shut down any information stored in a MEMORY
table will have been lost. However, the format of individual tables is kept, thus enabling the creation of temporary
tables that can be used to store information for quick access without having to recreate the tables each time the
database server is started. Other uses for the MEMORY storage engine include session handling, simulations and
summary tables.
2 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
ARCHIVE
The ARCHIVE storage engine acts as a long-term storage device for data that will not be modified in any way once it
enters into the database. With little overhead needed to only manage INSERT and SELECT statements, the information is
stored very efficiently due to it being compressed and non-modifiable. The ARCHIVE storage engine is perfect for
storing and retrieving log data or information that is no longer in active use. Complex searches against ARCHIVE tables
should be minimized due to the storage engine having to uncompress and read the entire table.
CSV
The CSV storage engine stores data in the form of a CSV (Comma Separated Values) file, not in a binary format. This is
not an efficient method for storing large volumes of data, or larger data types like BLOB, although such types are
supported. Since the data is stored in the CSV format, the files created are exceedingly portable and thus easy to import
51
FEDERATED
The FEDERATED storage engine enables access to data from remote MySQL database tables as if they were local tables.
This storage engine in essence turns the MySQL server into a proxy for a remote server, using the MySQL client access
library to connect to the remote host, execute queries and then reformat the data into the localized format. This allows the
server, not a client, to access a remote database and can be used to combine data from multiple hosts or for copying
specific data from remote databases into local tables without the use of data exports and imports. The use of the
FEDERATED storage engine has an effect on performance, especially when executing complex queries . Data storage,
indexes and locking are all dependent on the remote tables defined storage engine.
2 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
BLACKHOLE
The BLACKHOLE engine does not actually store any data. Even though the storage engine is truly a "Blackhole" for
data, tables and indexes can be created and all SQL statements that would add or update information to the database
can be executed without actually writing any data. In addition, any locking statements issued against the storage
engine are ignored. With the BLACKHOLE storage engine, the database structure is retained thus allowing indexes
on the (non-existent) information. This provides an excellent testing ground for database structures without actually
creating any data against those structures. All SQL statements against a BLACKHOLE table are written to the binary
log which in turn can then replicated to slave servers.
+------------+---------+-----------------------------+--------------+-----+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+------------+---------+-----------------------------+--------------+-----+------------+
| ndbcluster | YES | Clustered, fault-toler ... | YES | NO | NO |
| MRG_MYISAM | YES | Collection of identical ... | NO | NO | NO |
| BLACKHOLE | YES | /dev/null storage ... | NO | NO | NO |
| CSV | YES | CSV storage engine | NO | NO | NO |
| MEMORY | YES | Hash based, stored in ... | NO | NO | NO |
| FEDERATED | YES | Federated MySQL ... | YES | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
| InnoDB | YES | Supports transaction ... | YES | YES | YES |
| MyISAM | DEFAULT | Default engine as of ... | NO | NO | NO |
+------------+---------+-----------------------------+--------------+-----+------------+
For the purpose of performance tuning, this list describes which storage engines can be utilized in the current server
configuration and which storage engine is the default for all new tables created within the server.
NOTE: Installing necessary storage engines
If there are storage engines not supported by the server and they will assist in performance, it is best to ensure
that the server has those storage engines installed prior to going live with an application. If applications are
already running on the server and there is a need to utilize a storage engine not available, it is best to perform
the operation in off hours, when a complete backup of the data can be taken, and a rebuild of the MySQL
server can be made with the storage engines required. It is best to build the MySQL server with only the
storage engines required to minimize the size of the build and the amount of memory the additional storage
engines will utilize.
2 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: In the world database, create a table based on the City table by issuing the following command:
Effect: The storage engine should be MyISAM. If it is not, alter the table to use the MyISAM storage engine by
typing the following command:
Action: Now the new table needs to be loaded with data; however, we wish to load lots of data. There are multiple
ways to accomplish this, but for class purposes, and to save time, we will create a stored procedure to perform the
task. Issue the following commands in the mysql client to create a stored procedure called Load_Data:
NOTE: Due to the fact that there is no way to edit a specific portion of a stored procedure in the mysql client, it
is best to create the stored procedure code in an external application (such as Text Editor) for editing purposes.
2 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Execute the Load_Data script with 1000 repetitions by typing the following command in the mysql
client:
CALL Load_Data(1000);
Effect: This executes the script just created and loads approximately 4 Million rows of data into the
City_Test table. How long did it take to complete the script?
Effect: This displays valuable information about the City_Test table to include the engine type, the number
of rows and the length of the data. The Data_length field provides an educated guess of the size of the data
that is being stored.
NOTE: The SHOW TABLE STATUS is an approximation of the table information and multiple runs of this
same command may result in slightly different outputs.
2 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Remove the data from the City_Test table by executing the following command in the mysql client:
TRUNCATE City_Test;
Effect: The TRUNCATE command when executed against a table is equivalent to dropping the table and recreating
it (which is faster than a row bye row delete). For InnoDB tables, the dropping and recreation of the table is not
performed if there are foreign key constraints referencing the table.
Step 4. Test loading data into ARCHIVE
Action: Change the storage engine for the City_Test table to ARCHIVE by typing the following command in the
mysql client:
There are many fundamental differences between the different storage engines, with each having their advantages
and disadvantages. The next few steps will recreate the steps previously run on the City_Test table with the
ARCHIVE storage engine.
Action: Execute the Load_Data script with 1000 repetitions against the City_Table again by typing the following
command in the mysql client:
CALL Load_Data(1000);
Effect: This executes the script created earlier and loads approximately 4 Million rows of data into the
City_Test table. How long did it take to complete the script now that the City_Test table is using the
ARCHIVE storage engine?
Is there a difference in the Data_length field now that the ARCHIVE storage engine has been used?
Action: Attempt to remove all data from the City_Test table by executing the following command in the mysql
client:
TRUNCATE City_Test;
Effect: The ARCHIVE storage engine does not allow any deletion of the data (either by TRUNCATE or DELETE).
2 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Change the storage engine for the City_Test table to BLACKHOLE by typing the following
command in the mysql client:
The data in the City_Test table could not be deleted while it was in the ARCHIVE storage engine; however,
the storage engine for the table could be altered. In this case, the storage engine chosen is the BLACKHOLE
storage engine.
Action: Review the status of the table along with other important information by executing the following
Is there a difference in the Data_length field now that the BLACKHOLE storage engine has been used? ☺
Action: Execute the Load_Data script with 1000 repetitions against the City_Table again by typing the
following command in the mysql client:
CALL Load_Data(1000);
Effect: This executes the script created earlier and processes approximately 4 Million rows of data against the
City_Test table. How long did it take to complete the script now that the City_Test table is using the
BLACKHOLE storage engine?
Action: Change the storage engine for the City_Test table to INNODB by typing the following command in
the mysql client:
Effect: The City_Test table is now using the INNODB storage engine. The change to the new storage engine
should not have taken long due to the fact that there was no data in the table to process.
Action: In the mysql client type the following:
START TRANSACTION;
Effect: This will tell the client that we are starting the first steps of a transaction.
2 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Execute the Load_Data script against the City_Table again by typing the following command in the
mysql client:
CALL Load_Data(1000);
Effect: This executes the script created earlier and processes approximately 4 Million rows of data against the
City_Test table. How long did it take to complete the script now that the City_Test table is using the InnoDB
storage engine?
COMMIT;
Effect: This will commit all the INSERT transactions that were just issued.
Action: Review the status of the table along with other important information by executing the following command
in the mysql client:
Is there a difference in the Data_length field now that the InnoDB storage engine has been used?
In addition, the Data_free column identifies that there is an amount of capacity available in the InnoDB
tablespace (InnoDB Free: #### kb). This free amount identifies how much capacity is available in the InnoDB
tablespace and will increase as more data is added to tables connected to the InnoDB storage engine.
Action: Due to the fact that the InnoDB storage engine is much slower on inserts, it is necessary to use a smaller
data set to work with. At this point it is necessary to clean up the table and start fresh by typing the following in the
mysql client:
TRUNCATE City_Test;
The InnoDB storage engine processes the insertion of data differently due to it being ACID compliant. This effects
the time it takes to perform inserts; however, the advantage of ACID compliancy is definitely worth it for many
applications. There are steps to improve the insertion of data with tables that utilize the InnoDB storage engine that
will be discussed later in the course.
Step 7. Investigate Data and Index sizes
Action: Execute the Load_Data procedure (along with the transaction commands) against the City_Test again
but this time with only 250 repetitions by typing the following commands in the mysql client:
START TRANSACTION;
CALL Load_Data(250);
COMMIT;
2 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Effect: This executes the script that was just modified and loads approximately 1 Million rows of data into the
City_Test table.
Action: Review the status of the table along with other important information by executing the following
command in the mysql client:
What is the size of the Data_length field and how many rows of data does the table contain?
______________________________________________
This index is based on the first two characters of the data in the Name column. The reason that only the first
two characters are being chosen for the index is the amount of time it takes to index any more characters in the
data. With that said, this process can still take a few minutes.
NOTE: For this lab, there may not be a large difference between indexing the whole column over the first two
characters; however, with larger data sets and more unique data this could make a big difference.
Action: Review the table of the City_Test table again to see how the index has affected the status of the
table by typing the following in the mysql client:
Has the size of the Data_length changed? Is there a field for the Index_length now? If so, what is its
size?
_________________________
Also, for the purposes of the next few steps, record the number of records (Rows) that the table contains:
__________________________
Action: Execute the following command in the mysql client to delete approximately 10% random records in
the City_Test table:
Effect: This process deletes approximately 10% of the records from the City_Test table. This process could
also take some time to complete.
2 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Review the table of the City_Test table again to see how the deletion of the records has affected the
status of the table by typing the following in the mysql client:
Has the number of records (Rows) changed to reflect a deletion of records from the table? If so, what is the new
number of Rows?
__________________________________
If there was a deletion of records, did this affect the size of the Data_length or the Index_length?
2 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
2.5 Locks
When discussing locks it is important to see them as a synchronization mechanism for enforcing limits on access to a
53 resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency
control policies and maintaining data integrity. There are two types of locks:
● Shared (read) locks – many clients may read from the resource at the same time and not interfere with each
other. This type of lock allows other shared locks to be set.
● Exclusive (write) locks – only one client can write to the resource at a given time, and others are prevented
from reading while a client is writing. This type of lock prevents any other locks being set.
Locking Types
54 Locks granularity
In MySQL, there are two different levels of locking granularity implemented by different storage engines:
● Table locks – This is a when a table as a whole is locked on an all-or-nothing basis. When a client wants to
make a change to the data (INSERT, DELETE, UPDATE, etc.) the entire table is locked from write or reads
from any other client.
● Row locks – This is when an individual row is locked from reads or writes from the other clients; however,
due to the complexity of performing this type of lock it is the most computationally expensive, with the
greatest possible concurrency from the other clients perspectives.
NOTE: Storage engine dependent
The algorithms used by MySQL concerning locking methods are completely dependent upon the storage
engines that are being utilized. Each storage engine is very specific in its implementation of locks and will be
discussed in detail later in this course.
Locking Issues
Three locking issues that arise with applications using MySQL are:
● Blocking locks – Locks that prevent another thread from performing work, will be dependent on the locking
mechanism via the chosen storage engine.
● Deadlocks - This is a specific condition when two or more processes are each waiting for another to release
a resource, or more than two processes are waiting for resources in a circular chain. For every resource
request, the server sees if granting the request will mean that the system will enter an unsafe state, meaning a
state that could result in deadlock. The system then only grants requests that will lead to safe states. This
type of condition can not occur on storage engines that only support table level locking.
2 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Deadlock prevention is not only a function of the storage engine. There is some application design that needs to be
considered also. For example, operations should always be done in the same order. (e.g.. If a transaction updates A
and then B, whereas another transaction updates B and then A, the system would be prone to deadlocks).
● Locks on higher level than necessary - Locks can sometimes be acquired at a higher level than necessary,
usually when working with explicit locks versus implicit locks. Generally, this is an application issue.
2 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice, you will use different storage engines to determine the performance differences and size
differences.
1. In the world schema create a new table called city_huge that contains the structure and all the data from
City table.
55
2. Increase the number of records in the city_huge table by inserting all the records from the city_huge
table between four (4) and eight (8) times. The data in the city_huge table will grow exponentially as
you run the INSERT command against and from the city_huge table. The last two executions should
start to run slower. Depending on the speed of your hardware, adjust the number of repetitions to challenge
__________________________________________________________________________________
__________________________________________________________________________________
5. Search each table created for all records where the city identification number (ID) is equal to 123456 and
compare the average response times. If the response times differ, why?
__________________________________________________________________________________
__________________________________________________________________________________
6. Using the SHOW TABLE STATUS, review the Data_length and the other information associated with
the tables created in step 3.
2 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
2 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
57
3 MYSQL PERFORMANCE TOOLS
3-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Minimum arguments
The minimum that is required to run the mysqlslap diagnostic program includes the executable itself (entered on the
operating system command line) along with a user identified that is allowed to create a database (and a password if one is
assigned to the user):
When executed with the minimum number of arguments, mysqlslap will create a schema called mysqlslap, load the
schema with data (by creating a table called t1 to contain the data) and then will execute a general purpose query against
that data. The result will look similar to the output listed below:
NOTE:: The mysqlslap database will need to be created prior to executing mysqlslap as described.
3-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
61 ● --concurrency=N, -c N - when using a SELECT statement, this option identifies the number of clients
that would be accessing the data at the same time. The N can be a single numeric entry or can be multiple
numeric entries by separating the numeric values with a comma as shown in the example below:
NOTE: It may be necessary to increase the size of max_connections prior to executing the
example described above.
● This will result in 5 runs (identified by --concurrency) of the application (and 5 result sets); the first run
will simulate only one client accessing the SQL, followed by a simulation with 4 concurrent clients
accessing the data, then 16, 64, and 256 concurrent clients. The following graph represents how the number
of concurrent clients, along with 10 iterations of each statement, may affect the response times of the
MySQL server:
3-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
When a schema or query is not provided to the mysqlslap application, mysqlslap will create a database
called mysqlslap with a table called t1 which includes an INT column and VARCHAR(128) column. With
this database and table created, mysqlslap then performs multiple random inserts against the table (to fill the table
with data), followed by multiple random SELECT and INSERT statements.
62 ● --create-schema - this option identifies the schema that will be created to execute the mysqlslap
application against.
● --create - this option identifies the file or string that will be used to create the table that will utilize.
● --query, -q - this option identifies the query string or file that mysqlslap will execute against the data.
● --only-print - this option tells mysqlslap not to execute against the actual data but rather just print what
it would have done if it actually executed. The following example below identifies how the previous components
would work together:
The will result in the following being displayed as the output to the terminal window:
3-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● --csv[=file] - this option creates an output in the form of comma-separated values. The output can be
passed into an optional file or to the screen itself if no file is identified.
● --engine=engine_name, -e engine_name - this option can be used when creating tables by
● Similar to the previous example, 5 runs of the application will be executed but this time in three groups. The
first group will contain those results for the MyISAM storage engine. The second group will contain those
results for the ARCHIVE storage engine. The third, and last group, will contain the results using the InnoDB
storage engine. The following graph represents how the number of concurrent clients, storage engines and
number of iterations may affect the average response time of the MySQL server
63
Multiple Storage Engines
10 Iterations of random INSERT and SELECT statements against server
7.00
Average Response Time
6.00
5.00
(Seconds)
4.00
MyISAM
3.00
ARCHIVE
2.00
InnoDB
1.00
0.00
1 4 16 64 256
Concurrent Operations
3-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
3-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix F gives a detailed step-by-step approach to configuring and executing the SuperSmack application.
Please refer to that section if you wish to pursue using the application.
3-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
67 3.2.3 MyBench
This benchmarking tool handles the details of spawning clients along with gathering and computing statistics. This is a
Perl-based system which is hard on memory and should be run on a different machine from the MySQL server that it is
testing.
Perl Modules
The MyBench.pm modules contain the common logic. To use MyBench, the following Perl modules will need to be
installed:
● DBI
MyBench is useful for running a benchmark with complicated logic, such as running a script that processes a
number of statements inside a transaction or relying on SQL in line variables. MyBench is the more flexible
benchmarking tool to meet these needs.
3-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
68 3.2.4 SysBench
SysBench is a modular, cross-platform and multi-threaded benchmark tool for evaluating O/S parameters that are
important for a system running a database under intensive load. The idea of this benchmark suite is to quickly get an
impression about system performance without setting up complex database benchmarks or even without installing a
database at all.
The SysBench current feature set allows for a testing of the following system parameters:
● File I/O performance
● Scheduler performance
Test Modes
SysBench can be utilized to test a number of different modes within MySQL. The following is a list of the modes that
can be benchmarked:
● cpu - In this mode each request consists in calculation of prime numbers using 64-bit integers. Each thread
executes the requests concurrently until either the total number of requests or the total execution time
exceeds the limits specified with the common command line options.
● threads - This test mode was written to benchmark scheduler performance, more specifically the cases when
a scheduler has a large number of threads competing for some set of mutexes.
● mutex - This test mode was written to emulate a situation when all threads run concurrently most of the
time, acquiring the mutex lock for only a short period of time (incrementing a global variable). So the
purpose of this benchmark is to examine the performance of mutex implementation.
● memory - This test mode can be used to benchmark sequential memory reads or writes. Depending on
command line options each thread can access either a global or a local block for all memory operations.
● fileio - This test mode can be used to produce various kinds of file I/O workloads. At the prepare stage
SysBench creates a specified number of files with a specified total size. Then at the run stage, each thread
performs specified I/O operations on this set of files.
● oltp - This test mode was written to benchmark real database performance. At the prepare stage a table is
created in the specified database (sbtest by default). Then this table is filled with a specified number of
rows. Then the benchmark is run with a number of client threads, limiting the total number of requests by a
defined number.
3 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
mysqld writes statements to the query log in the order that it receives them. This may be different from the order
in which they are executed. To enable the general query log, add --log[=file_name] to the startup
configuration file. If no file_name value is given, the default name is host_name.log in the data directory.
mysqld writes a statement to the slow query log after it has been executed and after all locks have been released.
Log order may be different from execution order. To enable the slow query log, add --log-slow-
queries[=file_name] to the startup configuration file. If no file_name value is given, the default is the
name of the host machine with a suffix of -slow.log. If a filename is given, but not as an absolute pathname,
the server writes the file in the data directory.
3 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Queries that do not use indexes are logged in the slow query log if the --log-queries-not-using-indexes
option is specified. In addition, the --log-slow-admin-statements server option enables logging of slow
administrative statements such as OPTIMIZE TABLE, ANALYZE TABLE, and ALTER TABLE to the slow query
log.
Queries handled by the query cache are not added to the slow query log, nor are queries that would not benefit from
the presence of an index because the table has zero rows or one row.
Binary Log
The binary log, which replaces the earlier update log, is primarily used for recovery and replication; however, it also
helps performance tuning by providing a record of all changes that were made to the database.
Running the server with the binary log enabled makes performance about 1% slower; however, the advantages
to recovery, replication and performance monitoring far outweigh the performance drain. To enable the binary
log, add --log-bin[=base_name] to the startup configuration file. It is recommended that a basename
be specified.
Error Log
The error logs keeps a record of major events such as server start/stop, as well as any serious errors. In addition, if
mysqld notices a table that needs to be automatically checked or repaired, it writes a message to the error log.
NOTE: Naming the error Log
To specify where mysqld stores the error log, add --log-error[=file_name] to the configuration file.
If no file_name value is given, mysqld uses the name host_name.err and writes the file in the data
directory. If a FLUSH LOGS command is executed, the error log is renamed with the suffix -old and mysqld
creates a new empty log file. No renaming occurs if the --log-error option is not given.
3 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
70 3.3.2 mysqladmin
This monitoring tool provides a host of administrative and diagnostic functions for maintaining and tuning the MySQL
database server. mysqladmin is executed from the operating systems command prompt and is located in the bin
directory under the root path of the MySQL server. The following is a list of some of the more important performance
monitoring and management mysqladmin commands:
● extended-status – This command provides a listing of server status variables from the server. The list is
lengthy and provides a great deal of information that can be used for monitoring the performance of the server.
Within the MySQL command line client, SHOW STATUS is equivalent to mysqladmin extended-
status.
3 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
3 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
An alternative to running SHOW PROCESSLIST within the MySQL client is to use "mysqladmin
processlist" in the operating system command prompt. This can be useful for setting up some type of
automated monitoring of the MySQL server.
● SHOW STATUS – SHOW STATUS provides server status information. This information can also be obtained
using the "mysqladmin extended-status" command. With the GLOBAL modifier, SHOW STATUS
displays the status values for all connections to MySQL. With SESSION, it displays the status values for the
current connection.
● SHOW VARIABLES - SHOW VARIABLES provides system variables information. This information can also
be obtained using the "mysqladmin variables" command. With the GLOBAL modifier, SHOW
VARIABLES displays the values that are used for new connections to MySQL. With SESSION, it displays the
system variables values for the current connection.
3 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
3 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
74
4 SCHEMA/DATABASE DESIGN
4-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Last First
Phone Name Length Level
Name Name
Instructor
Instructor teaches Course
nts Enough
de
Stu tend seats?
at
Students Classroom
An ERD will draw out possible many-to-many relationships, ambiguities in the data, the need for additional
entities or attributes, and the relationship level required to enforce business rules.
4-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
78 ● Unified modeling language - The unified modeling language consists of many components and structures
that are designed to display relationships between "data" in large systems. This data can be anything from
resources to processes along attributes in a database. For the purposes of relational database structuring, only
a small subset of the UML is utilized. The following is an example of a schema diagramed with UML:
Instructor
Instructor Course
Course
InstructorID
InstructorID CourseID
CourseID
LastName
LastName Teaches ► CourseCode
CourseCode
FirstName
FirstName 0..* 1..1 CourseName
CourseName
Phone Category
MySQL Workbench is a visual database design tool that gives database users the ability to integrate database
design, modeling, creation and maintenance into a single, seamless environment. This tool can be downloaded
from http://dev.mysql.com/downloads/
4-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
79 4.3 Normalization
Normalization is a design technique that is widely used as a guide in designing relational databases. Normalization is
essentially a two step process that puts data into tabular form by removing repeating groups and then removing duplicated
data from the relational tables. The purposes of normalization are:
● Eliminate redundant data – Removing entries of repeated data is a key to the success of normalization and
ensures integrity by keeping individual data in one location, making it easier to update and manage.
● Eliminate columns not dependent on key – Columns that are not dependent on a key can get disconnected
from data and cause data corruption. By having all data connected to a key identifier, data integrity is ensured.
● Isolate independent multiple relationships – With relationships of data across multiple tables, normalization
Advantages
Data normalization can be an expensive process in time and resources; however, besides the advantages derived from the
purposes of normalization, there are other advantages:
● ER diagrams – Developing ERD’s are much easier when the data is normalized making the relationships easily
identifiable.
● Compact – By having the data normalized it is easier to modify a single object property. If an attribute of a table
had student genders labeled as boy or girl and this was identified as incorrect labeling of students, it would be
easy to change boy to male and girl to female if there is a normalized table called gender.
● Joins – Joins can be expensive, and proper normalization can improve relationships between tables and
minimize the amount of data having to be searched.
● Optimizer – With normalization, the optimizer choices for selection are limited and it is easier for the optimizer
to pick the best solutions.
● Updates - It is easier to update in one location versus multiple locations.
Disadvantages
Even with the numerous advantages associated with normalization, there are some disadvantages:
● Numerous tables - Multiple tables must be accessed for user reports and other user data.
● Maintenance - Maintenance may be in some conflict with the business processes.
80
4.3.1 Anomalies and Ideal Fields
In designing a database, there are times when anomalies will be evident within table structures based on the fields within
those tables. Some of the anomalies that can occur from poorly designed table structures are duplicate data, redundant
data and difficulties in using the data. When normalizing, it is imperative to ensure that any potential anomalies are
caught ahead of time. For example, redundancy in a database can cause what are called update anomalies. Update
anomalies are problems that arise when information is inserted, deleted, or updated.
● INSERT - There are times that the database design will force one table to have associated data in another table
to actually insert the desired data. Example: a customer could be prevented from being entered into a database
system without first ordering a product.
4-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● DELETE - When deleting data from one table, associated data in other tables may be deleted. Example:
Inventory statistics and values could be affected when a supplier row is deleted (eg. supplier goes out of
business) and the associated tables containing inventory from that supplier are deleted.
● UPDATE - This occurs when one field, or multiple associated fields, require an update causing multiple
associated rows requiring updates as well. Example: A supplier moves to a larger facility and their address
needs to updated in multiple tables, thus causing the potential for one or more tables being missed. Proper
normalization prevents this from being a problem.
Eliminating these anomalies is a central advantage to normalization and a central idea in normalization is using the
correct fields in each table. The best way to detect fields that may lead to these anomalies is ensuring that each field
designed meets the following criteria:
4-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Inventory
sID sLoc sPostal pID1 pName1 pQty1 pID2 pName2 pQty2
1 Holtsville 00501 1 bed 15 2 chair 4
2 Waukesha 53146 1 bed 4 3 table 6
3 Waukesha 53146 2 chair 8 4 sofa 4
4 Ketchikan 99950 2 chair 24 4 sofa 10
1NF
Inventory_1NF
This table is now in first normal form; however, there is still a level of redundancy that can be removed through
greater normalization. Each row contains the supplier identification number (sID), the supplier location (sLoc)
and the supplier postal code (sPostal), when all that is really needed is just the suppliers identification number.
This normalization level can result in inconsistencies and data integrity issues during updates and deletes.
Deleting all the rows associated with a supplier can result in the supplier's information being completely erased.
When having to update supplier information, every record associated with the supplier needs to be updated.
82 ● Second Normal Form (2NF) – A table is in this level of normalization has been normalized at the first level
(1NF), plus every non-key (supporting) value is dependent on the primary key value. In addition, the latter
constraint means that a non-key value must depend on every column in a composite primary key.
4-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Inventory_1NF
sID sLoc sPostal pID pName pQty
1 Holtsville 00501 1 bed 15
1 Holtsville 00501 2 chair 4
2 Waukesha 53146 1 bed 4
2 Waukesha 53146 3 table 6
3 Waukesha 53146 2 chair 8
3 Waukesha 53146 4 sofa 4
4 Ketchikan 99950 2 chair 24
4 Ketchikan 99950 4 sofa 10
The Inventory_1NF table has been separated into two distinct tables: Supplier_2NF and Part_1NF. The
Supplier_2NF table has one primary key (sID) with two supporting columns (sLoc and sPostal) that are
dependent on the primary key, thus making it a table that is in second normal form. The Part_1NF, on the
other hand, has a composite primary key (made up of sID and pID) but the supporting column (pName) is
not dependent on both of the composite keys (dependent only on pID), thus making this table still in first
normal form.
83 ● Third Normal Form (3NF) – A table in this level of normalization has been normalized so that every non-
key (supporting) value depends solely on the primary key and not on some other non-key (supporting) value.
4-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Part_3NF
sID pID Qty
1 1 15
1 2 4
Part_1NF 2 1 4
sID pID pName pQty 3NF
2 3 6
1 1 bed 15
3 2 8
1 2 chair 4 3 4 4
2 1 bed 4 4 2 24
2 3 table 6
4 4 10
3 2 chair 8
84 In the case of the Part_1NF table, when an attempt is made to normalize it in the second normal form, it is clear
that it is truly in third normal form. Due to the lack of complexity in the initial table (Part_1NF), the second
normal form was essentially passed over and the two resulting tables (Part_3NF and PartName_3NF) met all the
criteria of the third normal form.
SupplierPostal_3NF
sPostal sLoc
00501 Holtsville
53146 Waukesha
Supplier_2NF 3NF
99950 Ketchikan
sID sLoc sPostal
1 Holtsville 00501
2 Waukesha 53146
3 Waukesha 53146 Supplier_3NF
4 Ketchikan 99950 sID sPostal sName
3NF
1 00501 KidzRooms
2 53146 Zurniture, Inc.
3 53146 MyRoom
4 99950 Furventure, Inc.
Due to the transitive dependency in the Supplier_2NF table (sLoc is dependent on both sID and sPostal), the
table could not be considered third normal form. By normalizing that table into two separate third normal form
tables (SupplierPostal_3NF and Supplier_3NF), updates in the future will be less problematic.
85 1 to Many Relationships
One of the positive results of normalizing a database, is that the resulting relationship between two tables can result in a 1
to Many () relationship. This relationship ensures an efficient approach to updates, deletes and alterations to existing data.
The following is the complete view of the normalization that was completed on the original Inventory table:
4-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Inventory
sID sLoc sPostal pID1 pName1 pQty1 pID2 pName2 pQty2
1 Holtsville 00501 1 bed 15 2 chair 4
2 Waukesha 53146 1 bed 4 3 table 6
3 Waukesha 53146 2 chair 8 4 sofa 4
4 Ketchikan 99950 2 chair 24 4 sofa 10
4 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
86 4.4 Denormalization
There are times when a normalized database is considerably slow due to a large number of tables having to be queried.
One of the most common ways to optimize the performance of a database that is normalized is by adding redundant data.
This process is known as denormalizing a database.
Denormalizing a Database
The process of denormalizing a database begins with completely normalizing the database, thus removing all redundant
data possibilities. Once a database is normalized, the backward process of denormalizing a database is easier, thus
ensuring that a database is only denormalized to the extent needed to improve performance. Examples of denormalization
techniques include:
CourseDimension ScheduleDimension
CourseID
CourseID ScheduleID
ScheduleID
SalesFact
CourseCode
CourseCode ScheduleCode
ScheduleCode
CourseName CourseID (FK) CostCode
CostCode
CourseName
Category ScheduleID (FK) StartDate
StartDate
Category
SubCategory
SubCategory SalesDollars Location
Location
CourseMaterial
CourseMaterial Instructor
Instructor
Duration
Duration
4 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The fact table (SalesFact) that combines the two dimensional tables (Course and Schedule) is the center
of the “star” approach to schema design and provides the facts that could be used in general reports, thus
eliminating joining the two dimensional tables to retrieve that same information. In this case there is
only one attribute that is available in the fact table, but more could easily be added.
88 ● Snowflake Schemas – This is another hierarchical approach to designing a database, but breaking out the
dimensional tables (CourseDimension and ScheduleDimension) into more tables that normalize the database.
In the following example, the original Course dimension table is normalized with the addition of another
table.
CourseDimension ScheduleDimension
CourseID
CourseID ScheduleID
ScheduleID
This is a more normalized structure, but leads to more difficult queries and slower response times. This
normalization saves space, however, the dimension tables usually only hold about 1% of the records.
Therefore, space savings from normalizing, or snowflaking, is negligible.
NOTE: Dimension vs. fact tables
While the dimension tables are made up of multiple columns and hold large amounts of data in a minimal
number of rows, the fact tables are generally small in the number of columns but have large amounts of data in
a large number of rows.
4 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
• A foreign key field type should be identical to its connecting primary key field (ex. If a primary key is an INT
field type, all foreign key field types associated with that primary key also should be INT).
• It is always best to consider which data types and character sets will minimize storage (disk I/O).
• Whether to use fixed or variable length strings depends on the length distribution of the values.
• For multi-byte character sets always consider variable length data types (fixed length columns always use the
maximum storage).
• Different server SQL modes will affect the data types that can be used and the validity of data entry for specific
data types.
4 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
4 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
To specify bit values, b'value' notation can be used. A bit value is a binary value written using zeroes and ones. For
example, b'111' and b'100000000' represent 7 and 128, respectively. If a value is assigned to a BIT (M) column
that is less than M bits long, the value is padded on the left with zeroes. For example, assigning a value of b'101' to
a BIT(6) column is, in effect, the same as assigning b'000101'.
92
4.5.3 Temporal Data Types
Date and time data types are referred to as temporal data types. The following is a chart of the temporal data types and
their associated attributes:
4 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
TEXT Length of content in bytes, plus 2 Variable-length string. Max size: 65,535 characters. If a
bytes to store the length. maximum length (M) is entered, MySQL will use the smallest
TEXT data type that will support the values entered.
MEDIUMTEXT Length of content in bytes, plus 3 Variable-length string. Max size: 16,777,215 characters.
bytes to store the length.
LONGTEXT Length of content in bytes, plus 4 Variable-length string. Max size: 4,294,967,295 characters.
bytes to store the length.
ENUM('VALUE1', 1 byte for up to 255 values, 2 An enumeration column can have a maximum of 64k distinct
'VALUE2', ...) bytes for 256 to 64k values values. ENUM values are represented internally as integers.
4 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
4 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Above, City uses a VARCHAR(32) for the Name field. If we execute a query that does an ORDER BY Name, we
can watch the Sort* status variables to see whether the sort buffer overflowed and if so, how many passes were
required:
4 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Above, we see the simple ORDER BY Name operation required a single sort operation, with no extra pass. This is the
ideal situation (short of adding indexes to remove the sort step altogether). But, what happens if we make the Name field
size excessive? Make the VARCHAR(32) into a VARCHAR(1000):
mysql> alter table City change Name Name varchar(1000) not null default '';
Query OK, 4079 rows affected (0.06 sec)
Records: 4079 Duplicates: 0 Warnings: 0
This time the same query caused a sort buffer overflow solely because we wasted space in the Name field. MySQL was
forced to allocate more memory to perform a second sort pass. In a production situation, this problem could easily
multiply until MySQL runs out of memory.
4 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
4 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
96 4.6 Partitioning
Controlling the physical aspects of data storage is an important aspect of performance tuning, because data access speed
is an important bottleneck for any database server. Traditionally, MySQL allows control of the storage method on a per-
table basis though choice of storage engines, and of course the Operating System and/or system hardware can be custom
configured. Partitioning refers to breaking a single table up into discrete segments which can then be distributed across
different file systems and/or hardware. It can lead to significant performance gains.
4 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
98 Partition Types
The partitioning function for a table can divide data up between partitions in different ways:
● RANGE - Assigns rows to partitions based on column values falling within a given range.
○ Suits queries that access ranges of records, particularly contiguous ranges (WHERE …
BETWEEN ...AND...). A common example is web applications doing frequent display of paginated
data.
○ Suits things like date based partitioning, for example grouping records by year or month. This is
particularly useful for archiving old data from large tables: instead of issuing a slow and expensive
DELETE FROM … WHERE … < 2001, one might simply ALTER TABLE ... DROP
4 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
4 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The query will touch all 4079 rows. Now review the same table scan on the partitioned City_Test table:
Because the query filters by a column included in the partitioning function, the MySQL Optimizer knows that only
the third partition needs to be scanned, and that greatly reduces the number of rows that need to be touched.
4 - 24
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
4 - 25
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
100
5 INDEXING
5-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Index Issues
Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first
103 ● Speed versus maintenance – Indexes help to speed up data retrieval but are expensive to maintain.
● Slower writes – As indexes are added, the time to write to the data is increased to maintain index integrity.
● Index selectivity - The more selective an index is, the more benefit is obtained from using it.
● Duplicated data types - Highly duplicated data should not be indexed (for example, boolean data types, and
columns that represent gender, state abbreviations, or country codes). For indexes to be more efficient than a
complete table scan requires more than 3 different values in each index.
104 ● UNIQUE indexes – A UNIQUE index creates a constraint such that all values in the index must be distinct. An
error occurs if a new row of data is added with a key that matches an existing row. If an index is unique, use the
UNIQUE clause to force the column to be distinct which in turn also improves the efficiency of the indexing for
that column.
● Dead Indexes - Make sure to avoid indexes that are never used by any queries. These cause additional overhead
that is not necessary, and removing them will improve overall efficiency, especially during updates and inserts.
● Duplicate indexes - Avoid more than one index on the same column(s). The optimizer must determine which to
use. Also there is more maintenance as the data changes.
5-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
106 B-Tree
This widely used data structure emulates a tree structure with a set of linked nodes. There are four common terms for
nodes that tree indexes can contain:
● Root node – This is the starting node for the tree, all nodes within a tree will be connected to this node,
either directly or indirectly.
5-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Root Node
(non-leaf)
Pointers to
child leaf
nodes
20 40 60
Leaf
Nodes
Actual
Data Pages
107 HASH
A hash is simply a key/value pair and a hash index is a collection of those key/value pairs. The hash index works
efficiently by taking a lookup value (the value portion of the key/value pair), obtaining the specific key associated with
the lookup value, then transforming the key using a hash function into a hash, and then using the hash number with the
associated hash table to locate the desired value.
5-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
990
Gunter Greese 991
992
5-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: After logging into the MySQL server as root, type the following in the mysql client:
Effect: This variable sets the maximum size to which MEMORY tables are allowed to grow. Setting this variable has
SOURCE /lab/scripts/pt_stored_procedures.sql
Effect: The pt_stored_procedures.sql file loads in all the stored procedures that will be used throughout
the PT course. Change the location of this file from your home directory (~/) to whichever directory this file is
located in.
Action: Create a table called city_memory_huge with approximately 20,000 records by executing the following
SQL statement:
CALL create_city_memory_huge(5);
Effect: This stored procedure will create a table called city_memory_huge that is based on the City table but
uses the MEMORY storage engine along with other indexing options. Once the city_memory_huge table is
created, the stored procedure will load in the records from the City table 50 times, creating approximately 20,000
records in the new table.
Action: Issue the following command to view the structure of the city_memory_huge table.
Effect: The response shows that there is one B-Tree index and one HASH index on the Name column.
Step 2. Run A - Benchmark using a B-Tree index
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_memory_huge table:
If you wish to track how long it takes to actually run the mysqlslap executable, precede the executable with the
word time (such as time mysqlslap ...).
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements using the B-Tree index,
against the city_memory_huge table.
5-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Record the number of seconds it takes to run all the SELECT statements (Run A):
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_memory_huge table:
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements using the HASH index,
against the city_memory_huge table.
Record the number of seconds it takes to run all the SELECT statements (Run B):
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_memory_huge table:
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements not using any index,
against the city_memory_huge table.
Record the number of seconds it takes to run all the SELECT statements (Run C):
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_memory_huge table:
5-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements using the B-Tree index
along with range searches, against the city_memory_huge table.
Record the number of seconds it takes to run all the SELECT statements (Run D):
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_memory_huge table:
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements using the HASH index
along with range searches, against the city_memory_huge table.
Record the number of seconds it takes to run all the SELECT statements (Run E):
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_memory_huge table:
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements not using any index along
with range searches, against the city_memory_huge table.
Record the number of seconds it takes to run all the SELECT statements (Run F):
5-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Step 9. Clean up
Action: Clean up the world database by issuing the following command to delete the city_memory_huge
table:
Effect: The city_memory_huge table has been deleted from the world database.
5 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
B+TREE
108 Index Nodes
The B+Tree index is a data (i-nodes)
structure to store vast amounts
of information and is the next
3 5
level B-Tree. Typically
Leaf
B+Trees are used to store Key Nodes
amounts of data that will not fit Values
in main system memory. To do
this, secondary storage (usually 1 2 3 4 5 6 7
disk) is used to store the leaf
5 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
109 FULLTEXT
The MySQL server supports full-text indexing and searching. A full-text index in MySQL is an index of type
FULLTEXT and can be used only with MyISAM tables. Full-text indexes can be created for CHAR, VARCHAR, or
When displaying index status information for a specific table with SHOW INDEX FROM ..., the following
values will be displayed in the index_type field:
• BTREE - This value will be displayed for indexes that use B-Tree or B+Tree indexes.
• HASH - This value will be displayed for indexes that use Hash indexing.
• FULLTEXT - This value will be displayed for indexes that use Full-text indexing.
MyISAM
MyISAM InnoDB
InnoDB MEMORY
MEMORY NDB
NDB
Tree
TreeIndex
Index »«
»« »«
»« »«
»« »«
»«
Hash
HashIndex
Index »«
»« »«
»«
Full
Full-text
-textIndex
Index »«
»«
5 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
5 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise you will answer the following questions pertaining to indexing.
1. What are the differences between B-Trees and B+Trees?
__________________________________________________________________________________
__________________________________________________________________________________
2. What are the differences between tree indexes and hash indexes?
__________________________________________________________________________________
__________________________________________________________________________________
5 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
5 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
113
6 STATEMENT TUNING
6-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
6-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
A derived table is a special type of subquery that appears in the FROM clause of a query, as opposed to the
SELECT or WHERE clauses. Derived tables can also be referred to as virtual tables or inline tables. When a
derived table is used in a query, it should be provided a unique name (this can be accomplished with the AS
clause) that will act as the table name:
SELECT … FROM ( subquery ) AS unique_name
● type – This column describes the access strategy deployed by MySQL to get at the data in the table or
index in the row. The access strategy type used by MySQL to execute the SELECT statement is a lengthy
and worthwhile discussion for improving performance on the MySQL server and will be presented in greater
detail later in the chapter.
● possible_keys – This column provides the available indexes (or NULL if there are none available) that
MySQL had to choose from in evaluating the access strategy for the table that the row describes. When this
field does not display any indexes, or minimal indexes, associated with the SELECT statement source data, it
is worthwhile to consider developing indexes to improve the performance of this query and additional
queries that will be executed on the source data.
118 ● key - This column displays the actual key chosen to perform the data access (or NULL if there was not one
available). Typically, when diagnosing a slow query, this should be the first place reviewed to ensure that
MySQL is using an appropriate index.
6-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● key_len - This column provides the length, in bytes, of the key chosen. The length is NULL if the key column
says NULL. The number, if a unit is displayed, is often very useful in diagnosing whether a key’s length is
hindering a SELECT statement’s performance, and helps in determining how many parts of a multiple-part key
MySQL is using.
● ref - This column displays the columns within the key chosen that will be used to access data in the table, or a
constant, if the join has been optimized away with a single constant value. For instance, the following...
When a query is sent to MySQL , the optimizer sets about re-organizing it to match the best query execution plan.
This might include steps like rearranging the table join order. The optional EXTENDED keyword causes
EXPLAIN to output the final query form as an SQL warning which can then be viewed by SHOW WARNINGS.
This can sometimes be useful to gain insight into how the optimizer is treating problem queries.
6-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● DEPENDENT SUBQUERY- Identifies a sub-query that contains a reference to a table that also appears in the
outer query (also known as a correlated sub-query).
● DERIVED- Identifies that the FROM clause contains a sub-query.
NOTE: Complicated select_type's
The more complicated select_type’s will affect the performance of the server as it attempts to execute the
statement. For select_type’s that are not identified as SIMPLE or some other less taxing select_type,
consider rewriting if they are having negative effects on performance of the server overall. If there is no other
way to produce the output being requested, this output column can give insight into exactly what the SELECT
statement is dealing with.
Selectivity is this number (cardinality) divided by the total number of records in the table; thus a unique index
has a selectivity of 1.0. Key distribution is a related concept describing how often each unique value occurs
within the index. So, an index with 100 unique values out of 1000 total records would have a selectivity of .10.
However, if 900 of the records contained a single value, and the other 100 records contained the other 9 unique
values, key distribution would be low, as the unique values would not be distributed evenly across the index.
Currently, the optimizer can only optimize for selectivity values, and not for key distribution values.
● Sequential reads versus random seeks – The relative speed of doing sequential reads for data on disk is
high versus reading index keys and accessing the table data using random seeks from the index row pointers
to the actual data location. MySQL uses a threshold value to determine whether repeated seek operations will
be faster than a sequential read. The threshold value depends on the WHERE or ON conditions of the query
along with specific storage engine values.
121 This EXPLAIN output column is a key performance tuning column to ensure that MySQL has chosen an optimal path
for joining the various data sets or if the query itself needs additional fine-tuning. The following list of values are
those that can appear in the type column of the EXPLAIN output (ordered from most efficient access to least
efficient).
6-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
system
This refers to a special type of access strategy that MySQL can deploy when the SELECT statement requests data from an
in-memory table and the table has only one row of data. For example:
const
This access strategy type is used when the optimizer knows that it can get at most one row from the table and the values
from that row can be considered constants when referenced elsewhere in the query. This could be because there is a
PRIMARY KEY or UNIQUE constraint on a column that has a WHERE clause or because there is at most one row.
For example, using a fresh copy of the world database do a simple WHERE field comparison with a constant value:
6-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
But any expression that does not consist entirely of constant values known before query execution, won't use const:
Due the fact that MySQL performs a lookup for constant conditions on a unique key before the query
execution begins, if no rows match the WHERE expression the process will be stopped and the extra column
will contain the following statement:
Extra: Impossible WHERE noticed after reading const tables
In this case the access strategy type will contain a NULL value.
6-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
122 eq_ref
This access strategy type refers to a query that contains a single row read from the table for each combination of rows
returned from a previous data set. An eq_ref access strategy type can be used only if both of the following conditions
can be met:
● Keys used – All parts of a key must be used by the join in the query. This can include expressions that use
columns from tables that are read before this table or a constant.
● Unique key(s) are present – The table contains a unique, non-nullable key for which a WHERE condition
contains a single value which is present for each column in the key.
For example, using a fresh copy of the world database:
The Country table's PRIMARY KEY is the Code field. Since the primary key is also UNIQUE by definition, MySQL
knows that only a single row will be returned from the Country table for each value of CountryCode found in the
City table.
6-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
ref
This access strategy type is the same as the eq_ref access strategy type with the exception that one or more rows
that match rows returned from previous table retrieval will be read from the current table (thus making this less
efficient than the eq_ref access strategy). The ref access strategy type is selected when either of the following
occurs:
● Leftmost part of unique index is used – The JOIN (or WHERE clause) condition uses only the leftmost part
of a multicolumn key.
● Non-unique and non-null key – The key used is not unique but does not contain any null values.
For example, using a fresh copy of the world database:
6 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
123 ref_or_null
This access strategy type is identical to the ref access strategy type with the following exceptions:
● Null values – The key used can contain NULL values.
● OR null condition – The query contains a WHERE expression where an OR key_column is NULL.
For example, using a fresh copy of the world database:
6 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
index_merge
This retrieval method uses more than one index for a single referenced table in the query. In an index_merge
access strategy type, multiple executions of ref, ref_or_null, or range accesses are used to retrieve key values
matching various WHERE conditions, and the results of these various retrievals are combined together to form a single
data set.
For example, using a fresh copy of the world database:
6 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
124 unique_subquery
This access strategy type is using a sub-query, which can be described as a child query that returns a set of values in an
IN clause in the WHERE condition. MySQL will use this access strategy type when the sub-query will return a list of
unique values due to the sub-query's SELECT statement using a unique, non-nullable index.
For example, using a fresh copy of the world database:
mysql> EXPLAIN SELECT * FROM City WHERE CountryCode IN (SELECT Code FROM
Country WHERE Continent='Asia')\G
*************************** 1. row ***************************
id: 1
index_subquery
This access strategy type is identical to the unique_subquery access strategy type except that MySQL has
determined that the values returned by the sub-query will not be unique.
NOTE: Correlated sub-queries
A correlated sub-query is a sub-query that contains a reference to a table that also appears in the outer query. Those
sub-queries that are not reduced to a list of values are called correlated sub-queries, which means that the sub-
query (inner query) contains a reference in its WHERE clause to a table in the outer query, and it will be executed
for each row in the PRIMARY result set. When this occurs, derived tables should be considered as another option
to remedy this potentially severe performance problem.
6 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
range
This access strategy type refers to a query that contains a SELECT statement involving a WHERE clause (or IN
conditions) that uses any of the following operators: >, >=, <, <=, IN, LIKE or BETWEEN. The LIKE operator can
utilize the range access strategy type only if the first character of the comparison expression is not a wild card. LIKE
'A%' can utilize the range access strategy type, but LIKE '%A', can not. Range operations are not possible with
HASH indexes.
For example, using a fresh copy of the world database:
6 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
index
This access strategy type is chosen when MySQL does a sequential scan of all the key entries of an index. The access
type is usually seen when both the following conditions exist:
● Slow data retrieval – No WHERE clause is specified or the table utilized does not have an index that would
speed up data retrieval.
● Covering index – All columns in the SELECT list for this statement are available in the index.
For example, using a fresh copy of the world database:
"Using Index" appears in the Extra column of the EXPLAIN output when the server was able to use the
index data pages to retrieve all the information that it needed for the table in the query. "Using Index" will
appear with many of the access strategy types, but it will always appear when the type column of the EXPLAIN
output is used to execute a query. However, the "index" access strategy is not always a good thing because all
values of the index are being read. Many times, a sequential index scan can be quicker than a table scan, but not
always. Regardless, both an index scan and a table scan are sub-optimal query execution plans.
6 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
ALL
125
This access strategy type is used when a sequential scan of the entire table’s data is necessary. This access type is used
if either of the following conditions exist in the query:
● No conditions on keys – No WHERE or ON condition is specified for any columns of the table’s keys.
● Poor index selectivity – When index selectivity is poor, a sequential scan is considered more efficient than
numerous index lookups.
For example, using a fresh copy of the world database:
6 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
USE INDEX, IGNORE INDEX, and FORCE INDEX affect only which indexes are used when MySQL decides
how to find rows in the table and how to do the join. They do not affect whether an index is used when resolving
an ORDER BY or GROUP BY.
STRAIGHT_JOIN
Another option for assisting MySQL with running a query in a more efficient way is to use the STRAIGHT_JOIN clause.
This clause is used when the developer believes that executing a join in a different order than MySQL intends to would
provide more efficiency. STRAIGHT_JOIN tells the server to access tables in order from left to right as given in the
SELECT statement. This means that the first table in the FROM clause will be accessed first, then its resulting values
joined to the first joined table, and so on. In most cases, MySQL chooses the most optimal join method and any change in
order will diminish efficiency; however, having this ability to control how MySQL runs a query at this level is impressive
and, in rare times, effective.
NOTE: ANALYZE TABLE
Before using a STRAIGHT_JOIN, it is best to ensure that MySQL is up to date with statistics on the tables to be
used. After running a baseline EXPLAIN to view MySQL’s chosen access strategy for the query, execute an
ANALYZE TABLE against the tables. By doing this, the tables analyzed will have their statistics updated to ensure
that MySQL uses the most efficient access strategy. Execute the EXPLAIN command again to see if any changes
took place. If not, then a STRAIGHT_JOIN may be warranted.
6 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
6 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
6 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
This time both the table scan and the file sort are skipped by using the new index on both CountryCode and
Population to locate and sort the rows in a single step.
6 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
This query is now quite efficient and here we could stop, but we can take this one step further and add an index across all
three fields referenced in the query:
Now, not only have we still prevented the table scan and the file sort, but we have allowed MySQL to skip touching the
table itself and draw all data from the index on CountryCode, Population and Name.
6 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise connect the definition with the best EXPLAIN output terminology.
Term Definition
6 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL DBMS
Query Compiler
Preprocessor –
Semantic checking, name resolution
Parse
Optimizer
Tree
Generate
generate
Optimal
optimalQuery
QueryExecution
ExecutionPlans
Plans(Q(EP’s)
Query Transformations
6 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
mysqld writes a statement to the slow query log after it has been executed and after all locks have been
released. Log order may be different from execution order. To enable the slow query log, start mysqld with
the --log-slow-queries[=file_name] option. If no file_name value is given, the default is the
name of the host machine with a suffix of -slow.log. If a filename is given, but not as an absolute
pathname, the server writes the file in the data directory.
○ Queries that do not use indexes are logged in the slow query log if the --log-queries-not-
using-indexes option is specified. In addition, the --log-slow-admin-statements server
option enables logging of slow administrative statements such as OPTIMIZE TABLE, ANALYZE
TABLE, and ALTER TABLE to the slow query log.
○ Queries handled by the query cache are not added to the slow query log, nor are queries that would not
benefit from the presence of an index because the table has zero rows or one row.
● General query log - The general query log is beneficial to performance tuning because it tracks all queries
that have been submitted on the MySQL server. The general query log is a general record of what mysqld
is doing and because of the additional overhead should probably be used only on development boxes. The
server writes information to this log when clients connect or disconnect, and it logs each SQL statement
received from clients. The general query log can be very useful when an error in a client is suspected and it is
beneficial to know exactly what the client sent to mysqld.
6 - 24
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
mysqld writes statements to the query log in the order that it receives them. This may be different from the order
in which they are executed. To enable the general query log, start mysqld with the --log[=file_name] or
-l [file_name] option. If no file_name value is given, the default name is host_name.log in the data
d irectory.
132 ● SHOW PROCESSLIST - The SHOW PROCESSLIST command provides a real-time list of all the connections
to the server and the actions that each one is performing. With the SUPER privilege, the SHOW PROCESSLIST
provides the connection identifier, the user, the host the user is coming from, the database the user is using, the
current command being run, along with the time and state for each open connection. Equivalent to executing
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered the
contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the labs
for this chapter. The SQL to accomplish this is below:
SOURCE /labs/scripts/pt_stored_procedures.sql
Action: In the mysql client, create the city_huge table with approximately 800,000 records by executing the
following SQL statement:
CALL create_city_huge(200);
Effect: This stored procedure will create a table called city_huge that is based on the City table. Once the
city_huge table is created, the stored procedure will load in the records from the City table 200 times, creating
approximately 800,000 records in the new table.
Note: A warning may appear due to the script attempting to drop a table that does not exist, this is not a problem
and should be ignored.
Action: Terminate the mysql client by executing the following SQL statement:
QUIT;
Effect: This will ensure that the MySQL server can be shut down cleanly in the next step (nothing accessing the
server while it is shutting down).
6 - 25
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Shut-down the MySQL server by entering the following in an O/S terminal window:
/etc/init.d/mysql stop
Effect: This will terminate the current running of the MySQL server on your system. This an important step to
ensure that the changes to the my.cnf file will take place.
Action: Create the /etc/my.cnf file using the vi editor (or an available editor you are familiar with) to
create the content below:
Effect: These configuration options will be used to record information that will be used in the lab from this
point on.
Note: In some versions, the default for the log_output is TABLE. In addition,
Action: Restart the MySQL server by entering the following in an O/S terminal window:
/etc/init.d/mysql start
Effect: This will restart the MySQL server with the configuration options added to my.cnf.
Step 3. Setup a second O/S terminal for monitoring
Action: In a new O/S terminal window, execute the following command to continually monitor the
mysql_slow.log:
tail -f /tmp/mysql_slow.log
Effect: Executing this command in a separate terminal window will allow for you to switch back and forth to
monitor the current activity of the multiple processes that will be running.
Step 4. Execute mysqlslap
Action: Issue the following command in an O/S terminal window to run the mysqlslap benchmarking tool
against the city_huge table:
Effect: This will execute a pre-defined SQL file, containing multiple SELECT statements using, against the
city_huge table. Continue to the next step while mysqlslap is still running.
6 - 26
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Examine the window with the tail -f running while the mysqlslap benchmark in is running. This
may display queries that are running slow and provide information similar to that which is displayed below:
Action: Issue the following command in an O/S terminal window to run the mysqldumpslow tool against the
/tmp/mysql_slow.log table:
mysqldumpslow /tmp/mysql_slow.log
Effect: This will display a summary of the content of the mysql_slow.log to the O/S window.
6 - 27
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will optimize queries that were slow in lab 6-A.
1. Create a clean version of the world database to remove any index structures that have been created.
2. Load the stored procedures that were deleted when creating a fresh copy of the world database by issuing
the following command:
SOURCE /labs/scripts/pt_stored_procedures.sql
3. Create the city_huge table needs to be recreated again also by issuing the following command:
4. In an O/S terminal window, review the SQL statements that were slow. The following query was listed as
slow:
5. Improve the query in step 4 by including a combined index on the CountryCode and Population
columns in the city_huge table.
6. In the mysql client, execute the SQL statement from step 4 to see if the query execution time was
improved.
7. In the O/S terminal window, review another SQL statement that was slow. The following query was listed as
slow:
6 - 28
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
6 - 29
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
135
7 CACHING AND MYSQL
7-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The architecture design decisions will have a huge influence on the types of caching to be implemented and
utilized. Having a proper understanding of the type of architecture that is planned to be implemented will provide a
better understanding of the caching that can and should be used. Not every architecture design with utilize all the
caching features or the same types of caching features; thus, it is important to have the architectural needs finalized
prior to determining which types of caches to utilize.
7-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Scale out
Scale-out is the ability to distribute tasks across multiple machines. It is best to answer the question early which is
better: thirty-two (32) single CPU boxes or twenty (20) two CPU boxes.
NOTE: Layered architecture
MySQL architecture is more layered than hierarchical. Subsystems in MySQL do not depend on each other it
order to function, although some subsystems themselves do function in a tree-like fashion.
7-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Buffers allow processes to temporarily store input in memory until the process can execute it. Buffers are used
when there is a difference between the rate at which data is received and the rate at which it can be processed,
or in the case that these rates are variable, for example in a printer spooler. Buffers are usually handled through
software versus hardware. Caches, on other hand, contain the results of a previous process or the process itself.
This allows the cache to provide the result to a frequent request without having to interact with the hardware,
thus speeding up the response time. For example, multiple processes access the same files, much of those files
can be cached to improve performance (RAM being much faster than hard drives).
Proxy cache
A proxy server is a computer that offers a computer network service to allow clients to make indirect network
connections to other network services. A client connects to the proxy server, then requests a connection, file, or other
resource available on a different server. The proxy provides the resource either by connecting to the specified server
or by serving it from a cache, which is referred to as the proxy cache. The proxy cache is the highest level of caching
and the most efficient caching in relation to performance.
Application cache
Application caches are components of the application that cache data to reduce the number of times that files and
databases have to be directly accessed. The majority of application caches have a defined expiration parameter for the
data cached to ensure that outdated information is not retrieved. In addition, many of the application caches create
dependencies between data in the cache and the data in the files the cache was populated with (such as application
caches that remove data from the cache when the original data is modified).
7-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The MySQL query cache simplifies the execution of the application by allowing the MySQL server to manage the
caching process. However, even though the MySQL server has simplified the application by determining which
query results can be reused, the performance of the server can be affected.
Query cache
The MySQL query cache caches SQL query results and allows duplicate queries (which must be absolutely the same) to
see the same query results from memory. It caches in MySQL server process memory and is fully transparent to the
applications accessing it. There is no invalidation control, as the query is invalidated only when any involved tables are
updated. This methodology works great for many read intensive web applications where data is often only used from
Table cache
Due to MySQL being a multi-threaded application, there may be many clients issuing queries for a given table
simultaneously. To minimize the problem with multiple client threads having different states on the same table, the table
is opened independently by each concurrent thread. This uses additional memory but normally increases performance.
The table cache’s purpose is to maintain a thread for each connection to a table that is open. If three connections are using
the table, the table cache will maintain a minimum of three separate threads in memory.
Block/data/index cache
The implementation of the caches at the bottom of the triangle is dependent on the storage engine being used; however,
the purpose of the caches remain the same. These caches are designed to keep the most frequently accessed table blocks
in memory. In the case of an index cache, the structures usually contain a number of block buffers where the most-used
index blocks are placed. These cache types are the most resource heavy but can provide the most benefits to improve
performance.
7-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL Server
Connection Thread Pool
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client.
If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and
executing the statement again. The following list addresses additional query cache matters:
● Caches in MySQL server process memory.
● Fully transparent for applications.
● No invalidation control.
● Does not work with prepared statements!
● Works great for many read intensive web applications.
● The query cache can be emptied with RESET QUERY CACHE.
148 MySQL Query Cache Specifics
Of course, there are prerequisites and limitations regarding MySQL query cache usage, with the most important being:
● Only identical queries may be serviced from the cache. This includes spacing, text case, etc.
● Any modification to a table used by a query in the cache causes the query to be invalidated and removed from
the cache. InnoDB only causes query ejection when data is committed.
● Many functions, such as CURRENT_DATE, NOW, RAND and others, negate the use of the cache.
● No query that uses bind variables can be reused.
● No query that makes use of user defined functions can be cached.
7 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
152 ● query_cache_limit – This server variable establishes the maximum size result set that the query cache will
store. The default is 1M.
○ Qcache_hits – This status variable displays the number of query cache hits.
NOTE: Query Cache Utilization
If a query result is returned from query cache, the server increments the Qcache_hits status variable, not
Com_select. Thus the total number of select statements that were successfully executed can be obtained by
adding the Qcache_hits and Com_select numbers together. This would allow the following formula to
provide the query cache utilization: QC_Utilization = Qcache_hits/(Qcache_hits+Com_select)
In the event that there is suspicion that the query cache has become fragmented, you can execute the FLUSH
QUERY CACHE statement to defragment the query cache thus ensuring better utilization of memory. This
command does not remove queries from the cache, but coalesces memory free space chunks. Remove all queries
from the cache with the RESET QUERY CACHE command.
7 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered
the contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
Action: In the mysql client, create the city_huge table with approximately 200,000 records by executing the
CALL create_city_huge(50);
Effect: This stored procedure will create a table called city_huge that is based on the City table. Once the
city_huge table is created, the stored procedure will load in the records from the City table 50 times,
creating approximately 200,000 records in the new table.
Action: Execute the following SQL statement to enable the query cache:
Effect: This activates the query cache by setting the size of the query cache to a non-zero value. Setting it to 0
disables the query cache. The default size is 0, so the query cache is disabled by default.
Step 2. Insert a query into the query cache
7 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Execute the following SQL statements to review the state of the query cache
Effect: The query cache status is displayed, showing an output similar to the one below:
+-------------------------+---------+
| Variable_name | Value |
+-------------------------+---------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 4185568 |
Action: Execute the following SQL statement again to review the query cache state:
Effect: The query cache status is displayed, is there a change in any of the variables?
_______________________________________
_______________________________________
7 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
_______________________________________
_______________________________________
Effect: The query cache status is displayed, is there a change in any of the variables?
_______________________________________
_______________________________________
7 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MEMORY
The MEMORY storage engine stores all data in memory; once the MySQL server has been shut down any information
stored in a MEMORY database will have been lost. However, the format of the individual tables is kept thus enabling the
creation of temporary tables that can be used to store information for quick access without having to recreate the tables
TEMPORARY
A TEMPORARY table is visible only to the current connection, and is dropped automatically when the connection is
closed. This means that two different connections can use the same temporary table name without conflicting with each
other or with an existing non-TEMPORARY table of the same name. (The existing table is hidden until the temporary table
is dropped.) By being only visible to the current connection, this table type can be used for result caching for a single
session.
7 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will size the query cache followed by steps in creating a manual cache.
1. In the mysql client, execute the following SQL statement to reset the query cache statistics:
FLUSH STATUS;
155 2. Verify the query cache size is set to 4M. If there is a different value, set the query cache size to 4M.
3. Execute the Qcache-queries.sql file using the following mysqlslap benchmarking tool with one iteration:
4. Execute the same O/S command in step 3 again and then examine query cache statistics.
5. Reset the query cache statistics.
6. Set the query cache size to 16M.
7. Execute the same O/S command in step 3 two times and then examine the query cache statistics.
8. Reset the query cache statistics.
9. Set the query cache size to 32M.
10. Execute the same O/S command in step 3 two times and then examine the query cache statistics.
11. Reset the query cache statistics.
12. Set the query cache size to 64M.
13. Execute the same O/S command in step 3 two times and then examine the query cache statistics.
14. Reset the query cache statistics and then turn off query caching.
15. Of the different query cache sizes (4M, 16M, 32M or 64M), which size seems the best? Why do you think
this is?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
16. Execute the following query in the mysql client and record the time it took to execute:
7 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
17. Execute the query from step 16 to obtain the next 5 records (6-10) and record the time it took to execute.
How many seconds did the query take to execute: _____________
18. Execute the query from step 16 to obtain the next 5 records (11-15) and record the time it took to execute.
How many seconds did the query take to execute: _____________
19. Execute the query from step 16 to obtain the next 5 records (16-20) and record the time it took to execute.
How many seconds did the query take to execute: _____________
20. Execute the query from step 16 to obtain the next 5 records (21-25) and record the time it took to execute.
How many seconds did the query take to execute: _____________
7 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
28. Sum up the total time of all the previous 5 queries executed. How many seconds total did it take to run all 5
queries: _______________________.
How often would you need to have people go to extra pages to be useful?
__________________________________________________________________________________
__________________________________________________________________________________
7 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
157
8 MYSQL SERVER CONFIGURATION
8-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
A file descriptor is an index for an entry in a kernel-resident data structure containing the details of all open files.
Applications, such as MySQL, that need to access data that is maintained by the operating system, must issue
system calls to the kernel that in turn accesses the files on behalf of the applications, based on the index keys. The
application itself cannot read or write the file descriptor table directly.
The table_open_cache uses little RAM and it is generally fine to increase to values in the 10,000 range and even
higher to 100,000 or more if the operating system is configured to support that (ulimit in Linux). MyISAM uses
approximately one for each use of the table in all queries currently running on the server plus one for the first open.
NOTE: Separating table_cache
The table_cache server variable has been separated in MySQL 5.1 into two separate server variables:
table_open_cache and table_definition_cache. table_open_cache is similar to the original
table_cache variable by providing a cache for internal table objects but is no longer used for table definition
lookups. table_definition_cache provides a cache of the .frm file without using file descriptors. This
variable is ultimately responsible for handling the definition lookups by describing the metadata of the tables.
● Open_table_definitions - This status variable displays the number of cached .frm files.
● Open_tables – This status variable displays the number of currently open tables. A single table opened two
times is counted as two open tables. If the Open_tables status variable returns a large number of tables being
open and there are performance issues, it is a good sign that the table_cache server variable needs to be
increased. Even if the database has only 64 tables there may be more than 64 open tables. MySQL, being multi-
threaded, may be running many queries on the table at one time, and each of these will open a table.
● Opened_tables – This status variable displays the number of tables that have been opened.
8-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The simplest approach to determining if the Opened_tables status variable is prompting an increase in the
table_open_cache is by monitoring the server over time. If after starting the MySQL server the
Opened_tables status variable is constantly going up, then the table_open_cache is too low.
However, if it's only going up slowly per hour (<10), then the table_open_cache is configured well. The
purpose in monitoring the Opened_tables status variable is to avoid unnecessary disk I/O requests. If the
MySQL server has to open a table for each query, then the server will be bogged down with performance
issues.
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered
the contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
Action: After logging into the MySQL server as root, type the following in the mysql client:
FLUSH TABLES;
Action: Issue the following SQL command to view the status of the table cache:
__________________________________________________
8-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: In the world database, issue the following SQL command to lock the Country table to reads only for
any other clients accessing the database:
Effect: The current MySQL client connection is the only connection that can perform non-read SQL statements
against the Country table.
Action: Issue the following SQL command in the second mysql client started to view the status of the table cache:
Effect: Did the values change for any of the status variables from those that were recorded previously? If so, which
ones and why?
___________________________________________________
___________________________________________________
Step 5. Open and close a table
Action: Start up a third mysql client and perform the following in the world database:
Effect: The current record count (similar to the listing below) for the Country table will be displayed:
+--------- +
| COUNT(*) |
+----------+
| 239 |
+--------- +
1 row in set (#.## sec)
8-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Issue the following SQL command in the third mysql client started to view the status of the table cache:
Effect: Did the values change for any of the status variables from those that were recorded in step 3. If so, which
ones and why?
___________________________________________________
Action: Shut down all instances of the mysql clients currently running. This will ensure there are no longer any
connections to the MySQL server.
Action: Start up a single mysql client instance and issue the following command:
Effect: The Country table is locked allowing only this connection to be able to perform non-read SQL
statements against the table.
Action: View the contents of the table cache by issuing the following SQL command in the mysql client:
Effect: A response similar to the listing below will be displayed in the mysql client:
+----------+-----------------+--------+-------------|
| Database | Table | In_use | Name_locked |
+----------+-----------------+--------+-------------|
| world | City | 0 | 0 |
| world | Country | 1 | 0 |
| world | CountryLanguage | 0 | 0 |
+----------+-----------------+--------+-------------+
3 rows in set (#.## sec)
___________________________________________________
8-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Step 8. Clean up
Action: Issue the following command to unlock any tables that were locked during this exercise:
UNLOCK TABLES;
Effect: Additional labs performed may have problems if any tables remain locked.
8-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Static Linux binaries provided by Sun Microsystems can support up to 4000 connections; however, the actual
number of connections is dependent upon the quality of the thread library on a given platform, how much
● Max_used_connections – This status variable displays the number of connections that were opened
simultaneously up to a particular time. Monitoring this variable will assist in determining if the
max_connections server variable needs to be increased.
NOTE: Extra reserved connection
When mysqld is executed, MySQL adds an additional reserved connection to the max_connections
variable for access by users who have the SUPER privilege. This can be useful when MySQL rejects any
connections with a “Too Many Connections” error when an attempt is made to connect when all available
connections are in use.
8.2.3 open_files_limit
This server variable defines the number of open file descriptors to reserve (or open). When viewing the
open_files_limit value with SHOW VARIABLES, the number may be different to the number that was
assigned to the --open-files-limit option. This is because the server determines the open_files_limit
value by running the two formulas below and picking the largest of the two:
max_connections * 5
… or …
max_connections + table_cache * 2
In the event that this value has been entered into the mysqld_safe section of the my.cnf file, the number entered
will be used if it is larger than either of these calculated numbers. For systems where MySQL is unable to change the
number of open files, the number will always be zero (0). Due to these equations, it is easy to see that this variable is
important both for opening files and connections. If the MySQL server prevents connections with the error ‘Too many
open files’, it is a good indication that the open_files_limit server variable needs to be increased.
● Open_files – This status variable displays the number of open file descriptors that are currently active.
8-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
A process has been given an address space on the operating system for executing a set of instructions with full
control over various resources (files, devices, etc.) to complete its task. When a new process is created, the
operating system has to create a new address space for it. Threads, on the other hand, work under the address
space given to its parent process (with all the control given to its parent process) and can change state without
having to create a new process. MySQL utilizes a thread-based server architecture which allows the various
executing threads (or lightweight processes) to access key shared resources
8-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
8 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will benchmark the table cache server connection parameters.
1. Create 500 tables by passing the number 500 into the single parameter of the create_many_tables()
163 stored procedure.
2. View the status of the table cache by looking at the associated status variables.
3. Set the two table cache server connection system variables to 256 each.
4. In the mysql client, source the qry_many_tables.sql file to execute a query against each of the tables
created.
6. In the mysql client, monitor the change of the table cache statistics.
7. Set the two table cache server connection system variables to 1024 each.
8. In the mysql client, source the qry_many_tables.sql file to execute a query against each of the tables
created again.
9. Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the tables created again:
10. Compare execution time by monitoring the change of the table cache statistics in the mysql client. Is there a
noticeable difference?
11. Remove the tables created by calling the drop_many_tables() stored procedure.
8 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
8 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
NOTE: max_heap_table_size
It it important to realize that since the MEMORY storage engine is used for implicit in-memory temporary tables, the
max_heap_table_size variable places a maximum limit on tmp_table_size. For example, if
tmp_table_size is 32M but max_heap_table_size is 16M, then implicit MEMORY temporary tables will
convert to on-disk MyISAM tables when they grow past 16M. Thus both these variables need to be considered
when dealing with implicit temporary tables.
○ Created_tmp_disk_tables – This status variable displays the number of temporary tables on disk
created automatically by the server while executing statements. If BLOB/TEXT fields are requested in the
statement executed, disk based temporary tables are automatic. The number of these statements (containing
8 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will tune the tmp_table_size system variable and determine its impact on
performance.
1. Display the current values of the variables associated with the tmp_table_size system variable by
166 issuing the following command in the mysql client:
2. Execute the vmstat command in the O/S terminal window to display the current memory usage recorded.
4. Display the current values of the variables associated with the tmp_table_size system variable. Was
there an increase in the values? Display the current memory usage. Was there an increase, decrease or no
change in the memory values?
5. Increase the size of the tmp_table_size system variable to 8 *1024.
6. Execute the benchmark command from step 3 again.
7. Display the current values of the variables associated with the tmp_table_size system variable. Was
there an increase in the values? Display the current memory usage. Was there an increase, decrease or no
change in the memory values?
8. Increase the size of the tmp_table_size system variable to 8 *1024 * 1024.
9. Execute the benchmark command from step 3 again.
10. Display the current values of the variables associated with the tmp_table_size system variable. Was
there an increase in the values? Display the current memory usage. Was there an increase, decrease or no
change in the memory values?
8 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
8 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The binlog_cache_size server variable tells the server how big the size of the cache to hold the SQL
statements for the binary log during a transaction should be. A binary log cache is allocated for each client if
the server supports any transactional storage engines and if the server has the binary log enabled (--log-bin
option). If a transaction is larger than the binary log cache, the server must open a temporary file to store the
transaction and then delete that temporary file when the transaction ends. The Binlog_cache_disk_use
and the Binlog_cache_use can be used to tune the server variable binlog_cache_size to a large
enough value that avoids the use of these temporary files.
8 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
170 ● Bytes_received - This status variable displays the number of bytes received from all clients.
● Bytes_sent - This status variable displays the number of bytes sent to all clients. The Bytes_received
and Bytes_sent can be used together to determine the traffic to and from the server.
● Com_xxx – These status statement counter variables indicate the number of times each xxx statement has been
executed. There is one status variable for each type of statement. For example, Com_delete and
Com_insert count DELETE and INSERT statements, respectively. All of the Com_xxx variables are
increased even if a prepared statement argument is unknown or an error occurred during execution. In other
words, their values correspond to the number of requests issued, not to the number of requests successfully
completed. There are numerous Com_xxx variables and each brings valuable insight into the working of the
server and should be monitored and analyzed in any performance tuning exercise.
There is no SHOW command that displays all the user variables, thus making it imperative that user variables are
memorized (or retained in some external fashion) for later use within the session.
NOTE: Duplicating global variable names
Even though it is possible to give a user variable the same name as a global variable (ex. wait_timeout), it is best
not to duplicate global variable names to eliminate confusion.
● System and Server Variables – These are the variables that will be most beneficial to performance tuning
because they contain states or attributes of the MySQL server. Many of the system and server variables can exist
in two states:
○ Global – These variable are the same throughout all the sessions that are running on the MySQL server and
can only be modified by using the SET GLOBAL command. For example:
● Current Connection – These variables are specific to the current users connection and revert to their global
value when the users session is terminated. These variables names must be prefixed with two @@ symbols:
When SHOW VARIABLES is executed, the system variables are the variables that are displayed. In the course of
performance tuning, these variables and their values will be crucial to tuning the server properly. Many of the
variables listed in SHOW VARIABLES will be discussed in greater detail throughout this course.
8 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
SELECT variable_name can be used to view specific global or current connection system and server
variables (eg. SELECT @@wait_timeout; ). To ensure that global system or server variable is being
viewed, use SELECT @@global.variable_name (eg. SELECT @@global.wait_timeout; )
User, system and server variables are case insensitive thus making @foo, @FOO and @Foo the same variable
(@thread_cache_size, @Thread_Cache_Size and @THREAD_CACHE_SIZE identify the same
server variable).
● Stored Routines Variables and Parameters – These types of variables are declared within stored routines
and are valid within the stored routines only. These variables can not be called outside of the stored routine
8 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
173 ● BIG_TABLES
When select queries are complex in nature (such as JOIN queries), MySQL may create a temporary table in
memory to process them. When these select queries get large, this can lead to problems with memory. By
enabling BIG_TABLES prior to running these complex queries, the server is told to create all temporary tables
on disk. This can lead to relatively slower query results; however, freeing memory for other processes, and the
time difference may be well worth it.
A typical approach to using the BIG_TABLES variable is below:
SET @@BIG_TABLES = 1;
SELECT problematic_query;
● FOREIGN_KEY_CHECKS
With storage engines that support foreign keys, there can be problems with loading data or altering table
definitions and/or table data. The FOREIGN_KEY_CHECKS variable, which performs a check on each
transaction to ensure that foreign keys are not being violated, can be a performance issue when performing
certain functions.
○ Disabled – By setting FOREIGN_KEY_CHECKS to 0 (switched off), transactions will not be tested upon
foreign key constraints. This is useful when loading data and performing table alterations.
○ Enabled – By setting FOREIGN_KEY_CHECKS to 1 (switched on), each transaction will be tested for
foreign key constraint violations. This setting can not only affect large amounts of data being loaded or table
alterations but can also affect performance. However, the purpose of foreign key constraints, when needed,
is worth the performance issue.
A typical approach to modifying the FOREIGN_KEY_CHECKS variable is shown below with a previous
dumped file being restored with a SOURCE command:
SET FOREIGN_KEY_CHECKS = 0;
SOURCE dump_file_name;
SET FOREIGN_KEY_CHECKS = 1;
8 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● SQL_BIG_SELECTS
MySQL is designed to run every JOIN select statement that is sent to it and is structured correctly. However,
this can be a bad thing for those SELECT statements that are structured correctly but will result in execution
times that are quite long.
○ Disabled – By setting SQL_BIG_SELECTS to 0 (switched off), queries that will result in a number
of rows greater than the max_join_size variable will be terminated. This setting can prevent server
delays and excessive memory loads.
○ Enabled – By setting SQL_BIG_SELECTS to 1 (switched on), all SELECT statements are allowed
and there is no testing the query execution against the max_join_size variable.
8 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
175 ● UNIQUE_CHECKS
This variables, when set to 1, tells the MySQL server to perform uniqueness checks for secondary indexes in
InnoDB tables. If set to 0, uniqueness checks are not done for index entries inserted into InnoDB's insert buffer.
This variable should be set to 0 when the data has been verified to contain no uniqueness violations when
performing large table imports into InnoDB tables.
● SQL_LOG_BIN
This variable disables or enables binary logging for the current connection (SQL_LOG_BIN is a session
variable) if the client has the SUPER privilege. The statement is refused with an error if the client does not have
that privilege.
● SHOW OPEN TABLES - This command lists the non-TEMPORARY tables that are currently open in the table
cache. The LIKE clause, if present, indicates which table names to match. The WHERE clause can be given to
select rows using more general conditions.
● SHOW ENGINE {INNODB | NDB} STATUS – Some storage engines can report custom status listings to
assist with management, monitoring and performance tuning.
8 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
8 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise you will answer the following questions pertaining to MySQL server configurations.
6. How can you monitor if there are joins not using indexes being executed on the server?
__________________________________________________________________________________
__________________________________________________________________________________
7. How can you monitor if you might be having problems with lock contention on the server?
__________________________________________________________________________________
8 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
8 - 24
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
180
9 MYISAM STORAGE ENGINE
9-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Storage Medium
MyISAM, like all MySQL storage engines, uses one or more files to handle operations within the data sets that are
located under the data_dir directory. The data_dir directory contains one subdirectory for each schema (database) residing
on the server. The MyISAM storage engine creates three separate files for each table within the associated schema
directory:
Transaction Capabilities
MyISAM does not provide atomicity, consistency or durability of multiple statements executed within a transaction.
Statements executed against MyISAM can not be guaranteed to successfully process according to the applications (or
more appropriately the designers) intent in the event of a server crash.
Locking
The MyISAM storage engine protects against corruption of data by different threads updating in conflicting ways by
using table-level locking. This has been seen as a “deficiency” in the MyISAM storage engine, but for the majority of
applications this is not an issue due to the processes in place and the speed of MyISAM, including in very high
concurrency situations.
183 Backup and Recovery
Of all the standard MySQL backup capabilities, the ability to stop the server and copy the actual files is the safest and
most complete backup scenario. MyISAM supports this through the three files mentioned earlier. By copying each of the
files associated with a table (*.frm, *.MYD and *.MYI), the user can move or restore the associated data through
simple O/S commands with no break in data or design integrity. Copying applies even between dissimilar operating
systems and CPUs, like 32 bit Windows on Intel to 64 bit Linux on SPARC.
Optimization
With multiple indexing options and specific features of MyISAM, there are some simple approaches to optimizing
MyISAM data that will improve performance and provide noticeable results with little to no overhead. This the major
push of the remainder of this chapter.
9-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Special Features
Some of the special features of the MyISAM storage engine are:
● FULLTEXT indexing – When a record is inserted into a MyISAM table containing a FULLTEXT index, the
data for the indexed fields are analyzed and split into “words”.
● Concurrent inserts - For a MyISAM table, concurrent inserts can be used to add rows at the same time that
SELECT statements are running if there are no deleted rows in middle of the table.
● Prefix compression on indexes - String indexes are space compressed. If the first index part is a string, it is
also prefix compressed. Space compression makes the index file smaller than the worst-case figure if a string
column has a lot of trailing space or is a VARCHAR column that is not always used to the full length. Prefix
9-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
9-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
186 ● Where there is a performance benefit from the database caching writes itself, particularly applicable if
there may be many updates to the same data in a short time.
● Where high concurrency and/or comparatively long application lock times are required.
● Where foreign keys are needed for referential integrity.
● Where transaction support is required - Almost all activities involving money moving around.
9-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MyISAM tables can be forced to use a specific row format by using either ROW_FORMAT=FIXED or
ROW_FORMAT=DYNAMIC in the CREATE TABLE syntax.
The disadvantage to using the fixed-length format is its need to pad certain column types, thus causing the disk
space required to be larger then for identical data stored in a dynamic format.
9-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Dynamic
Dynamic storage format is used if a MyISAM table contains any variable-length columns (VARCHAR, VARBINARY,
BLOB, or TEXT). The one advantage that the dynamic format has over the fixed-length format is the ability for each
row to use only as much space as is required, thus taking up less disk space. This advantage may be all that is needed
to determine the row format to use; however, the following is a list of the associated disadvantages:
● Row fragmentation - If a row becomes larger than the allotted space, dynamic rows are split into as many
pieces as are required, resulting in row fragmentation. This may require running OPTIMIZE TABLE or
myisamchk -r from time to time to improve performance.
● Crash recovery – Dynamic row format tables are more difficult than fixed-length format tables to
reconstruct after a crash, because rows may be fragmented into many pieces and links (fragments) may be
To minimize the MyISAM table fragmentation that can occur, it is best to use fixed-length fields with tables
that are frequently updated or deleted. If this is not possible, large table definitions should be split into two or
more tables separating the variable-length fields from the fixed-length fields.
myisampack tablename.MYI
myisampack compresses each column in the table separately. When the table is used later, the server reads into
memory the information needed to decompress columns. MyISAM compressed is faster for RANDOM access to
rows because only the required rows must be read and decompressed. To list all the options that can be used with
myisampack, use the following syntax:
myisampack --help
After a table is compressed with myisampack, use myisamchk -rq to rebuild its indexes. Compressed
tables can be uncompressed with myisamchk -u tablename.MYI.
9-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
tablename.MYI tablename.MYD
The idea behind B-trees is that non-leaf nodes (also known as internal nodes) can have a variable number of child
nodes within some pre-defined range. The size of the index blocks in a table is normally set by the server when the
MYI index file is created, depending on the size of the keys in the indexes present in the table definition. In most
cases, it is set equal to the I/O buffer size.
As data is inserted or removed from the data structure, the number of child nodes varies within a node and so
internal nodes are coalesced or split so as to maintain the designed range. A B-tree is kept balanced by requiring
that all leaf nodes (nodes that have zero child nodes and are the farthest from the root node) are at the same depth.
This depth will increase slowly as elements are added to the tree, but an increase in the overall depth is infrequent,
and results in all leaf nodes being one more hop further removed from the root.
9-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise you will answer the following questions pertaining to the MyISAM storage engine.
10. What is the disk footprint of MyISAM tables? What are the purposes of the different files?
__________________________________________________________________________________
__________________________________________________________________________________
11. What uses are best suited to MyISAM?
__________________________________________________________________________________
__________________________________________________________________________________
9 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
With the MyISAM storage engine's capability to perform INSERT operations while SELECT queries are running
against the data (through the READ LOCAL lock type), it makes an excellent choice for applications that are
responsible for logging continual activities while also giving end users the ability to run reports or other query
operations on the data contained. This is made possible by the fact that MyISAM knows that all new records will
always occur at the end of the data file and thus there is no need to hold up SELECT statements that request data
from anywhere else in the table.
9 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
192 Priorities
When running a SQL statement that interacts with a MyISAM table, an individual user can determine the importance
level that their individual statements have in regards to priority execution on the MySQL server. The following is a
list of the priority commands that can be used:
● HIGH_PRIORITY – This command gives the SQL statement being executed the ability to move to the top
of the list of any waiting SQL statements (from all threads) and can be used with the following SQL
statements:
○ SELECT - When SELECT HIGH_PRIORITY ... is utilized, MySQL gives the SELECT a higher
priority than a statement that updates a table. This should only be used for queries that are very fast and
must be done at once. A SELECT HIGH_PRIORITY query that is issued while the table is locked for
When using the INSERT HIGH_PRIORITY statement, the server automatically overrides any existing
low_priority_updates setting. The low_priority_updates setting, when set to 1, forces all
INSERT, UPDATE, DELETE, and LOCK TABLE WRITE statements to wait until there is no pending
SELECT or LOCK TABLE READ on the affected table. This variable previously was named
sql_low_priority_updates. In addition, by default writing queries have a higher priority than reading
queries, so by default there are two priority queues.
● LOW_PRIORITY – This command places the SQL statement being executed at the bottom of the list of any
waiting and new SQL statements (from all threads) and will only run when all waiting and new SQL
statements have completed. This command can be used with the following SQL statements:
○ INSERT – When INSERT LOW_PRIORITY ... is utilized, MySQL delays the execution of the
statement (and stops any other work from the particular client) until no other clients are reading from the
table. This includes any other clients that may access the table while the INSERT LOW_PRIORITY
statement is waiting to execute. In an environment where there is a large number of clients that can be
reading from the server at any one time this could cause the client to wait a very long time. This should
normally not be used with MyISAM tables because doing so disables concurrent inserts.
○ UPDATE – The UPDATE LOW_PRIORITY ... works identical to the INSERT LOW_PRIORITY
statement and should also be avoided unless absolutely necessary.
○ DELETE – The DELETE LOW_PRIORITY ... also works the same as the INSERT LOW_PRIORITY
by not executing until all clients are done reading from the table. This can cause integrity issues in that
another client can be viewing data that should be deleted immediately.
9 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Using LOW_PRIORITY with UPDATE, INSERT and DELETE can cause the queries to wait in the queue forever,
if there is no time of day when clients aren't reading from the table.
The use of the INSERT DELAYED statement in SQL statements is not directly related to locking issues in
MyISAM, but has a bearing on many of the discussions in the locking section. When the INSERT DELAYED
statement is executed, the server puts the row or rows to be inserted into a buffer, and the client issuing the
9 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Key Cache
The KEY_CACHE structure contains a number of linked lists of accessed index data blocks. These blocks each
represent a single, fixed size block of data read from an index file, and are kept in memory until they are
determined to be “cold” index blocks (not being used often enough to be kept in memory). When a block of
data is determined to be “cold”, purged from memory. Even though the KEY_CACHE uses a least recently used
(LRU) approach to determining which blocks of data are “cold” or “warm”, it is designed to keep all blocks in
memory that are associated with root B-tree levels.
With statements that have the potential to change it is existing data, the key cache first writes the changes to the
internally buffered index block and marks it as “dirty”. If this “dirty” block is accessed by the key cache for the
purpose of purging it, the key cache first writes the changes to the index file before removing it from memory.
key_buffer_size
The key_buffer_size system variable controls the size of the key cache. If this variable is set equal to zero, no
key cache is used. The key cache also is not used if the key_buffer_size value is too small to allocate the
minimal number of block buffers (8). When the key cache is not operational, index files are accessed using only the
native file system buffering provided by the operating system (in other words, table index blocks are accessed using
the same strategy as that employed for table data blocks).
9 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The key cache referred to in a CACHE INDEX statement can be created by setting its size with a SET GLOBAL
parameter setting statement or by using server startup options. For example:
Note that the default key cache can not be destroyed. Any attempt to do this will be ignored.
9 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
hot_cache.key_buffer_size = 500M
cold_cache.key_buffer_size = 500M
key_buffer_size = 2G
init_file=/path/to/data-directory/mysqld_init.sql
The statements in mysqld_init.sql are executed each time the server starts. The file should contain one SQL statement
per line. The following example assigns several tables each to hot_cache and cold_cache:
9 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
This statement pre-loads all index blocks from t1. It pre-loads only blocks for the non-leaf nodes from t2.
Currently, pre-loading of indexes for a table is only allowed if all are using the same page size. Any tables using the 4k
size or any other size will prevent key cache pre-loading. This usually happens when FULLTEXT indexes or indexes with
long character columns are present.
197
9.7.3 Midpoint Insertion Strategy
By default, the key cache management system uses the LRU strategy for choosing key cache blocks to be evicted, but it
also supports a more sophisticated method called the midpoint insertion strategy. When using the midpoint insertion
strategy, the LRU chain is divided into two parts: a hot sub-chain and a warm sub-chain. The division point between two
parts is not fixed, but the key cache management system takes care that the warm part is not “too short,” always
containing at least key_cache_division_limit percent of the key cache blocks. The
key_cache_division_limit is a component of the structured key cache variables, so its value is a parameter that
can be set per cache.
● Initial Entry into Key Cache - When an index block is read from a table into the key cache, it is placed at the
end of the warm sub-chain. After a certain number of hits (accesses of the block), it is promoted to the hot sub-
chain. At present, the number of hits required to promote a block (3) is the same for all index blocks.
9 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Initial Entry into Key Cache Promotion to Hot sub -chain First Candidate for Eviction
● Promotion to Hot sub-chain - A block promoted into the hot sub-chain is placed at the end of the chain.
The block then circulates within this sub-chain. If the block stays at the beginning of the sub-chain for a long
enough time, it is demoted to the warm chain. This time is determined by the value of the
key_cache_age_threshold component of the key cache.
● First Candidate for Eviction - The threshold value prescribes that, for a key cache containing N blocks, the
block at the beginning of the hot sub-chain not accessed within the last N ×
key_cache_age_threshold / 100 hits is to be moved to the beginning of the warm sub-chain. It
then becomes the first candidate for eviction, because blocks for replacement always are taken from the
beginning of the warm sub-chain.
The midpoint insertion strategy allows more-valued blocks to be kept in the cache. If the plain LRU strategy is
preferred, leave the key_cache_division_limit value set to its default of 100.
198
9.7.4 Other Key Cache Status Variables
There are a number of status variables that assist in maintaining, monitoring and optimizing the key cache for
MyISAM tables. The following lists the other key cache status variables not discussed up to this point:
● key_blocks_not_flushed – This variable displays the number of key blocks in the key cache that
have changed but have not yet been flushed to disk. These can be considered “dirty” key blocks in the key
cache. The index will be corrupted during a crash and will need to be repaired before it can be used again.
● key_blocks_used – This variable displays the number of used blocks in the key cache. This value is a
high-water mark that indicates the maximum number of blocks that have ever been in use at one time. This
value should be high. The more blocks in the key cache, the less the server is using disk-based I/O to
examine the index data.
● key_blocks_unused – This variable displays the number of unused blocks in the key cache. This value
can be used to determine how much of the key cache is in use.
9 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● key_read_requests - This variable represents the number of requests that have been executed to read a key
block from the cache.
199 ● key_reads – This variable displays the number of physical reads of a key block from disk. The cache miss
ratio can be calculated as:
key_reads/key_read_requests
If the ratio of key_reads to key_read_requests is high, more than about one in one hundred, consider
increasing key_buffer_size, remembering to balance all of the RAM allocations on the server, so it may
need to be left as it is or even reduced to obtain the best balance. In addition to the cache miss ratio, it is also
9 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered
the contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
Action: After logging into the MySQL server as root, type the following in the mysql client:
Action: Issue the following SQL command to view the status of the key cache:
Effect: A response similar to the listing below will be displayed in the mysql client:
+------------------------+-------+
| Variable_name | Value |
+------------------------+-------+
| Key_blocks_not_flushed | 0 |
| Key_blocks_unused | 14 |
| Key_blocks_used | 0 |
| Key_read_requests | 0 |
| Key_reads | 0 |
| Key_write_requests | 0 |
| Key_writes | 0 |
+------------------------+-------+
7 rows in set (#.## sec)
Step 3. Execute some queries and monitor key cache state changes
Action: In the world database, issue the following SQL command to query the Country table:
Effect: The Country table is queried and the result ("Sweden") is displayed in the mysql client.
9 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Issue the following SQL command again to view the status of the key cache now:
Are there changes to any key cache variables? If so, which variables have been changed? What are their values?
Action: In the world database, issue the following SQL command to query the Country table:
Effect: The Country table is queried and the result ("Swaziland") is displayed in the mysql client.
Are there changes to any key cache variables? If so, which variables have been changed? What are their values?
Action: In the world database, issue the following SQL command to query the Country table:
Effect: The Country table is queried and the result ("Finland") is displayed in the mysql client.
Action: Issue the following SQL command again to view the status of the key cache now:
Are there changes to any key cache variables? If so, which variables have been changed? What are their values?
9 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
9 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
9 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will test the features of the key cache to include performing a midpoint insertion strategy.
1. Populate the city_huge table with approximately 800,000 records by passing an argument of 200 to the
create_city_huge stored procedure. This might take some time to run, so now might be a good time
204 for a break.
2. Reset the key cache to a 16K key cache using the following SQL statement:
Record the time it took to execute the query: _____________ (Query A-1)
4. Execute a FLUSH TABLES command to flush the query cache. This will ensure that the next command
gives an accurate result.
5. Execute a large query against the city_huge table that performs a full table scan (eliminating the file
system cache effect) using the following SQL statement:
Record the time it took to execute the query: _____________ (Query A-2)
6. Reset the key cache to 75% of system memory.
7. Execute a FLUSH TABLES command again to flush the query cache.
8. Execute the original full index scan query (step 3) against the city_huge table again.
Record the time it took to execute the query: _____________ (Query B-1)
9. Execute a FUSH TABLES command again to flush the query cache.
10. Execute the original full table scan query (step 5) against the city_huge table again.
Record the time it took to execute the query: _____________ (Query B-2)
11. Execute a FLUSH TABLES command again to flush the query cache.
12. Execute the original full index scan query (step 3) against the city_huge table again.
Record the time it took to execute the query: _____________ (Query C-1)
13. Compare the three different queries (A, B and C) performance against each other.
14. Reset the key cache to 50M.
9 - 24
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
15. Execute the mid-point.sql file using the following O/S command to populate the cache with one iteration:
16. In the mysql client, execute the following SQL statement to reset the key cache statistics:
FLUSH STATUS;
24. In the mysql client, determine the hit rate of key cache again.
What is the hit rate percentage: ______________ (hit rate C)
25. Compare hit rate A and hit rate C.
26. Set the key_cache_division_limit system variable to 50 by issuing the following SQL statement:
9 - 25
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
31. Execute a FLUSH TABLES command again to flush the query cache.
32. Execute the original full index scan query (step 3) against the city_huge table again.
33. In the mysql client, determine the hit rate of key cache again.
What is the hit rate percentage: ______________ (hit rate E)
34. Compare all the hit rates to determine the best use of the key cache.
9 - 26
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Limitations:
206 ● MyISAM tables - Only MyISAM tables can be merged into a MERGE table. If other table types were allowed,
the reality is that they would probably not be needed (for example, InnoDB).
2006_QTR1_Stats
2006_QTR1_Stats
207
2006_QTR1_Stats.MRG
2006_QTR1_Stats.MRG
2006_03_Stats
2006_02_Stats
2006_01_Stats
2006_03_Stats.MYI
2006_QTR1_Stats
2006_03_Stats.MYD
.frm
(Metadata)
2006_02_Stats.MYD 2006_02_Stats.MYI
2006_01_Stats.MYD 2006_01_Stats.MYI
MyISAM tables can be added or removed to an existing MERGE table with the ALTER TABLE command.
9 - 27
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Advantages
● MERGE tables can assist in exceeding possible file size limitations of the O/S.
Disadvantages
● Increase in the number of file descriptors required based on the increased number of files.
● Index reads are slower as MySQL has to search the indexes of multiple tables.
● No global indexes are available.
9 - 28
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise you will answer the following questions pertaining to the MyISAM Storage Engine.
1. Describe the different concurrent insert modes in MyISAM
__________________________________________________________________________________
__________________________________________________________________________________
2. How many priority queues are there for MyISAM tables? How can you change the priority of a query?
__________________________________________________________________________________
__________________________________________________________________________________
9 - 29
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will test the MyISAM index building.
1. In the mysql client, export the city_huge table to a flat file using the following SQL statement:
2. In an O/S terminal, setup up monitoring for the mysqladmin processlist by entering the flowing
statement:
watch -d ls -l /usr/local/mysql/data/world/city_huge*
vmstat 1
5. In the mysql client, set the key cache to a small number (16K).
6. Set the MyISAM sort buffer to 4 by entering the following SQL statement:
7. Truncate the city_huge table, and reload the records exported in step 1.
Watch the monitors; how long did the import take: _____________ (Import A)
8. In the mysql client, set the key cache to a big number (75% of system memory).
9. Truncate the city_huge table, and reload the records exported in step 1.
Watch the monitors; how long did the import take: _____________ (Import B)
10. In the mysql client, set the key cache to a small number (16K).
11. Set the MyISAM sort buffer to 80% of system memory.
12. Truncate the city_huge table, and reload the records exported in step 1.
Watch the monitors; how long did the import take: _____________ (Import C)
13. Compare the different times for each of the imports.
14. Shut down the other terminal windows that were opened during this lab exercise.
9 - 30
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
9 - 31
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
211
10 INNODB STORAGE ENGINE
10 - 2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
If the server is started with no InnoDB configuration options, MySQL creates an auto-extending 10MB data file
named ibdata1 and two 5MB log files named ib_logfile0 and ib_logfile1 in the MySQL data directory.
To ensure that InnoDB’s performance is good, configuration options should be set to meet the needs of the
application that will be utilizing the tables. Configuration options and parameters will be discussed throughout the
remainder of this chapter.
● ib_logfile – These files are used with the InnoDB transaction processing system and are usually followed
by a sequential number to identify the log segment. The files hold information related to transaction history and
are located by default in the data directory.
● table_name.frm - This file (located in the schema directory under the MySQL datadir) contains all the
meta information about the InnoDB table definition. This is the only file, in an InnoDB shared tablespace setup,
that is located in the actual schema directory.
10 - 3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL
MySQL
Data
Data
mysql
mysql
test
test
City.frm
(Metadata)
Internal data
Internal data
dictionary Transaction Country.frm
dictionary Transaction
(Metadata)
---- ---- --- -- -- -----
Insert undo
--- ---- --- ----- -- -- -
records,
Insert undo records,
logs redo
redolog
log
logs records, etc.
Update records, etc.
Update
undo
undologs
logs Countrylanguage.frm
ibdata files Ib_logfile files
(Metadata)
To enable multiple tablespaces in InnoDB, the innodb_file_per_table setting should be placed in the
[mysqld] section of the server configuration file. After this setting is added and MySQL restarted, all new
InnoDB tables will use multiple tablespaces, and any InnoDB tables created previously will continue to use the
shared tablespace. If the innodb_file_per_table line is removed from the configuration file and
MySQL restarted, new tables are created in the shared tablespace, but access is still available for any tables
that were created using multiple tablespaces. InnoDB always needs the shared tablespace because it puts its
internal data dictionary and undo logs there. The .ibd files are not sufficient for InnoDB to operate.
10 - 4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL
MySQL
Data
Data
mysql
mysql
InnoDB multiple tablespaces
test
test City.ibd
City.ibd
undo records,
redo clause Country.frm
logs
logs redolog
log clause
(Metadata)
Update records,
records,etc.
etc.
Update
undo
undologs
logs Countrylanguage.frm
ibdata files Ib_logfile files
(Metadata)
Locking
The InnoDB storage engine supports row-level locking, providing the finest level of lock granularity, where only the row
that is read or updated is locked; and table-level locking, which by default is only used when there are changes to the
table structure itself (as is the case with ALTER TABLE).
216 ● Row-level locking - With row-level locking, InnoDB allows other concurrent transactions to access other rows
on the same page. This is in contrast with page level locking, where an entire page containing the row is locked,
thus forcing concurrent transactions to wait to access the same page until the lock is released.
NOTE: Next-Key Locking
In row level locking InnoDB uses next-key locking. This means that besides index records, InnoDB can also lock
the 'gap' before an index record to block insertions by other users immediately before the index record. A next-key
lock means a lock which locks an index record and the gap before it. A gap lock means a lock which only locks a
gap before some index record.
● Table-level locking - With table-level locking, InnoDB ensures the utmost integrity of the data it contains when
doing direct changes to the table structure itself. This feature is controlled by the --innodb-table-locks
server setting, which is ON by default.
10 - 5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Special Features
As stated earlier, InnoDB provides MySQL with a transactional storage engine that has commit, rollback, and crash
recovery capabilities. In addition, InnoDB does locking on the row level and also provides a consistent non-locking
read in SELECT statements. These features increase multi-user concurrency and performance. InnoDB also supports
FOREIGN KEY constraints. InnoDB tables can be freely mixed with tables from other MySQL storage engines, even
within the same statement.
10 - 6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10 - 7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Read
Read
Buffer Database
Disk Storage Pool Server
Write
Write
The In-Memory
buffer pool writes
and reads data
pages from O/S
222 ● Log buffer - This buffer contains cached log records. The innodb_log_buffer_size server
configuration variable controls the size of the buffer that InnoDB uses to write to the log files on disk.
Sensible values range from 1MB to 8MB. The default is 1MB. A large log buffer allows large transactions to
run without a need to write the log to disk before the transactions commit. Thus, with big transactions,
making the log buffer larger will save disk I/O.
Doublewrite Buffer (From the InnoDB online manual – section 11.4.3)
InnoDB uses a novel file flush technique involving a structure called the doublewrite buffer. It adds safety to
recovery following an operating system crash or a power outage, and improves performance on most varieties
of Unix by reducing the need for fsync() operations. Before writing pages to a data file, InnoDB first writes
them to a contiguous tablespace area called the doublewrite buffer. Only after the write and the flush to the
doublewrite buffer has completed does InnoDB write the pages to their proper positions in the data file. If the
operating system crashes in the middle of a page write, InnoDB can later find a good copy of the page from the
doublewrite buffer during recovery.
10 - 8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Log
Buffer
Buffer
Writes to disk Pool
at COMMIT and
checkpoints
records, Insert
Insertundo
--------------------
records, undo
redo
redolog
log logs
logs
records,
records,etc.
etc. Update
Update
undo
undologs
logs
Ib_logfile files ibdata files
As the figure demonstrates, InnoDB inserts transactional statements as log records into the log buffer, while
simultaneously executing the modifications those statements make against the in-memory copy of the record data
available in the buffer pool. This dual-buffer write explains the doublewrite buffer terminology.
223 The innodb_flush_log_at_trx_commit server variable tells the InnoDB storage when to execute a transaction
commit to flush log buffer to disk, thus making the modifications made by the transaction permanent and able to survive
a database crash. With many short transactions changing this variable can really improve performance. The values that
this variable can take are:
● 0 - The logs are flushed to disk approximately once per second
● 1 - The COMMIT statement initiates the flush (some disks have internal caches, etc., that could affect this).
● 2 - The log buffer is written to the file, but the file is not necessary flushed to disk, as the file system may have a
cache. An fsync() occurs every second.
10 - 9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Log files
has been flushed to disk (log Log Transaction
Transaction
buffer also flushed). Checkpoints COMMIT records,
Buffer (+ checkpoint)
records,
redo
influence crash recovery time, not redolog
log
records,
records,etc.
etc.
the ACID properties of the storage
engine. Checkpoints are done
Ib_logfile files
when the database is idle, or the
Redo Log is filling up (and new
Insert
Insertundo
--------------------
undo
Monitor. Checkpoints are written logs
logs
onto the Redo Log header and are Update
Update
undo
undologs
used by InnoDB to know at logs
recovery time what to recover.
Additional ibdata files
Memory Pool
225 Recovery Management
Subsystem (RMS)
The recovery management subsystem inside InnoDB flushes database pages and transaction operations to the log files
for the purpose of backup and recovery. The RMS implements a fuzzy checkpoint operation by continually flushing
modified database pages from the buffer pool in small batches. InnoDB writes to the log files in a circular fashion, so
if the log file has reached the configured limit set by the innodb_log_file_size system variable, a checkpoint
operation is executed at once to flush the modified database pages. This is done in order to make sure the committed
pages are available in case of recovery.
226 Log File Size
The size of each log file should be chosen to avoid executing checkpoint operations too often. The bigger log file size
reduces disk I/O in checkpointing. However, the larger size of the log file increases the “redo” recovery time in case
of a server crash.
The innodb_log_file_size server variable defines the size of each log file in a log group in megabytes.
Sensible values range from 1M to 1/n-th of the size of the InnoDB_log_buffer_size, where n is the number of log files
in the group. The larger the value, the less checkpoint flush activity is needed in the buffer pool, saving disk I/O. But
larger log files also mean that recovery will be slower in case of a crash. The combined size of log files must be less
than 4 GB on 32-bit computers. The default is 5M.
10 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered the
contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
Step 2. Test InnoDB INSERT bulk performance with varying COMMIT intervals
CALL innodb_inserts(100000,1);
Effect: This stored procedure will create a table called innodb_test and load it with 100,000 random records,
committing changes after each record. (Note that slow disk speed may cause this procedure to be very slow!)
How long did it take to perform the insert of the 10,000 records with a commit after each record?
__________________________________________________________________________________________
Action: Run the innodb_inserts stored procedure again, but this time commit every 5 records:
CALL innodb_inserts(100000,5);
How long did it take to perform the insert of the 100,000 records with a commit performed every 5 records?
__________________________________________________________________________________________
Action: Run the innodb_inserts stored procedure again, but this time commit every 50 records:
CALL innodb_inserts(100000,50);
How long did it take to perform the insert of the 100,000 records with a commit performed every 50 records?
__________________________________________________________________________________________
Action: Run the innodb_inserts stored procedure again, but this time commit every 500 records:
CALL innodb_inserts(100000,500);
How long did it take to perform the insert of the 100,000 records with a commit performed every 500 records?
__________________________________________________________________________________________
10 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Run the innodb_inserts stored procedure again, but this time commit every 5000 records:
CALL innodb_inserts(100000,5000);
How long did it take to perform the insert of the 100,000 records with a commit performed every 5000 records?
__________________________________________________________________________________________
Action: Run the innodb_inserts stored procedure again, but this time commit every 10000 records:
CALL innodb_inserts(100000,10000);
__________________________________________________________________________________________
Action: Run the innodb_inserts stored procedure again, but this time commit every 100000 records:
CALL innodb_inserts(100000,100000);
How long did it take to perform the insert of the 100,000 records with a single commit performed afterward?
__________________________________________________________________________________________
Step 3. Review the results
Of all the different commits, which performed the fastest and why
__________________________________________________________________________________________
10 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
PKV
PKV
PKV
PKV
PKV
Leaf
Row
Row
Row
Row
Row
All InnoDB indexes are B+trees where the index records are stored in the leaf pages of the tree. The default size of
an index page is 16KB. When new records are inserted, InnoDB tries to leave 1/16 of the page free for future
insertions and updates of the index records. If index records are inserted in a sequential order (ascending or
descending), the resulting index pages are about 15/16 full. If records are inserted in a random order, the pages are
from 1/2 to 15/16 full. If the fill factor of an index page drops below 1/2, InnoDB tries to contract the index tree to
free the page.
10 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Free unnecessary
F log entries
an lus C
d h O
m lo M Rollback segment
ov g M
IT used for:
e bu
1000 slots per page
A transaction
Undo Log: Updates, Deletes
Two undo logs per trx
When the system recovers from a crash, the InnoDB logs will be applied to any rows that were being updated at the
time of the crash, but had not been flushed to disk. First, the system applies the redo, basically re-applying the
changes it has recorded in a serial manner to the tables and indexes in question. And, since the commits are also
logged, the system will know what undo's to apply to bring the database back to a consistent state. That is the state
where all committed data is consistent in the database.
10 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● PageNo - The page number makes up 4 bytes of the overall line. It represents the page within the tablespace
where the change was made.
● Primary Key Value - The undo log records begin with the primary key value.
NOTE: InnoDB and Primary Keys
If an InnoDB table is created without a PRIMARY KEY, MySQL picks the first UNIQUE index that has only NOT
NULL columns as the primary key and uses it as the clustered index (primary key value). If there is no such index
in the table, InnoDB internally generates a clustered index where the rows are ordered by the row ID that InnoDB
assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are
inserted. Thus the rows ordered by the row ID will be physically in the insertion order.
● Old trx id - The next section of the record is the old transaction ID, the ID of the transaction that changed that
row.
● Old values on that row – This contains the old values that were replaced during the transaction.
10 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
231 Multiversioning
Multiversion Concurrency Control (MVCC) is an advanced technique for improving database performance in a
multiuser environment. The InnoDB tablespace contains many versions of the same rows in order to maintain
isolation between transactions. Different transactions see different versions of the same rows. To manage this, the
InnoDB storage engine maintains the following ID’s:
● Create Version ID – The data records are tied to a version, this identification is used to tell InnoDB when
the data record was created.
● Delete Version ID – When a record was deleted (or expired), this identification is maintained to track the
deletion of the data record.
Due to MVCC ‘s handling of UPDATE statements, the physical table spaces can be riddled with holes. These
holes can cause the physical side of the ibdata files to become quite large with tables that have lots of
updates. This should be considered when discussing multiple tablespaces in the application architecture.
10 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered the
contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
CALL create_city_huge(100);
Effect: This stored procedure will create a table called city_huge that is based on the City table. Once the
city_huge table is created, the stored procedure will load in the records from the City table 100 times, creating
approximately 400,000 records in the new table.
Action: Change the storage engine being used for the city_huge table to InnoDB by executing the following
SQL statement:
This process will take some time, thus it may be a good time to take a break.
Step 2. Put InnoDB under some traffic load
Action: Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the city_huge table:
Effect: This will execute a pre-defined SQL file that will run multiple queries against the city_huge table to give
us a good number of InnoDB statistics to look at. (Note: The innodb_query.sql file is identified here as being
in the home directory of the current user, please alter that location to where this file is located on your system.)
10 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: In the mysql client, enter the following command to display the InnoDB performance counters,
statistics and information about transaction processing along with a large number of other statistics to help in the
tuning process:
Effect: The InnoDB status report will display a large amount of information and takes some time to decipher.
(Note: With the way that MySQL displays information, you may have to scroll through the window to get to the
information that is required.)
Action: The first thing to check for in the InnoDB status report is to make sure that you have a large enough
sampling to get proper statistics. This on the first line under the title INNODB MONITOR OUTPUT. The
following is an example of this line:
Due to the fact that many of the values are based on per second, it is important to have a sampling of data of at
least 20-30 seconds. Anything less is unusable and should be discarded.
Step 5. Review SEMAPHORES
Action: Under the SEMAPHORES section of the InnoDB status report, there are two sub-sections. The first is the
current waits (which will only be displayed if running in a high concurrency environment and InnoDB has to
rely on O/S waits) and the second is event counters. For our purposes, the event counters will provide the
information we are looking for. The following is an example of what this section looks like:
----------
SEMAPHORES
----------
OS WAIT ARRAY INTO: reservation count 2685, signal count 2087
Mutex spin waits 0, rounds 171700, OS waits 1636
RW-shared spins 960, OS waits 319;
RW-excl spins 1832, OS waits 730
Reservation count and signal count displays the how often InnoDB uses the internal sync array, and thus the
ratios between them represent how frequently InnoDB is falling back on the OS wait functions.
The OS waits located under exclusive locks and shared locks are not related to the wait array info, as those refer
to the number of times InnoDB relayed on the OS to handle the name thread hoping that the object was freed
already. These OS waits can be very slow; thousands of them per second is a problem and identifies an area
needing attention.
Spin waits and spin rounds identify a problem if both are high and show that significant CPU resources are
being wasted. However, even then there is only a concern when it is hundreds of thousands spin rounds per
second. The system variable innodb_sync_spin_loops can be used to help correct this problem.
10 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Under the TRANSACTIONS section of the InnoDB status report, review the content associated with the
transactions that occurred during the sampling period. This section can be very helpful in determining if your
application has lock contention and the reasons for transaction deadlocks. The following is an example of what this
section looks like:
------------
TRANSACTIONS
------------
Trx id counter 0 264408
The Trx id counter, which is incremented for each transaction, is the current transaction identifier.
Purge done for trx's n:o is the number of transactions to which a purge is completed. Uncommitted
transactions (that may be old and stale) may eat up resources by blocking the purge process. Reviewing the
transaction counter difference between the current and last purged transaction will help identify if this is occurring.
The system variable innodb_max_purge_lag can be used to help correct this problem.
History list length 6 is the number of unpurged transactions in undo space. This value is increased as
transactions which have done updates are committed and decreased as purge runs.
In the LIST OF TRANSACTIONS ... sub-section, the transactions that have occurred during the sampling will be
displayed. If the number of connections made during the sampling was small, all the connections will be printed in
the transaction list. However, if there were a large number of connections only a few will be printed. This ensures
that the report output will not grow too large.
10 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Under the FILE I/O section of the InnoDB status report, the state of the file input/output helper threads are
displayed. These helper threads are responsible for insert buffer merges (insert buffer thread), asynchronous log
flushes (log thread), read-ahead (read thread) and flushing of dirty buffers (write thread). This section can be
very helpful in determining if your application has lock contention and the reasons for transaction deadlocks.
The following is an example of what this section looks like:
--------
FILE I/O
--------
For each helper thread you can see thread state (if thread is ready: waiting for i/o request) or if it is
executing certain operation.
The number of pending operations is shown for each of the helper threads (these are the amount of operations
queued for execution or being executed at the same time). This includes the number of pending fsync
operations. Calling fsync() on modified files is used by InnoDB to ensure data makes it to the disk (simply
passing it to the OS cache is not enough). If any of these values are constantly high, an I/O bound workload is
possible. However, not all I/O threads are accounted for here (eg. I/O requests submitted by threads executing
requests), so there may still be an I/O bound workload while the numbers are displaying zeroes.
The number of file I/O operations are shown as computed averages. These numbers can be very helpful in
monitoring and graphing. The bytes/read shows the average size of read requests. It is best to consider these
values as read-ahead efficiency due to what they are tracking. For random I/O these will most likely display 16K
(a page size), for full table scan or index scan read-ahead may be performed which can increase average read
size significantly.
10 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Under the INSERT BUFFER AND ADAPTIVE HASH INDEX section of the InnoDB status report, the status of
the insert buffer and adaptive hash are displayed. An adaptive hash is a hash index Innodb builds for some pages to
speed up row lookup by replacing btree searches with hash searches. The following is an example of what this
section looks like:
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
ibuf: size 1, free list len 405, seg size
For the most part, there is very little that can be done to improve the performance of this section, so this section can
be considered informational only.
The first line shows status of the insert buffer (ibuf) to include the segment size and the free list along with the any
records that are located in the insert buffer. This is followed by how many inserts were done in the insert buffer, how
many records were merged and how many merges took place. Calculating the ratio of merges to the number of
inserts provides the insert buffer efficiency.
The last section displays the hash table size, number of used cells and number of buffers used by adaptive hash
index. The number of hash index lookups and number of non-hash index lookups indicate the hash index efficiency.
Step 9. Review LOG
Under the LOG section of the InnoDB status report, information associated with the log subsystem is displayed. The
following is an example of what this section looks like:
---
LOG
---
Log sequence number 2 2284053902
Log flushed up to 2 2284053902
Last checkpoint at 2 2284053902
0 pending log writes, 0 pending chkp writes
129214 log i/o's done, 0.00 log i/o's/second
This section displays the current log sequence number (which is the amount of bytes InnoDB has written in log files
since system tablespace creation). This section also identifies up to which point logs have been flushed along with
how much data is not flushed in the log buffer. In addition, when the last checkpoint was performed is displayed.
Monitoring this information provides the information need to adjust the innodb_log_buffer_size. If there is
a more than 30% of the log buffer size being flushed, it can be an indication that this system variable needs to be
increased.
10 - 23
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The last two lines display the number of pending normal log writes and number of checkpoint log writes and the
number of log i/o operations, which allows for the separation of the tablespace related I/O from log related I/O
so that you can see how much I/O your log file requires.
Log writes may be more or less expensive based on the value of the
innodb_flush_log_at_trx_commit system variable. When this variable is set to 2, log writes are sent
to the O/S cache. With these writes being sequential the log writes are considered very fast.
Step 10. Review BUFFER POOL AND MEMORY
Under the BUFFER POOL AND MEMORY section of the InnoDB status report, buffer pool activity and memory
usage is displayed. This report provides the information needed to determine if the buffer pool is sized properly.
The following is an example of what this section looks like:
In this section, the display shows total memory allocated by InnoDB, amount of memory allocated in additional
memory pool, total number of pages in buffer pool, number of pages free, pages allocated by database pages and
dirty pages.
In relation to sizing the buffer pool, if there is constantly a lot of pages free, it probably means the active
database size is smaller than allocated buffer pool size and can be tuned down.
The pending reads and writes lines identify pending requests on buffer pool level. InnoDB may merge multiple
requests to one on file level so these can be different. In addition, there are different types of I/O submissions by
InnoDB displayed: least recently used (LRU) which identifies dirty pages that have not been accessed in an
excessive amount of time, old pages that need to be flushed by the check-pointing process and independent
single page writes.
The third to the last line displays the number of pages being read and written. Any created pages is empty pages
created in the buffer pool for new data (showing previous page content was not read to the buffer pool).
On the last line, the buffer pool hit ratio is displayed which measures buffer pool efficiency. 1000/1000
corresponds to 100% hit rate. Due to the fact that buffer pool hit rate is workload dependent it is hard to
determine what is poor performance. Sometimes 950/1000 will be enough, sometimes you can see I/O bound
workload with hit rate of 995/1000.
10 - 24
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Under the ROW OPERATIONS section of the InnoDB status report, activity on the row basics along with some
addition system information is displayed. The following is an example of what this section looks like:
--------------
ROW OPERATIONS
--------------
2 queries inside InnoDB, 0 queries in queue
2 read views open inside InnoDB
Main thread process no. 4330, id 2996149136, state: sleeping
This section displays the InnoDB thread queue status by answering the questions: "how many threads are waiting
and how many are active?"
The second line answers the question: "how many read views are open inside InnoDB?", which defines transactions
that were started but no statement is currently active.
The third line displays the state of the InnoDB main thread which is responsible for scheduling the number of
system operations to include flushing dirty pages, check-pointing, purging, flushing logs and insert buffer merge.
The fourth line displays the number of row operations since the system has started along with average values
providing valuable data for monitoring and graphing the InnoDB load.
Step 12. Tune InnoDB performance
Action: Now with the InnoDB status report being explained, it is time to make some changes to some of the
InnoDB system variables and view how it affects the performance. The first system variable that will be changed is
innodb_buffer_pool_size. The value assigned to this system variable should be approximately 80% of
system memory. This system variable can not be set dynamically, so the server will need to be stopped and restarted
with the following command line option:
--innodb_buffer_pool_size=430000000
Alternatively, place the option in the my.cnf file under the [mysqld] section:
[mysqld]
innodb_buffer_pool_size=430000000
The value of 430000000 should be altered to around 80% of the system memory available to mysqld. This value,
80%, is appropriate for a dedicated database server; however, setting it too high can create competition for physical
memory and cause paging in the O/S.
10 - 25
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Issue the following command twice (the first is a buffer warm up) in an O/S terminal window to run the
mysqlslap benchmarking tool against the city_huge table again:
This will execute a pre-defined SQL file that will run multiple queries against the city_huge table again. Did the
performance of the operation improve or worsen after adjusting the InnoDB buffer pool size?
______________________
Action: In the mysql client, change the setting of the innodb_flush_log_at_trx_commit system
When this system variable is changed to 0, it is possible to lose one second worth of transactions during a crash.
However, the trade of is that performance generally improves with a setting other than 1.
Action: Issue the following command twice (the first is a buffer warm up) in an O/S terminal window to run the
mysqlslap benchmarking tool against the city_huge table again:
This will execute a pre-defined SQL file that will run multiple queries against the city_huge table again. Did the
performance of the query improve or worsen after adjusting the InnoDB log flush behavior?
______________________
Step 13. Clean up
Action: Return the system variables changed during this exercise to their original values. This will require a
restart of the MySQL server with the innodb_buffer_pool_size or
innodb_flush_log_at_trx_commit system variables removed from the command line and my.cnf.
Effect: The MySQL server is now back to its original configuration with the altered settings reverted to their
default values.
10 - 26
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10 - 27
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise you will answer the following questions pertaining to the InnoDB storage engine.
10. What are the possible disk footprints for InnoDB tables?
__________________________________________________________________________________
__________________________________________________________________________________
11. What types of locking does InnoDB support?
__________________________________________________________________________________
__________________________________________________________________________________
10 - 28
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10 - 29
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Further Practice
In this further practice you will tune InnoDB settings for improved performance during an import operation.
1. In the mysql client, create a smaller city_huge database to complete the steps to follow by executing the
following SQL statement:
CALL create_city_huge(25)
2. In the mysql client, export the data from the city_huge table randomly to a flat file using the following
3. Export the data from the city_huge table ordered by the primary key to a flat file using the following
SQL statement:
4. Truncate the city_huge table and import the random text file (created in step 2) back into the table.
Record how long it took approximately to load the records back in: _________________
5. Truncate the city_huge table and import the primary key text file (created in step 3) back into the table.
Record how long it took approximately to load the records back in: _________________
6. Shut down the mysql client.
7. Edit the /etc/my.cnf file and add the following line:
innodb_buffer_pool_size = 1652704000;
This will increase the size of the innodb_buffer_pool_size system variable to approximately 80%
of the system memory (Using 2GB as an example). Close and save the file.
8. Restart the MySQL server.
9. Restart the mysql client.
10. Disable (turn off) unique check in mysql by entering the following SQL statement:
SET UNIQUE_CHECKS=0;
10 - 30
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
11. Truncate the city_huge table and import the random text file (created in step 2) back into the table.
Record how long it took approximately to load the records back in this time: _________________
12. Truncate the city_huge table and import the primary key text file (created in step 3) back into the table.
Record how long it took approximately to load the records back in this time: _______________
10 - 31
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10 - 32
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
239
11 ALTERNATE STORAGE ENGINES
11 - 2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Advantages
● Fast – Reading or writing data from MEMORY tables is very fast, due to everything being in memory.
○ Non-transactional and stores all data in memory.
11 - 3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered
the contents of the database.
Effect: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
Action: Create (or recreate) the city_huge table (which defaults to using the MyISAM storage engine) with
approximately 160,000 records by executing the following SQL statement:
CALL create_city_huge(40);
Effect: This stored procedure will create (or recreate) a table called city_huge that is based on the City
table. Once the city_huge table is created, the stored procedure will load in the records from the City table
40 times, creating approximately 160,000 records in the new table.
11 - 4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Set the key_buffer_size system variable to approximately 50% of system memory by typing the
following in the mysql client:
Effect: The key_buffer_size system variable controls the size of the key cache. If your system does not have
512M of system memory, change the 256M key_buffer_size system variable to approximately 50% of the
memory on your system.
Action: Load the index into memory by typing the following SQL statement:
Effect: This LOAD INDEX INTO CACHE statement pre-loads a table index into the default key cache. (Note:
LOAD INDEX INTO CACHE is used only for MyISAM tables.)
Action: Execute the following mysqlslap command in the O/S terminal window (command line client):
Effect: The MEMORY_query.sql file (which runs multiple SELECT statements) is executed against the
MyISAM city_huge table using the mysqlslap benchmarking tool 5 times.
Action: Record the number of seconds it takes to run all the SELECT statements:
Average: _______________
Minimum: _______________
Maximum: _______________
Step 3. Test the MyISAM table with UPDATE queries
Action: Execute the following mysqlslap command in the O/S terminal window (command line client):
Effect: The MEMORY_update.sql file (which runs multiple UPDATE statements) is executed against the
MyISAM city_huge table using the mysqlslap benchmarking tool.
Action: Record the number of seconds it takes to run all the UPDATE statements:
Average: _______________
Minimum: _______________
Maximum: _______________
11 - 5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Reduce the size of the key_buffer_size system variable to approximately 10% of system memory
by typing the following in the mysql client:
If your system does not have 512M of system memory, change the 50 M key_buffer_size system variable
to approximately 10% of the memory on your system.
Action: Increase the size of the max_heap_table_size system variable to approximately 50% of system
memory by typing the following in the mysql client:
This variable sets the maximum size to which MEMORY tables are allowed to grow. If your system does not
have 512M of system memory, change the 256M max_heap_table_size system variable to approximately
50% of the memory on your system.
Action: Change the city_huge table to utilize the MEMORY storage engine by entering the following SQL
statement:
Effect: The MEMORY storage engine moves all the city_huge data into memory (RAM).
Step 5. Test the MEMORY table with SELECT queries
Action: Execute the following mysqlslap command in the O/S terminal window (command line client):
Action: Record the number of seconds it takes to run all the SELECT statements now that city_huge table is
using the MEMORY storage engine:
Average: _______________
Minimum: _______________
Maximum: _______________
11 - 6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Execute the following mysqlslap command in the O/S terminal window (command line client):
Action: Record the number of seconds it takes to run all the UPDATE statements now that city_huge table is
using the MEMORY storage engine:
Average: _______________
Action: Compare the MyISAM results with the MEMORY results, what do the numbers tell you?
Effect: You may find that the SELECT statements have similar run times; however, the UPDATE statements will
most likely run faster using the MEMORY storage engine.
11 - 7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
11 - 8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● The implementation uses SELECT, INSERT, UPDATE, and DELETE, but not HANDLER.
● FEDERATED tables do not work with the query cache.
NOTE: FEDERATED Tables and Performance
Due to the numerous performance limitations, the FEDERATED storage engine is not a choice for applications
that require speed and/or the use of extensive business rules. The FEDERATED storage engine has features that
are useful in a less demanding capacity but should be carefully evaluated prior to putting into any system that
requires demanding performance.
11 - 9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
A large number of SELECT statements during insertion can deteriorate the compression, unless only bulk or
delayed inserts are used.
Performance Issues
The ARCHIVE storage engine reduces the size of most other storage engine files by up to 70%. This can be
extremely effective for reducing the size of data that is no longer being modified and is strictly historical in nature.
Furthermore, due to the nature of business data, having historical data can be extremely important for a multiple
number of business and sometimes government regulation reasons.
11 - 10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The following is a sample of the data represented in either a CSV table or using the above command:
Action: Install a fresh copy of the world database to remove any indexes or other actions that may have altered the
contents of the database.
Action: Source the pt_stored_procedures.sql file to load the stored procedures that will be used in the
labs for this chapter.
11 - 11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Recreate the city_huge table (which defaults as the MyISAM storage engine) with approximately
800,000 records by executing the following SQL statement:
CALL create_city_huge(200);
Effect: This stored procedure will create (or recreate) a table called city_huge that is based on the City
table. Once the city_huge table is created, the stored procedure will load in the records from the City table
200 times, creating approximately 800,000 records in the new table.
Action: Execute the myisampack command in the world directory underneath the mysql data directory
(most likely /usr/local/mysql/data/world) to compress the MyISAM table:
myisampack city_huge.MYI
Effect: The myisampack utility compresses MyISAM tables. myisampack works by compressing each
column in the table separately. Usually, myisampack packs the data file 40%-70%. (Note: You must have O/S
access rights to the data directory to perform actions against the data in the database outside of the mysql
client.)
Step 4. Check the compressed MyISAM table
Action: Execute the following myisamchk command in the O/S terminal window (command line client):
Effect: The myisamchk utility gets information about your MyISAM database tables or checks, repairs, or
optimizes them. Using the -rq options tells myisamchk to repair (-r) using a faster repair (-q) by not
modifying the data file.
Step 5. Test the compressed MyISAM table with SELECT queries
Action: Execute the following mysqlslap command in the O/S terminal window (command line client):
Effect: The MEMORY_full_query.sql file (which runs two SELECT statements that produce a full search
against the data) is executed against the city_huge table using the mysqlslap benchmarking tool.
11 - 12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: Record the number of seconds it takes to run all the SELECT statements:
Average: _______________
Minimum: _______________
Maximum: _______________
Action: Execute the following mysqlslap command in the O/S terminal window (command line client):
Effect: The MEMORY_query.sql file (which runs multiple SELECT statements) is executed against the
Action: Create an identical table to city_huge called city_huge_archive by executing the following the
SQL statements in the mysql client:
Effect: There are now two tables: city_huge_myisam that contains the data compressed using MyISAM
compression, and the city_huge table which is now utilizing the ARCHIVE storage engine.
Step 7. Review the files utilized by the two tables
Action: In the world directory under the MySQL data directory (most likely
/usr/local/mysql/data/world), compare the files city_huge_myisam.MYD and city_huge.ARZ
using the following command in an O/S terminal window:
ls -l city_huge*
11 - 13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Action: With the city_huge table now using the ARCHIVE storage engine, execute the following
mysqlslap command in the O/S terminal window (command line client) again:
Action: Record the number of seconds it takes to run all the SELECT statements:
Average: _______________
(Note: This step could take a considerably long time to complete, please feel free to kill the process if the time is
excessive.)
Action: Record the number of seconds it takes to run all the SELECT statements:
Average: _______________
Minimum: _______________
Maximum: _______________
Step 9. Review the results
Action: Compare the MyISAM compression results with the ARCHIVE results, what do the numbers tell you?
11 - 14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Relay Slave
A BLACKHOLE can be useful as a slave server whose purpose is to relay the master server’s data to a slave or multiple
slaves through a relay slave. This relay slave (or middle server) would contain BLACKHOLE tables that are designed to
Slave
Server
Master Relay
Host Slave
Slave
Server
11 - 15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Master Slave
Host Server
The master writes to its binary log. The “dummy” mysqld process acts as a slave, applying the desired combination
of replicate-do-* and replicate-ignore-* rules, and writes a new, filtered binary log of its own. The dummy process
does not actually store any data, so there is little processing overhead incurred by running the additional mysqld
process on the replication master host. This type of setup can be repeated with additional replication slaves.
NOTE: Other possible uses for the BLACKHOLE storage engine
11 - 16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
254 11.7 How Using Multiple Storage Engines Can Affect Performance
Storage engines can be mixed on the same server or even in a single query. Constant data can be stored in MyISAM,
dynamic critical data in InnoDB, and MEMORY for temporary tables. Conversion back and forth is simple:
On the downside mixed database configurations are more complicated, including for backup, maintenance, tuning
purposes. There is also the potential of bugs while using multiple storage engines. The optimizer may have an especially
hard time. In addition, all storage engines have their own memory buffers. Thus using many storage engines results in
less buffer space for each engine.
Mixing of storage engines can improve overall performance, for example, using MEMORY for small tables that
change a lot or caches, MyISAM for read-heavy data, or when FULLTEXT indexes are needed, InnoDB for data
with lots of UPDATEs and DELETEs, ARCHIVE for logs, etc.
A disadvantage with mixing of multiple storage engines is that they all have their own buffers, thus it becomes
much more difficult to configure MySQL for all used engines as the memory has to be efficiently shared. Another
disadvantage comes from mixing transactional and non-transactional tables in the same transaction. This should
not be done as operations on non-transactional tables cannot be rolled back.
11 - 17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
11 - 18
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
11 - 19
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
257 SQL
Nodes MySQL MySQL MySQL MySQL
Server Server Server Server
Data
NDBCluster
Nodes
This architecture ensures there is no single point of failure. Applications continue to run and data remains consistent,
even if any one of the data, SQL, or management server nodes fail. In addition, nodes and computers can be
distributed across geographies. This is important to recover a database or an entire cluster in case of a physical
disaster in a particular location.
258 NDBCluster Storage Engine
The NDBCluster storage engine runs in server memory and supports transactions and synchronous replication.
“Synchronous replication" is between servers only and is automatic. The data is thus spread across many servers.
The engine spreads data redundantly across many servers in the cluster. The storage engine allows one of the servers
to go offline and does not interrupt the data availability. The storage engine uses row-level locking. All reads are non-
locking by default. READ-COMMITTED is the only supported isolation level. As with clusters in general, the NDB
Cluster storage engine ensures individual node crashes do not stop the cluster. This storage engine provides for
automatic synchronization of data nodes at restart as well as recovery from checkpoints and logs at cluster crash.
The NDB Cluster storage engine can perform many maintenance tasks online, including on-line backup and on-line
software upgrades. In addition, the storage engine supports unique hash and T-tree ordered indexes.
NOTE: MySQL Cluster for High Availability Training Course
This section was designed to be an introduction to MySQL Clustering; however, due to the enormity of this
database application, time does not permit for this training course to give the application the time it deserves.
The MySQL Cluster for High Availability training course is the recommended course. This is a 3 day course
taught by an Authorized MySQL instructor who will explain and demonstrate important details of clustering:
how to get started with MySQL Cluster, how to properly configure and manage the cluster nodes to ensure
high availability, how to install the different nodes, and understanding of the internals of the cluster.
11 - 20
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Quiz
In this exercise connect the storage engine type with the features(s) associated. Note: Each storage engine may have more
than one feature it connects to.
Storage Engine Type Feature
11 - 21
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
11 - 22
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
260
MySQL Performance Tuning
12 CONCLUSION
12 - 1
Chapter 12 - Conclusion
12 - 2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
12 - 3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
12 - 4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Thank you in advance for taking the time to give us your opinions!
12 - 5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
12 - 6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
12 - 7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
5) The registration at the top of the page is optional, so skip it and select the link at the bottom of the page that
reads No thanks, just take me to the downloads!. You will not perform registration during the course.
6) Scroll down the page to the section showing the “Mirrors in: ...”. Find the location closest to you and click
on the link next to it called HTTP:
Appendix A-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7) A window will pop up with the option to Save File or Cancel. Select Save File to download the software
for installation. (If a window regarding security appears, click on Run to allow the download.)
8) When the file (mysql-essential-<release_#>-win32.msi) is finished downloading, open the
file to start the installation.
9) The Windows Setup Wizard will appear. Click on the Next> button to perform the set-up.
10) Select a Typical installation.
Appendix A-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
9) A window will pop up with the option to Save File or Cancel. Select Save File to download the software
for installation.
10) Unpack the tarball to /usr/local with (change /tmp/ to the proper path):
cd /usr/local
tar –zxvf /tmp/mysql-5.1.44-linux-i686-glibc23.tar.gz
ln –s mysql-5.1.44-linux-i686-glibc23 mysql
Appendix A-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
cd mysql
scripts/mysql_install_db
groupadd mysql
cp support-files/mysql.server /etc/init.d/mysql
/etc/init.d/mysql start
PATH=${PATH}:/usr/local/mysql/bin
3) Under the Example Databases section, select the Zip file for download of the world database:
4) Unzip the file into the <local_disk> folder. (i.e. directly under the C:/ or D:/ drive.)
Note: For the purposes of this course, you will not need to download the setup guide.
Appendix A-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
This appendix is intended as a quick introduction to the Linux operating system. The purpose is to allow individuals
not familiar with Linux or the Linux command line the ability to navigate and create files, view directories. This
document was created as part of the classes offered by MySQL so it is biased towards commands that we use in our
classes.
Windows in Linux?
This document assumes that the Linux version being used is the most recent and that a graphical environment has
been installed. The basic system for a graphical environment in Linux is X-Windows. X-windows or X-Free-86 is
the Linux graphical user interface (GUI) and provides the basic framework for utilizing the Linux operating system.
Appendix B-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
General Notes
● Linux is CaSe-SeNsItIvE which is different from a Windows environment. Dog.txt, DOG.txt and
doG.txt would be considered three different file names in Linux, where in Windows they would all be
Appendix B-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The resolution of the screen can be changed using the following keyboard commands:
○ <CTRL><ALT><+> - Increases the resolution (eg. 800x600 => 1024x768).
○ <CTRL><ALT><-> - Decreases the resolution (eg. 1024x768 => 800x600).
Note: In X-Windows it may be necessary to add the resolution setting sizes to the X-Windows
configuration file (eg. /etc/X11/XF86Config). The first resolution size identified will be the default setting
when X-Windows starts and the other settings will be used when the <CTRL><ALT><+> and
<CTRL><ALT><-> keys are used. If the entries are not in normal resolution order (eg. "1024x767",
"800x600", "640x480", etc.), then these keys may not necessarily increase or decrease as expected (it
Appendix B-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● Absolute path - This type of path is a full path that identifies the actual location of the file or directory
being called without giving any thought to the current directory being used. This can be useful when it is
necessary to start an application in a directory that the executables are not located in. By calling the
executable full path (eg. /usr/local/mysql), from within another directory (eg. /tmp), any external
dealings from within the application will take place in the directory started (unless otherwise identified by
the application itself).
● Relative path - This type of path identification allows for application that are located within the current
directory to be executed without having to identify the exact path (eg. executing mysql within the
/usr/local directory). The operating system actually appends the current directory to the file name
Time-saving features
When using the command line there are many time-saving features that can be used to minimize the amount of
typing that is needed. The following is a recap (or full description) of the most useful:
● TAB completion - <TAB> in a text terminal will autocomplete the command if there is only one option.
● Wildcards - Wildcards can subsituted for file names. The most common wildcard is *. By issuing the
following command (with a wild card):, the entire contents of the directory are displayed (including all
subdirectories and there contents):
ls /usr/local/*
● Shortcuts - In Linux there are multiple characters that assist in minimizing the amount of typing necessary.
The following are the most common:
○ ~ (tilde) - This character is short for the current users home directory (most likely
/home/username). If the following was entered: cd ~/my_dir, the operating system would go to
the /home/username/my_dir directory. The command cd alone is equivalent to typing cd ~.
○ . (dot) - This character refers to the current directory. Sometimes this is necessary to execute
applications located with the current directory. The following: ./my_program will attempt to
execute the application my_program in the current working directory.
○ .. (two dots) - These characters (when used consecutively) will reference the parent directory to the
current working directory. The following: cd .. when executed in the /usr/local directory will
make the /usr directory the current working directory.
○ & (ampersand) - This character when placed following the application name (including a space
between the application and ampersand) will run the application in the background.
Common commands
There are some commands that are absolutely necessary to know (or at least understand) when using the Linux
command line. The following is a large number of these commands to include those that are most useful in this
training along with some general need-to-know commands:
● Getting around - The first steps of learning Linux is learning how to get around. The following is a recap
(or full description) of these commands:
Appendix B-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
○ ls - Under Linux, the command "dir" is an alias to ls and lists the contents of either the current
directory (when executed without any parameters) or the contents identified by the parameters entered.
The following is a feature of the ls command that is useful to know:
■ ls -al|more - This can be used when the output flows off of the screen. The -al parameter
identifies to list all the files (including hidden files). The |more command pauses the display
after each screenful.
○ cd - This command is an acronym for "change directory". When followed by a directory name, that
directory will become the active working directory. cd when used by itself is equivalent to cd ~
Appendix B-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
○ rm filename - This command will delete the file that is identified by the filename parameter. The user
attempting to delete the file(s) must also be the owners of those files (with the exception of
superusers). Most versions of the Linux operating systems will prompt the user verifying that they
wish to delete the files. However, the user can force the deletion without this prompt by adding the -f
option.
■ Deleting large number of files - Using the rm command with wildcards can delete large number
of files in a single command (eg. rm -f * will remove all files in the current working directory,
no questions asked).
rm -rf /usr/local/apache
In this example, the /usr/local/apache directory (along with all the files and subdirectories it
contains) will be removed from the system. The -f option is almost a necessity, otherwise, the end
user would have to verify each deletion that takes place. There is no undo for this action.
○ mkdir directoryname - This command will create a subdirectory (identified by the directoryname
parameter) in the current directory (unless an absolute path is used). This command will produce an
error if the user creating the subdirectory does not have the required privilege or the directory name is
already being used in the directory (either the working directory or the absolute path directory, if used).
● Viewing and editing files - The following is a listing of the most common Linux commands associated
with viewing and editing files:
○ cat filename |more- This command will display the contents of the filename to the screen. If the file is
a text-based file, then that which is displayed will most likely be readable. However, if the file is an
application or other binary type file, the display will be unreadable and most likely cause havoc with
the operating systems attempting to display it. The contents will be displayed with no pause unless the
parameters |more are used. With the |more combination, the display is paused after every screenful is
displayed. Note: Use the command reset after displaying a binary file to ensure that the character
set displays the proper default character set. Other commands that are similar to the cat command:
■ more filename - This command, which has been used as a parameter with other commands, can
be used to scroll through the content of a file.
■ less filename - This command is roughly equivalent to more but can be more convenient to use.
Press the <q> key to terminate the command.
○ vi filename - This command provides simple and standard text editing capabilities. There is a separate
appendix that describes how to use the vi text editor. There are other text editors to choose from. The
following is a list of the most common:
■ emacs filename - This command utilizes the emacs text editor to display and manipulate text files.
■ kwrite filename - This command utilizes the kwrite text editor to display and manipulate text
files.
■ nedit filename - This command utilizes the nedit text editor to display and manipulate text files.
Appendix B-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
■ vim filename - This command utilizes the vim text editor to display and manipulate text files.
○ <Middle mouse button> - This hardware interface tells Linux to paste the text which is currently
highlighted somewhere else. This is the normal "copy-paste" operation in Linux.
○ touch filename - This command will alter the date/time stamp of the file filename to the current time
for files that exist. If a file does not exist with the identified filename, the operating system will create
an empty file with that name.
● Finding files - The following commands provides the means in which the operating system searches for
files versus the end user having to search each directory individually for a file or files:
Getting help
With the large number of commands offered by the Linux operating system, it is difficult to remember the syntax
and/or parameters to perform the command action. This is where having a built-in manual for the majority of
commands is helpful. The following are the most common ways to access the help features of Linux:
● commandname --help |more - These parameters used with most commands will display a brief help on a
command. "--help" works similar to DOS "/h" switch. The "more" pipe is needed if the output is longer
than one screen. help commandname is the equivalent to using the --help parameter.
● man commandname - The man command, followed by the commandname, displays the contents of the
system manual pages (help) on the topic. Press "q" to quit the viewer. The command info
commandname works similar to the man command but may contain more up-to-date information. For the
majority of commands, the manual pages are hard to read and understand, it is advisable to use the
commandname --help parameter first. If that does not answer the question being asked, then look in
the /usr/doc directory for the specific command verbage file. To display a manual page from a specific
section, add a section number between the man command and the commandname queried (eg. man 3
exit displays an info on the command exit from section 3 of the manual pages).
● apropos commandname - This command will provide a list of commands associated with the
commandname entered.
Shutting Down
Terminating the operating system is similar to Microsoft Windows when using the Linux GUI; however, to terminate
the application from the command line the following commands can be used:
● exit - This command logs the user out of the current shell that is running. If there is only one Linux shell
running, the operating system will be shutdown.
● shutdown -h now - This command, when used by the root user, will shut down the system to a halt.
● halt - This command, when used by the root user, will halt the machine. This command can be very useful
on remote shutdown.
● reboot - This command, when used by the root user, will reboot the machine. This command can be very
useful when managing a remote server.
Appendix B-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● reset - This command can be used to rescue a window that has been frozen or messed up (usually trying to
display a binary file with a command such as cat). The end user may not be able to see the typing of the
command; however, continuing to type the command followed by the enter key should produce the desired
result of reseting the operating system.
B.3 Obtaining system information
Collecting and reviewing system information can be an important aspect for system administrators and end users
alike. System information can include everything from determining the name of the computer that is being used to
determining if the computer can connect to the internet or to another machine on the network. The following are the
Appendix B-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● du / -bh | more - This command displays detailed disk usage (thus the command name being du) for each
subdirectory starting at the file system root (/). The -bh works together to display the information in
"human readable" form.
● env - This command displays information about the current user environment.
● echo $PATH - This command displays a specific component of the current user environment called PATH
(which displays the directories whose command can be executed without having to be in that specific
directory).
B.4 Accessing a system remotely
Appendix B-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● gzip file_name.tar - This command is used to compress files (in most cases files that have been tar'd) using
the Lempel-Ziv coding (LZ77) method. The original file is replaced with a new file with the same file
name but an extended identifier that includes the characters (.gz) creating a file that is
file_name.tar.gz. Sometimes these files will be shortened even further by other users to have an
extension of only .tgz.
● tar -zxvf file_name.tar.gz - This command will separate and decompress files that have been tar'd and
zipped using the tar and gzip commands. The -zxvf parameters tell the program to compress or
decompress files automatically (z), extract (x) the contents of the archive, display a verbose (v) output to
● gunzip file_name.tar.gz - This command decompresses files that have been compressed with gzip. This
command is useful with files that are tar'd and zipped if the end user only wants the archive file without
having it untar'd in the process.
● bzip2 file_name - (bzip = big zip) This command compresses a file (identified by the file_name parameter)
using the Burrows-Wheeler block sorting text compression algorithm, and Huffman coding. This type of
compression is generally considered better than that achieved by the gunzip command which uses the
Lempel-Ziv coding (LZ77) method. The original file is deleted and a new compressed file is created with
the original name of the file but a new extension to identify it is zipped with bzip (*.bz2). The following
are two examples of compressing files with bzip2. The first is a standard bzip action, while the second
show hows to compress mutliple files into one bzipped file:
bzip2 My_file.txt
bzip2 My_file.txt His_file.txt Her_file.txt > Our_files.bz2
● bunzip2 file_name.bz2 - This command decompresses files that have been compressed with bzip2. The
Bzip compression tools are very useful for very large files.
● unzip file_name.zip - This command decompresses files that have been compressed with the PKZIP
command for the Microsoft disk operating system (DOS).
B.6 User management
Adding, deleting and otherwise managing access for new and existing users is an important aspect for system
administrators and for most users these commands are not going to be required for their day-to-day tasks. However,
if these tasks are required for use in the classroom or it will be part of the tasks required of the end user, it is
important to understand them. The following commands are the most common commands used in Linux to handle
user management:
● adduser username - This command creates a user (identified by the username parameter) on the local host
(or the remote host if being utilized). The useradd username command is equivalent to the adduser
command.
● groupadd groupname - This command creates a new group (identified by the groupname parameter). A
group can exist without having any members; however, it is not good user management to have groups
without any members.
Appendix B-10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● passwd username password- This command changes the password (or adds a password for a new user)
for the user identified in the username parameter. If the username parameter is ommited (basically
there is only one parameter), the passwd program will change the current users password to the parameter
entry.
● chmod permissions file_name - This command changes the file access permission for the files that the end
user has ownership rights to (or all files if end user is root). File permissions are set using three modes:
read (r), write (w) and execute (e) which are assigned by the class of users (three classes in all: owner,
group and all). The following is an example of a file that has been given access to read, write and execute
files for all three user groups: rwxrwxrwx. The permissions can be set using the format shown or using a
Appendix B-11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
● rpm -qf filename - This command displays the name of the *.rpm package to which the file filename (on
belongs. This is a useful command if the original RPM package is unknown and there is a need to reinstall
the application.
● rpm -e packagename - This command uninstalls the RPM package (identified by the packagename
parameter). The packagename is the same name as the original RPM package installed but without the
dash and version number (or the .rpm extension).
Appendix B-12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
C.1.1 Start vi
To use vi on a file, type in vi filename. If the file named filename exists, then the first page (or screen) of
the file will be displayed; if the file does not exist, then an empty file and screen are created into which you may
enter text.
vi -r filename recover filename that was being edited when system crashed
C.1.2 Stop vi
Usually the new or modified file is saved when you leave vi. However, it is also possible to quit vi without saving
the file.
NOTE: The cursor moves to bottom of screen whenever a colon (:) is typed. This type of command is
completed by hitting the <Return> (or <Enter>) key.
:x<Return> quit vi, writing out modified file to file named in original invocation
:wq<Return> quit vi, writing out modified file to file named in original invocation
:q!<Return> quit vi even though latest changes have not been saved for this vi call
Appendix C-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
0 (zero) move cursor to start of current line (the one with the cursor)
Appendix C-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
The main purpose of an editor is to create, add, or modify text for a file.
o open and put text in a new line below current line, until <Esc> hit
O open and put text in a new line above current line, until <Esc> hit
Appendix C-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
R replace characters, starting with current cursor position, until <Esc> hit
cw change the current word with new text, starting with the character under cursor, until <Esc> hit
cNw change N words starting from the cursor, until <Esc> hit; e.g., c5w changes 5 words
C change (replace) the characters in the current line, until <Esc> hit
Ncc or cNc change (replace) the next N lines, starting with the current line, stopping when <Esc> is hit
dNw delete N words beginning with character under cursor; e.g., d5w deletes 5 words
D delete the remainder of the line, starting with current cursor position
Ndd or dNd delete N lines, beginning with the current line; e.g., 5dd deletes 5 lines
Nyy or yNy copy (yank, cut) the next N lines, including the current line, into the buffer
p put (paste) the line(s) in the buffer into the text after the current line
Appendix C-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
:r filename<Return> read file named filename and insert after current line
:w! prevfile<Return> write entire content over a pre-existing file named prevfile
Appendix C-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix D-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Term Definition
Appendix D-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix D-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix D-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Least-recently-used keys are removed first, however, you can choose the size for “cold”, “hot” and
“warm” caches
6. How can MyISAM tables be optimized for bulk inserts?
Increase the bulk_insert_buffer_size variable, DISABLE KEYS/ENABLE KEYS, OPTIMIZE TABLE
before the insert.
7. How can you optimize MyISAM table reparation and ALTER TABLE operations?
Increase myisam_sort_buffer_size and myisam_max_sort_file_size
Appendix D-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix D-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix D-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
2. Increase the number of records in the city_huge table by inserting all the records from the city_huge
3. Create a copy of the city_huge table using other storage engines: MEMORY, ARCHIVE, MYISAM (if not
the default) , INNODB (if not the default). For MEMORY, you will likely need to increase the
max_heap_table_size variable to fit the data in memory.
4. Search each table created for all records where the city name is “Paris” and compare the average response
times. If the response times differ, why?
Appendix E-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
5. Search each table created for all records where the city identification number (ID) is equal to 123456 and
compare the average response times. If the response times differ, why?
6. Using the SHOW TABLE STATUS, review the Data_length and the other information associated with
Appendix E-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
SOURCE /labs/scripts/pt_stored_procedures.sql
3. Create the city_huge table needs to be recreated again also by issuing the following command:
CALL create_city_huge_no_index(200)
4. In an O/S terminal window, review the SQL statements that were slow. The following query was listed as
slow:
5. Improve the query in step 4 by including a combined index on the CountryCode and Population
columns in the city_huge table.
6. In the mysql client, execute the SQL statement from step 4 to see if the query execution time was
improved.
7. In the O/S terminal window, review another SQL statement that was slow. The following query was listed
as slow:
9. In the mysql client, execute the SQL statement improved in step 8 to see if the query execution time was
improved.
Appendix E-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
FLUSH STATUS;
2. Verify the query cache size is set to 4M. If there is a different value, set the query cache size to 4M.
SELECT @@query_cache_size;
4. Execute the same O/S command in step 3 again and then examine query cache statistics.
FLUSH STATUS;
7. Execute the same O/S command in step 3 two times and then examine the query cache statistics.
FLUSH STATUS;
Appendix E-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
10. Execute the same O/S command in step 3 two times and then examine the query cache statistics.
FLUSH STATUS;
13. Execute the same O/S command in step 3 two times and then examine the query cache statistics.
14. Reset the query cache statistics and then turn off query caching.
FLUSH STATUS;
SET GLOBAL query_cache_size = 0;
15. Of the different query cache sizes (4M, 16M, 32M or 64M), which size seems the best? Why do you think
this is?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
16. Execute the following query in the mysql client and record the time it took to execute:
Appendix E-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
17. Execute the query from step 16 to obtain the next 5 records (6-10) and record the time it took to execute.
Appendix E-6
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
24. Select the next 5 records (6-10) from the table created in step 22 and record how many seconds it took the
query to execute.
Appendix E-7
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
CALL create_many_tables(500);
3. Set the two table cache server connection system variables to 256 each.
4. In the mysql client, source the qry_many_tables.sql file to execute a query against each of the
tables created.
SOURCE /labs/scripts/qry_many_tables.sql
5. Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the tables created:
6. In the mysql client, monitor the change of the table cache statistics.
7. Set the two table cache server connection system variables to 1024 each.
8. In the mysql client, source the qry_many_tables.sql file to execute a query against each of the
tables created again.
SOURCE /labs/scripts/qry_many_tables.sql
Appendix E-8
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
9. Issue the following command in an O/S terminal window (command line) to run the mysqlslap
benchmarking tool against the tables created again:
10. Compare execution time by monitoring the change of the table cache statistics in the mysql client. Is there
a noticeable difference?
CALL drop_many_tables();
2. Execute the vmstat command in the O/S terminal window to display the current memory usage recorded.
shell> vmstat
3. Issue the following mysql and mysqlslap benchmark commands in the O/S terminal window to have
a noticeable load applied against the server and give us information for monitoring any future changes to
status variables:
4. Display the current values of the variables associated with the tmp_table_size system variable. Was
there an increase in the values? Display the current memory usage. Was there an increase, decrease or no
change in the memory values?
Appendix E-9
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
7. Display the current values of the variables associated with the tmp_table_size system variable. Was
there an increase in the values? Display the current memory usage. Was there an increase, decrease or no
change in the memory values?
10. Display the current values of the variables associated with the tmp_table_size system variable. Was
there an increase in the values? Display the current memory usage. Was there an increase, decrease or no
change in the memory values?
Appendix E-10
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
CALL create_city_huge(200);
3. Execute a large query against the city_huge table that performs a full index scan using the following SQL
statement:
Record the time it took to execute the query: _____________ (Query A-1)
4. Execute a FLUSH TABLES command to flush the query cache. This will ensure that the next command
gives an accurate result.
5. Execute a large query against the city_huge table that performs a full table scan (eliminating the file
system cache effect) using the following SQL statement:
Record the time it took to execute the query: _____________ (Query A-2)
6. Reset the key cache to 75% of system memory. (Below is for a system with 2GB of memory)
Record the time it took to execute the query: _____________ (Query B-1)
9. Execute a FLUSH TABLES command again to flush the query cache.
10. Execute the original full table scan query (step 5) against the city_huge table again.
Record the time it took to execute the query: _____________ (Query B-2)
Appendix E-11
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
11. Execute a FLUSH TABLES command again to flush the query cache.
12. Execute the original full index scan query (step 3) against the city_huge table again.
Record the time it took to execute the query: _____________ (Query C-1)
13. Compare the three different queries (A, B and C) performance against each other.
15. Execute the mid-point.sql file using the following O/S command to populate the cache with one
iteration:
16. In the mysql client, execute the following SQL statement to reset the key cache statistics:
FLUSH STATUS;
18. In the mysql client, determine the hit rate by reviewing the key cache statistics and using the following
formula:
19. Execute a FLUSH TABLES command again to flush the query cache.
20. Execute the original full index scan query (step 3) against the city_huge table again.
21. In the mysql client, determine the hit rate of key cache again.
Appendix E-12
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
FLUSH STATUS;
23. Execute the mid-point.sql file again using the following O/S command to populate the cache with ten
24. In the mysql client, determine the hit rate of key cache again.
FLUSH STATUS;
31. Execute a FLUSH TABLES command again to flush the query cache.
32. Execute the original full index scan query (step 3) against the city_huge table again.
Appendix E-13
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
33. In the mysql client, determine the hit rate of key cache again.
34. Compare all the hit rates to determine the best use of the key cache.
2. In an O/S terminal, setup up monitoring for the mysqladmin processlist by entering the flowing
statement:
3. In another O/S terminal, setup up monitoring of disk usage by entering the following statement:
watch -d ls -l /usr/local/mysql/data/world/city_huge*
vmstat 1
5. In the mysql client, set the key cache to a small number (16K).
6. Setting the myisam_sort_buffer to anything less than 4 will produce a warning and default to 4. Use
the following command to set the myisam_sort_buffer:
7. Truncate the city_huge table, and reload the records exported in step 1.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/city_huge.txt' INTO TABLE city_huge;
Watch the monitors; how long did the import take: _____________ (Import A)
Appendix E-14
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
8. In the mysql client, set the key cache to a big number (75% of system memory; Below is a 2G example).
9. Truncate the city_huge table, and reload the records exported in step 1.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/city_huge.txt' INTO TABLE city_huge;
11. Set the MyISAM sort buffer to 80% of system memory (Below is a 2G example).
12. Truncate the city_huge table, and reload the records exported in step 1.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/city_huge.txt' INTO TABLE city_huge;
Watch the monitors; how long did the import take: _____________ (Import C)
13. Compare the different times for each of the imports.
14. Shut down the other terminal windows opened during this lab exercise.
Appendix E-15
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
CALL create_city_huge(25)
3. Export the data from the city_huge table ordered by the primary key to a flat file using the following
SQL statement:
4. Truncate the city_huge table and import the random text file (created in step2) back into the table.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/innodb_rand.txt' INTO TABLE city_huge;
Record how long it took approximately to load the records back in: _________________
5. Truncate the city_huge table and import the primary key text file (created in step 3) back into the table.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/innodb_pk_order.txt' INTO TABLE city_huge;
Record how long it took approximately to load the records back in: _________________
6. Shut down the mysql client by executing the following command:
QUIT;
innodb_buffer_pool_size = 1652704000;
This will increase the size of the innodb_buffer_pool_size system variable to approximately
80% of the system memory (Using 2GB as an example). Close and save the file.
Appendix E-16
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
SET UNIQUE_CHECKS=0;
11. Using the world database, truncate the city_huge table and import the random text file (created in step
2) back into the table.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/innodb_rand.txt' INTO TABLE city_huge;
Record how long it took approximately to load the records back in this time: _________________
12. Truncate the city_huge table and import the primary key text file (created in step 2) back into the table.
TRUNCATE city_huge;
LOAD DATA INFILE '/tmp/innodb_pk_order.txt' INTO TABLE city_huge;
Record how long it took approximately to load the records back in this time: _______________
Appendix E-17
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
APPENDIX F SUPERSMACK
Super Smack
Running Super Smack
Once Super Smack installed, sample benchmarks are included in the /usr/share/smack directory:
select-key.smack
update-select.smack
These two files are example configuration files that will load a table with a large number of records and execute
a numerous queries against it. The following demonstrates how a files such as these can be used with Super
Smack:
This script produces the equivalent of 40 concurrent users each running 8,000 iterations of the test queries
configured in the update-select.smack configuration file. This results in an output that would look
similar to the following example:
The first time Super Smack is run, the following lines will be displayed which shows the application preparing
the MySQL server with the required SQL components (creating the schema and table) along with the loading of
data:
F-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL Performance Tuning Appendix F: Super Smack
One column of data is the most optimal data file for use with Super Smack; however, if there is a need to test
Super Smack does allow for the end user to identify the delimiter that would be used (when reading the file) but
it is common to use comma's as the delimiter in many programming languages and applications.
Configuration file
The heart of Super Smack is the configuration file that can be used to define a series of tests (a query barrel) to
run against the server along with the data and tables needed to support those tests. When running the tests,
Super Smack gives the user the ability to control how many concurrent clients will be simulated and how many
iterations of each test the clients will execute using command-line arguments. There are four components to the
configuration file that need to be created: clients, tables, dictionaries, and queries.
• Clients - This section requires the preparation of the client that will be used. It includes the user that
will be used to connect to the server, the host of the server, the database that will be used, the password
associated with the user and the location of the mysql socket file. In addition, if the MySQL server is running
on a non-standard port, that also needs to be identified (port "{port_num}"). The following is an
example of the client section of the Super Smack configuration file:
client "world"
{
user "root";
host "localhost";
db "world";
pass "";
socket "/var/share/mysql/mysql.sock";
}
F-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL Performance Tuning Appendix F: Super Smack
• Tables - This section is responsible for preparing the table structure and the data that will be contained in
the table. In this section the client to utilize will be identified, the structural components of the table will be
listed, the minimum number of rows that is needed for the test, and the source of the data that will populate the
table. If there is not enough rows of data from the data source to meet the needs of the minimum number of
rows, a program called gen-data (which is included with Super Smack) can be used to fill in the missing rows.
The following example shows what a typical table configuration section may look like:
table "city_smack"
{
client "world";
min_rows "110000";
data_file "world.dat";
gen_data_file "gen-data -n 110000 -f %n,%10-35s%,%3-3s%,%10-20s%,%d";
}
Gen-Data
The actual options of the gen-data program are not documented anywhere in the actual Super Smack application;
however, there are a few common threads that can be read from this command line. First, the number of rows that is
needed is preceded by the -n option (-n 110000). Second, the data needed to be populated is preceded by the -f
to identify that there is a need to format the data which is then followed by the actual formatting which uses a printf-
style format string (%n - incremental integer, %#-#s% identifies a string with a minimum size and a maximum size,
and %d is a random numeric integer). This command can also be run from the operating system command line to test
it prior to using it in the configuration file.
F-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL Performance Tuning Appendix F: Super Smack
• Dictionary - This section of the configuration file is used to provide Super Smack with values to use
when creating the queries for testing purposes.
o type - In this section the method used to select the data is indentified (rand - for randomly
selected values, seq - for values to be selected sequentially, or unique - for values to be generated uniquely
using gen-data).
o source_type - The second component of this section is to identify where the values will be
coming from (file - which states that a file will be used to obtain the values, list - which states a comma-
separated list will be supplied in the configuration itself, or template - which is used with the unique method
and uses the same printf-style formatting).
o source - The next component is the actual source data; this can consist of either the file that is
to be used or the list using a formatting such as ("Kai","Sarah","Max","Patricia,"Tobias").
dictionary "world"
{
type "rand";
source_type "file";
source "world.dat";
delim ",";
}
• Queries - This section defines the queries that will be used and prepares them for use in the execution of
the Super Smack program. There are three components to this section: the query itself, the name of the query
and a flag to identify if the query will return a result set or not. In addition, if the query itself is going to use
values from the dictionary, the flag telling Super Smack that the query is parsed must be turned on (parsed =
"y"). The following is an example of two queries that could be used in the operation of the Super Smack
program:
query "select_city"
{
query "SELECT * FROM city_smack WHERE CountryCode = '$word'";
type "select_ctry_code";
has_result_set = "y";
parsed = "y";
}
query "update_city"
{
query "UPDATE city_smack SET CountryCode = '$word' WHERE District= '$word'";
type "update_ctry_code";
has_result_set = "n";
parsed = "y";
}
F-4
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
MySQL Performance Tuning Appendix F: Super Smack
• Super Smack client - This section is the actual client that will perform the benchmark based on the
previous values that have been set. The difference between this client and the previous client (besides the
values being different) is the query_barrel syntax that defines the order and the number of clients that will be
executed during each iteration. The query_barrel syntax includes the number of times a query should be run
preceding the actual query name, so in the case of the example below, each query is run once for each iteration:
client "smacked"
{
user "test";
host "localhost";
db "test";
pass "";
• Main section - This section is the final section of the Super Smack configuration file and contains the
actual flow that will be used to execute the application. The syntax for this section rarely changes and contains
the reading of the command-line arguments in shell-style numbered values ($1, $2, etc.) thus making it easy to
change the order of the arguments. The following is the order in which this section follows: (1) The smacked
client is initialized (2) The number of rounds that the application will execute is read in from the command line
argument and set in the script (3) The necessary threads are read in from command line argument and set in the
script (4) Each thread is connected to the server (5) The barrel of queries are unloaded/executed with the results
being stored for statistical purposes (6) The collect_threads is responsible for ensuring that all the clients have
returned statistics before moving on to the next iteration (7) The clients are disconnected and the statistics are
compiled and reported back to the end user.
Main
{
smacked.init();
smacked.set_num_rounds($2);
smacked.create_threads($1);
smacked.connect();
smacked.unload_query_barrel();
smacked.collect_threads();
smacked.disconnect();
}
F-5
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Appendix G-1
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
ED Control
/Help
Heat Chart
(all serve rs)
Critical Events
(per Server)
Appendix G-2
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Enterprise Dashboard
Custom Advisor
Upgrade Advisor
Administration Advisor
Security Advisor
Replication Monitor
Replication Advisor
Query Analyaer
Schema Advisor
Performance Advisor
Appendix G-3
THESE eKIT MATERIALS ARE FOR YOUR USE IN THIS CLASSROOM ONLY. COPYING eKIT MATERIALS FROM THIS COMPUTER IS STRICTLY PROHIBITED
Improved Availability
• Semi-synchronous replication
• Replication heartbeat
• Replication slave fsync options
• Automatic relay log recovery
Improved Usability
• SIGNAL/RESIGNAL
• More partitioning options
• Replication server filtering
• Replication slave side type conversions
• Individual log flushing
• MySQL 5.5 extends the usability of the stored objects and table/index
partitioning features.
• SIGNAL and RESIGNAL statements allow you to implement exception-
handling logic in your stored procedures, stored functions, triggers, events,
and database applications.
• With the RANGE COLUMNS and LIST COLUMNS clauses of the CREATE
TABLE statement, partitioning is more flexible and can optimize queries
better.
• Performance schema provides tables that let you see performance data in
real time, or historically.
Improved Scalability
MySQL 5.5.4
MySQL 5.5.3
Transactions/Second
MySQL 5.5.6
(New InnoDB)
MySQL 5.1.50
MySQL 5.1.50
(InnoDB built-in)
MySQL 5.5.6
(New InnoDB)
MySQL 5.1.50
MySQL 5.1.50
(InnoDB built-in)
10000
8000
6000
InnoDB
4000
MyISAM
2000
0
6 12 18 24 30 36
CPU Cores
AMD Opteron 7160 (Magny-Cours) @2100 MHz
64 GB memory
2 x Intel X25E SSD drives
OS is Oracle Enterprise Linux with the Enterprise Kernel
CONFIDENTIAL – HIGHLY RESTRICTED 4 sockets with a total of 48 cores.
MySQL 5.5.6
(New InnoDB)
MySQL 5.1.50
MySQL 5.1.50
(InnoDB built-in)
Intel x86_64
4 CPU x 2 Cores/CPU
3.166 GHz, 8GB RAM
Windows Server 2008
MySQL 5.5.6
(New InnoDB)
MySQL 5.1.50
MySQL 5.1.50
(InnoDB built-in)
Intel x86_64
4 CPU x 2 Cores/CPU
3.166 GHz, 8GB RAM
Windows Server 2008