What Are Indexes?: ID First Name Last Name Class
What Are Indexes?: ID First Name Last Name Class
Every time your web application runs a database query containing a WHERE statement, the database server's job is to look
through all the rows in your table to find those that match your request. As the table grows, an increasing number of rows
need to be inspected each time.
Indexes solve this problem in exactly the same was as the index in a reference book, by taking data from a column in your
table and storing it alphabetically in a separate location called an index. The same process can be applied to all data types,
for example numeric data will be stored in numeric order and dates in date order.
By doing this, the database does not need to look through every row in your table, instead it can find the data you are
searching for alphabetically, then skip immediately to look at the row(s) where the data is located.
In this example, we will be working with a table of students at an extraordinarily large school. This is quite a large table as
we have 100,000 students in our school. The first few rows of the table look like this:
1 James Bond 6A
2 Chris Smith 6A
3 Jane Eyre 6B
4 Dave Smith 6B
The first query SELECT * FROM students WHERE id = 1 is a special case because it is looking up a row using its
PRIMARY KEY. Because we already know exactly which row we want, no further optimization is required. It is best to
always look rows up in this way when possible, and almost all tables should have a unique column defined as the primary
key in this way.
The second query SELECT * FROM students WHERE last_name = 'Smith' will be searching a column which does
not currently have an index and this is exactly the type of query that will search the whole table unless we do something
about it. Lets go right ahead and create an index on that column:
This creates an index named by_last_name on the table, which will contain an indexes copy the last_name column,
allowing us to look these up much more quickly.
Exactly the same principle can be applied to the class column, allowing us to efficiently look up the students in a
particular class. We should create a simple index on it as follows:
MySQL will normally only use one index to query the table, so we will not benefit from using both of our existing indexes
to perform this query. However we also do not need to many any more indexes at this point!
The database server will look at the table and determine that we have an index on class and that each class only
contains about 20 students. It will use the by_class index we have already created to locate all 20 students in the class,
then check the last_name of each row individually. Searching 20 rows is no trouble for the server, compared with
searching all 100,000, and we've avoided wasting memory by creating any more indexes.
Why did we drop the by_class index? Because our new index will allow us to search by class and by last name, but it will
also allow us to search only by class, rendering the by_class index redundant.
So why did we keep by_last_name? Unfortunately, indexes can only be used starting at the beginning. You don't have to
use the whole index, but the query has to use its components in order starting with the first.
Joins
It is very likely that our application will have some joins. In this example, our students will have some grades in a grades
table as follows:
ID student_id Timestamp Grade
1 1 2014-01-20 15:00:00 A+
2 1 2014-02-20 15:00:00 A-
When viewing a student's record, our web application will fetch all grades related to that student with the following query:
SELECT * FROM grades WHERE student_id = 1
We also have a feature to display all grades from a particular class:
SELECT * from students WHERE class = '6A' JOIN grades on grades.student_id = students.id
Both of these queries use the "foreign key" called student_id. It is almost always a good idea to index foreign keys, and
this is done in the same way as text fields.
CREATE INDEX by_student_id ON grades (student_id);
It is always wise to start by indexing columns that join with other tables (they usually end _id). After that, look for
columns you know you will commonly search on. Focus on your largest tables.
Indexes are used to retrieve data from the database very fast. The users cannot see the indexes, they are just used
to speed up searches/queries.
Note: Updating a table with indexes takes more time than updating a table without (because the indexes also need
an update). So, only create indexes on columns that will be frequently searched against.
Note: The syntax for creating indexes varies among different databases. Therefore: Check the syntax for creating
indexes in your database.