Django Mysql
Django Mysql
Release 4.11.0
Adam Johnson
1 Exposition 3
2 Installation 9
3 Checks 13
4 QuerySet Extensions 15
5 Model Fields 25
6 Field Lookups 41
7 Aggregates 43
8 Database Functions 45
9 Migration Operations 55
10 Form Fields 59
11 Validators 61
12 Cache 63
13 Locks 71
14 Status 75
15 Management Commands 77
16 Test Utilities 79
17 Exceptions 81
18 Contributing 83
19 Changelog 85
Index 103
i
ii
Django-MySQL Documentation, Release 4.11.0
Django-MySQL extends Django’s built-in MySQL and MariaDB support their specific features not available on other
databases.
If you’re new, check out the Exposition to see all the features in action, or get started with Installation. Otherwise, take
your pick:
CONTENTS 1
Django-MySQL Documentation, Release 4.11.0
2 CONTENTS
CHAPTER
ONE
EXPOSITION
1.1 Checks
Extra checks added to Django’s check framework to ensure your Django and MySQL configurations are optimal.
$ ./manage.py check
?: (django_mysql.W001) MySQL strict mode is not set for database connection 'default'
...
Read more
Django-MySQL comes with a number of extensions to QuerySet that can be installed in a number of ways - e.g.
adding the QuerySetMixin to your existing QuerySet subclass.
SELECT COUNT(*) ... can become a slow query, since it requires a scan of all rows; the approx_count functions
solves this by returning the estimated count that MySQL keeps in metadata. You can call it directly:
Author.objects.approx_count()
Or if you have pre-existing code that calls count() on a QuerySet you pass it, such as the Django Admin, you can set
the QuerySet to do try approx_count first automatically:
qs = Author.objects.all().count_tries_approx()
# Now calling qs.count() will try approx_count() first
Read more
3
Django-MySQL Documentation, Release 4.11.0
Use MySQL’s query hints to optimize the SQL your QuerySets generate:
Author.objects.straight_join().filter(book_set__title__startswith="The ")
# Does SELECT STRAIGHT_JOIN ...
Read more
Sometimes you need to modify every single instance of a model in a big table, without creating any long running
queries that consume large amounts of resources. The ‘smart’ iterators traverse the table by slicing it into primary key
ranges which span the table, performing each slice separately, and dynamically adjusting the slice size to keep them
fast:
Read more
For interactive debugging of queries, this captures the query that the QuerySet represents, and passes it through
EXPLAIN and pt-visual-explain to get a visual representation of the query plan:
>>> Author.objects.all().pt_visual_explain()
Table scan
rows 1020
+- Table
table myapp_author
Read more
4 Chapter 1. Exposition
Django-MySQL Documentation, Release 4.11.0
Use MariaDB’s Dynamic Columns for storing arbitrary, nested dictionaries of values:
class ShopItem(Model):
name = models.CharField(max_length=200)
attrs = DynamicField()
Read more
1.3.2 EnumField
A field class for using MySQL’s ENUM type, which allows strings that are restricted to a set of choices to be stored in a
space efficient manner:
class BookCover(Model):
color = EnumField(choices=["red", "green", "blue"])
Read more
1.3.3 FixedCharField
A field class for using MySQL’s CHAR type, which allows strings to be stored at a fixed width:
class Address(Model):
zip_code = FixedCharField(length=10)
Read more
Django’s TextField and BinaryField fields are fixed at the MySQL level to use the maximum size class for the
BLOB and TEXT data types - these fields allow you to use the other sizes, and migrate between them:
class BookBlurb(Model):
blurb = SizedTextField(size_class=3)
# Has a maximum length of 16MiB, compared to plain TextField which has
# a limit of 4GB (!)
Read more
Some database systems, such as the Java Hibernate ORM, don’t use MySQL’s bool data type for storing boolean flags
and instead use BIT(1). This field class allows you to interact with those fields:
class HibernateModel(Model):
some_bool = Bit1BooleanField()
some_nullable_bool = NullBit1BooleanField()
Read more
>>> Author.objects.filter(name__sounds_like="Robert")
[<Author: Robert>, <Author: Rupert>]
Read more
1.5 Aggregates
MySQL’s powerful GROUP_CONCAT statement is added as an aggregate, allowing you to bring back the concatenation
of values from a group in one query:
Read more
>>> Author.objects.annotate(
... full_name=ConcatWS("first_name", "last_name", separator=" ")
... ).first().full_name
"Charles Dickens"
Read more
6 Chapter 1. Exposition
Django-MySQL Documentation, Release 4.11.0
class Migration(migrations.Migration):
dependencies = []
Read more
1.8 Cache
Read more
1.9 Locks
Read more
1.10 Status
Read more
dbparams helps you include your database parameters from settings in commandline tools with dbparams:
Read more
Set some MySQL server variables on a test case for every method or just a specific one:
class MyTests(TestCase):
@override_mysql_variables(SQL_MODE="ANSI")
def test_it_works_in_ansi_mode(self):
self.run_it()
Read more
8 Chapter 1. Exposition
CHAPTER
TWO
INSTALLATION
2.1 Requirements
2.2 Installation
INSTALLED_APPS = [
...,
"django_mysql",
...,
]
Django-MySQL comes with some extra checks to ensure your database configuration is optimal. It’s best to run these
now you’ve installed to see if there is anything to fix:
Are your tests slow? Check out my book Speed Up Your Django Tests which covers loads of ways to write faster,
more accurate tests.
9
Django-MySQL Documentation, Release 4.11.0
Half the fun features are extensions to QuerySet. You can add these to your project in a number of ways, depending
on what is easiest for your code - all imported from django_mysql.models.
class Model
The simplest way to add the QuerySet extensions - this is a subclass of Django’s Model that sets objects
to use the Django-MySQL extended QuerySet (below) via QuerySet.as_manager(). Simply change your
model base to get the goodness:
class MySuperModel(Model):
pass # TODO: come up with startup idea.
class QuerySet
The second way to add the extensions - use this to replace your model’s default manager:
class MySuperDuperModel(MyBaseModel):
objects = QuerySet.as_manager()
# TODO: what fields should this model have??
If you are using a custom manager, you can combine this like so:
class MySuperDuperManager(models.Manager):
pass
class MySuperDuperModel(models.Model):
objects = MySuperDuperManager.from_queryset(QuerySet)()
# TODO: fields
class QuerySetMixin
The third way to add the extensions, and the container class for the extensions. Add this mixin to your custom
QuerySet class to add in all the fun:
10 Chapter 2. Installation
Django-MySQL Documentation, Release 4.11.0
class MySplendidModel(Model):
objects = MySplendidQuerySet.as_manager()
# TODO: profit
add_QuerySetMixin(queryset)
A final way to add the extensions, useful when you don’t control the model class - for example with built in
Django models. This function creates a subclass of a QuerySet's class that has the QuerySetMixin added in
and applies it to the QuerySet:
qs = User.objects.all()
qs = add_QuerySetMixin(qs)
# Now qs has all the extensions!
12 Chapter 2. Installation
CHAPTER
THREE
CHECKS
Django-MySQL adds some extra checks to Django’s system check framework to advise on your database configuration.
If triggered, the checks give a brief message, and a link here for documentation on how to fix it.
Warning: From Django 3.1 onwards, database checks are not automatically run in most situations. You should use
the --database argument to manage.py check to run the checks. For example, with just one database connection
you can run manage.py check --database default.
Note: A reminder: as per the Django docs, you can silence individual checks in your settings. For example, if you
determine django_mysql.W002 doesn’t require your attention, add the following to settings.py:
SILENCED_SYSTEM_CHECKS = [
"django_mysql.W002",
]
This check has been removed since Django itself includes such a check, mysql.W002, since version 1.10. See its
documentation.
InnoDB Strict Mode is similar to the general Strict Mode, but for InnoDB. It escalates several warnings around InnoDB-
specific statements into errors. Normally this just affects per-table settings for compression. It’s recommended you
activate this, but it’s not very likely to affect you if you don’t.
Docs: MySQL / MariaDB.
As above, the easiest way to set this is to add SET to init_command in your DATABASES setting:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.mysql",
"NAME": "my_database",
"OPTIONS": {
(continues on next page)
13
Django-MySQL Documentation, Release 4.11.0
Note: If you use this along with the init_command for W001, combine them as SET
sql_mode='STRICT_TRANS_TABLES', innodb_strict_mode=1.
Also, as above for django_mysql.W001, it’s better that you set it permanently for the server with SET GLOBAL and a
configuration file change.
MySQL’s utf8 character set does not include support for the largest, 4 byte characters in UTF-8; this basically means
it cannot support emoji and custom Unicode characters. The utf8mb4 character set was added to support all these
characters, and there’s really little point in not using it. Django currently suggests using the utf8 character set for
backwards compatibility, but it’s likely to move in time.
It’s strongly recommended you change to the utf8mb4 character set and convert your existing utf8 data as well, unless
you’re absolutely sure you’ll never see any of these ‘supplementary’ Unicode characters (note: it’s very easy for users
to type emoji on phone keyboards these days!).
Docs: MySQL / MariaDB.
Also see this classic blogpost: How to support full Unicode in MySQL databases.
The easiest way to set this up is to make a couple of changes to your DATABASES settings. First, add OPTIONS with
charset to your MySQL connection, so MySQLdb connects using the utf8mb4 character set. Second, add TEST with
COLLATION and CHARSET as below, so Django creates the test database, and thus all tables, with the right character set:
DATABASES = {
"default": {
"ENGINE": "django.db.backends.mysql",
"NAME": "my_database",
"OPTIONS": {
# Tell MySQLdb to connect with 'utf8mb4' character set
"charset": "utf8mb4",
},
# Tell Django to build the test database with the 'utf8mb4' character set
"TEST": {
"CHARSET": "utf8mb4",
"COLLATION": "utf8mb4_unicode_ci",
},
}
}
Note this does not transform the database, tables, and columns that already exist. Follow the examples in the ‘How to’
blog post link above to fix your database, tables, and character set. It’s planned to add a command to Django-MySQL
to help you do this, see Issue 216.
14 Chapter 3. Checks
CHAPTER
FOUR
QUERYSET EXTENSIONS
MySQL-specific Model and QuerySet extensions. To add these to your Model/Manager/QuerySet trifecta, see In-
stallation. Methods below are all QuerySet methods; where standalone forms are referred to, they can be imported
from django_mysql.models.
15
Django-MySQL Documentation, Release 4.11.0
min_size=1000
The threshold at which to use the approximate algorithm; if the approximate count comes back as less that
this number, count() will be called and returned instead, since it should be so small as to not bother your
database. Set to 0 to disable this behaviour and always return the approximation.
The default of 1000 is a bit pessimistic - most tables won’t take long when calling COUNT(*) on tens of
thousands of rows, but it could be slow for very wide tables.
django_mysql.models.count_tries_approx(activate=True, fall_back=True, return_approx_int=True,
min_size=1000)
This is the ‘magic’ method to make pre-existing code, such as Django’s admin, work with approx_count.
Calling count_tries_approx sets the QuerySet up such that then calling count will call approx_count
instead, with the given arguments.
To unset this, call count_tries_approx with activate=False.
To ‘fix’ an Admin class with this, simply do the following (assuming Author inherits from django_mysql’s
Model):
class AuthorAdmin(ModelAdmin):
def get_queryset(self, request):
qs = super(AuthorAdmin, self).get_queryset(request)
return qs.count_tries_approx()
You’ll be able to see this is working on the pagination due to the word ‘Approximately’ appearing:
You can do this at a base class for all your ModelAdmin subclasses to apply the magical speed increase across
your admin interface.
The following methods add extra features to the ORM which allow you to access some MySQL-specific syntax. They
do this by inserting special comments which pass through Django’s ORM layer and get re-written by a function that
wraps the lower-level cursor.execute().
Because not every user wants these features and there is a (small) overhead to every query, you must activate this feature
by adding to your settings:
DJANGO_MYSQL_REWRITE_QUERIES = True
qs = Author.objects.label("AuthorListView").all()
You can add arbitrary labels, and as many of them as you wish - they will appear in the order added. They will
work in SELECT and UPDATE statements, but not in DELETE statements due to limitations in the way Django
performs deletes.
You should not pass user-supplied data in for the comment. As a basic protection against accidental SQL injec-
tion, passing a comment featuring */ will raise a ValueError, since that would prematurely end the comment.
However due to executable comments, the comment is still prone to some forms of injection.
However this is a feature - by not including spaces around your string, you may use this injection to use executable
comments to add hints that are otherwise not supported, or to use MySQL 5.7+ optimizer hints.
django_mysql.models.straight_join()
Adds the STRAIGHT_JOIN hint, which forces the join order during a SELECT. Note that you can’t force Django’s
join order, but it tends to be in the order that the tables get mentioned in the query.
Example usage:
# Note from Adam: sometimes the optimizer joined books -> author, which
# is slow. Force it to do author -> books.
Author.objects.distinct().straight_join().filter(books__age=12)[:10]
# Note from Adam: for some reason the optimizer didn’t use a temporary
# table for this, so we force it
Author.objects.distinct().sql_big_result()
Example usage:
Warning: The query cache was removed in MySQL 8.0, and is disabled by default from MariaDB 10.1.7.
Example usage:
Docs: MariaDB.
django_mysql.models.sql_no_cache()
Adds the SQL_NO_CACHE hint, which means the result set will not be fetched from or stored in the Query Cache.
This only has an effect when the MySQL system variable query_cache_type is set to 1 or ON.
Warning: The query cache was removed in MySQL 8.0, and is disabled by default from MariaDB 10.1.7.
Example usage:
# Avoid caching all the expired sessions, since we’re about to delete
# them
deletable_session_ids = (
Session.objects.sql_no_cache().filter(expiry__lt=now()).values_list("id",␣
˓→flat=True)
Docs: MariaDB.
django_mysql.models.sql_calc_found_rows()
Adds the SQL_CALC_FOUND_ROWS hint, which means the total count of matching rows will be calculated when
you only take a slice. You can access this count with the found_rows attribute of the QuerySet after filling its
result cache, by e.g. iterating it.
This can be faster than taking the slice and then again calling .count() to get the total count.
Example usage:
Here’s a situation we’ve all been in - we screwed up, and now we need to fix the data. Let’s say we accidentally set the
address of all authors without an address to “Nowhere”, rather than the blank string. How can we fix them??
The simplest way would be to run the following:
Author.objects.filter(address="Nowhere").update(address="")
Unfortunately with a lot of rows (‘a lot’ being dependent on your database server and level of traffic) this will stall other
access to the table, since it will require MySQL to read all the rows and to hold write locks on them in a single query.
To solve this, we could try updating a chunk of authors at a time; such code tends to get ugly/complicated pretty quickly:
min_id = 0
max_id = 1000
biggest_author_id = Author.objects.order_by("-id")[0].id
while True:
Author.objects.filter(id__gte=min_id, id__lte=...)
# I'm not even going to type this all out, it's so much code
Here’s the solution to this boilerplate with added safety features - ‘smart’ iteration! There are two classes; one yields
chunks of the given QuerySet, and the other yields the objects inside those chunks. Nearly every data update can be
thought of in one of these two methods.
class django_mysql.models.SmartChunkedIterator(queryset, atomically=True, status_thresholds=None,
pk_range=None, chunk_time=0.5, chunk_size=2,
chunk_min=1, chunk_max=10000,
report_progress=False, total=None)
Implements a smart iteration strategy over the given queryset. There is a method iter_smart_chunks that
takes the same arguments on the QuerySetMixin so you can just:
bad_authors = Author.objects.filter(address="Nowhere")
for author_chunk in bad_authors.iter_smart_chunks():
author_chunk.update(address="")
Iteration proceeds by yielding primary-key based slices of the queryset, and dynamically adjusting the size of
the chunk to try and take chunk_time seconds. In between chunks, the wait_until_load_low() method of
GlobalStatus is called to ensure the database is not under high load.
Warning: Because of the slicing by primary key, there are restrictions on what QuerySets you can use,
and a ValueError will be raised if the queryset doesn’t meet that. Specifically, only QuerySets on models
with integer-based primary keys, which are unsliced, and have no order_by will work.
There are a lot of arguments and the defaults have been picked hopefully sensibly, but please check for your case
though!
queryset
The queryset to iterate over; if you’re calling via .iter_smart_chunks then you don’t need to set this
since it’s the queryset you called it on.
atomically=True
If true, wraps each chunk in a transaction via django’s transaction.atomic(). Recommended for any
write processing.
status_thresholds=None
A dict of status variables and their maximum tolerated values to be checked against after each chunk with
wait_until_load_low().
When set to None, the default, GlobalStatus will use its default of {"Threads_running": 10}. Set
to an empty dict to disable status checking - but this is not really recommended, as it can save you from
locking up your site with an overly aggressive migration.
Using Threads_running is the most recommended variable to check against, and is copeid from the
default behaviour of pt-online-schema-change. The default value of 10 threads is deliberately conser-
vative to avoid locking small database servers. You should tweak it up based upon the live activity of your
server - check the running thread count during normal traffic and add some overhead.
pk_range=None
Controls the primary key range to iterate over with slices. By default, with pk_range=None, the QuerySet
will be searched for its minimum and maximum pk values before starting. On QuerySets that match few
rows, or whose rows aren’t evenly distributed, this can still execute a long blocking table scan to find these
two rows. You can remedy this by giving a value for pk_range:
• If set to 'all', the range will be the minimum and maximum PK values of the entire table, excluding
any filters you have set up - that is, for Model.objects.all() for the given QuerySet’s model.
• If set to a 2-tuple, it will be unpacked and used as the minimum and maximum values respectively.
Note: The iterator determines the minimum and maximum at the start of iteration and does not update them
whilst iterating, which is normally a safe assumption, since if you’re “fixing things” you probably aren’t
creating any more bad data. If you do need to process every row then set pk_range to have a maximum
far greater than what you expect would be reached by inserts that occur during iteration.
chunk_time=0.5
The time in seconds to aim for each chunk to take. The chunk size is dynamically adjusted to try and match
this time, via a weighted average of the past and current speed of processing. The default and algorithm is
taken from the analogous pt-online-schema-change flag –chunk-time.
chunk_size=2
The initial size of the chunk that will be used. As this will be dynamically scaled and can grow fairly
quickly, the initial size of 2 should be appropriate for most use cases.
chunk_min=1
The minimum number of objects in a chunk. You do not normally need to tweak this since the dynamic
scaling works very well, however it might be useful if your data has a lot of “holes” or if there are other
constraints on your application.
chunk_max=10000
The maximum number of objects in a chunk, a kind of sanity bound. Acts to prevent harm in the case of
iterating over a model with a large ‘hole’ in its primary key values, e.g. if only ids 1-10k and 100k-110k
exist, then the chunk ‘slices’ could grow very large in between 10k and 100k since you’d be “processing”
the non-existent objects 10k-100k very quickly.
report_progress=False
If set to true, display out a running counter and summary on sys.stdout. Useful for interactive use. The
message looks like this:
And uses \r to erase itself when re-printing to avoid spamming your screen. At the end Finished! is
printed on a new line.
total=None
By default the total number of objects to process will be calculated with approx_count(), with
fall_back set to True. This count() query could potentially be big and slow.
total allows you to pass in the total number of objects for processing, if you can calculate in a cheaper
way, for example if you have a read-replica to use.
class django_mysql.models.SmartIterator
A convenience subclass of SmartChunkedIterator that simply unpacks the chunks for you. Can be accessed
via the iter_smart method of QuerySetMixin.
For example, rather than doing this:
bad_authors = Author.objects.filter(address="Nowhere")
for authors_chunk in bad_authors.iter_smart_chunks():
for author in authors_chunk:
author.send_apology_email()
bad_authors = Author.objects.filter(address="Nowhere")
for author in bad_authors.iter_smart():
author.send_apology_email()
In the first format we were forced to perform a dumb query to determine the primary key limits set by
SmartChunkedIterator, due to the QuerySet not otherwise exposing this information.
Note: There is a subtle difference between the two versions. In the first the end boundary, max_pk, is a
closed bound, whereas in the second, the end_pk from iter_smart_pk_ranges is an open bound. Thus the
<= changes to a <.
How does MySQL really execute a query? The EXPLAIN statement (docs: MySQL / MariaDB), gives a description of
the execution plan, and the pt-visual-explain tool can format this in an understandable tree.
This function is a shortcut to turn a QuerySet into its visual explanation, making it easy to gain a better understanding
of what your queries really end up doing.
django_mysql.models.pt_visual_explain(display=True)
Call on a QuerySet to print its visual explanation, or with display=False to return it as a string. It prepends
the SQL of the query with ‘EXPLAIN’ and passes it through the mysql and pt-visual-explain commands
to get the output. You therefore need the MySQL client and Percona Toolkit installed where you run this.
Example:
>>> Author.objects.all().pt_visual_explain()
Table scan
rows 1020
+- Table
table myapp_author
Can also be imported as a standalone function if you want to use it on a QuerySet that does not have the
QuerySetMixin added, e.g. for built-in Django models:
FIVE
MODEL FIELDS
5.1 DynamicField
MariaDB has a feature called Dynamic Columns that allows you to store different sets of columns for each row in a
table. It works by storing the data in a blob and having a small set of functions to manipulate this blob. (Docs).
Django-MySQL supports the named Dynamic Columns of MariaDB 10.0+, as opposed to the numbered format of
5.5+. It uses the mariadb-dyncol python package to pack and unpack Dynamic Columns blobs in Python rather than in
MariaDB (mostly due to limitations in the Django ORM).
class django_mysql.models.DynamicField(spec=None, **kwargs)
A field for storing Dynamic Columns. The Python data type is dict. Keys must be strs and values must be one
of the supported value types in mariadb-dyncol:
• str
• int
• float
• datetime.date
• datetime.datetime
• datetime.datetime
• A nested dict conforming to this spec too
Note that there are restrictions on the range of values supported for some of these types, and that decimal.
Decimal objects are not yet supported though they are valid in MariaDB. For more information consult the
mariadb-dyncol documentation.
Values may also be None, though they will then not be stored, since dynamic columns do not store NULL, so you
should use .get() to retrieve values that may be None.
To use this field, you’ll need to:
1. Use MariaDB 10.0.2+
2. Install mariadb-dyncol (python -m pip install mariadb-dyncol)
3. Use either the utf8mb4 or utf8 character set for your database connection.
25
Django-MySQL Documentation, Release 4.11.0
These are all checked by the field and you will see sensible errors for them when Django’s checks run if you have
a DynamicField on a model.
spec
This is an optional type specification that checks that the named columns, if present, have the given types.
It is validated against on save() to ensure type safety (unlike normal Django validation which is only used
in forms). It is also used for type information for lookups (below).
spec should be a dict with string keys and values that are the type classes you expect. You can also nest
another such dictionary as a value for validating nested dynamic columns.
For example:
import datetime
class SpecModel(Model):
attrs = DynamicField(
spec={
"an_integer_key": int,
"created_at": datetime.datetime,
"nested_columns": {
"lat": int,
"lon": int,
},
}
)
By default a DynamicField has no form field, because there isn’t really a practical way to edit its contents. If required,
is possible to add extra form fields to a ModelForm that then update specific dynamic column names on the instance in
the form’s save().
You can query by names, including nested names. In cases where names collide with existing lookups (e.g. you have a
column named 'exact'), you might want to use the ColumnGet database function. You can also use the ColumnAdd
and ColumnDelete functions for atomically modifying the contents of dynamic columns at the database layer.
We’ll use the following example model:
class ShopItem(Model):
name = models.CharField(max_length=200)
attrs = DynamicField(
spec={
"size": str,
}
)
def __str__(self):
return self.name
Exact Lookups
Name Lookups
To query based on a column name, use that name as a lookup with one of the below SQL types added after an underscore.
If the column name is in your field’s spec, you can omit the SQL type and it will be extracted automatically - this
includes keys in nested dicts.
The list of SQL types is:
• BINARY - dict (a nested DynamicField)
• CHAR - str
5.1. DynamicField 27
Django-MySQL Documentation, Release 4.11.0
• DATE - datetime.date
• DATETIME - datetime.datetime
• DOUBLE - float
• INTEGER - int
• TIME - datetime.time
These will also use the correct Django ORM field so chained lookups based on that type are possible, e.g.
dynamicfield__age_INTEGER__gte=20.
Beware that getting a named column can always return NULL if the column is not defined for a row.
For example:
# As 'size' is in the field's spec, there is no need to give the SQL type
>>> ShopItem.objects.filter(attrs__size="Large")
[<ShopItem: T-Shirt>]
Legacy
These field classes are only maintained for legacy purposes. They aren’t recommended as comma separation is a fragile
serialization format.
For new uses, you’re better off using Django 3.1’s JSONField that works with all database backends. On earlier
versions of Django, you can use django-jsonfield-backport.
Two fields that store lists of data, grown-up versions of Django’s CommaSeparatedIntegerField, cousins of
django.contrib.postgres’s ArrayField. There are two versions: ListCharField, which is based on
CharField and appropriate for storing lists with a small maximum size, and ListTextField, which is based on
TextField and therefore suitable for lists of (near) unbounded size (the underlying LONGTEXT MySQL datatype has a
maximum length of 232 - 1 bytes).
class django_mysql.models.ListCharField(base_field, size=None, **kwargs)
A field for storing lists of data, all of which conform to the base_field.
base_field
The base type of the data that is stored in the list. Currently, must be IntegerField, CharField, or any
subclass thereof - except from ListCharField itself.
size
Optionally set the maximum numbers of items in the list. This is only checked on form validation, not on
model save!
As ListCharField is a subclass of CharField, any CharField options can be set too. Most importantly
you’ll need to set max_length to determine how many characters to reserve in the database.
Example instantiation:
class Person(Model):
name = CharField()
post_nominals = ListCharField(
base_field=CharField(max_length=10),
size=6,
max_length=(6 * 11), # 6 * 10 character nominals, plus commas
)
Validation on save()
When performing the list-to-string conversion for the database, ListCharField performs some validation, and
will raise ValueError if there is a problem, to avoid saving bad data. The following are invalid:
• Any member containing a comma in its string representation
• Any member whose string representation is the empty string
class Widget(Model):
widget_group_ids = ListTextField(
base_field=IntegerField(),
size=100, # Maximum of 100 ids in list
)
Warning: These fields are not built-in datatypes, and the filters use one or more SQL functions to parse the
underlying string representation. They may slow down on large tables if your queries are not selective on other
columns.
contains
The contains lookup is overridden on ListCharField and ListTextField to match where the set field contains
the given element, using MySQL’s FIND_IN_SET function (docs: MariaDB / MySQL docs).
For example:
>>> Person.objects.filter(post_nominals__contains="PhD")
[<Person: Horatio>, <Person: Severus>]
>>> Person.objects.filter(post_nominals__contains="Esq.")
[<Person: Horatio>]
>>> Person.objects.filter(post_nominals__contains="DPhil")
[<Person: Severus>]
(continues on next page)
>>> Person.objects.filter(
... Q(post_nominals__contains="PhD") & Q(post_nominals__contains="III")
... )
[<Person: Horatio>]
Note: ValueError will be raised if you try contains with a list. It’s not possible without using AND in the query, so
you should add the filters for each item individually, as per the last example.
len
A transform that converts to the number of items in the list. For example:
>>> Person.objects.filter(post_nominals__len=0)
[<Person: Paulus>]
>>> Person.objects.filter(post_nominals__len=2)
[<Person: Severus>]
>>> Person.objects.filter(post_nominals__len__gt=2)
[<Person: Horatio>]
Index lookups
This class of lookups allows you to index into the list to check if the first occurrence of a given element is at a given
position. There are no errors if it exceeds the size of the list. For example:
>>> Person.objects.filter(post_nominals__0="PhD")
[<Person: Horatio>, <Person: Severus>]
>>> Person.objects.filter(post_nominals__1="DPhil")
[<Person: Severus>]
>>> Person.objects.filter(post_nominals__100="VC")
[]
Warning: The underlying function, FIND_IN_SET, is designed for sets, i.e. comma-separated lists of unique
elements. It therefore only allows you to query about the first occurrence of the given item. For example, this is a
non-match:
>>> Person.objects.create(name="Cacistus", post_nominals=["MSc", "MSc"])
>>> Person.objects.filter(post_nominals__1="MSc")
[] # Cacistus does not appear because his first MSc is at position 0
Note: FIND_IN_SET uses 1-based indexing for searches on comma-based strings when writing raw SQL. However
Note: Unlike the similar feature on django.contrib.postgres’s ArrayField, ‘Index transforms’, these are
lookups, and only allow direct value comparison rather than continued chaining with the base-field lookups. This
is because the field is not a native list type in MySQL.
Similar to Django’s F expression, this allows you to perform an atomic add and remove operations on list fields at the
database level:
class django_mysql.models.ListF(field_name)
You should instantiate this class with the name of the field to use, and then call one of its methods.
Note that unlike F, you cannot chain the methods - the SQL involved is a bit too complicated, and thus only single
operations are supported.
append(value)
Adds the value of the given expression to the (right hand) end of the list, like list.append:
>>> Person.objects.get().full_name
"Horatio Phd Esq. III DSocSci"
appendleft(value)
Adds the value of the given expression to the (left hand) end of the list, like deque.appendleft:
>>> Person.objects.update(post_nominals=ListF("post_nominals").appendleft("BArch
˓→"))
>>> Person.objects.get().full_name
"Horatio BArch Phd Esq. III DSocSci"
pop()
Takes one value from the (right hand) end of the list, like list.pop:
>>> Person.objects.update(post_nominals=ListF("post_nominals").pop())
>>> Person.objects.get().full_name
"Horatio BArch Phd Esq. III"
popleft()
Takes one value off the (left hand) end of the list, like deque.popleft:
>>> Person.objects.update(post_nominals=ListF("post_nominals").popleft())
>>> Person.objects.get().full_name
"Horatio Phd Esq. III"
Warning: All the above methods use SQL expressions with user variables in their queries, all of which start
with @tmp_. This shouldn’t affect you much, but if you use user variables in your queries, beware for any
conflicts.
Legacy
These field classes are only maintained for legacy purposes. They aren’t recommended as comma separation is a fragile
serialization format.
For new uses, you’re better off using Django 3.1’s JSONField that works with all database backends. On earlier
versions of Django, you can use django-jsonfield-backport.
Two fields that store sets of a base field in comma-separated strings - cousins of Django’s
CommaSeparatedIntegerField. There are two versions: SetCharField, which is based on CharField and
appropriate for storing sets with a small maximum size, and SetTextField, which is based on TextField and
therefore suitable for sets of (near) unbounded size (the underlying LONGTEXT MySQL datatype has a maximum length
of 232 - 1 bytes).
SetCharField(base_field, size=None, **kwargs):
A field for storing sets of data, which all conform to the base_field.
django_mysql.models.base_field
The base type of the data that is stored in the set. Currently, must be IntegerField, CharField, or any
subclass thereof - except from SetCharField itself.
django_mysql.models.size
Optionally set the maximum number of elements in the set. This is only checked on form validation, not
on model save!
As SetCharField is a subclass of CharField, any CharField options can be set too. Most importantly you’ll
need to set max_length to determine how many characters to reserve in the database.
Example instantiation:
Validation on save()
When performing the set-to-string conversion for the database, SetCharField performs some validation, and
will raise ValueError if there is a problem, to avoid saving bad data. The following are invalid:
• If there is a comma in any member’s string representation
• If the empty string is stored.
class Post(Model):
tags = SetTextField(
base_field=CharField(max_length=32),
)
Warning: These fields are not built-in datatypes, and the filters use one or more SQL functions to parse the
underlying string representation. They may slow down on large tables if your queries are not selective on other
columns.
contains
The contains lookup is overridden on SetCharField and SetTextField to match where the set field contains the
given element, using MySQL’s FIND_IN_SET (docs: MariaDB / MySQL).
For example:
>>> Post.objects.filter(tags__contains="thoughts")
[<Post: First post>, <Post: Second post>]
>>> Post.objects.filter(tags__contains="django")
[<Post: First post>, <Post: Third post>]
Note: ValueError will be raised if you try contains with a set. It’s not possible without using AND in the query, so
you should add the filters for each item individually, as per the last example.
len
A transform that converts to the number of items in the set. For example:
>>> Post.objects.filter(tags__len=1)
[<Post: Second post>]
>>> Post.objects.filter(tags__len=2)
[<Post: First post>, <Post: Third post>]
>>> Post.objects.filter(tags__len__lt=2)
[<Post: Second post>]
Similar to Django’s F expression, this allows you to perform an atomic add or remove on a set field at the database
level:
class django_mysql.models.SetF(field_name)
You should instantiate this class with the name of the field to use, and then call one of its two methods with a
value to be added/removed.
Note that unlike F, you cannot chain the methods - the SQL involved is a bit too complicated, and thus you can
only perform a single addition or removal.
add(value)
Takes an expression and returns a new expression that will take the value of the original field and add the
value to the set if it is not contained:
post.tags = SetF("tags").add("python")
post.save()
remove(value)
Takes an expression and returns a new expression that will remove the given item from the set field if it is
present:
post.tags = SetF("tags").remove("python")
post.save()
Warning: Both of the above methods use SQL expressions with user variables in their queries, all of which
start with @tmp_. This shouldn’t affect you much, but if you use user variables in your queries, beware for
any conflicts.
5.4 EnumField
Using a CharField with a limited set of strings leads to inefficient data storage since the string value is stored over and
over on disk. MySQL’s ENUM type allows a more compact representation of such columns by storing the list of strings
just once and using an integer in each row to refer to which string is there. EnumField allows you to use the ENUM type
with Django.
Docs: MySQL / MariaDB.
class django_mysql.models.EnumField(choices, **kwargs)
A subclass of Django’s Charfield that uses a MySQL ENUM for storage.
choices is a standard Django argument for any field class, however it is required for EnumField. It can either
be a list of strings, or a list of two-tuples of strings, where the first element in each tuple is the value used, and
the second the human readable name used in forms. The best way to form it is with Django’s TextChoices
enumeration type.
For example:
class BookCoverColour(models.TextChoices):
RED = "red"
GREEN = "green"
BLUE = "blue"
class BookCover(models.Model):
colour = EnumField(choices=BookCoverColour.choices)
Warning: It is possible to append new values to choices in migrations, as well as edit the human readable
names of existing choices.
However, editing or removing existing choice values will error if MySQL Strict Mode is on, and replace the
values with the empty string if it is not.
Also the empty string has strange behaviour with ENUM, acting somewhat like NULL, but not entirely. It is
therefore recommended you ensure Strict Mode is on.
5.5 FixedCharField
Django’s CharField uses the VARCHAR data type, which uses variable storage space depending on string length. This
normally saves storage space, but for columns with a fixed length, it adds a small overhead.
The alternative CHAR data type avoids that overhead, but it has the surprising behaviour of removing trailing space
characters, and consequently ignoring them in comparisons. FixedCharField provides a Django field for using CHAR.
This can help you interface with databases created by other systems, but it’s not recommended for general use, due to
the trialing space behaviour.
Docs: MySQL / MariaDB.
5.4. EnumField 37
Django-MySQL Documentation, Release 4.11.0
class Address(Model):
zip_code = FixedCharField(length=5)
Django’s TextField and BinaryField fields are fixed at the MySQL level to use the maximum size class for the
BLOB and TEXT data types. This is fine for most applications, however if you are working with a legacy database, or
you want to be stricter about the maximum size of data that can be stored, you might want one of the other sizes.
The following field classes are simple subclasses that allow you to provide an extra parameter to determine which
size class to use. They work with migrations, allowing you to swap them for the existing Django class and then use a
migration to change their size class. This might help when taking over a legacy database for example.
Docs: MySQL / MariaDB.
class django_mysql.models.SizedTextField(size_class: int, **kwargs)
A subclass of Django’s TextField that allows you to use the other sizes of TEXT data type. Set size_class
to:
• 1 for a TINYTEXT field, which has a maximum length of 255 bytes
• 2 for a TEXT field, which has a maximum length of 65,535 bytes
• 3 for a MEDIUMTEXT field, which has a maximum length of 16,777,215 bytes (16MiB)
• 4 for a LONGTEXT field, which has a maximum length of 4,294,967,295 bytes (4GiB)
class django_mysql.models.SizedBinaryField(size_class, **kwargs)
A subclass of Django’s BinaryField that allows you to use the other sizes of BLOB data type. Set size_class
to:
• 1 for a TINYBLOB field, which has a maximum length of 255 bytes
• 2 for a BLOB field, which has a maximum length of 65,535 bytes
• 3 for a MEDIUMBLOB field, which has a maximum length of 16,777,215 bytes (16MiB)
• 4 for a LONGBLOB field, which has a maximum length of 4,294,967,295 bytes (4GiB)
Some database systems, such as the Java Hibernate ORM, don’t use MySQL’s bool data type for storing boolean flags
and instead use BIT(1). Django’s default BooleanField and NullBooleanField classes can’t work with this.
The following subclasses are boolean fields that work with BIT(1) columns that will help when connecting to a legacy
database. If you are using inspectdb to generate models from the database, use these to replace the TextField output
for your BIT(1) columns.
class Bit1BooleanField
A subclass of Django’s BooleanField that uses the BIT(1) column type instead of bool.
class NullBit1BooleanField
Note: Django deprecated NullBooleanField in version 3.1 and retains it only for use in old migrations.
NullBit1BooleanField is similarly deprecated.
A subclass of Django’s NullBooleanField that uses the BIT(1) column type instead of bool.
SIX
FIELD LOOKUPS
ORM extensions for filtering. These are all automatically added for the appropriate field types when django_mysql
is in your INSTALLED_APPS. Note that lookups specific to included model fields are documented with the field, rather
than here.
MySQL string comparison has a case-sensitivity dependent on the collation of your tables/columns, as the Django
manual describes. However, it is possible to query in a case-sensitive manner even when your data is not stored with
a case-sensitive collation, using the BINARY keyword. The following lookup adds that capability to the ORM for
CharField, TextField, and subclasses thereof.
6.1.1 case_exact
Exact, case-sensitive match for character columns, no matter the underlying collation:
>>> Author.objects.filter(name__case_exact="dickens")
[]
>>> Author.objects.filter(name__case_exact="Dickens")
[<Author: Dickens>]
6.2 Soundex
MySQL implements the Soundex algorithm with its SOUNDEX function, allowing you to find words sounding similar to
each other (in English only, regrettably). These lookups allow you to use that function in the ORM and are added for
CharField and TextField.
6.2.1 soundex
>>> Author.objects.filter(name__soundex="R163")
[<Author: Robert>, <Author: Rupert>]
SQL equivalent:
41
Django-MySQL Documentation, Release 4.11.0
6.2.2 sounds_like
>>> Author.objects.filter(name__sounds_like="Robert")
[<Author: Robert>, <Author: Rupert>]
SQL equivalent:
SEVEN
AGGREGATES
>>> Book.objects.create(bitfield=29)
>>> Book.objects.create(bitfield=15)
>>> Book.objects.all().aggregate(BitAnd("bitfield"))
{'bitfield__bitand': 13}
class django_mysql.models.BitOr(column)
Returns an int of the bitwise OR of all input values, or 0 if no rows match.
Docs: MySQL / MariaDB.
Example usage:
>>> Book.objects.create(bitfield=29)
>>> Book.objects.create(bitfield=15)
>>> Book.objects.all().aggregate(BitOr("bitfield"))
{'bitfield__bitor': 31}
class django_mysql.models.BitXor(column)
Returns an int of the bitwise XOR of all input values, or 0 if no rows match.
Docs: MySQL / MariaDB.
Example usage:
>>> Book.objects.create(bitfield=11)
>>> Book.objects.create(bitfield=3)
>>> Book.objects.all().aggregate(BitXor("bitfield"))
{'bitfield__bitxor': 8}
43
Django-MySQL Documentation, Release 4.11.0
Warning: MySQL will truncate the value at the value of group_concat_max_len, which by default is
quite low at 1024 characters. You should probably increase it if you’re using this for any sizeable groups.
group_concat_max_len docs: MySQL / MariaDB.
Optional arguments:
distinct=False
If set to True, removes duplicates from the group.
separator=','
By default the separator is a comma. You can use any other string as a separator, including the empty string.
Warning: Due to limitations in the Django aggregate API, this is not protected against SQL injection.
Don’t pass in user input for the separator.
ordering=None
By default no guarantee is made on the order the values will be in pre-concatenation. Set ordering to 'asc'
to sort them in ascending order, and 'desc' for descending order. For example:
44 Chapter 7. Aggregates
CHAPTER
EIGHT
DATABASE FUNCTIONS
>>> Author.objects.annotate(
... is_william=If(Q(name__startswith="William "), True, False)
... ).values_list("name", "is_william")
[('William Shakespeare', True),
('Ian Fleming', False),
('William Wordsworth', True)]
class django_mysql.models.functions.CRC32(expression)
Computes a cyclic redundancy check value and returns a 32-bit unsigned value. The result is NULL if the argument
is NULL. The argument is expected to be a string and (if possible) is treated as one if it is not.
Docs: MySQL / MariaDB.
Usage example:
>>> Author.objects.annotate(description_crc=CRC32("description"))
45
Django-MySQL Documentation, Release 4.11.0
>>> # Females, then males - but other values of gender (e.g. empty string) first
>>> Person.objects.all().order_by(Field("gender", ["Female", "Male"]))
Note: These work with MariaDB 10.0.5+ only, which includes PCRE regular expressions and these extra functions
to use them. More information can be found in its documentation.
Usage example:
>>> Author.objects.create(name="Euripides")
>>> Author.objects.create(name="Frank Miller")
>>> Author.objects.create(name="Sophocles")
>>> Author.objects.annotate(name_has_space=CharLength(RegexpSubstr("name", r"\s"))).
˓→filter(
... name_has_space=0
... )
[<Author: Euripides>, <Author: Sophocles>]
class django_mysql.models.functions.LastInsertId(expression=None)
With no argument, returns the last value added to an auto-increment column, or set by another call to
LastInsertId with an argument. With an argument, sets the ‘last insert id’ value to the value of the given
expression, and returns that value. This can be used to implement simple UPDATE ... RETURNING style queries.
This function also has a class method:
get(using=DEFAULT_DB_ALIAS)
Returns the value set by a call to LastInsertId() with an argument, by performing a single query. It is
stored per-connection, hence you may need to pass the alias of the connection that set the LastInsertId
as using.
Note: Any queries on the database connection between setting LastInsertId and calling
LastInsertId.get() can reset the value. These might come from Django, which can issue multiple
queries for update() with multi-table inheritance, or for delete() with cascading.
>>> Countable.objects.filter(id=1).update(counter=LastInsertId("counter") + 1)
1
>>> # Get the pre-increase value of 'counter' as stored on the server
>>> LastInsertId.get()
242
These functions work with data stored in Django’s JSONField on MySQL and MariaDB only. JSONField is built in
to Django 3.1+ and can be installed on older Django versions with the django-jsonfield-backport package.
These functions use JSON paths to address content inside JSON documents - for more information on their syntax,
refer to the docs: MySQL / MariaDB.
class django_mysql.models.functions.JSONExtract(expression, *paths, output_field=None)
Given expression that resolves to some JSON data, extract the given JSON paths. If there is a single path,
the plain value is returned; if there is more than one path, the output is a JSON array with the list of values
represented by the paths. If the expression does not match for a particular JSON object, returns NULL.
If only one path is given, output_field may also be given as a model field instance like IntegerField(),
into which Django will load the value; the default is JSONField(), as it supports all return types including the
array of values for multiple paths.
Note that if expression is a string, it will refer to a field, whereas members of paths that are strings will be
wrapped with Value automatically and thus interpreted as the given string. If you want any of paths to refer to
a field, use Django’s F() class.
Docs: MySQL / MariaDB.
Usage examples:
>>> # Add power_level = 0 for those items that don't have power_level
>>> ShopItem.objects.update(attrs=JSONInsert("attrs", {"$.power_level": 0}))
Note that if expression is a string, it will refer to a field, whereas keys and values within the data dictionary
will be wrapped with Value automatically and thus interpreted as the given string. If you want a key or value to
refer to a field, use Django’s F() class.
Docs: MySQL / MariaDB.
>>> # Append the string '10m' to the array 'sizes' directly in MySQL
>>> shop_item = ShopItem.objects.latest()
>>> shop_item.attrs = JSONArrayAppend("attrs", {"$.sizes": "10m"})
>>> shop_item.save()
These are MariaDB 10.0+ only, and for use with DynamicField.
class django_mysql.models.functions.AsType(expression, data_type)
A partial function that should be used as part of a ColumnAdd expression when you want to ensure that
expression will be stored as a given type data_type. The possible values for data_type are the same as
documented for the DynamicField lookups.
Note that this is not a valid standalone function and must be used as part of ColumnAdd - see below.
class django_mysql.models.functions.ColumnAdd(expression, to_add)
Given expression that resolves to a DynamicField (most often a field name), add/update with the dictionary
to_add and return the new Dynamic Columns value. This can be used for atomic single-query updates on
Dynamic Columns.
Note that you can add optional types (and you should!). These can not be drawn from the spec of the
DynamicField due to ORM restrictions, so there are no guarantees about the types that will get used if you
do not. To add a type cast, wrap the value with an AsType (above) - see examples below.
Docs: MariaDB.
Usage examples:
NINE
MIGRATION OPERATIONS
class Migration(migrations.Migration):
dependencies = []
operations = [
# Install https://mariadb.com/kb/en/mariadb/metadata_lock_info/
InstallPlugin("metadata_lock_info", "metadata_lock_info.so")
]
55
Django-MySQL Documentation, Release 4.11.0
class django_mysql.operations.InstallSOName(soname)
MariaDB only.
An Operation subclass that installs a MariaDB plugin library. One library may contain multiple plugins that
work together, this installs all of the plugins in the named library file. Runs INSTALL SONAME soname. Note
that unlike InstallPlugin, there is no idempotency check to see if the library is already installed, since there
is no way of knowing if all the plugins inside the library are installed.
Docs: MariaDB.
soname
This is a required argument. The name of the library to install the plugin from. You may skip the file
extension (e.g. .so, .dll) to keep the operation platform-independent.
Example usage:
class Migration(migrations.Migration):
dependencies = []
operations = [
# Install https://mariadb.com/kb/en/mariadb/metadata_lock_info/
InstallSOName("metadata_lock_info")
]
Note: If you’re using this to move from MyISAM to InnoDB, there’s a page for you in the MariaDB knowledge
base - Converting Tables from MyISAM to InnoDB.
Example usage:
class Migration(migrations.Migration):
dependencies = []
TEN
FORM FIELDS
10.1 SimpleListField
max_length
This is an optional argument which validates that the list does not exceed the given length.
min_length
This is an optional argument which validates that the list reaches at least the given length.
59
Django-MySQL Documentation, Release 4.11.0
SimpleListField is not particularly user friendly in most cases, however it’s better than nothing.
10.2 SimpleSetField
max_length
This is an optional argument which validates that the set does not exceed the given length.
min_length
This is an optional argument which validates that the set reaches at least the given length.
ELEVEN
VALIDATORS
61
Django-MySQL Documentation, Release 4.11.0
TWELVE
CACHE
12.1 MySQLCache
An efficient cache backend using a MySQL table, an alternative to Django’s database-agnostic DatabaseCache. It has
the following advantages:
• Each operation uses only one query, including the *_many methods. This is unlike DatabaseCache which uses
multiple queries for nearly every operation.
• Automatic client-side zlib compression for objects larger than a given threshold. It is also easy to subclass and
add your own serialization or compression schemes.
• Faster probabilistic culling behaviour during write operations, which you can also turn off and execute in a
background task. This can be a bottleneck with Django’s DatabaseCache since it culls on every write operation,
executing a SELECT COUNT(*) which requires a full table scan.
• Integer counters with atomic incr() and decr() operations, like the MemcachedCache backend.
12.1.1 Usage
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "my_super_cache",
}
}
You then need to make the table. The schema is not compatible with that of DatabaseCache, so if you are switching,
you will need to create a fresh table.
Use the management command mysql_cache_migration to output a migration that creates tables for all the
MySQLCache instances you have configured. For example:
63
Django-MySQL Documentation, Release 4.11.0
class Migration(migrations.Migration):
dependencies = [
# Add a dependency in here on an existing migration in the app you
# put this migration in, for example:
# ('myapp', '0001_initial'),
]
operations = [
migrations.RunSQL(
"""
CREATE TABLE `my_super_cache` (
cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL PRIMARY KEY,
value longblob NOT NULL,
value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p',
expires BIGINT UNSIGNED NOT NULL
);
""",
"DROP TABLE `my_super_cache`"
),
]
Save this to a file in the migrations directory of one of your project’s apps, and add one of your existing migrations
to the file’s dependencies. You might want to customize the SQL at this time, for example switching the table to use
the MEMORY storage engine.
Django requires you to install sqlparse to run the RunSQL operation in the migration, so make sure it is installed.
Once the migration has run, the cache is ready to work!
If you use this with multiple databases, you’ll also need to set up routing instructions for the cache table. This can be
done with the same method that is described for DatabaseCache in the Django manual, apart from the application
name is django_mysql.
Note: Even if you aren’t using multiple MySQL databases, it may be worth using routing anyway to put all your cache
operations on a second connection - this way they won’t be affected by transactions your main code runs.
MySQLCache is fully compatible with Django’s cache API, but it also extends it and there are, of course, a few details
to be aware of.
incr/decr
Like MemcachedCache (and unlike DatabaseCache), incr and decr are atomic operations, and can only be used with
int values. They have the range of MySQL’s SIGNED BIGINT (-9223372036854775808 to 9223372036854775807).
max_allowed_packet
MySQL has a setting called max_allowed_packet, which is the maximum size of a query, including data. This there-
fore constrains the size of a cached value, but you’re more likely to run up against it first with the get_many/set_many
operations.
The MySQL 8.0 default is 4MB, and the MariaDB 10.2 default is 16MB. Most applications should be fine with these
limits. You can tweak the setting as high as 1GB - if this isn’t enough, you should probably be considering another
solution!
culling
MySQL is designed to store data forever, and thus doesn’t have a direct way of setting expired rows to disappear.
The expiration of old keys and the limiting of rows to MAX_ENTRIES is therefore performed in the cache backend by
performing a cull operation when appropriate. This deletes expired keys first, then if more than MAX_ENTRIES keys
remain, it deletes 1 / CULL_FREQUENCY of them. The options and strategy are described in in more detail in the Django
manual.
Django’s DatabaseCache performs a cull check on every write operation. This runs a SELECT COUNT(*) on the
table, which means a full-table scan. Naturally, this takes a bit of time and becomes a bottleneck for medium or large
cache table sizes of caching. MySQLCache helps you solve this in two ways:
1. The cull-on-write behaviour is probabilistic, by default running on 1% of writes. This is set with the
CULL_PROBABILITY option, which should be a number between 0 and 1. For example, if you want to use the
same cull-on-every-write behaviour as used by DatabaseCache (you probably don’t), set CULL_PROBABILITY
to 1.0:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"CULL_PROBABILITY": 1.0},
}
}
2. The cull() method is available as a public method so you can set up your own culling schedule in background
processing, never affecting any user-facing web requests. Set CULL_PROBABILITY to 0, and then set up your
task. For example, if you are using celery you could use a task like this:
@shared_task
def clear_caches():
caches["default"].cull()
caches["other_cache"].cull()
12.1. MySQLCache 65
Django-MySQL Documentation, Release 4.11.0
This functionality is also available as the management command cull_mysql_caches, which you might run as
a cron job. It performs cull() on all of your MySQLCache instances, or you can give it names to just cull those.
For example, this:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"MAX_ENTRIES": -1},
}
}
Note that you should then of course monitor the size of your cache table well, since it has no bounds on its growth.
compression
Like the other Django cache backends, stored objects are serialized with pickle (except from integers, which are stored
as integers so that the incr() and decr() operations will work). If pickled data has has a size in bytes equal to or
greater than the threshold defined by the option COMPRESS_MIN_LENGTH, it will be compressed with zlib in Python
before being stored, reducing the on-disk size in MySQL and network costs for storage and retrieval. The zlib level is
set by the option COMPRESS_LEVEL.
COMPRESS_MIN_LENGTH defaults to 5000, and COMPRESS_LEVEL defaults to the zlib default of 6. You can tune these
options - for example, to compress all objects >= 100 bytes at the maximum level of 9, pass the options like so:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"COMPRESS_MIN_LENGTH": 100, "COMPRESS_LEVEL": 9},
}
}
To turn compression off, set COMPRESS_MIN_LENGTH to 0. The options only affect new writes - any compressed values
already in the table will remain readable.
custom serialization
You can implement your own serialization by subclassing MySQLCache. It uses two methods that you should override.
Values are stored in the table with two columns - value, which is the blob of binary data, and value_type, a single
latin1 character that specifies the type of data in value. MySQLCache by default uses three codes for value_type:
• i - The blob is an integer. This is used so that counters can be deserialized by MySQL during the atomic incr()
and decr() operations.
• p - The blob is a pickled Python object.
• z - The blob is a zlib-compressed pickled Python object.
For future compatibility, MySQLCache reserves all lower-case letters. For custom types you can use upper-case letters.
The methods you need to override (and probably call super() from) are:
django_mysql.cache.encode(obj)
Takes an object and returns a tuple (value, value_type), ready to be inserted as parameters into the SQL
query.
django_mysql.cache.decode(value, value_type)
Takes the pair of (value, value_type) as stored in the table and returns the deserialized object.
Studying the source of MySQLCache will probably give you the best way to extend these methods for your use case.
prefix methods
Three extension methods are available to work with sets of keys sharing a common prefix. Whilst these would not be
efficient on other cache backends such as memcached, in an InnoDB table the keys are stored in order so range scans
are easy.
To use these methods, it must be possible to reverse-map the “full” key stored in the databse to the key you would provide
to cache.get, via a ‘reverse key function’. If you have not set KEY_FUNCTION, MySQLCache will use Django’s default
key function, and can therefore default the reverse key function too, so you will not need to add anything.
However, if you have set KEY_FUNCTION, you will also need to supply REVERSE_KEY_FUNCTION before the prefix
methods can work. For example, with a simple custom key function that ignores key_prefix and version, you
might do this:
def my_reverse_key_func(full_key):
# key_prefix and version still need to be returned
key_prefix = None
version = None
return key, key_prefix, version
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"KEY_FUNCTION": my_key_func,
"REVERSE_KEY_FUNCTION": my_reverse_key_func,
}
}
Once you’re set up, the following prefix methods can be used:
django_mysql.cache.delete_with_prefix(prefix, version=None)
Deletes all keys that start with the string prefix. If version is not provided, it will default to the VERSION
setting. Returns the number of keys that were deleted. For example:
12.1. MySQLCache 67
Django-MySQL Documentation, Release 4.11.0
Note: This method does not require you to set the reverse key function.
django_mysql.cache.get_with_prefix(prefix, version=None)
Like get_many, returns a dict of key to value for all keys that start with the string prefix. If version is not
provided, it will default to the VERSION setting. For example:
django_mysql.cache.keys_with_prefix(prefix, version=None)
Returns a set of all the keys that start with the string prefix. If version is not provided, it will default to the
VERSION setting. For example:
12.1.4 Changes
Initially, in Django-MySQL version 0.1.10, MySQLCache did not force the columns to use case sensitive collations;
in version 0.2.0 this was fixed. You can upgrade by adding a migration with the following SQL, if you replace
yourtablename:
Or as a reversible migration:
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunSQL(
(continues on next page)
12.1. MySQLCache 69
Django-MySQL Documentation, Release 4.11.0
THIRTEEN
LOCKS
try:
with Lock("my_unique_name", acquire_timeout=2.0):
mutually_exclusive_process()
except TimeoutError:
print("Could not get the lock")
For more information on user locks refer to the GET_LOCK documentation on MySQL or MariaDB.
Warning: As the documentation warns, user locks are unsafe to use if you have replication running and
your replication format (binlog_format) is set to STATEMENT. Most environments have binlog_format
set to MIXED because it can be more performant, but do check.
name
This is a required argument.
Specifies the name of the lock. Since user locks share a global namespace on the MySQL server, it will
automatically be prefixed with the name of the database you use in your connection from DATABASES
and a full stop, in case multiple apps are using different databases on the same server.
MySQL enforces a maximum length on the total name (including the DB prefix that Django-MySQL adds)
of 64 characters. MariaDB doesn’t enforce any limit. The practical limit on MariaDB is maybe 1 million
characters or more, so most sane uses should be fine.
acquire_timeout=10.0
The time in seconds to wait to acquire the lock, as will be passed to GET_LOCK(). Defaults to 10 seconds.
71
Django-MySQL Documentation, Release 4.11.0
using=None
The connection alias from DATABASES to use. Defaults to Django’s DEFAULT_DB_ALIAS to use your main
database connection.
is_held()
Returns True iff a query to IS_USED_LOCK() reveals that this lock is currently held.
holding_connection_id()
Returns the MySQL CONNECTION_ID() of the holder of the lock, or None if it is not currently held.
acquire()
For using the lock as a plain object rather than a context manager, similar to threading.Lock.acquire.
Note you should normally use try / finally to ensure unlocking occurs.
Example usage:
lock = Lock("my_unique_name")
lock.acquire()
try:
mutually_exclusive_process()
finally:
lock.release()
release()
Also for using the lock as a plain object rather than a context manager, similar to threading.Lock.
release. For example, see above.
classmethod held_with_prefix(prefix, using=DEFAULT_DB_ALIAS)
Queries the held locks that match the given prefix, for the given database connection. Returns a dict of lock
names to the CONNECTION_ID() that holds the given lock.
Example usage:
>>> Lock.held_with_prefix("Author")
{'Author.1': 451, 'Author.2': 457}
Note: Works with MariaDB 10.0.7+ only, when the metadata_lock_info plugin is loaded. You can
install this in a migration using the InstallSOName operation, like so:
class Migration(migrations.Migration):
dependencies = []
operations = [
# Install https://mariadb.com/kb/en/mariadb/metadata_lock_info/
InstallSOName("metadata_lock_info")
]
release()
Also for using the lock as a plain object rather than a context manager, similar to threading.Lock.
release. For example, see above.
Note: Transactions are not allowed around table locks, and an error will be raised if you try and use one inside
of a transaction. A transaction is created to hold the locks in order to cooperate with InnoDB. There are a number
of things you can’t do whilst holding a table lock, for example accessing tables other than those you have locked
- see the MySQL/MariaDB documentation for more details.
73
Django-MySQL Documentation, Release 4.11.0
Note: Table locking works on InnoDB tables only if the innodb_table_locks is set to 1. This is the default,
but may have been changed for your environment.
FOURTEEN
STATUS
MySQL gives you metadata on the server status through its SHOW GLOBAL STATUS and SHOW SESSION STATUS com-
mands. These classes make it easy to get this data, as well as providing utility methods to react to it.
The following can all be imported from django_mysql.status.
class django_mysql.status.GlobalStatus(name, using=None)
Provides easy access to the output of SHOW GLOBAL STATUS. These statistics are useful for monitoring purposes,
or ensuring queries your code creates aren’t saturating the server.
Basic usage:
Note that global_status is a pre-existing instance for the default database connection from DATABASES. If
you’re using more than database connection, you should instantiate the class:
To see the names of all the available variables, refer to the documentation: MySQL / MariaDB. They vary based
upon server version, plugins installed, etc.
using=None
The connection alias from DATABASES to use. Defaults to Django’s DEFAULT_DB_ALIAS to use your main
database connection.
get(name)
Returns the current value of the named status variable. The name may not include SQL wildcards (%). If it
does not exist, KeyError will be raised.
The result set for SHOW STATUS returns values in strings, so numbers and booleans will be cast to their
respective Python types - int, float, or bool. Strings are be left as-is.
get_many(names)
Returns a dictionary of names to current values, fetching them in a single query. The names may not include
wildcards (%).
75
Django-MySQL Documentation, Release 4.11.0
read_operations = session_status.get("Handler_read")
replica1_reads = SessionStatus(using="replica1").get("Handler_read")
FIFTEEN
MANAGEMENT COMMANDS
MySQL-specific management commands. These are automatically available with your manage.py when you add
django_mysql to your INSTALLED_APPS.
Outputs your database connection parameters in a form suitable for inclusion in other CLI commands, helping avoid
copy/paste errors and accidental copying of passwords to shell history files. Knows how to output parameters in two
formats - for mysql related tools, or the DSN format that some percona tools take. For example:
If the database alias is given, it should be alias of a connection from the DATABASES setting; defaults to ‘default’. Only
MySQL connections are supported - the command will fail for other connection vendors.
Mutually exclusive format flags:
15.1.1 --mysql
Which will translate to include all the relevant flags, including your database.
77
Django-MySQL Documentation, Release 4.11.0
15.1.2 --dsn
Outputs the parameters in the DSN format, which is what many percona tools take, e.g.:
Note: If you are using SSL to connect, the percona tools don’t support SSL configuration being given in their DSN
format; you must pass them via a MySQL configuration file instead. dbparams will output a warning on stderr if this
is the case. For more info see the percona blog.
SIXTEEN
TEST UTILITIES
@override_mysql_variables(SQL_MODE="MSSQL")
class MyTests(TestCase):
def test_it_works_in_mssql(self):
run_it()
@override_mysql_variables(SQL_MODE="ANSI")
def test_it_works_in_ansi_mode(self):
run_it()
During the first test, the SQL_MODE will be MSSQL, and during the second, it will be ANSI; each slightly changes
the allowed SQL syntax, meaning they are useful to test.
Note: This only sets the system variables for the session, so if the tested code closes and re-opens the database
connection the change will be reset.
django_mysql.test.utils.using
The connection alias to set the system variables for, defaults to ‘default’.
79
Django-MySQL Documentation, Release 4.11.0
SEVENTEEN
EXCEPTIONS
Various exception classes that can be raised by django_mysql code. They can imported from the django_mysql.
exceptions module.
exception django_mysql.exceptions.TimeoutError
Indicates a database operation timed out in some way.
81
Django-MySQL Documentation, Release 4.11.0
EIGHTEEN
CONTRIBUTING
1. Install tox.
2. Run a supported version of MySQL or MariaDB. This is easiest with the official Docker images. For example:
3. Run the tests by passing environment variables with your connection parameters. For the above Docker com-
mand:
You can also pass other pytest arguments after the --.
83
Django-MySQL Documentation, Release 4.11.0
NINETEEN
CHANGELOG
• Make MySQLCache.touch() return True if the key was touched, False otherwise. This return value was
missing since the method was added for Django 2.1.
• Fix a bug where set fields’ contains lookups would put SQL parameters in the wrong order.
• Remove deprecated database functions which exist in Django 3.0+:
– Sign
– MD5
– SHA1
– SHA2.
85
Django-MySQL Documentation, Release 4.11.0
• Drop pt_fingerprint(). Its complicated threading code leaked processes. Switch to calling
pt-fingerprint directly with subprocess.run() instead.
• Add model FixedCharField for storing fixed width strings using a CHAR type.
Thanks to Caleb Ely in PR #883.
• Fix query rewriting to install for recreated database connections. (Issue #677)
• Update Python support to 3.5-3.7, as 3.4 has reached its end of life.
• Always cast SQL params to tuples in ORM code.
• Remove authors file and documentation page. This was showing only 4 out of the 17 total contributors.
• Tested on Django 2.2. No changes were needed for compatibility.
• Remove universal wheel. Version 3.0.0 has been pulled from PyPI after being up for 3 hours to fix mistaken
installs on Python 2.
• Drop Django 1.8, 1.9, and 1.10 support. Only Django 1.11+ is supported now.
• Django 2.1 compatibility - no code changes were required, releasing for PyPI trove classifiers and documentation.
• Added JSONArrayAppend database function that wraps the respective JSON-modifying function from MySQL
5.7.
• Fixed some crashes from DynamicField instances without explicit spec definitions.
• Fixed a crash in system checks for ListCharField and SetCharField instances missing max_length.
• Fixed JSONField model field string serialization. This is a small backwards incompatible change.
Storing strings mostly used to crash with MySQL error -1 “error totally whack”, but in the case your string was
valid JSON, it would store it as a JSON object at the MySQL layer and deserialize it when returned. For example
you could do this:
The new behaviour now correctly returns what you put in:
>>> mymodel.attrs
'{"foo": "bar"}'
• Removed the connection.is_mariadb monkey patch. This is a small backwards incompatible change. Instead
of using it, use django_mysql.utils.connection_is_mariadb.
• Only use Django’s vendored six (django.utils.six). Fixes usage of EnumField and field lookups when six
is not installed as a standalone package.
• Added JSONInsert, JSONReplace and JSONSet database functions that wraps the respective JSON-modifying
functions from MySQL 5.7.
• Fixed JSONField to work with Django’s serializer framework, as used in e.g. dumpdata.
• Fixed JSONField form field so that it doesn’t overquote inputs when redisplaying the form due to invalid user
input.
• Fixed some features to work when there are non-MySQL databases configured
• Fixed JSONField to allow control characters, which MySQL does - but not in a top-level string, only inside a
JSON object/array.
• SmartChunkedIterator now fails properly for models whose primary key is a non-integer foreign key.
• pty is no longer imported at the top-level in django_mysql.utils, fixing Windows compatibility.
• Added new JSONField class backed by the JSON type added in MySQL 5.7.
• Added database functions JSONExtract, JSONKeys, and JSONLength that wrap the JSON functions added in
MySQL 5.7, which can be used with the JSON type columns as well as JSON data held in text/varchar columns.
• Added If database function for simple conditionals.
• Added manage.py command fix_datetime_columns that outputs the SQL necessary to fix any datetime
columns into datetime(6), as required when upgrading a database to MySQL 5.6+, or MariaDB 5.3+.
• SmartChunkedIterator output now includes the total time taken and number of objects iterated over in the
final message.
• Fixed EnumField so that it works properly with forms, and does not accept the max_length argument.
• SmartChunkedIterator output has been fixed for reversed iteration, and now includes a time estimate.
• Added three system checks that give warnings if the MySQL configuration can (probably) be improved.
• New function add_QuerySetMixin allows addding the QuerySetMixin to arbitrary QuerySets, for when you
can’t edit a model class.
• Added field class EnumField that uses MySQL’s ENUM data type.
• Allow approx_count on QuerySets for which only query hints have been used
• Added index query hints to QuerySet methods, via query-rewriting layer
• Added ordering parameter to GroupConcat to specify the ORDER BY clause
• Added index query hints to QuerySet methods, via query-rewriting layer
• Added sql_calc_found_rows() query hint that calculates the total rows that match when you only take a slice,
which becomes available on the found_rows attribute
• Added Regexp database functions for MariaDB - RegexpInstr, RegexpReplace, and RegexpSubstr
• Added the option to not limit the size of a MySQLCache by setting MAX_ENTRIES = -1.
• MySQLCache performance improvements in get, get_many, and has_key
• Added query-rewriting layer added which allows the use of MySQL query hints such as STRAIGHT_JOIN via
QuerySet methods, as well as adding label comments to track where queries are generated.
• Added TableLock context manager
• More database functions added - Field and its complement ELT, and LastInsertId
• Case sensitive string lookup added as to the ORM for CharField and TextField
• Migration operations added - InstallPlugin, InstallSOName, and AlterStorageEngine
• Extra ORM aggregates added - BitAnd, BitOr, and BitXor
• MySQLCache is now case-sensitive. If you are already using it, an upgrade ALTER TABLE and migration is
provided at the end of the cache docs.
• (MariaDB only) The Lock class gained a class method held_with_prefix to query held locks matching a
given prefix
• SmartIterator bugfix for chunks with 0 objects slowing iteration; they such chunks most often occur on tables
with primary key “holes”
• Now tested against Django master for cutting edge users and forwards compatibility
• Added the MySQLCache backend for use with Django’s caching framework, a more efficient version of
DatabaseCache
• Fix a ZeroDivision error in WeightedAverageRate, which is used in smart iteration
• pt_visual_explain no longer executes the given query before fetching its EXPLAIN
• New pt_fingerprint function that wraps the pt-fingerprint tool efficiently
• For List fields, the new ListF class allows you to do atomic append or pop operations from either end of the
list in a single query
• For Set fields, the new SetF class allows you to do atomic add or remove operatiosn from the set in a single
query
• The @override_mysql_variables decorator has been introduced which makes testing code with different
MySQL configurations easy
• The is_mariadb property gets added onto Django’s MySQL connection class automatically
• A race condition in determining the minimum and maximum primary key values for smart iteration was fixed.
• Add Set and List fields which can store comma-separated sets and lists of a base field with MySQL-specific
lookups
• Support MySQL’s GROUP_CONCAT as an aggregate!
• Add a functions module with many MySQL-specific functions for the new Django 1.8 database functions
feature
• Allow access of the global and session status for the default connection from a lazy singleton, similar to Django’s
connection object
• Fix a different recursion error on count_tries_approx
• Renamed connection_name argument to using on Lock, GlobalStatus, and SessionStatus classes, for
more consistency with Django.
• Fix recursion error on QuerySetMixin when using count_tries_approx
• Added manage.py command dbparams for outputting database paramters in formats useful for shell scripts
• Added Model and QuerySet subclasses which add the approx_count method
TWENTY
• genindex
• modindex
• search
99
Django-MySQL Documentation, Release 4.11.0
d
django_mysql.exceptions, 81
101
Django-MySQL Documentation, Release 4.11.0
A module, 81
acquire() (django_mysql.locks.Lock method), 72 DynamicField (class in django_mysql.models), 25
acquire() (django_mysql.locks.TableLock method), 73
add() (django_mysql.models.SetF method), 36 E
add_QuerySetMixin(), 11 ELT (class in django_mysql.models.functions), 46
AlterStorageEngine (class in encode() (in module django_mysql.cache), 67
django_mysql.operations), 56 EnumField (class in django_mysql.models), 37
append() (django_mysql.models.ListF method), 32
appendleft() (django_mysql.models.ListF method), 32 F
approx_count() (in module django_mysql.models), 15 Field (class in django_mysql.models.functions), 46
as_dict() (django_mysql.status.GlobalStatus method), FixedCharField (class in django_mysql.models), 37
76 force_index() (in module django_mysql.models), 19
AsType (class in django_mysql.models.functions), 52 from_engine (django_mysql.operations.AlterStorageEngine
attribute), 56
B
base_field (django_mysql.forms.SimpleListField G
attribute), 59 get() (django_mysql.models.functions.LastInsertId
base_field (django_mysql.forms.SimpleSetField method), 49
attribute), 60 get() (django_mysql.status.GlobalStatus method), 75
base_field (django_mysql.models.ListCharField get_many() (django_mysql.status.GlobalStatus
attribute), 29 method), 75
base_field (in module django_mysql.models), 33 get_with_prefix() (in module django_mysql.cache),
Bit1BooleanField (built-in class), 39 68
BitAnd (class in django_mysql.models), 43 GlobalStatus (class in django_mysql.status), 75
BitOr (class in django_mysql.models), 43 GroupConcat (class in django_mysql.models), 43
BitXor (class in django_mysql.models), 43
H
C held_with_prefix() (django_mysql.locks.Lock class
ColumnAdd (class in django_mysql.models.functions), 52 method), 72
ColumnDelete (class in holding_connection_id() (django_mysql.locks.Lock
django_mysql.models.functions), 52 method), 72
ColumnGet (class in django_mysql.models.functions), 53
ConcatWS (class in django_mysql.models.functions), 46 I
count_tries_approx() (in module If (class in django_mysql.models.functions), 45
django_mysql.models), 16 ignore_index() (in module django_mysql.models), 19
CRC32 (class in django_mysql.models.functions), 45 InstallPlugin (class in django_mysql.operations), 55
InstallSOName (class in django_mysql.operations), 56
D is_held() (django_mysql.locks.Lock method), 72
decode() (in module django_mysql.cache), 67
delete_with_prefix() (in module J
django_mysql.cache), 67 JSONArrayAppend (class in
django_mysql.exceptions django_mysql.models.functions), 51
103
Django-MySQL Documentation, Release 4.11.0
104 Index
Django-MySQL Documentation, Release 4.11.0
T
TableLock (class in django_mysql.locks), 72
TimeoutError, 81
to_engine (django_mysql.operations.AlterStorageEngine
attribute), 56
U
UpdateXML (class in django_mysql.models.functions), 47
use_index() (in module django_mysql.models), 19
using (in module django_mysql.test.utils), 79
W
wait_until_load_low()
(django_mysql.status.GlobalStatus method), 76
write (django_mysql.locks.TableLock attribute), 73
X
XMLExtractValue (class in
django_mysql.models.functions), 47
Index 105