Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Django Doesn't Scale

Django Doesn't Scale

(and what you can do about it.)

Given at OSCON 2012.

Jacob Kaplan-Moss

July 19, 2012
Tweet

More Decks by Jacob Kaplan-Moss

Other Decks in Technology

Transcript

  1. 2

  2. 6 DjangoCon 2008 70 Cal Henderson, Why I hate Django

    http://www.youtube.com/watch?v=i6Fr65PFqfk
  3. 10 front-end rendering time database load resource contention ddos attack

    network latency bugs in your software bugs in third party software slow template rendering misconfigured servers not enough threads “Our site’s too slow!”
  4. Data collection 11 •logging http://docs.python.org/library/logging •Sentry http://sentry.readthedocs.org/ •python-statsd http://packages.python.org/python-statsd/ •mmstats

    http://mmstats.readthedocs.org/ https://github.com/schmichael/django-mmstats/ •Metrology http://metrology.readthedocs.org
  5. import time from metrology import Metrology http_ok = Metrology.counter('http.ok') http_err

    = Metrology.counter('http.err') response_time = Metrology.histogram('request.time') class RequestMetricsMiddleware(object): def process_request(self, request): request._start_time = time.time() def process_response(self, request, response): response_time.update(time.time() - request._start_time) if 200 <= response.status_code < 400: http_ok.increment() else: http_err.increment() return response def process_exception(self, request, exception): http_err.increment() 12
  6. 14

  7. Required viewing: “Cache rules everything around me” Jacob Burch, Noah

    Silas http://pyvideo.org/video/679 16
  8. Two-phased rendering <div id="header"> {% load phased_tags %} {% phased

    with user %} Hello, {{ user.name }} {% endphased %} </div> 24
  9. “ ” There are only two hard things in computer

    science: cache invalidation, naming things, and off-by-one errors. 26
  10. 27 from books.models import Book from django.shortcuts import render def

    book_list(request): qs = Book.objects.all() return render(request, 'books.html', {'books': qs})
  11. 28 from books.models import Book from django.shortcuts import render from

    django.views.decorators.cache import cache_page @cache_page(600) def book_list(request): qs = Book.objects.all() return render(request, 'books.html', {'books': qs})
  12. “ ” There are only two hard things in computer

    science: cache invalidation, naming things, and off-by-one errors. 30
  13. 32 Cache cache outdated? View Regenerate cache data key:arg:arg  

         -­‐>  (ttl,  data) username:jacob        -­‐>  (1342710010,  "JKM") Data changed
  14. 39 Rule of thumb: Writes are 10x as expensive as

    reads* * This isn’t actually true, but you should pretend it is anyway.
  15. 40 @task def mark_received(message_ids): Message.objects.filter(id__in=message_ids) \ .update(received=True) def message_list(request): messages

    = Message.objects.filter(recipient=request.user) mark_received.delay(messages.values_list('id', flat=True)) return render(...)
  16. ORM inefficiencies •Queryset cloning Copying a queryset requires cloning a

    heavy structure. Model.objects.filter(...).filter(...).order_by(...) •Model instantiation ~40k __init__s per second (see http://bit.ly/Muepgo). •Saving models Readable, but slow: m.foo  =  bar;  m.save() Much much faster: Model.objects.filter(id=instance.id).update(foo=bar) 48
  17. Bulk inserts • Horrifically slow: for  title  in  big_title_list:  

         Book.objects.create(title=title) • Faster, but still terrible: with  transaction.commit_on_success():        for  title  in  big_title_list:                Book.objects.create(title=title) • Fast: bl  =  [Book(title=t)  for  t  in  big_title_list] Book.objects.bulk_create(bl) • But COPY  FROM wins, hands down. http://www.postgresql.org/docs/9.1/static/sql-copy.html http://initd.org/psycopg/docs/usage.html#using-copy-to-and-copy-from 49
  18. Modern databases are incredible. 53 See, e.g. Schemaless SQL, Craig

    Kerstiens: http://klewel.com/conferences/djangocon-2012/index.php?talkID=29