I was developing a small app as a playground and confidence builder, choosing django, heroku and S3 as resources. One of the packages used is easy-thumbnails.

Behaviour

I chose to use the easy-thumbnails app with S3, a possibility granted by the storages framework. At a glance, everything works OK; thumbnails are created and rendered accordingly. However, on a thumbnail-heavy page, I (and my profiling app) noticed significant load times (3s for a page with 7 thumbnails).

Drilling down, it became evident that something was checking the S3 bucket for each thumbnail (a httpplib call occurred for each one).

Cause

TL;DR: If you use the default ImageField, the thumbnail storage is created by the Thumbnailer framework. Each time the storage object is created, it checks for the bucket's existence. This means a query to S3. Lots of images equals lots of queries to S3, checking for a bucket's existence.

Logic goes something like this:

  1. In thumbnail.py:

    get_thumbnailer(source)[alias]
    
  2. In easy_thumbnails.files.py executing line 52 results in creating a new ThumbnailerFieldFile() object

  3. easy_thumbnails.fields.py has a ThumbnailerField() object created, which uses thumbnail_storage parameter. This is empty in the above call, which results later in creating the storge object

  4. In easy_thumbnails.files.py, the Thumbnailer class constructor contains:

    if not thumbnail_storage:
        thumbnail_storage = get_storage_class(
            settings.THUMBNAIL_DEFAULT_STORAGE)()
    

Solution

I have thought of two options:

  1. Either tweak the easy_thumbnails.files.py, the Thumbnailer class constructor or
  2. Try to find a way not to re-create the storage object

I ended up using in my model something like:

source_image = ThumbnailerImageField(
    blank=True, null=True,
    upload_to=get_file_path,
    storage=picture_log_storage,
    thumbnail_storage=thumbnail_storage)

Where the thumbnail_storage is initialised only once at the beginning of the .models file:

thumbnail_storage = get_storage_class(settings.THUMBNAIL_DEFAULT_STORAGE)()

This way, the storage object is created only once per Django instance and the bucket existence is also queried only once.

Results

Now, a page query takes about 120ms on average instead of 3s. Wow. Much speedup.