Improve S3Boto Django-Storages Performance with Custom Settings
Django-storages is a great piece of software. It provides an uniform API for all kind of file storage backends, including Amazon's S3. I use it for storing assets (the /static/
and /media/
) for my django app. However, particularly with the /static/
assets, I've noticed firebug/yslow complaining of performance, so I've decided to dig in more into settings.
First, there are some nice articles to help you deciding what you need to improve performance (you can find even more with a simple search). These provided some hints where to start from.
S3Boto Settings
To my surprise, I found that there are quite a lot of settings in S3Boto's backend for django storages. You can tweak almost everything you want for your files. At the time of writing this article, here are the available settings.
A first batch of settings is used to establish the communication to the server:
- AWS API keys. You need those to connect to S3.
AWS_S3_ACCESS_KEY_ID
orAWS_ACCESS_KEY_ID
AWS_S3_SECRET_ACCESS_KEY
orAWS_SECRET_ACCESS_KEY
AWS_S3_HOST
(defaultS3Connection.DefaultHost
) can be used to customise the S3 host. You probably don't need to change anything here.AWS_S3_USE_SSL
(defaultTrue
) use SSL when communicating with S3.AWS_S3_PORT
(defaultNone
) allows for possible customisation of the port when communicating with S3.AWS_QUERYSTRING_AUTH
(default True) is useful for giving HTTP or browser access to resources that would normally require authentication. See here.AWS_QUERYSTRING_EXPIRE
(default3600
) defines a timeout in seconds for theAWS_QUERYSTRING_AUTH
above.AWS_S3_CALLING_FORMAT
(defaultSubdomainCallingFormat()
) defines the calling format for S3 - This is a S3Boto componentAWS_S3_ENCRYPTION
(defaultFalse
) sets the S3 server-side encryption. IfTrue
, the object will be stored encrypted on S3.
A second batch of settings is used to specify file properties:
-
AWS_REDUCED_REDUNDANCY
(defaultFalse
) specifies if file uses the RR feature of S3 -
AWS_S3_FILE_OVERWRITE
(defaultTrue
) used to specify if files are overwritten or not on update. The mechanism is used by the method that retrieves a name; if overwrite is enabled, the file name is the same as the original. Otherwise, a new name is retrieved from S3Boto -
AWS_STORAGE_BUCKET_NAME
(defaultNone
) defines the bucket name. This is required to identify the bucket where files are stored. -
AWS_DEFAULT_ACL
(default'public-read'
) defines per file ACL. The accepted values seem to correcpons to the Canned ACLs from S3 -
AWS_HEADERS
(default{}
) defines the HTTP headers returned by S3 for a file. -
AWS_IS_GZIPPED
(defaultFalse
) creates gzipped versions of the file on S3. It's used in conjunction with theGZIP_CONTENT_TYPES
. -
Following defines the types available to gzip:
GZIP_CONTENT_TYPES`(default (`'text/css'`,`'application/javascript'`,`'application/x-javascript'`, ))
-
AWS_S3_FILE_NAME_CHARSET
(default'utf-8'
) specifies the charset for file names. You can change it, but UTF covers most characters. -
AWS_PRELOAD_METADATA
(defaultFalse
) allows for caching some information for a buckets' files
Another group is used for storage backend API (e.g. helpers, generating the URL):
AWS_AUTO_CREATE_BUCKET
(defaultFalse
)is a helper to avoid setup errors. If set toTrue
, the bucket will be automatically created. One can imagagine an usage to allow for dynamic bucket creation (e.g. sharding or per-use bucket). It's also nice to be similar to theget_or_create
API in the ORM.AWS_S3_URL_PROTOCOL
(default'http:'
) defines the protocol used to access the files. Default is HTTP, but you can set it to HTTPS if necessary.AWS_LOCATION
(default''
) is used to normalise the URLs. One can use it do prepend a specific path like/media/my_app/images/
to the file name. Otherwise, the path name will be relative to the root of the bucketAWS_S3_CUSTOM_DOMAIN
(defaultNone
) can allow for a custom domain to be used in accessing the file from S3 (e.g. static.currio.com instead of static.currio.com.s3...com)AWS_S3_SECURE_URLS
(defaultTrue
) is used in URL building and specifies HTTPS as protocol.
Purpose
Well, the point is that such settings can be tweaked, so the performance improves (better caching, gzipped files...). My settings are something similar to this:
from storages.backends.s3boto import S3BotoStorage
StaticRootS3BotoStorage = lambda: S3BotoStorage(
bucket="static-bucket",
reduced_redundancy=True,
headers={
'Cache-Control': 'max-age=2592000',
}
)
MediaRootS3BotoStorage = lambda: S3BotoStorage(bucket="media-bucket")
This variant allows for the static files to use the reduced redundancy (cheaper) and have the Cache-Control header enabled. I should add the gzip too. It also differentiates the static and media files in different buckets.
The lambda expression takes advantage of the implementation where all the above settings have their own variables which can be populated from the constructor. This approach allows a per-storage parameters; defining the above as django settings would result in same settings for all S3 storages.
The S3-specific settings look like this:
import os
AWS_ACCESS_KEY_ID = os.environ.get('AWS_ACCESS_KEY_ID', None)
AWS_SECRET_ACCESS_KEY = os.environ.get('AWS_SECRET_ACCESS_KEY', None)
AWS_S3_SECURE_URLS = False
# Enable S3 deployment only if we have the AWS keys
#
S3_DEPLOYMENT = AWS_ACCESS_KEY_ID is not None
if S3_DEPLOYMENT:
MEDIA_ROOT = '/media/'
STATIC_ROOT = '/static/'
AWS_STORAGE_BUCKET_NAME = 'app'
DEFAULT_FILE_STORAGE = 'app.s3utils.MediaRootS3BotoStorage'
STATICFILES_STORAGE = 'app.s3utils.StaticRootS3BotoStorage'
THUMBNAIL_DEFAULT_STORAGE = 'app.s3utils.MediaRootS3BotoStorage'
STATIC_URL = "http://static-bucket.s3-website-us-east-1.amazonaws.com/"
MEDIA_URL = "http://media-bucket.s3-website-us-east-1.amazonaws.com/"
else:
# Log that S3 is disabled
pass
HTH,