Docker has matured a lot since it was released nearly 4 years ago. We’ve been watching it closely at Caktus, and have been thrilled by the adoption -- both by the community and by service providers. As a team of Python and Django developers, we’re always searching for best of breed deployment tools. Docker is a clear fit for packaging the underlying code for many projects, including the Python and Django apps we build at Caktus.
Technical overview
There are many ways to containerize a Python/Django app, no one of which could be considered “the best.” That being said, I think the following approach provides a good balance of simplicity, configurability, and container size. The specific tools I’ll be using are: Docker (of course), Alpine Linux, and uWSGI.
Alpine Linux is a simple, lightweight Linux distribution based on musl libc and Busybox. Its main claim to fame on the container landscape is that it can create a very small (5MB) Docker image. Typically one’s application will be much larger than that after the code and all dependencies have been included, but the container will still be much smaller than if based on a general-purpose Linux distribution.
There are many WSGI servers available for Python, and we use both Gunicorn and uWSGI at Caktus. A couple of the benefits of uWSGI are that (1) it’s almost entirely configurable through environment variables (which fits well with containers), and (2) it includes native HTTP support, which can circumvent the need for a separate HTTP server like Apache or Nginx, provided static files are hosted on a 3rd-party CDN such as Amazon S3.
The Dockerfile
Without further ado, here’s a production-ready Dockerfile you can use as a starting point for your project (it should be added in your top level project directory, or whichever directory contains the Python package(s) provided by your application):
FROM python:3.5-alpine# Copy in your requirements fileADD requirements.txt /requirements.txt# OR, if you’re using a directory for your requirements, copy everything (comment out the above and uncomment this if so):# ADD requirements /requirements# Install build deps, then run `pip install`, then remove unneeded build deps all in a single step. Correct the path to your production requirements file, if needed.RUNset -ex \&& apk add --no-cache --virtual .build-deps \
gcc \
make \
libc-dev \
musl-dev \
linux-headers \
pcre-dev \
postgresql-dev \&& pyvenv /venv \&& /venv/bin/pip install -U pip \&&LIBRARY_PATH=/lib:/usr/lib /bin/sh -c "/venv/bin/pip install -r /requirements.txt"\&&runDeps="$(\
scanelf --needed --nobanner --recursive /venv \| awk '{ gsub(/,/, "\nso:", $2); print "so:" $2 }'\| sort -u \| xargs -r apk info --installed \| sort -u \)"\&& apk add --virtual .python-rundeps $runDeps\&& apk del .build-deps
# Copy your application code to the container (make sure you create a .dockerignore file if any large files or directories should be excluded)RUN mkdir /code/
WORKDIR /code/ADD . /code/# uWSGI will listen on this portEXPOSE 8000# Add any custom, static environment variables needed by Django or your settings file here:ENV DJANGO_SETTINGS_MODULE=my_project.settings.deploy# uWSGI configuration (customize as needed):ENV UWSGI_VIRTUALENV=/venv UWSGI_WSGI_FILE=my_project/wsgi.py UWSGI_HTTP=:8000 UWSGI_MASTER=1 UWSGI_WORKERS=2 UWSGI_THREADS=8 UWSGI_UID=1000 UWSGI_GID=2000# Call collectstatic (customize the following line with the minimal environment variables needed for manage.py to run):RUNDATABASE_URL=none /venv/bin/python manage.py collectstatic --noinput
# Start uWSGICMD ["/venv/bin/uwsgi", "--http-auto-chunked", "--http-keepalive"]
We extend from the Alpine flavor of the official Docker image for Python 3.5, copy the folder containing our requirements files to the container, and then, in a single line, (a) install the OS dependencies needed, (b) pip install the requirements themselves (edit this line to match the location of your requirements file, if needed), (c) scan our virtual environment for any shared libraries linked to by the requirements we installed, and (d) remove the C compiler and any other OS packages no longer needed, except those identified in step (c) (this approach, using scanelf, is borrowed from the underlying 3.5-alpine Dockerfile). It’s important to keep this all on one line so that Docker will cache the entire operation as a single layer.
You’ll notice I’ve only included a minimal set of OS dependencies here. If this is an established production app, you’ll most likely need to visit https://pkgs.alpinelinux.org/packages, search for the Alpine Linux package names of the OS dependencies you need, including the -dev supplemental packages as needed, and add them to the list above.
Next, we copy our application code to the image, set some default environment variables, and run collectstatic. Be sure to change the values for DJANGO_SETTINGS_MODULE and UWSGI_WSGI_FILE to the correct paths for your application (note that the former requires a Python package path, while the latter requires a file system path). In the event you’re not serving static media directly from the container (e.g., with Whitenoise), the collectstatic command can also be removed.
Finally, the --http-auto-chunked and --http-keepalive options to uWSGI are needed in the event the container will be hosted behind an Amazon Elastic Load Balancer (ELB), because Django doesn’t set a valid Content-Length header by default, unless the ConditionalGetMiddleware is enabled. See the note at the end of the uWSGI documentation on HTTP support for further detail.
Building and testing the container
Now that you have the essentials in place, you can build your Docker image locally as follows:
docker build -t my-app .
This will go through all the commands in your Dockerfile, and if successful, store an image with your local Docker server that you could then run:
docker run -e DATABASE_URL=none -t my-app
This command is merely a smoke test to make sure uWSGI runs, and won’t connect to a database or any other external services.
Running commands during container start-up
As an optional final step, I recommend creating an ENTRYPOINT script to run commands as needed during container start-up. This will let us accomplish any number of things, such as making sure Postgres is available or running migrate or collectstatic during container start-up. Save the following to a file named docker-entrypoint.sh in the same directory as your Dockerfile:
#!/bin/shset -e
until psql $DATABASE_URL -c '\l';do>&2echo"Postgres is unavailable - sleeping"
sleep 1
done>&2echo"Postgres is up - continuing"if["x$DJANGO_MANAGEPY_MIGRATE"='xon'];then
/venv/bin/python manage.py migrate --noinput
Fi
if["x$DJANGO_MANAGEPY_COLLECTSTATIC"='xon'];then
/venv/bin/python manage.py collectstatic --noinput
fiexec"$@"
Next, add the following line to your Dockerfile, just above the CMD statement:
ENTRYPOINT ["/code/docker-entrypoint.sh"]
This will (a) make sure a database is available (usually only needed when used with Docker Compose), (b) run outstanding migrations, if any, if the DJANGO_MANAGEPY_MIGRATE is set to on in your environment, and (c) run collectstatic if DJANGO_MANAGEPY_COLLECTSTATIC is set to on in your environment. Even if you add this entrypoint script as-is, you could still choose to run migrate or collectstatic in separate steps in your deployment before releasing the new container. The only reason you might not want to do this is if your application is highly sensitive to container start-up time, or if you want to avoid any database calls as the container starts up (e.g., for local testing). If you do rely on these commands being run during container start-up, be sure to set the relevant variables in your container’s environment.
Creating a production-like environment locally with Docker Compose
To run a complete copy of production services locally, you can use Docker Compose. The following docker-compose.yml will create a barebones, ephemeral, AWS-like container environment with Postgres and Redis for testing your production environment locally.
This is intended for local testing of your production environment only, and will not save data from stateful services like Postgres upon container shutdown.
version:'2'services:db:environment:POSTGRES_DB:app_dbPOSTGRES_USER:app_userPOSTGRES_PASSWORD:changemerestart:alwaysimage:postgres:9.6expose:-"5432"redis:restart:alwaysimage:redis:3.0expose:-"6379"app:environment:DATABASE_URL:postgres://app_user:changeme@db/app_dbREDIS_URL:redis://redisDJANGO_MANAGEPY_MIGRATE:onbuild:context:.dockerfile:./Dockerfilelinks:-db:db-redis:redisports:-"8000:8000"depends_on:-db-redis
Copy this into a file named docker-compose.yml in the same directory as your Dockerfile, and then run:
docker-compose up --build -d
This downloads (or builds) and starts the three containers listed above. You can view output from the containers by running:
docker-compose logs
If all services launched successfully, you should now be able to access your application at http://localhost:8000/ in a web browser.
Extra: Blocking Invalid HTTP_HOST header errors with uWSGI
To avoid Django’s Invalid HTTP_HOST header errors (and prevent any such spurious requests from taking up any more CPU cycles than absolutely necessary), you can also configure uWSGI to return an HTTP 400 response immediately without ever invoking your application code. This can be accomplished by adding a command line option to uWSGI in your Dockerfile script, e.g., --route-host=’^(?!www.myapp.com$) break:400' (note, the single quotes are required here, to prevent the shell from attempting to interpret the regular expression). If preferred (for example, in the event you use a different domain for staging and production), you can accomplish the same end by setting an environment variable via your hosting platform: UWSGI_ROUTE_HOST=‘^(?!www.myapp.com$) break:400'.
That concludes this high-level introduction to containerizing your Python/Django app for hosting on AWS Elastic Beanstalk (EB), Elastic Container Service (ECS), or elsewhere. Each application and Dockerfile will be slightly different, but I hope this provides a good starting point for your containers. Shameless plug: If you’re looking for a simple (and at least temporarily free) way to test your Docker containers on AWS using an Elastic Beanstalk Multicontainer Docker environment or the Elastic Container Service, checkout AWS Container Basics (more on this soon). Good luck!