Intro
Have you heard a lot about Docker recently and been wondering what all the hubbub has been about? In this tutorial, I’ll introduce Docker, what all the hype has been about, what advantages it can give you, and how you can get started using it for your development projects. In this tutorial, I assume familiarity with concepts like processes and operating systems.
This tutorial is a write-up of a Codementor Office Hours I recently hosted. You can check out the video at the end of this tutorial.
Table of Contents
What is Docker?
What is a Container Exactly?
Vocabulary You Need To Know
What Does the Docker Eco-System Look Like?
Getting Started: Installation
How Can We Use Docker for Development?
Example Python and Flask Application
Workflow for Working with Docker
How Can We Publish This?
Watch the Office Hours Video
Conclusion
What is Docker?
The official website for Docker states:
Docker provides a way to run applications securely isolated in a container, packaged with all its dependencies and libraries. Because your application can always be run with the environment it expects right in the build image, testing and deployment is simpler than ever, as your build will be fully portable and ready to run as designed in any environment.
What does this mean in layman’s terms? If you think of your application, or more concretely, any operating system process, that process has dependencies. For example, your application may have to open up configuration files, write data to a folder, or open up network sockets. These are all the things your code assumes will be made available by the operating system. Docker provides a way of packaging your code and all these dependencies into a single unit, called a container.
Docker provides an environment of tools for creating, running, stopping containers. Essentially, we want to treat containers to be a “building block” of software.
Some of you reading this may be wondering if this is related to virtual machines. Let’s take a look at the image below.
The biggest difference is what layer of the stack is being virtualized. For virtual machines, they are individual operating systems running in parallel on a single host operating system. In the Docker world, each container is just a process and its dependencies are grouped together. In some sense, this is virtualization at the process level, not the operating systems level.
What is a Container Exactly?
From the official website for Docker:
Docker containers wrap a piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries — anything that can be installed on a server. This guarantees that the software will always run the same, regardless of its environment.
That may sound a bit convoluted, so hopefully a picture from RightScale will clear things up a bit.
A running container is simply a standard OS process, plus file system dependencies and network connections. In the picture above, the leftmost diagram shows an abstract model of what a process is, where the green PGM is shows the main program code and MEM is the memory used by the running process. The other two boxes are a bit advanced, but basically, those are the current state of the CPU registers and a process table used by the OS.
The next diagram shows it a bit more concretely in a more realistic scenario where the code running in memory needs to read in data from the hard drive, like config files in /etc
, shared libraries in /lib
and the binary of the process in /bin
.
The last diagram shows a container — notice the boundary surrounds the file system now, with a new box called “net”. Everything in the orange box is a container, a single distributable unit that represents a running process and what it needs to run.
Great! But why bother with this at all?
The main reasons I like to force Docker into my workflow are the following:
- Allows you to break large applications into its constituent parts, which helps keep the cognitive load low and allows for easier debugging and more transparency.
- Wrap up complex application instructions in a way that helps document how to build and configure your software.
- If you use Docker for development as well as production, then you have identical stacks. This reduces the cognitive load of knowing how to use multiple setups. It also prevents the “it works for me” excuses for when things break.
- Do not have to worry about crazy compilation steps to install software and maintain your machine.
- Helps keep your development work to be independent of your OS (so an Ubuntu upgrade doesn’t start touching your development libraries, etc.)
Vocabulary You Need To Know
Whenever I am learning something new, it always helps to outline new vocabulary. Here are some you will need to know in order to fully immerse yourself into the Docker ecosystem:
Container: A runnable process and its resource dependencies grouped together into a single unit (see above)
Image: A blueprint for creating containers. Much like class versus object, or VM image versus a VM. As an example, the Redis Docker Image allows you to create one or more actual Redis containers that can/will run the Redis process and whatever dependencies it has.
Dockerfile: A text file containing instructions for building a docker image. All images have an associated Dockerfile.
Registry: An external service for storing/referencing images that you can name and version. Dockerhub is a registry provided to you by Docker, but you can set up a private one if need be.
Volume: Like a traditional Linux
mount
, a volume is a folder on the docker host system that is mapped into a running container
What Does the Docker Eco-System Look Like?
The Docker Ecosystem of tools is always in flux. There is constant development going on, so much of this may change, but here are the core components of the Docker world:
Docker Engine: This is the core of Docker. This daemon presents a REST API and is responsible for actually running containers. You will run commands via a CLI that talks directly to this daemon.
Docker Machine: This tool allows you to provision a virtual machine, on many different cloud providers or locally via VirtualBox, to run a Docker Engine. Essentially, its sole purpose is to provision engines.
Docker Swarm: Docker’s distributed container platform to cluster many Docker Machine instances together. We will not cover this in this tutorial, but will in future ones.
Docker Compose: Coordinate the creation of containers by describing how containers fit together to provide the functionality of a whole application.
I am a big fan of visual learning, so here is a diagram showing much of the Docker ecosystem:
On the left we have commands you will run from the command line, which interfaces to the Docker daemon (i.e. engine). The engine is located in a Docker Host (i.e. machine) and it’s this daemon that hosts images and creates running containers from those images. Images can be built locally to a daemon or downloaded from the registry all the way on the right.
Getting Started: Installation
For the rest of this tutorial, I will be using Docker engine, machine and compose. I will also be using Virtualbox for the local Docker machine, and Digital Ocean in the last part. To keep this tutorial shorter, I’ll link to installation notes for all of these tools:
- Docker Installation Notes
- Virtualbox Installation Notes
For Docker Machine on Linux:
$ curl -L https://github.com/docker/machine/releases/download/v0.8.2/docker-machine-`uname -s`-`uname -m` > /usr/local/bin/docker-machine && chmod +x /usr/local/bin/docker-machine
For Docker Compose on Linux:
$ curl -L "https://github.com/docker/compose/releases/download/1.9.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose && chmod +x /usr/local/bin/docker-compose
How Can We Use Docker for Development?
Now that we know all this about Docker, how can we use Docker for a local development environment? Let’s come up with a motivating example: we have been hired to build a small REST server. For this application, I have chosen Python and Flask as my tools but you could use other stacks as well. This talk is opinionated in what tools I choose to use, but should be generally applicable to other languages.
We’ll call the application Flask-Tester to see finished repositories, please check:
- git clone https://github.com/jquacinella/flask-tester-backend
- git clone https://github.com/jquacinella/flask-tester-docker
Example Python and Flask Application
When I develop software, I like to keep multiple repositories:
- One for the application code itself, in this case, Python and Flask code
- One for the docker orchestration of the application
To get started, run these commands (on Linux; use the equivalent for Windows/MacOSX):
mkdir code; cd code
git clone https://github.com/jquacinella/flask-tester-backend
cd ../; git clone https://github.com/jquacinella/flask-tester-docker
This creates two folders, one for each repo, which we’ll start building now. As for the application itself, while this is not a tutorial on Python and Flask, we’ll review it quickly. Here is a link to the main wsgi.py file in Github for this application. This is the initial version, that is incomplete and without redis. The version on master has the updated code with redis.
When it comes to applications in Docker, here are a few tips:
- Tip #1: Make your application respond to env variables for configuration loading. We will not do this here but will in future.
- Tip #2: Make sure your setup will reload the application on source changes, to allow live coding from your local machine.
Workflow for Working with Docker
Now that we got that out of the way, I am going to outline how I approach development using Docker. When I start a new project, my thought process goes something like this:
- Create a docker machine to host your docker engine.
Note: Since I use Linux as my main OS, I could use a Docker engine installed locally on my laptop. However, AFAIK, using Docker on Windows or MacOSX requires a VM, so I am doing it this way to be as general as possible. - Create an image for a container that can host and run our Python application.
- Outline volumes needed for source code and config files.
- Write a compose file to allow us to automate the creation and deletion of the application container with proper volumes.
- Write some scripts to wrap Docker compose.
Let’s get started!
Step 1: Create a Local Docker Machine with Shared Volumes
First, create a virtualbox VM called flask-tester-engine:
$ docker-machine create --driver virtualbox flask-tester-engine
To see what machines you have available, run:
$ docker-machine ls
However, now we need to tell docker client to use this machine as your docker engine. We do this using the env command:
$ docker-machine env flask-tester-engine
$ eval $(docker-machine env flask-tester-engine)
We can now run a docker command to see all containers, which you should see none right:
$ docker ps --all
Here is what my output looks like so far:
$ docker-machine create --driver virtualbox flask-tester-engine
Running pre-create checks...
(flask-tester-engine) You are using version 4.3.36_Ubuntur105129 of VirtualBox. If you encounter issues, you might want to upgrade to version 5 at https://www.virtualbox.org
Creating machine...
(flask-tester-engine) Copying /home/james/.docker/machine/cache/boot2docker.iso to /home/james/.docker/machine/machines/flask-tester-engine/boot2docker.iso...
(flask-tester-engine) Creating VirtualBox VM...
(flask-tester-engine) Creating SSH key...
(flask-tester-engine) Starting the VM...
(flask-tester-engine) Check network to re-create if needed...
(flask-tester-engine) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env flask-tester-engine
$ docker-machine env flask-tester-engine
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.101:2376"
export DOCKER_CERT_PATH="/home/james/.docker/machine/machines/flask-tester-engine"
export DOCKER_MACHINE_NAME="flask-tester-engine"
# Run this command to configure your shell:
# eval $(docker-machine env flask-tester-engine)
$ eval $(docker-machine env flask-tester-engine)
$ docker ps -all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
Now that we created a machine, how exactly does this VM know about files on my hard-drive? How can you develop software interactively like this? The next portion here might be a bit confusing, so make sure to slow down and understand this point. Let’s look at what we have now:
As you can see, there is no way for the machine to touch your local file system. Technically, Docker machine sets up the Virtualbox with a mount from your local or home directory to the VM. However, I do not like this automatic setup and prefer updating the mount manually. See https://github.com/docker/machine/issues/1814 for more details.
The following commands will let us set up a Virtualbox “Shared volume” (note, this is not a docker volume):
$ VBoxManage sharedfolder add flask-tester-engine --automount --name "code" --hostpath /path/to/code
$ docker-machine ssh flask-tester-engine 'sudo mkdir --parents /mnt/code'
$ docker-machine ssh flask-tester-engine 'sudo mount -t vboxsf code /mnt/code'
$ docker-machine ssh flask-tester-engine 'ls -l /mnt/code'
drwxrwxr-x 1 root root 4096 Nov 29 19:53 flask-tester-backend
drwxrwxr-x 1 root root 4096 Nov 29 19:53 flask-tester-docker
Here is what we have now:
Step 2: Create a Docker Image For Our App
Now we need to create a container to host the application. I always think in terms of “What are the dependencies?” It’s a good question to ask when thinking about what need to be in a container. There are always two things to think about: the process the container is wrapping and what files that process needs. The docker philosophy is to have one process per container, but there is much discussion about this. Check Phusion BaseImage for an alternative view.
In our case, what will be our central process? Our application is going to be uWSGI process. This process will start up threads running our Python Flask app. Now we need to ask, “What file dependencies do we have?”
We need:
- The Python source files
- Files related to uWSGI, including a binary and conf files
- Access to open port 80 for client connections
Docker makes it easy to create your own image by basing your image on another image. In our case, the easiest thing to do is to use the standard Ubuntu image. This means the image starts off with a filesystem that is a copy of the Ubuntu Linux distribution. Given this base image, what do we need to do to get it running on our application:
- Install uWSGI in the container filesystem.
- Configure the container to run uWSGI as its main process.
- Get our source code in the container.
If you remember from the vocabulary section, we need to write a Dockerfile to do tasks the above. This Dockerfile will be the definition of the image. Here is what the Dockerfile will look like:
# Inspired by https://github.com/atupal/dockerfile.flask-uwsgi-nginx/blob/master/Dockerfile
# Base our image off the ubuntu docker container from Dockerhub
from ubuntu:14.04
# Update ubuntu filesystem with updates
run apt-get update
# Install python and easy_install from Ubuntu repos
run apt-get install -y build-essential python python-dev python-setuptools
# Install pip, and then use pip to install flask and uwsgi
run easy_install pip
run pip install uwsgi flask
# We are exposing a network port
expose 80
# uWSGI command that this container is running
cmd ["/usr/local/bin/uwsgi", "--http", ":80", "--wsgi-file", "/src/wsgi.py", "--py-autoreload", "1"]
For a complete reference on the syntax of Dockerfile, check the documentation. Each line of the Dockerfile starts with a keyword, like run, or expose. The main one to know is run, which runs a command just like you were at a Linux command line. The exposed keyword tells Docker that this container will need access to a network port. Then from the keyword, it tells Docker what image to base this container off. All of these commands are essentially setting up the container filesystem.
The last one, cmd, tells Docker that this container’s main process will be uWSGI, with CLI parameters separated into a list. Notice the --py-autoreload
parameter, which will reload the uWSGI process when the source code changes.
Another thing to notice is that this container is assuming that the Python source code will be located at /src/wsgi.py
With that being done, how do we build the image?
# In flask-tester-docker repo:
$ mkdir -p images/flask_tester
$ vim images/flask_tester/Dockerfile # content from before
$ docker build -t flask_tester:0.1 images/flask_tester # Build an image from the dockerfile
... a lot of output ...
$ docker images # We can see our image and the ubuntu image it is based on
REPOSITORY TAG IMAGE ID CREATED SIZE
flask_tester 0.1 5cb4ef863e80 2 minutes ago 439.7 MB
ubuntu 14.04 aae2b63c4946 About an hour ago 188 MB
Notice each image is identified by a name and a tag; the tag is useful for keeping track of versions, so when we update the Dockerfile later in this tutorial, we’ll update the tag from 0.1 to 0.2.
Step 3: Think about Volumes for Code and Config
Now we need to think about volumes. What else does our container need? Typically, we need to get configuration files and data or source code into the container as well. We will create a volume for source code and we’ll use the virtual machine mount to tell the container to map the directory in the VM to a /src
folder in the container.
Normally, we would have a volume for configuration files for uWSGI. To keep things simple like in our case, I am running uWSGI only with CLI arguments.
Why are we bothering with volumes? Can’t we copy everything into the container at build time? We could but that would make things less dynamic. You would have to rebuild on every source or config change. This allows us to update configs and not have to rebuild the image. This makes debugging much easier and allows config and source files to be in Git.
Interlude: Test the App
At this point, we can run our app! With everything we have, we should be able to test our app using docker run
, which is the command to create and start a container process:
$ docker run -d --name flask_tester_container -p 80:80 -v /mnt/code/flask-tester-backend:/src flask_tester:0.1
$ docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0fc5c1909584 flask_tester:0.1 "supervisord -n" 7 seconds ago Up 7 seconds 0.0.0.0:80->80/tcp flask_tester_container
Remember, everything is in reference to the virtual machine, so the volume mount is in reference to /mnt
, which is mapped to the top-level directory on your machine. You should be able to go to http://localhost/ and see Hello world!
Here is what we have now:
Step 4: Docker Compose to Automate Docker Run
Great, now that we can run the app, we can write a Docker Compose script to mimic the run command, so we do not have to keep remembering the parameters. To see the compose script as it stands right now, check it out here:
version: '2'
services:
flask_tester_app:
container_name: flask_tester
hostname: flask_tester
image: flask_tester:0.1
volumes:
- /mnt/code/flask-tester-backend:/src
ports:
- "80:80"
Notice that most lines have a counterpart to the run command, including container_name
, image
, volumes
, and ports
. The version line at the top simply says to use the version 2 syntax. Keep that as it is, there is no need to ever use version 1 at this point. See more about Docker Compose syntax here. First, we stop and kill the old container from the run command, and then use the docker compose command to run this file:
$ docker stop flask_tester; docker rm flask_tester
$ /usr/local/bin/docker-compose -f compose/flask_tester.yml -p local up -d
Creating network "local_default" with the default driver
Creating flask_tester
$ /usr/local/bin/docker-compose -f compose/flask_tester.yml -p local ps
Name Command State Ports
--------------------------------------------------------------------------
flask_tester /usr/local/bin/uwsgi --htt ... Up 0.0.0.0:80->80/tcp
$ docker ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9098751a9976 flask_tester:0.1 "/usr/local/bin/uwsgi" 7 seconds ago Up 7 seconds 0.0.0.0:80->80/tcp flask_tester
Step 5: Wrapper Scripts
Since the docker compose file has everything we need to run our app, we can create some bash scripts to make our life easier. They are just one liners, which you can checkout on the github page.
Extra Credit: Adding Redis to the Mix
Uh oh, the client we are working for wants a new feature (above and beyond just printing hello world, who saw that coming?). Redis is a NoSQL database that can store various data-types. They want to add Redis to the mix and use it to store a count of web page hits. With docker, this isn’t a problem. We’ll have to:
- Add the Python
redis
library to the flask-tester container Dockerfile. - Create a new container to house the
redis
process and its dependencies.- We’ll use the redis container from Dockerhub.
Update our compose file to link them together
You can reference the other container using its name and DNS provided by Docker.
$ … update dockerfile for app …
$ docker build -t flask_tester:0.2 images/flask_tester # Build image flask_tester:0.2
$ … update compose file …
$ ./scripts/stop_app.sh # Stop all containers for this app
$ mkdir data/redis # Create folder for persistent redis data
$ ./scripts/start_app.sh # Re-create all the containers
$ curl $(docker-machine ip flask-tester-engine) # cURL access to our app
Counter: 27
There is a lot going on here, so let’s break it down. What did we add to the Dockerfile? We added line 12:
run pip install uwsgi flask redis
All we did was add redis
to the line, so the Python redis library is now installed in the filesystem of the new image.
What do we need to change in the source code? We update the code to connect to Redis and increment a counter every time an API call is made. This is great but how does the code on line 7 know to connect to Redis? We’ll see below.
What do we need to change in the Docker Compose file? Basically everything from line 13 on, which creates a new container from the Redis image that is provided by those who created Redis. From the documentation on the Redis image, it will save data to the /data folder
. We can mount a local directory there to keep the data persistent across containers. We create that folder with the commands above. Once we rebuild the new image, compose will take care of bringing up both containers.
Back to how these two containers will communicate. We can now jump inside the process space of the container using the docker exec
command to see that the containers can ping one another using their container names as DNS names:
$ docker exec -it flask_tester /bin/bash
root@flask_tester:/# ping redis
PING redis (172.18.0.2) 56(84) bytes of data.
64 bytes from flask_tester_redis.local_default (172.18.0.2): icmp_seq=1 ttl=64 time=0.142 ms
64 bytes from flask_tester_redis.local_default (172.18.0.2): icmp_seq=2 ttl=64 time=0.176 ms
^C
--- redis ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.142/0.159/0.176/0.017 ms
Therefore, the container that represents our app can simply connect to it using a DNS name
How Can We Publish This?
Whew, that was a lot. Hopefully, you have learned a lot about how to do things with Docker locally and how to use it for development. However, if you wanted to push this to the web, could you? Yes.
First, we will have to create a new machine on Digital Ocean, and then switch to using that machine. We’ll then update our Dockerfile to copy the application source code directly in the build process (which still allows us to overwrite it via a volume (so we can get the best of both worlds). Once we do that, we’ll run our compose script again, and everything should work.
You will have to sign up with DO and get an API key. Once you do that, you can run the following commands:
$ docker-machine create --driver digitalocean --digitalocean-access-token XXXXXXXXXX flask-tester-do
$ eval $(docker-machine env flask-tester-do)
After a few minutes, you will have a new Docker machine on your DO account. The eval command now means all your docker commands are in reference to that machine, including compose
scripts.
Next, we’ll update the Dockerfile to copy the source code to the container. Herein lies a weird caveat — Docker refuses to allow you to reference files from outside the directory of the Dockerfile. There are some reasons for this, but to get around it, we can use a slight hack and tell Linux to mount a directory in another directory using the standard mount command:
$ sudo mount --bind <path/to/dir>/code/flask-tester-backend images/flask_tester/src
For Windows, you might have to symlink or not have two repos but have your Dockerfile closer to the source. For MacOS X, the mount command should work. Look at line 18 and 19, where we run a mkdir /src
and copy our source files into the new directory created in the container.
All we need to do now is build the app, and tell compose to run it:
$ docker build -t flask_tester:0.2 images/flask_tester # Build image flask_tester:0.2
$ … update compose file …
$ ./scripts/stop_app.sh # Stop all containers for this app
$ mkdir data/redis # Create folder for persistent redis data
$ ./scripts/start_app.sh # Re-create all the containers
You should now be able to access the application on your new machine!
Watch the Office Hours Video
Conclusion
I hope you enjoyed this tutorial and see the power in using Docker for development. If you enjoyed this, stayed tuned for more tutorials on how to use a Docker Swarm for easy scale out of your machines in production.