Asynchronous programming has been gaining a lot of traction in the past few years, and for good reason. Although it can be more difficult than the traditional linear style, it is also much more efficient.
For example, instead of waiting for an HTTP request to finish before continuing execution, with Python async coroutines you can submit the request and do other work that's waiting in a queue while waiting for the HTTP request to finish. It might take a bit more thinking to get the logic right, but you'll be able to handle a lot more work with less resources.
Even then, the syntax and execution of asynchronous functions in languages like Python actually aren't that hard. Now, JavaScript is a different story, but Python seems to execute it fairly well.
Asynchronicity seems to be a big reason why Node.js so popular for server-side programming. Much of the code we write, especially in heavy IO applications like websites, depends on external resources. This could be anything from a remote database call to POSTing to a REST service. As soon as you ask for any of these resources, your code is waiting around with nothing to do.
With asynchronous programming, you allow your code to handle other tasks while waiting for these other resources to respond.
Coroutines
An asynchronous function in Python is typically called a 'coroutine', which is just a function that uses the async
keyword, or one that is decorated with @asyncio.coroutine
. Either of the functions below would work as a coroutine and are effectively equivalent in type:
import asyncio
async def ping_server(ip):
pass
@asyncio.coroutine
def load_file(path):
pass
These are special functions that return coroutine objects when called. If you're familiar with JavaScript Promises, then you can think of this returned object almost like a Promise. Calling either of these doesn't actually run them, but instead a future is returned, which can then be passed to the event loop to be executed later on.
In case you ever need to determine if a function is a coroutine or not, asyncio
provides the method asyncio.iscoroutine(obj)
that does exactly this for you.
Yield from
There are a few ways to actually call a coroutine, one of which is the yield from
method. This was introduced in Python 3.3, and has been improved further in Python 3.5 in the form of async/await
(which we'll get to later).
The yield from
expression can be used as follows:
import asyncio
@asyncio.coroutine
def get_json(client, url):
file_content = yield from load_file('/Users/scott/data.txt')
As you can see, yield from
is being used within a function decorated with @asyncio.coroutine
. If you were to try and use yield from
outside this function, then you'd get error from Python like this:
File "main.py", line 1
file_content = yield from load_file('/Users/scott/data.txt')
^
SyntaxError: 'yield' outside function
In order to use this syntax, it must be within another function (typically with the coroutine decorator).
Async/await
The newer and cleaner syntax is to use the async/await
keywords. Introduced in Python 3.5, async
is used to declare a function as a coroutine, much like what the @asyncio.coroutine
decorator does. It can be applied to the function by putting it at the front of the definition:
async def ping_server(ip):
# ping code here...
To actually call this function, we use await
, instead of yield from
, but in much the same way:
async def ping_local():
return await ping_server('192.168.1.1')
Again, just like yield from
, you can't use this outside of another coroutine, otherwise you'll get a syntax error.
In Python 3.5, both ways of calling coroutines are supported, but the async/await
way is meant to be the primary syntax.
Running the event loop
None of the coroutine stuff I described above will matter (or work) if you don't know how to start and run an event loop. The event loop is the central point of execution for asynchronous functions, so when you want to actually execute the coroutine, this is what you'll use.
The event loop provides quite a few features to you:
- Register, execute, and cancel delayed calls (asynchronous functions)
- Create client and server transports for communication
- Create subprocesses and transports for communication with another program
- Delegate function calls to a pool of threads
While there are actually quite a few configurations and event loop types you can use, most of the programs you write will just need to use something like this to schedule a function:
import asyncio
async def speak_async():
print('OMG asynchronicity!')
loop = asyncio.get_event_loop()
loop.run_until_complete(speak_async())
loop.close()
The last three lines are what we're interested in here. It starts by getting the default event loop (asyncio.get_event_loop()
), scheduling and running the async task, and then closing the loop when the loop is done running.
The loop.run_until_complete()
function is actually blocking, so it won't return until all of the asynchronous methods are done. Since we're only running this on a single thread, there is no way it can move forward while the loop is in progress.
Now, you might think this isn't very useful since we end up blocking on the event loop anyway (instead of just the IO calls), but imagine wrapping your entire program in an async function, which would then allow you to run many asynchronous requests at the same time, like on a web server.
You could even break off the event loop in to its own thread, letting it handle all of the long IO requests while the main thread handles the program logic or UI.
An example
Okay, so let's see a slightly bigger example that we can actually run. The following code is a pretty simple asynchronous program that fetches JSON from Reddit, parses the JSON, and prints out the top posts of the day from /r/Python.
The first method shown, get_json()
, is called by get_reddit_top()
and just creates an HTTP GET request to the appropriate Reddit URL. When this is called with await
, the event loop can then continue on and service other coroutines while waiting for the HTTP response to get back. Once it does, the JSON is returned to get_reddit_top()
, parsed, and printed out.
import asyncio
import aiohttp
import json
async def get_json(client, url):
async with client.get(url) as response:
assert response.status == 200
return await response.read()
async def get_reddit_top(client):
data = await get_json(client, 'https://www.reddit.com/r/Python/top.json?sort=top&t=day&limit=5')
j = json.loads(data.decode('utf-8'))
for i in j['data']['children']:
score = i['data']['score']
title = i['data']['title']
link = i['data']['url']
print(str(score) + ': ' + title + ' (' + link + ')')
loop = asyncio.get_event_loop()
client = aiohttp.ClientSession(loop=loop)
loop.run_until_complete(get_reddit_top(client))
client.close()
To run this, you'll need to install aiohttp
first, which you can do with PIP:
$ pip install aiohttp
Now just make sure you run it with Python 3.5 or higher, and you should get an output like this:
$ python main.py
170: How To Make Python Run As Fast As Julia (https://www.ibm.com/developerworks/community/blogs/jfp/entry/Python_Meets_Julia_Micro_Performance)
25: Python Is Not C: Take Two (https://www.ibm.com/developerworks/community/blogs/jfp/entry/Python_Is_Not_C_Take_Two?lang=en)
23: Malicious traffic detection system (https://github.com/stamparm/maltrail)
18: What is the best Python 3 library to automate SSH sessions under Windows? (https://www.reddit.com/r/Python/comments/3x17qu/what_is_the_best_python_3_library_to_automate_ssh/)
16: SSH Python 3.5 (https://www.reddit.com/r/Python/comments/3x4haz/ssh_python_35/)
Conclusion
Although Python's built-in asynchronous functionality isn't quite as smooth as JavaScript's, that doesn't mean you can't use it for interesting and efficient applications. Just take 30 minutes to learn its ins and outs and you'll have a much better sense as to how you can integrate this in to your own applications.
What do you think of Python's async/await? How have you used it in the past? Let us know in the comments!