This article is brought with ❤ to you by Semaphore.
Introduction
Most websites we use provide an HTTP API to enable developers to access their data from their own applications. For developers utilizing the API, this usually involves making some HTTP requests to the service, and using the responses in their applications. However, this may get tedious since you have to write HTTP requests for each API endpoint you intend to use. Furthermore, when a part of the API changes, you have to edit all the individual requests you have written.
A better approach would be to use a library in your language of choice that helps you abstract away the API's implementation details. You would access the API through calling regular methods provided by the library, rather than constructing HTTP requests from scratch. These libraries also have the advantage of returning data as familiar data structures provided by the language, hence enabling idiomatic ways to access and manipulate this data.
In this tutorial, we are going to write a Python library to help us communicate with The Movie Database's API from Python code.
By the end of this tutorial, you will learn:
- How to create and test a custom library which communicates with a third-party API and
- How to use the custom library in a Python script.
Prerequisites
Before we get started, ensure you have one of the following Python versions installed:
- Python 2.7, 3.3, 3.4, or 3.5
We will also make use of the Python packages listed below:
requests
- We will use this to make HTTP requests,vcrpy
- This will help us record HTTP responses during tests and test those responses, andpytest
- We will use this as our testing framework.
Project Setup
We will organize our project as follows:
.
├── requirements.txt
├── tests
│ ├── __init__.py
│ ├── test_tmdbwrapper.py
│ └── vcr_cassettes
└── tmdbwrapper
└── __init__.py
└── tv.py
This sets up a folder for our wrapper and one for holding the tests. The
vcr_cassettes
subdirectory inside tests
will store our recorded HTTP
interactions with The Movie Database's API.
Our project will be organized around the functionality we expect to provide in
our wrapper. For example, methods related to TV functionality will be in the
tv.py
file under the tmdbwrapper
directory.
We need to list our dependencies in the requirements.txt
file as follows.
At the time of writing, these are the latest versions. Update the version numbers
if later versions have been published by the time you are reading this.
requests==2.11.1
vcrpy==1.10.3
pytest==3.0.3
Finally, let's install the requirements and get started:
pip install -r requirements.txt
Test-driven Development
Following the test-driven development practice, we will write the tests for our application first, then implement the functionality to make the tests pass.
For our first test, let's test that our module will be able to fetch a TV show's info from TMDb successfully.
# tests/test_tmdbwrapper.pyfromtmdbwrapperimportTVdeftest_tv_info():"""Tests an API call to get a TV show's info"""tv_instance=TV(1396)response=tv_instance.info()assertisinstance(response,dict)assertresponse['id']==1396,"The ID should be in the response"
In this initial test, we are demonstrating the behavior we expect our complete
module to exhibit. We expect that our tmdbwrapper
package will contain a TV
class, which we can then instantiate with a TMDb TV ID.
Once we have an instance of the class, when we call the info
method, it should
return a dictionary containing the TMDb TV ID we provided under the 'id'
key.
To run the test, execute the py.test
command from the root directory.
As expected, the test will fail with an error message that should contain
something similar to the following snippet:
ImportError while importing test module '/Users/kevin/code/python/tmdbwrapper/tests/test_tmdbwrapper.py'.
'cannot import name TV'
Make sure your test modules/packages have valid Python names.
This is because the tmdbwrapper
package is empty right now. From now on, we will
write the package as we go, adding new code to fix the failing tests, adding
more tests and repeating the process until we have all the functionality we need.
Implementing Functionality in Our API Wrapper
To start with, the minimal functionality we can add at this stage is creating the TV
class inside our package.
Let's go ahead and create the class in the tmdbwrapper/tv.py
file:
# tmdbwrapper/tv.pyclassTV(object):pass
Additionally, we need to import the TV
class in the tmdbwrapper/__init__.py
file,
which will enable us to import it directly from the package.
# tmdbwrapper/__init__.pyfrom.tvimportTV
At this point, we should re-run the tests to see if they pass. You should now see the following error message:
> tv_instance = TV(1396)
E TypeError: object() takes no parameters
We get a TypeError
. This is good. We seem to be making some progress.
Reading through the error, we can see that it occurs when we try to instantiate
the TV
class with a number.
Therefore, what we need to do next is implement a constructor for the TV
class
that takes a number. Let's add it as follows:
# tmdbwrapper/tv.pyclassTV(object):def__init__(self,id):pass
As we just need the minimal viable functionality right now, we will leave the
constructor empty, but ensure that it receives self
and id
as parameters.
This id
parameter will be the TMDb TV ID that will be passed in.
Now, let's re-run the tests and see if we made any progress. We should see the following error message now:
> response = tv_instance.info()
E AttributeError: 'TV' object has no attribute 'info'
This time around, the problem is that we are using the info
method from the tv_instance
,
and this method does not exist. Let's add it.
# tmdbwrapper/tv.pyclassTV(object):def__init__(self,id):passdefinfo(self):pass
After running the tests again, you should see the following failure:
> assert isinstance(response, dict)
E assert False
E + where False = isinstance(None, dict)
For the first time, it's the actual test failing, and not an error in our code.
To make this pass, we need to make the info
method return a dictionary. Let's
also pre-empt the next failure we expect. Since we know that the returned
dictionary should have an id
key, we can return a dictionary with an
'id'
key whose value will be the TMDb TV ID provided when the class is initialized.
To do this, we have to store the ID as an instance variable, in order to access
it from the info
function.
# tmdbwrapper/tv.pyclassTV(object):def__init__(self,id):self.id=iddefinfo(self):return{'id':self.id}
If we run the tests again, we will see that they pass.
Writing Foolproof Tests
You may be asking yourself why the tests are passing, since we clearly have not fetched any info from the API. Our tests were not exhaustive enough. We need to actually ensure that the correct info that has been fetched from the API is returned.
If we take a look at the TMDb documentation
for the TV info method, we can see that there are many additional fields
returned from the TV info response, such as poster_path
, popularity
, name
,
overview
, and so on.
We can add a test to check that the correct fields are returned in the response,
and this would in turn help us ensure that our tests are indeed checking for a correct
response object back from the info
method.
For this case, we will select a handful of these properties and ensure that they are in the response. We will use pytest fixtures for setting up the list of keys we expect to be included in the response.
Our test will now look as follows:
# tests/test_tmdbwrapper.pyfrompytestimportfixturefromtmdbwrapperimportTV@fixturedeftv_keys():# Responsible only for returning the test datareturn['id','origin_country','poster_path','name','overview','popularity','backdrop_path','first_air_date','vote_count','vote_average']deftest_tv_info(tv_keys):"""Tests an API call to get a TV show's info"""tv_instance=TV(1396)response=tv_instance.info()assertisinstance(response,dict)assertresponse['id']==1396,"The ID should be in the response"assertset(tv_keys).issubset(response.keys()),"All keys should be in the response"
Pytest fixtures help us create test data that we can then use in other tests.
In this case, we create the tv_keys
fixture which returns a list of some of the
properties we expect to see in the TV response.
The fixture helps us keep our code clean, and explicitly separate the scope of the two
functions.
You will notice that the test_tv_info
method now takes tv_keys
as a parameter.
In order to use a fixture in a test, the test has to receive the fixture name as
an argument. Therefore, we can make assertions using the test data.
The tests now help us ensure that the keys from our fixtures are a subset of the
list of keys we expect from the response.
This makes it a lot harder for us to cheat in our tests in future, as we did before.
Running our tests again should give us a constructive error message which fails because our response does not contain all the expected keys.
Fetching Data from TMDb
To make our tests pass, we will have to construct a dictionary object
from the TMDb API response and return that in the info
method.
Before we proceed, please ensure you have obtained an API key from TMDb by registering. All the available info provided by the API can be viewed in the API Overview page and all methods need an API key. You can request one after registering your account on TMDb.
First, we need a requests session
that we will use for all HTTP interactions.
Since the api_key
parameter is required for all requests, we will attach it to
this session object so that we don't have to specify it every time we need to make an
API call. For simplicity, we will write this in the package's __init__.py
file.
# tmdbwrapper/__init__.pyimportosimportrequestsTMDB_API_KEY=os.environ.get('TMDB_API_KEY',None)classAPIKeyMissingError(Exception):passifTMDB_API_KEYisNone:raiseAPIKeyMissingError("All methods require an API key. See ""https://developers.themoviedb.org/3/getting-started/introduction ""for how to retrieve an authentication token from ""The Movie Database")session=requests.Session()session.params={}session.params['api_key']=TMDB_API_KEYfrom.tvimportTV
We define a TMDB_API_KEY
variable which gets the API key from the
TMDB_API_KEY
environment variable. Then, we go ahead and initialize a requests
session and provide the API key in the params
object. This means that it will
be appended as a parameter to each request we make with this session object.
If the API key is not provided, we will raise a custom APIKeyMissingError
with
a helpful error message to the user.
Next, we need to make the actual API request in the info
method as follows:
# tmdbwrapper/tv.pyfrom.importsessionclassTV(object):def__init__(self,id):self.id=iddefinfo(self):path='https://api.themoviedb.org/3/tv/{}'.format(self.id)response=session.get(path)returnresponse.json()
First of all, we import the session
object that we defined in the package root.
We then need to send a GET request to the TV info URL that returns details about a single TV show, given its ID.
The resulting response object is then returned as a dictionary by calling the
.json()
method on it.
There's one more thing we need to do before wrapping this up. Since we are now making actual API calls, we need to take into account some API best practices. We don't want to make the API calls to the actual TMDb API every time we run our tests, since this can get you rate limited.
A better way would be to save the HTTP response the first time a request is made,
then reuse this saved response on subsequent test runs. This way, we minimize
the amount of requests we need to make on the API and ensure that our tests still
have access to the correct data. To accomplish this, we will use the vcr
package:
# tests/test_tmdbwrapper.pyimportvcr@vcr.use_cassette('tests/vcr_cassettes/tv-info.yml')deftest_tv_info(tv_keys):"""Tests an API call to get a TV show's info"""tv_instance=TV(1396)response=tv_instance.info()assertisinstance(response,dict)assertresponse['id']==1396,"The ID should be in the response"assertset(tv_keys).issubset(response.keys()),"All keys should be in the response"
We just need to instruct vcr
where to store the HTTP response for the
request that will be made for any specific test. See vcr's docs on
detailed usage information.
At this point, running our tests requires that we have a TMDB_API_KEY
environment
variable set, or else we'll get an APIKeyMissingError
.
One way to do this is by setting it right before running the tests,
i.e. TMDB_API_KEY='your-tmdb-api-key' py.test
.
Running the tests with a valid API key should have them passing.
Adding More Functions
Now that we have our tests passing, let's add some more functionality to our wrapper. Let's add the ability to return a list of the most popular TV shows on TMDb. We can add the following test:
# tests/test_tmdbwrapper.py@vcr.use_cassette('tests/vcr_cassettes/tv-popular.yml')deftest_tv_popular():"""Tests an API call to get a popular tv shows"""response=TV.popular()assertisinstance(response,dict)assertisinstance(response['results'],list)assertisinstance(response['results'][0],dict)assertset(tv_keys).issubset(response['results'][0].keys())
Note that we are instructing vcr
to save the API response in a different file.
Each API response needs its own file.
For the actual test, we need to check that the response is a dictionary
and contains a results
key, which contains a list of TV show dictionary objects.
Then, we check the first item in the results
list to ensure it is a
valid TV info object, with a test similar to the one we used for the info
method.
To make the new tests pass, we need to add the popular
method to the TV
class.
It should make a request to the popular TV shows path, and then return
the response serialized as a dictionary.
Let's add the popular
method to the TV
class as follows:
# tmdbwrapper/tv.py@staticmethoddefpopular():path='https://api.themoviedb.org/3/tv/popular'response=session.get(path)returnresponse.json()
Also, note that this is a staticmethod
, which means it doesn't need the class
to be initialized for it to be used. This is because it doesn't use any instance
variables, and it's called directly from the class.
All our tests should now be passing.
Taking Our API Wrapper for a Spin
Now that we've implemented an API wrapper, let's check if
it works by using it in a script. To do this, we will write a program that
lists out all the popular TV shows on TMDb along with their popularity rankings.
Create a file in the root folder of our project. You can name the file anything you like — ours is called testrun.py
.
# example.pyfrom__future__importprint_functionfromtmdbwrapperimportTVpopular=TV.popular()fornumber,showinenumerate(popular['results'],start=1):print("{num}. {name} - {pop}".format(num=number,name=show['name'],pop=show['popularity']))
If everything is working correctly, you should see an ordered list of the current popular TV shows and their popularity rankings on The Movie Database.
Filtering Out the API Key
Since we are saving our HTTP responses to a file on a disk, there are chances
we might expose our API key to other people, which is a Very Bad Idea™,
since other people might use it for malicious purposes. To deal with this, we
need to filter out the API key from the saved responses.
To do this, we need to add a filter_query_parameters
keyword argument to the
vcr
decorator methods as follows:
@vcr.use_cassette('tests/vcr_cassettes/tv-popular.yml',filter_query_parameters=['api_key'])
This will save the API responses, but it will leave out the API key.
Continuous Testing on Semaphore CI
Lastly, let's add continuous testing to our application using Semaphore CI.
We want to ensure that our package works on various platforms and that we don't accidentally break functionality in future versions. We do this through continuous automatic testing.
Ensure you've committed everything on Git, and push your repository to GitHub or Bitbucket, which will enable Semaphore to fetch your code. Next, sign up for a free Semaphore account, if don't have one already. Once you've confirmed your email, it's time to create a new project.
Follow these steps to add the project to Semaphore:
Once you're logged into Semaphore, navigate to your list of projects and click the "Add New Project" button:
Next, select the account where you wish to add the new project.
Select the repository that holds the code you'd like to build:
Configure your project as shown below:
Finally, wait for the first build to run.
It should fail, since as we recall, the TMDB_API_KEY
environment key is required
for the tests to run.
Navigate to the Project Settings
page of your application and add your API key
as an environment variable as shown below:
Make sure to check the Encrypt content
checkbox when adding the key to ensure
the API key will not be publicly visible.
Once you've added that and re-run the build, your tests should be passing again.
Conclusion
We have learned how to write a Python wrapper for an HTTP API by writing one ourselves. We have also seen how to test such a library and what are some best practices around that, such as not exposing our API keys publicly when recording HTTP responses.
Adding more methods and functionality to our API wrapper should be straightforward, since we have set up methods that should guide us if we need to add more. We encourage you to check out the API and implement one or two extra methods to practice. This should be a good starting point for writing a Python wrapper for any API out there.
Please reach out with any questions or feedback that you may have in the comments section below. You can also check out the complete code and contribute on GitHub.
This article is brought with ❤ to you by Semaphore.