Python’s packaging ecosystem is one of its biggest strengths, but Windows users are often frustrated by packages that do not install properly. One of the most common errors you’ll see is this one:
As far as errors go, “unable to find vcvarsall.bat” is not the most helpful. What is this mythical batch file? Why do I need it? Where can I get it? How do I help Python find it? When will we be freed from this pain? Let’s look at some answers to these questions.
What is vcvarsall.bat, and why do I need it?
To explain why we need this tool, we need to look at a common pattern in Python packages. One of the benefits of installing a separate package is the ability to do something that you couldn’t normally do – in many cases, this is something that would be completely impossible otherwise. Like image processing with Pillow, high-performance machine learning with scikit-learn, or micro-threading with greenlet. But how can these packages do things that aren’t possible in regular Python?
The answer is that they include extension modules, sometimes called native modules. Unlike Python modules, these are not .py files containing Python source code – they are .pyd files that contain native, platform-specific code, typically written in C. In many cases the extension module is an internal detail; all the classes and functions you’re actually using have been written in Python, but the tricky parts or the high-performance parts are in the extension module.
When you see “unable to find vcvarsall.bat”, it means you’re installing a package that has an extension module, but only the source code. “vcvarsall.bat” is part of the compiler in Visual Studio that is necessary to compile the module.
As a Windows user, you’re probably used to downloading programs that are ready to run. This is largely due to the very impressive compatibility that Windows provides – you can take a program that was compiled twenty years ago and run it on versions of Windows that nobody had imagined at that time. However, Python comes from a very different world where every single machine can be different and incompatible. This makes it impossible to precompile programs and only distribute the build outputs, because many users will not be able to use it. So the culture is one where only source code is distributed, and every machine is set up with a compiler and the tools necessary to build extension modules on install. Because Windows has a different culture, most people do not have (or need) a compiler.
The good news is that the culture is changing. For Windows platforms, a package developer can upload wheels of their packages as well as the source code. Extension modules included in wheels have already been compiled, so you do not need a compiler on the machine you are installing onto.
When you use pip to install your package, if a wheel is available for your version of Python, it will be downloaded and extracted. For example, running pip install numpy will download their wheel on Python 3.5, 3.4 and 2.7 – no compilers needed!
I need a package that has no wheel – what can I do?
Firstly, this is become a more and more rare occurrence. The pythonwheels.com site tracks the most popular 360 packages, showing which ones have made wheels available (nearly 60% when this blog post was written). But from time to time you will encounter a package who’s developer has not produced wheels.
The first thing you should do is report an issue on the project’s issue tracker, requesting (politely) that they include wheels with their releases. If the project supports Windows at all, they ought to be testing on Windows, which means they have already handled the compiler setup. (And if a project is not testing on Windows, and you care a lot about that project, maybe you should to volunteer to help them out? Most projects do not have paid staff, and volunteers are always appreciated.)
If a project is not willing or able to produce wheels themselves, you can look elsewhere. For many people, using a distribution such as Anaconda or Python(x,y) is an easy way to get access to a lot of packages.
However, if you just need to get one package, it’s worth seeing if it is available on Christoph Gohlke’s Python Extension Packages for Windows page. On this page there are unofficial wheels (that is, the original projects do not necessarily endorse them) for hundreds of packages. You can download any of them and then use pip install (full path to the .whl file) to install it.
If none of these options is available, you will need to consider building the extension yourself. In many cases this is not difficult, though it does require setting up a build environment. (These instructions are adapted from Build Environment.)
First you’ll need to install the compiler toolset. Depending on which version of Python you care about, you will need to choose a different download, but all of them are freely available. The table below lists the downloads for versions of Python as far back as 2.6.
Python Version | You will need |
---|---|
3.5 and later | Visual C++ Build Tools 2015 or Visual Studio 2015 |
3.3 and 3.4 | Windows SDK for Windows 7 and .NET 4.0 (Alternatively, Visual Studio 2010 if you have access to it) |
2.6 to 3.2 | Microsoft Visual C++ Compiler for Python 2.7 |
After installing the compiler tools, you should ensure that your version of setuptools is up-to-date.
For Python 3.5 and later, installing Visual Studio 2015 is sufficient and you can now try to pip install the package again. Python 3.5 resolves a significant compatibility issue on Windows that will make it possible to upgrade the compilers used for extensions, so when a new version of Visual Studio is released, you will be able to use that instead of the current one.
For Python 2.6 through 3.2, you also don’t need to do anything else. The compiler package (though labelled for “Python 2.7″, it works for all of these versions) is detected by setuptools, and so pip install will use it when needed.
However, if you are targeting Python 3.3 and 3.4 (and did not have access to Visual Studio 2010), building is slightly more complicated. You will need to open a Visual Studio Command Prompt (selecting the x64 version if using 64-bit Python) and run set DISTUTILS_USE_SDK=1 before calling pip install.
If you have to install these packages on a lot of machines, I’d strongly suggest installing the wheel package first and using pip wheel (package name) to create your own wheels. Then you can install those on other machines without having to install the compilers.
And while this sounds simple, there is a downside. Many, many packages that need a compiler also need other dependencies. For example, the lxml example we started with also requires copies of libxml2 and libxslt– more libraries that you will need to find, download, install, build, test and verify. Just because you have a compiler installed does not mean the pain ends.
When will the pain end?
The issues surrounding Python packaging are some of the most complex in our industry right now. Versioning is difficult, dependency resolution is difficult, ABI compatibility is difficult, secure hosting is difficult, and software trust is difficult. But just because these problems are difficult does not mean that they are impossible to solve, that we cannot have a viable ecosystem despite them, or that people are not actively working on better solutions.
For example, wheels are a great distribution solution for Windows and Mac OS X, but not so great on Linux due to the range of differences between installs. However, there are people actively working on making it possible to publicly distribute wheels that will work with most versions of Linux, such that soon all platforms will benefit from faster installation and no longer require a compiler for extension modules.
Most of the work solving these issues for Python goes on at the distutils-sig mailing list, and you can read the current recommendations at packaging.python.org. We are all volunteers, and so over time the discussion moves from topic to topic as people develop an interest and have time available to work on various problems. More contributors are always welcome.
But even if you don’t want to solve the really big problems, there are ways you can help. Report an issue to package maintainers who do not yet have wheels. If they don’t currently support Windows, offer to help them with testing, building, and documentation. Consider donating to projects that accept donations – these are often used to fund the software and hardware (or online services such as Appveyor) needed to support other platforms.
And always thank project maintainers who actively support Windows, Mac OS X and Linux. It is not an easy task to build, test, debug and maintain code that runs on such a diverse set of platforms. Those who take on the burden deserve our encouragement.