Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22875

Sebastian Witowski: Upgrade Your Python Version

$
0
0

Here is an idea for a completely free[1] speed improvement for your code - upgrade your Python version!

I started this series of articles using Python 3.8, but today we already have version 3.11. Python 3.11 is the first version of Python that brings pretty significant speed improvements thanks to the Faster CPython project. If you never heard about it, it's something that started as Mark Shannon's idea to improve the overall performance of CPython and now a dedicated team of developers (including Guido van Rossum) is working to bring some hefty speed improvements over the next few releases.

So I decided to benchmark some Python scripts to see how much faster they can get by simply updating the Python versions. I will check out some of the examples that I described in this "Writing Faster Python" series, but also some random, computationally intensive programs.

Setup #

Here are the scripts I will take for a spin. Each link will take you to the corresponding article on that topic.

Ask for Forgiveness or Look Before You Leap - example 2, where we check if all 3 attributes exist (and they do):

# permission_vs_forgiveness.py

classBaseClass:
hello ="world"
bar ="world"
baz ="world"

classFoo(BaseClass):
pass

FOO = Foo()

# Look before you leap
deftest_permission2():
ifhasattr(FOO,"hello")andhasattr(FOO,"bar")andhasattr(FOO,"baz"):
FOO.hello
FOO.bar
FOO.baz

# Ask for forgiveness
deftest_forgiveness2():
try:
FOO.hello
FOO.bar
FOO.baz
except AttributeError:
pass

Ask for Forgiveness or Look Before You Leap - example 3, where we check for an attribute, but that attribute doesn't exist:

# permission_vs_forgiveness2.py

classBaseClass:
pass# "hello" attribute is now removed

classFoo(BaseClass):
pass

FOO = Foo()

# Look before you leap
deftest_permission3():
ifhasattr(FOO,"hello"):
FOO.hello

# Ask for forgiveness
deftest_forgiveness3():
try:
FOO.hello
except AttributeError:
pass

Find Item in a List - for loop and a generator expression for finding the first number divisible by 42 and 43. They both use count() function inside:

# find_item.py

from itertools import count

defcount_numbers():
for item in count(1):
if(item %42==0)and(item %43==0):
return item

defgenerator():
returnnext(item for item in count(1)if(item %42==0)and(item %43==0))

For Loop vs. List Comprehension - for loop and a list comprehension for creating a filtered list of numbers:

# filter_list.py

MILLION_NUMBERS =list(range(1_000_000))

deffor_loop():
output =[]
for element in MILLION_NUMBERS:
ifnot element %2:
output.append(element)
return output

deflist_comprehension():
return[number for number in MILLION_NUMBERS ifnot number %2]

Sorting Lists - list.sort() and sorted() for sorting a list of random numbers:

# sorting.py

from random import sample

# List of 1 000 000 integers randomly shuffled
MILLION_RANDOM_NUMBERS = sample(range(1_000_000),1_000_000)

deftest_sort():
random_list = MILLION_RANDOM_NUMBERS[:]
return random_list.sort()

deftest_sorted():
random_list = MILLION_RANDOM_NUMBERS[:]
returnsorted(random_list)

Remove Duplicates From a List - removing duplicates from a list with a for loop and by converting list to a set and back to a list:

# duplicates.py

from random import randrange

DUPLICATES =[randrange(100)for _ inrange(1_000_000)]

deftest_for_loop():
unique =[]
for element in DUPLICATES:
if element notin unique:
unique.append(element)
return unique

deftest_set():
returnlist(set(DUPLICATES))

Slower scripts #

With the examples from "Writing Faster Python" articles we have a good variety of common operations. We do attribute lookups, we handle exceptions, we test iterators, generators, loops and lists comprehensions, etc.

But all those examples are rather fast to run, so just for a good measure let's add two more functions that are intended to be more computational-heavy and run for at least a few seconds:

Bubble sort - a fairly slow sorting algorithm. Let's run it on a list of 10 000 numbers in a descending order, which should take a couple of seconds on my computer:

# bubble_sort.py

DESCENDING_10_000 =list(range(10_000,0,-1))

defbubble_sort():
numbers = DESCENDING_10_000[:]
changed =True
while changed:
changed =False
for i inrange(len(numbers)-1):
if numbers[i]> numbers[i+1]:
numbers[i], numbers[i+1]= numbers[i+1], numbers[i]
changed =True
return numbers

Monte Carlo estimation of the π number. This is a simple simulation where we draw a square with side of 1 and inside we draw a circle (so it has a diameter of 1). Then we throw a bunch of darts (or generate random points in case we don't have a large pile of virtual darts) inside that square. This lets us estimate the area of both the square and the circle by simply counting the number of darts that landed inside each of them. By definition all the darts will end up inside the square, but only some of them will land in the circle. Finally, we know from school that the area of the circle divided by the area of the square is equal to π/4. So we do that division and we get the estimation of π. The more darts we throw, the better the estimation is. Here is a visual explanation of this method.

Again, there are more efficient algorithms to do this simulation (e.g. using numpy), but I want a slow version on purpose:

# pi_estimation.py

from random import random
from math import sqrt

# Total number of darts to throw.
TOTAL =100_000_000

defestimate_pi():
# Number of darts that land inside the circle.
inside =0

for _ inrange(TOTAL):
x2 = random()**2
y2 = random()**2
# Check if the x and y points lie inside the circle
if sqrt(x2 + y2)<1.0:
inside +=1
return(float(inside)/ TOTAL)*4

Benchmarks #

With 14 functions to check, we are ready to start our benchmarks. To run all of them at once, I've created a simple bash script that will run all functions under different Python versions. I use pyenv to install latest versions of Python starting from 3.7 and then I use Python executables from each of those versions. Finally, I print the results in a nice table.

Here is the bash script I came up with. Don't worry if you don't understand how it works, I probably won't understand it one month from now either.

#!/usr/bin/env bash

# Python versions that we will test
PYENV_VERSIONS=(3.7.14 3.8.14 3.9.14 3.10.7 3.11.0)

# Setup code and the actual functions that we will benchmark
COMMANDS=(
"-s 'from permission_vs_forgiveness import test_permission2' 'test_permission2()'"
"-s 'from permission_vs_forgiveness import test_forgiveness2' 'test_forgiveness2()'"
"-s 'from permission_vs_forgiveness2 import test_permission3' 'test_permission3()'"
"-s 'from permission_vs_forgiveness2 import test_forgiveness3' 'test_forgiveness3()'"
"-s 'from find_item import count_numbers' 'count_numbers()'"
"-s 'from find_item import generator' 'generator()'"
"-s 'from filter_list import for_loop' 'for_loop()'"
"-s 'from filter_list import list_comprehension' 'list_comprehension()'"
"-s 'from sorting import test_sort' 'test_sort()'"
"-s 'from sorting import test_sorted' 'test_sorted()'"
"-s 'from duplicates import test_for_loop' 'test_for_loop()'"
"-s 'from duplicates import test_set' 'test_set()'"
"-s 'from bubble_sort import bubble_sort' 'bubble_sort()'"
"-s 'from pi_estimation import estimate_pi' 'estimate_pi()'"
)

OUTPUT="Function,"
# Create a header with version numbers
forvin${PYENV_VERSIONS[@]}
do
OUTPUT+="$v,"
done

# Last column will contain difference between 1st and last version of Python in the PYENV_VERSIONS
OUTPUT+="${PYENV_VERSIONS[0]}/${PYENV_VERSIONS[${#PYENV_VERSIONS[@]}-1]}"
OUTPUT+="\n"

for(( i =0; i < ${#COMMANDS[@]} ; i++))
do
# Remove the single quotes from function name
OUTPUT+=$(echo ${COMMANDS[$i]##*\ } | tr -d "'")

forvin${PYENV_VERSIONS[@]}
do
OUTPUT+=","
OUTPUT+=$(eval"/Users/switowski/.pyenv/versions/$v/bin/python -m timeit ${COMMANDS[$i]}"|sed -e 's/.*: \(.*\) per loop/\1/')
done
# Divide timings for the first and last Python version and add it in the last column
v1=$(eval"/Users/switowski/.pyenv/versions/${PYENV_VERSIONS[0]}/bin/python -m timeit ${COMMANDS[$i]}"|sed -e 's/.*: \(.*\) per loop/\1/' -e 's/[^0-9\.]//g')
v2=$(eval"/Users/switowski/.pyenv/versions/${PYENV_VERSIONS[${#PYENV_VERSIONS[@]}-1]}/bin/python -m timeit ${COMMANDS[$i]}"|sed -e 's/.*: \(.*\) per loop/\1/' -e 's/[^0-9\.]//g')
difference=$(echo"scale=2; $v1 / $v2"|bc)
OUTPUT+=",$difference"

OUTPUT+="\n"
done

# Print in a table-like format
printf"$OUTPUT"|column -ts,

I've put all the code examples together with the benchmark script and the results in this repository. The actual benchmark script has one more version in case you don't care about the table, but the raw output from the timeit functions.

Results #

Let's see the results. The lower the number, the faster a given code example runs. In the last column we can see the comparison of how long it takes to run the code in Python 3.7 vs. Python 3.11. "1.68" means that a given example runs 68% slower in Python 3.7.

I did a bit of cleanup by moving the units next to the function name (instead of next to each number as in the original output).

Function3.7.143.8.143.9.143.10.73.11.03.7/3.11
test_permission2() [nsec]2181451481451401.68
test_forgiveness2() [nsec]91.970.47283.171.71.31
test_permission3() [nsec]77.460.961.957.140.51.88
test_forgiveness3() [µsec]256251239283307.83
count_numbers() [µsec]46.847.547.446.6411.14
generator() [µsec]47.147.747.645.339.51.18
for_loop() [msec]27.226.526.825.619.41.39
list_comprehension() [msec]18.31818.617.717.31.04
test_sort() [msec]175175176176175.97
test_sorted() [msec]1831831861831851.00
test_for_loop() [msec]3603643163053081.17
test_set() [msec]5.595.575.836.096.08.91
bubble_sort() [sec]8.058.248.237.894.691.72
estimate_pi() [sec]17.117.918.117.414.31.21

We can see that in most cases, as we upgrade Python version, our examples run faster. And Python 3.11 gives us the best improvements. Upgrading your Python version now makes even more sense than before if you're looking for some speed improvements.

But for some examples we see degradation of performance. The 0.97 for test_sort() and 0.91 for test_set() differences are so small that I assume it's the small randomness of the benchmarks results. But the test_forgiveness3() with around 20% decrease in performance in Python 3.11 looked interesting. I checked the release notes for Python 3.11 to find what might be causing this and I found nothing. So I decided to compared how Python handles exceptions for the most common example - division by zero:

# division.py
defdivide_by_zero():
try:
1/0
except ZeroDivisionError:
pass

Benchmarking the above code under different Python versions gave me the following results:

  • Python 3.7.14: 161 nsec
  • Python 3.8.14: 170 nsec
  • Python 3.9.14: 165 nsec
  • Python 3.10.7: 141 nsec
  • Python 3.11.0: 169 nsec

In Python 3.11.0 it's almost as slow as in Python 3.7 or 3.8 were. So it seems like the slowdown for my test_forgiveness3() was specific to this one particular example and not something we should be worried about. And while this example is slower, all the other examples of testing permission and forgiveness got much faster in the newer Python versions. In Python 3.11 the "ask for permission" gets and additional speed boost from the "zero cost" exception handling.

"Zero cost" exception handling #

Python 3.11 introduced something called "zero cost" exception handling. This Hacker News submission has some good explanation how this works in Python and other languages. The gist of this feature is that everything inside the "try" block (the "happy path" of the exception) will now be faster - almost as fast as if there was no try/except block at all.

Let's see this in action!

I created one more short benchmarking script. I took 3 code examples (for loop for filtering a list, bubble sort, and the pi estimation) and wrapped their most inner instructions in a try/except block (so that this try/except block is executed as often as possible). At the same time, since there are no exceptions, the "except" block is never called, so I can just put pass inside.

So, for example, the first test case will compare those two variants:

MILLION_NUMBERS =list(range(1_000_000))

deffor_loop():
output =[]
for element in MILLION_NUMBERS:
ifnot element %2:
output.append(element)
return output

deffor_loop_with_try_except():
output =[]
for element in MILLION_NUMBERS:
ifnot element %2:
try:
output.append(element)
except Exception:
pass
return output

With zero cost exceptions handling, Python 3.11 should run those code examples faster that Python 3.10 or 3.9.

Let's see the results by running the exceptions_benchmark.sh script:

Function3.9.143.10.73.11.0
Filter [msec]26.7 (28.4)26 (27.1)19.6 (20.4)
Pi [sec]18.4 (19.2)17.3 (17.5)14.1 (14.3)
Bubble [sec]8.26 (8.46)7.96 (8.06)4.72 (4.75)

First number in each column is how long it takes to run the original version (without try/except blocks). The number in parenthesis is how long it takes to run the same function with the try/except blocks called multiple times.

The differences between both variants are tiny for all 3 Python versions. But for Python 3.11 they are even smaller! Take this simple benchmark with a grain of salt, but I hope it helped illustrate what's the benefit of "zero cost" exception handling.

Conclusions #

Upgrading Python version is one of a few ways of making your code a bit faster without changing it. And no matter if you upgrade from Python 3.7. to 3.8 or from Python 3.9 to Python 3.10, you will always get some improvements for a large codebase. But it's the Python 3.11 where a dedicated effort was made to really speed it up. According to the release notes, it should speed up your code by around 10-60%. So now is a good time to think about upgrading your Python projects.

If you want to run your own benchmarks, with more advanced code examples, the Python Performance Benchmark Suite is a good place to look for some inspiration.


TODO: todo #

  • update date

  1. Completely free if you have good tests coverage (in case of some subtle bugs between minor Python versions), all the libraries you are using work with newer Python version, and you have a few moments to install new Python version. ↩︎


Viewing all articles
Browse latest Browse all 22875

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>