Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22462

PyPy Development: PyPy and ijson - a guest blog post

$
0
0
This gem was posted in the ijson issue tracker after some discussion on #pypy, and Dav1dde kindly allowed us to repost it here:

"So, I was playing around with parsing huge JSON files (19GiB, testfile is ~520MiB) and wanted to try a sample code with PyPy, turns out, PyPy needed ~1:30-2:00 whereas CPython 2.7 needed ~13 seconds (the pure python implementation on both pythons was equivalent at ~8 minutes).

"Apparantly ctypes is really bad performance-wise, especially on PyPy. So I made a quick CFFI mockup: https://gist.github.com/Dav1dde/c509d472085f9374fc1d

Before:

CPython 2.7:
    python -m emfas.server size dumps/echoprint-dump-1.json
    11.89s user 0.36s system 98% cpu 12.390 total 

PYPY:
    python -m emfas.server size dumps/echoprint-dump-1.json
    117.19s user 2.36s system 99% cpu 1:59.95 total


After (CFFI):

CPython 2.7:
     python jsonsize.py ../dumps/echoprint-dump-1.json
     8.63s user 0.28s system 99% cpu 8.945 total 

PyPy:
     python jsonsize.py ../dumps/echoprint-dump-1.json
     4.04s user 0.34s system 99% cpu 4.392 total

"



Dav1dd goes into more detail in the issue itself, but we just want to emphasize a few significant points from this brief interchange:
  • His CFFI implementation is faster than the ctypes one even on CPython 2.7.
  • PyPy + CFFI is faster than CPython even when using C code to do the heavy parsing.
 The PyPy Team


Viewing all articles
Browse latest Browse all 22462

Trending Articles