Thursday, August 16, 2012

A quick fix to improve Python3 startup time

My web server is very low-end. Dating from the mid-90s, it has a 200Mhz Pentium and 96MB of RAM. It was running Debian 2.2 (potato), but I recently upgraded to the most recent Debian 6.0 (squeeze). I'm impressed that it even runs.

I also upgraded to Python3 in order to handle the recent overhaul of my Tamarin automated grading system. I compiled Python3 from source to do this, since Debian doesn't include it yet. They're still shipping Python 2.6.

Tamarin is little more than a bunch of CGI scripts. I expect it to run a little slowly on this machine, but, after the upgrade, any CGI request is taking about 4 seconds, which is fairly intolerable. Static webpages are still responsive enough, though. So I started up the Python3 interpreter... and waited. Yep, that's where the lag is. See for yourself:

ztomasze@tamarin:~$ time python3 -c 'pass'

real    0m2.493s
user    0m2.336s
sys     0m0.160s

Here, I'm just starting python3 to execute a single 'pass' statement that does nothing. This takes 2.5 seconds.

I read somewhere that not loading the local site libraries by using the -S option can give a performance boost. Since Tamarin uses only standard modules, I gave it a shot:

ztomasze@tamarin:~$ time python3 -S -c 'pass'

real    0m0.465s
user    0m0.392s
sys     0m0.072s

A 500% improvement! I even installed the default Python 2.6, just to compare times:

ztomasze@tamarin:~$ time python -c 'pass'

real    0m0.448s
user    0m0.348s
sys     0m0.060s
ztomasze@tamarin:~$ time python -S -c 'pass'

real    0m0.185s
user    0m0.148s
sys     0m0.036s

So Python3 is significantly slower for me than Python 2 was, but using the -S option at least gets me back to standard Python2 times.

This savings didn't really translate directly to improved CGI preformance though. Running two of my scripts from the command command line, I experienced the following:

                       status.py upload.py
original time          4 sec     6 sec
adding -S to #! line   3 sec     5 sec

A delay this long is still fairly intolerable. And I don't think the lag is inherent to Tamarin, since the delays weren't this long with Python2 and Debian 2.2 on the same machine.

I know I could probably shave off some more time for Tamarin by using FastCGI, SCGI, or mod_python. SCGI looks most useful to me given my existing codebase. Whenever I get some free time, I'll look into that.