People sometimes ask me how Pyston is going and what we’re currently working on. It’s a bit hard to answer, both because we haven’t had a release recently with some headline-worthy features, but also because a lot of the stuff we’re working on is individually pretty small. Sometimes I try to find some sort of way of expressing this, maybe saying something like “there are a lot of small optimizations that we have to include” or “there is a very long tail of compatibility work”. It never feels that satisfying, so I thought I’d just jot down some of the random things that I’ve done lately and hope that maybe it ends up being somewhat representative.
- Single-character string optimizations. I noticed that we were running the following code somewhat slowly:
query_string = url.split('?')[1]
It turned out that we actually did a pretty good job at most of this: we would get into url.split quickly, and we would take the result and find the 1th element in it quickly. It was just that our str.split method implementation was much slower than CPython’s. In particular, we were using a string function that was string.find(string), which even though was fast and had special-casing for small strings, was not as fast as the corresponding string.find(char) function. So we needed to add an optimization that if the string that we are splitting on is a single character, we call string.find(char). (CPython also has this optimization.)
- Tracing-jit aggressiveness backoff. This is probably the most along the lines of what I thought I’d be working on: some JIT level features dealing with some cool dynamic-language properties. Cool.
- Running code inside execs quickly. Well, I haven’t actually done this yet but I’m going to. Currently we bail on efficient handling of execs, since they have some special name-resolution rules [or rather they are vastly more likely to use those rules than normal Python code], so we restrict that code to the interpreter. I’m noticing that this is starting to effect us: collections.namedtuple creates your class by constructing a class definition string and exec’ing it. Even though the resulting code is small, every time we have to run through it we pay some extra cost via the not-as-fast interpreter.
- Efficient unicode attribute lookup. I didn’t anticipate this at all, but there are definitely cases where it’s important for us to be able to handle unicode-based attribute lookups quickly, such as getattr(obj, u”foo”). People don’t often explicitly request unicode attribute names, but any code that does “from __future__ import unicode_literals” will get this behavior by default.
- Initializing sets in __new__ vs __init__. This is the kind of “long tail” compatibility issue I mentioned. You wouldn’t think that it would matter to the user whether the set did its initialization work in __new__ or __init__. Sure, there are ways that the user could tell if they really wanted to, but does “real code” doesn’t depend on it? Turns out the answer is yes, this causes errors in sqlalchemy. So I need to go back and make sure we do the initialization at the same time that CPython does, so that we can support sqlalchemy’s use of set-subclassing.
So anyway, that’s just some of the random stuff that I’ve been up to lately (or am about to do). There are definitely way more details to be worked out than I expected.