Stack vs Register bytecodes for Python

There seems to be a consensus that register bytecodes are superior to stack bytecodes. I don’t quite know how to cite “common knowledge”, but doing a google search for “Python register VM” or “stack vs register vm” supports the fact that many people believe this. There was a comment on this blog to this effect as well.

Anyway, regardless of whether it truly is something that everyone believes or not, I thought I’d add my two cents. Pyston uses a register bytecode for Python, and I wouldn’t say it’s as great as people claim.

Lifetime management for refcounting

Why? One of the commonly-cited reasons that register bytecodes are better is that they don’t need explicit push/pop instructions. I’m not quite sure I agree that you don’t need push instructions — you still need an equivalent “load immediate into register”. But the more interesting one (at least for this blog post) is pop.

The problem is that in a reference-counted VM, we need to explicitly kill registers. While the Python community has made great strides to support deferred destruction, there is still code (especially legacy code) that relies on immediate destruction. In Pyston, we’ve found that it’s not good enough to just decref a register the next time it is set: we need to decref a register the last time it is used. This means that we had to add explicit “kill flags” to our instructions that say which registers should be killed as a result of the instruction. In certain cases we need to add explicit “kill instructions” whose only purpose is to kill a register.

In the end it’s certainly manageable. But because we use a register bytecode, we need to add explicit lifetime management, whereas in a stack bytecode you get that for free.

I don’t think it’s a huge deal either way, because I don’t think interpretation overhead is the main factor in Python performance, and a JIT can smooth over the differences anyway. But the lifetime-management aspect was a surprise to me and I thought I’d mention it.

Kevin Modzelewski

July 28, 2016

Uncategorized

2 responses to “Stack vs Register bytecodes for Python”

John says:

July 28, 2016 at 2:57 pm

What if python had been designed from the start with a register VM?

LikeLike

Reply
Freddie Witherden says:

August 4, 2016 at 3:12 pm

The consensus is one which is well supported in the literature with the most extensive research having been conducted within the context of the Java VM. The best reference I know of is Yunhe et al.

Click to access p153-yunhe.pdf

One caveat here is that to truly realize the benefit an interpreter needs to be optimizing and transforming the AST (something I alluded to in my previous comment). Now, it is certainly true that Python bytecode is a very different beast to Java bytecode. However, a few years ago Victor Stinner did investigate the use of a register based VM for CPython

http://faster-cpython.readthedocs.io/registervm.html

although the project appears to be dormant the initial results were very promising. One point made by both pieces of work surround the use of computed goto’s within the context of a threaded interpreter to reduce the number of branch mispredictions. However, with modern CPUs — which have dramatically improved branch predictors compared with those of prior generators — this statement is no longer as true as it once was, see Rohou et al.

https://hal.inria.fr/hal-00911146/document

LikeLike

Reply

kmod's blog

Stack vs Register bytecodes for Python

Lifetime management for refcounting

2 responses to “Stack vs Register bytecodes for Python”

Leave a comment Cancel reply

Stack vs Register bytecodes for Python

Lifetime management for refcounting

Share this:

2 responses to “Stack vs Register bytecodes for Python”

Leave a comment Cancel reply