kmod's blog


Why generator functions suck

I recently had to debug a piece of code similar to the following:

def f(x):
  return globals.get(x)

def g(x):
  # Very long function that calls f() at certain points

#In a different module:
def h(x):
  return g(x)

The problem was that something was breaking inside the function of f. I started inserting print statements to try to debug, and things were going pretty well. I found a minimal test that I could do to show that things were messed up, and it turns out things were bad as soon as f was entered. So then I started looking at g. I found all the times that it called f. Things were indeed messed up before calling f, and in fact they were messed up at the beginning of g as well!

So then I started looking into h. But things were fine right before g was entered, and were fine after it exited as well. How could this happen? The fact that they're in different modules seemed a likely culprit, so I did some tests on that. No luck -- it wasn't because they were in separate modules. It wasn't because they were being run on separate threads.  None of these functions were decorated in any way.  It was not obvious at all what was going on.

And then I looked closer at g. And buried in the middle of it is a

      yield foo

which means that g is actually a generator function. There was no way for me to know that; the function was called like a normal function, and it looked like a normal function. Yet the semantics of it are completely different! So the problem was that the actual body of g wasn't being run until much later, after h returns, and the conditions it expects have disappeared. If g was somehow marked as a generator function, the problem would have been somewhat straightforward. In the current version, though, it takes looking through the entire body of g to determine that g does not behave at all like what it looks like it does when you use it. And that is why generator functions (and other similar syntactic sugar in other languages) suck.

In case you were wondering, the specific example concerns Pylons. You can return a generator from your web handler, and Pylons will iterate through it until the end. This is nice for things like serving large files off disk. You can create a generator that reads the file a bit at a time when it is needed, instead of loading the entire file and sending it all at once.

The problem, though, is that the global 'request' object disappears by the time Pylons actually calls your generator. So anything inside your generator function, or anything that your generator function calls, will die a violent death if it wants to access anything about the request that isn't saved. This took forever to figure out.

Comments (0) Trackbacks (0)

No comments yet.

Leave a comment

No trackbacks yet.