Re: Generators vs. Functions?



On Sun, 05 Feb 2006 19:14:29 +1100, Steven D'Aprano <steve@xxxxxxxxxxxxxxxxxxxxxx> wrote:

On Sun, 05 Feb 2006 03:31:24 +0000, Neil Schemenauer wrote:

Peter Hansen <peter@xxxxxxxxxxx> wrote:
More precisely, the state of the function is *saved* when a yield
occurs, so you certainly don't *recreate* it from scratch, but merely
restore the state, and this should definitely be faster than creating it
from scratch in the first place.

Right. Resuming a generator is faster than calling a function.

Have you actually measured this, or are you just making a wild guess?

According to a short test performed by Magnus Lycka, resuming a generator
takes more time than calling a function. My own test agrees.

Here is my test, using Python 2.3. I've tried to make the test as fair as
possible, with the same number of name lookups in both pieces of test code.

# straight function, two name lookups

import timeit

t1 = timeit.Timer(stmt="func.next()", setup=
... """class K:
... pass
...
... def next():
... return 1
...
... func = K()
... func.next = next
... """)

t1.timeit()
0.63980388641357422


# generator, two name lookups

t2 = timeit.Timer(stmt="gen.next()", setup=
... """def g():
... while 1: yield 1
...
... gen = g()
... """)

t2.timeit()
0.82081794738769531


# straight function, one name lookup

t3 = timeit.Timer(stmt="f()", setup=
... """def f():
... return 1
... """)

t3.timeit()
0.47273492813110352


# generator, one name lookup

t4 = timeit.Timer(stmt="gnext()", setup=
... """def g():
... while 1: yield 1
...
... gnext = g().next
... """)

t4.timeit()
0.55085492134094238


So on the basis of my tests, there is a small, but significant speed
advantage to _calling_ a function versus _resuming_ a generator.

Of course the other advantages of generators often far outweigh the tiny
setup cost each time you call one. In addition, for any complex function
with significant execution time, the call/resume time may be an
insignificant fraction of the total execution time. There is little or no
point in avoiding generators due to a misplaced and foolish attempt to
optimise your code.

I show an advantage favoring generator resumption vs function call:

from time import clock
def f(): return clock()
...
def g(): yield clock(); yield clock()
...
max(f()-f() for x in xrange(10000))
-9.2190462142316409e-006
max(f()-f() for x in xrange(10000))
-9.2190462139818408e-006
max(float.__sub__(*g()) for x in xrange(10000))
-7.5428559682677587e-006
max(float.__sub__(*g()) for x in xrange(10000))
-7.5428559682677587e-006
max(float.__sub__(*g()) for x in xrange(10000))
-7.5428559682677587e-006

(It'll probably go ten times faster on a recent box ;-)

Regards,
Bengt Richter
.