Python Generator - Mutate Last Result?

435 views Asked by At

I'm trying to decide between the following two definitions of my generator. Which is better? Which is "more pythonic"? And is there anyway to mitigate the drawbacks of each one?

def myGenerator1(howMany):
    result = [0,0,0]
    yield result
    for i in range(howMany)
        modifyListInPlace(result)
        yield result

for val in myGenerator1(1000):
    useValThenForgetIt(val)

def myGenerator2(howMany):
    result = (0,0,0)
    yield result
    for i in range(howMany)
        result = createNewUpdatedTuple(result)
        yield result

for val in myGenerator2(1000):
    useValThenForgetIt(val)

The first one modifies a value that has been returned by the generator, possibly messing with calling code that I haven't foreseen yet. The second produces 1000 tuples worth of garbage in this case, or more if I increase "howMany"(which I might).

The loops I give as an example are just my current use of the generator. I don't think i would ever save the values that come out of it, but it is a bit of a utility that could possibly be useful elsewhere.

2

There are 2 answers

1
Raymond Hettinger On BEST ANSWER

Looking to the standard library as a guide, the combinatoric functions in the itertools module all return tuples eventhough the underlying algorithm is a mutate-in-place algorithm. For example, look at the code for itertools.permutations.

This design (returning tuples instead of lists) has proven to be robust. I worry that the mutating list approach will create some hard-to-find bugs depending on what the caller is doing with the iterator's return value.

One other thought. I wouldn't worry too much about "creating thousands of tuples worth of garbage" for the unused results. Python's tuple implementation is very good at reusing previously disposed tuples (by using an array of freelists, it can create a new tuple from a previously used one without making a call to the memory allocator). So, the tuple version make be just a performant as the list version or even a little better.

2
Ben On

The fact that the first one can return an object, then un-obviously modify it after it's been returned is a HUGE code smell to me, regardless of what language you're using (i.e. it's not an issue of being "pythonic"). Plus, why would you want a function that yields an iterator for the same value over and over again, modifying between yields? Seems very unintuitive to me.

If you use the values, then the tuples created by myGenerator2 aren't garbage. If you use them one at a time, they'll never all exist at the same time, and your program will almost certainly be doing many other memory allocations/deallocations. Unlike the list returned by range(howMany), which will create 1,000 integers that you never actually use (unless you're on Python3, in which case range returns a generator rather than a list).

If there's any chance at all that a caller may want to hang on to a reference to something returned by your generator (and Python programmers generally expect, when given a generator, to be able to go items = list(generator) if they need to use them more than once), then the second is far superior.