I have a defer.inlineCallback
function for incrementally updating a large (>1k) list one piece at a time. This list may change at any time, and I'm getting bugs because of that behavior.
The simplest representation of what I'm doing is:-
@defer.inlineCallbacks
def _get_details(self, dt=None):
data = self.data
for e in data:
if needs_update(e):
more_detail = yield get_more_detail(e)
do_the_update(e, more_detail)
schedule_future(self._get_details)
self.data
is a list of dictionaries which is initially populated with basic information (e.g. a name and ID) at application start. _get_details
will run whenever allowed to by the reactor to get more detailed information for each item in data, updating the item as it goes along.
This works well when self.data
does not change, but once it is changed (can be at any point) the loop obviously refers to the wrong information. In fact in that situation it would be better to just stop the loop entirely.
I'm able to set a flag in my class (which the inlineCallback
can then check) when the data is changed.
- Where should this check be conducted?
- How does the
inlineCallback
code execute compared to a normaldeferred
(and indeed to a normal python generator). - Does code execution stop everytime it encounters
yield
(i.e. can I rely on this code between oneyield
and the next to be atomic)? - In the case of unreliable large lists, should I even be looping through the data (
for e in data
), or is there a better way?
Based on more testing, the following are my observations.
for e in data
iterates through elements, with the element still existing even if data itself does not, both before and after theyield
statement.As far as I can tell, execution is atomic between one
yield
and the next.Looping through the data is more transparently done by using a counter. This also allows for checking whether the data has changed. The check can be done anytime after
yield
because any changes must have occurred beforeyield
returned. This results in the code shown above.