I have a defer.inlineCallback function for incrementally updating a large (>1k) list one piece at a time. This list may change at any time, and I'm getting bugs because of that behavior.
The simplest representation of what I'm doing is:-
@defer.inlineCallbacks
def _get_details(self, dt=None):
data = self.data
for e in data:
if needs_update(e):
more_detail = yield get_more_detail(e)
do_the_update(e, more_detail)
schedule_future(self._get_details)
self.data is a list of dictionaries which is initially populated with basic information (e.g. a name and ID) at application start. _get_details will run whenever allowed to by the reactor to get more detailed information for each item in data, updating the item as it goes along.
This works well when self.data does not change, but once it is changed (can be at any point) the loop obviously refers to the wrong information. In fact in that situation it would be better to just stop the loop entirely.
I'm able to set a flag in my class (which the inlineCallback can then check) when the data is changed.
- Where should this check be conducted?
- How does the
inlineCallbackcode execute compared to a normaldeferred(and indeed to a normal python generator). - Does code execution stop everytime it encounters
yield(i.e. can I rely on this code between oneyieldand the next to be atomic)? - In the case of unreliable large lists, should I even be looping through the data (
for e in data), or is there a better way?
Based on more testing, the following are my observations.
for e in dataiterates through elements, with the element still existing even if data itself does not, both before and after theyieldstatement.As far as I can tell, execution is atomic between one
yieldand the next.Looping through the data is more transparently done by using a counter. This also allows for checking whether the data has changed. The check can be done anytime after
yieldbecause any changes must have occurred beforeyieldreturned. This results in the code shown above.