Using post_import_function in App Engine bulkuploader yaml

329 views Asked by At

I'm trying to upload some data to my App Engine datastore using the bulkuploader. For one of my entity types, I have one property that is calculated from another, so I'd really like to do some post-processing on each entity as it's imported to do this calculation. I keep seeing brief mentions of the post_import_function transform tag, but no real comprehensive documentation or examples.

For now, I'm just trying to do a simple test just to get my post_import_function to work.

My entity model:

class TestEntity(db.Model):
    location = db.GeoPtProperty()
    cells = db.StringListProperty() # Computed from location

The relevant part of my bulkloader.yaml file looks like this:

- kind: TestEntity
  [... connector info ...]
  property_map:
    [... transform info for __key__ and location here ...]
  post_import_function: post_transform.post_process_testentity

And my post_process_testentity function:

def post_process_testentity(input_dict, entity_instance, bulkload_state):
    entity_instance.cells = [u'Hello there!']
    return entity_instance

When I do a data upload with all this stuff, I get no errors (and I know that post_process_testentity is being entered, because I've added a few print statements inside it that ran correctly). Everything about the upload works, except my post processing function has absolutely no effect. There are no "Hello there!"s in my datastore when I use the data viewer.

Could someone help me out a bit? Thank you!

1

There are 1 answers

0
Jen S On BEST ANSWER

In case others are having similar problems, I got my test as described above to work. It seems that entity_instance in the post processing function is actually of type google.appengine.api.datastore.Entity, which is a subclass of dict. So, this modification to the post_process_testentity function worked:

def post_process_testentity(input_dict, entity_instance, bulkload_state):
    entity_instance['cells'] = [u'Hello there!']
    return entity_instance

However, I only figured this out through playing around with printing various debugging messages. It would be great if this stuff was documented somewhere. Does anyone know where I can find such documentation?