Querying embedded entities in Cloud Datastore

2.5k views Asked by At

I have a couple of questions about the use of embedded entities in Datastore.

Consider the following simple test case:

Entity entity = new Entity("Person");

entity.setProperty("name", "Alice");
EmbeddedEntity address = new EmbeddedEntity();
address.setProperty("streetAddress", "100 Main Street");
address.setProperty("addressLocality", "Springfield");
address.setProperty("addressRegion", "VA");

entity.setProperty("address", address);

DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
datastore.put(entity);

Query query = new Query("Person");
FilterPredicate regionFilter = 
    new FilterPredicate("address.addressRegion", FilterOperator.EQUAL, "VA");
query.setFilter(regionFilter);

List<Entity> results = datastore.prepare(query)
    .asList(FetchOptions.Builder.withDefaults());

assertEquals(1, results.size());

This test is failing; the result set is empty.

Here are my questions:

  1. Am I using FilterPredicate correctly? The documentation does not explain how to reference properties of an EmbeddedEntity. I am guessing that the convention is to use a dot-separated path. But maybe this is not correct.
  2. Does my test case need to declare indexes for subproperties within the embedded address entity? If so, how?

The Datastore documentation contains the following statement:

"When an embedded entity is included in indexes, you can query on subproperties."

I am following the instructions in the article about Local Unit Testing for Java, but there is nothing in the article that explains how to define indexes in a JUnit test.

1

There are 1 answers

0
Dan McGrath On BEST ANSWER

This test is flaky due to eventual consistency.

Since you are not doing an ancestor query, the query uses eventually consistent indices (SELECT * FROM Person WHERE address.addressRegion = "VA"). The insert and the query are not guaranteed to hit the same replica, nor is the address.addressRegion guaranteed to have been updated.

By default the embedded entity should be indexed so that isn't the problem.

Eventual consistent generally is resolved in milliseconds, but since you are writing and querying immediately, there is an increased chance you'll hit it.

There are 2 strategies you can employ to reduce the flakiness of the test.

1. Sleeping

Adding a 1 or 2 second sleep between the put and the query will reduce the flakiness of the test, but not eliminate it - might be a reasonable first step. I didn't run your code by at a glance it seems correct.

2. Forcing the index write to be applied

Cloud Datastore synchronously writes Entities to a replica majority, however indices are asynchronously applied after this step - this leads to eventual consistency for some queries.

You can force indices to be applied by performing a read of the entity in question. When an entity is read, the write log for the entity group is checked to see if there are any outstanding writes to be applied - if there are they are forced to be applied before the read. You can use this mechanism in unit testing to reduce eventual consistency issues.

Misc.

Also, to verify the entity was written as expected, you can jump into the Cloud console and execute the GQL statement from above.