Avoiding contention in AppEngine

627 views Asked by At

I am trying to wrap my head around contention and how it applies to the Application Engine stack.

I have a model that is build like so

class Events(db.Model):
    #Owner Identification Number
    owner_id        = db.StringProperty(required=True)

    #Authentication Token
    auth_token      = db.StringProperty(required=True)

    #generic, b, c, d, ...
    driver          = db.StringProperty(required=True)

    #Start Time and Date
    tStart          = db.DateTimeProperty(auto_now=True)

    #Define whether the event is active or inactive
    active          = db.BooleanProperty(default=False)

    #Payload store, this will store each payload block sent or pulled
    payloads        = db.StringListProperty(indexed=False)

This model holds several events, each event has an owner and a payloads, the owner of the event will be writing payloads to and from his event and many others will be reading from the event, it's sort of a transcription stack.

My question is about contention, Will I be effected by this and if so how can I restructure to prevent it.

Thank you.

4

There are 4 answers

0
systempuntoout On BEST ANSWER

I don't see any problem with your Model :

  1. Events entities will not pay any contention tax as they seem, judgding from your words and example , just root entities outside any entity group.
  2. A frequent update on a single entity can cause contention but I hardly doubt that the owner will update any entity more than one time per second (1QPS is the threshold you have to keep in mind, above that you are in the danger zone).
  3. Datastore read operations do not cause contention problem.
0
Peter Knego On

In your case the limitation that applies is the entity write/update limitation, which is 1 write/update per entity (or entity group) per second.

There is no limitation for reads.

It's still a good idea to use memcache for caching reads in order to lower the cost and improve response time. If you use Python NDB then caching is enabled by default.

Solution: IMHO a good solution for increasing write throughput and concurrently have reads are backends. They are (mostly) always on instances that you can use as a shared memory. So you can batch writes (and flush via Task Queue) while having concurrent reads.

Note: Instances get restarted about once a day, so you cannot treat them as reliable storage - you can use them as smart cache while asynchronously (via backend threads or task queue) transactionally updating entity in datastore.

0
dragonx On

In App Engine, each instance of an Event is read/written as an entire object. You would be worried about contention on each instance of Event. If you must update a single instance of Event frequently, then you may need to worry about contention. If you update different instances, then there's nothing to worry about.

I'm not sure what exactly you mean by contention. You may either be referring to a) transactional integrity or b) limited write performance. You should have no problems with read performance, though you do have the issue of eventual consistency to deal with.

a) If you must read the correct data after an Event instance has been updated, you need to use a datastore get() request by key. A query() request may return old data.

b) If you are worried about write performance, you need to somehow split your entity into multiple entities. You may perhaps considering having multiple Payload entities for each Event, something like:

class Payload(db.Model):
    event = db.ReferenceProperty(Events)
    payload = db.StringProperty()

This way you can write each payload separately, but it'll be slightly more expensive since they need to be indexed and you'll need to query them by event to get them. You may want to set the Event to be the ancestor so you can use ancestor queries for consistent queries.

1
dvliman On

I am new to Google App Engine too. so basically avoiding contention is actually asking how to increase the write throughput. The solutions I could think of are:

  1. Sacrifice Transactions
  2. Batch Writes in memcached
  3. Shard counters
  4. Background Tasks queue

https://developers.google.com/appengine/articles/sharding_counters

https://developers.google.com/appengine/articles/scaling/contention

Any other idea? I would like to know too!