I'm having a very difficult time understanding the mechanics of the Google App Engine Datastore.
I want to understand the mechanics so I can build my database in an optimal way for the database.
Given my example below, can someone help me in:
- structure the database optimally
- understand the performance of both read and write given that structure
Example:
Let's say I have N baseball players and each has a unique ID.
I'd like to keep a daily tally of the homeruns hit by each player (storing a "total daily homeruns" property) and basically increment it when a homerun is hit.
So, as time increases, I'd like to show a graph of homeruns each day for each baseball player over X years.
Player 1
1/21/2011 - 2 homeruns
1/22/2011 - 0 homeruns
1/23/2011 - 1 homeruns
Read Requirement: Read the last 5 years of daily "homerun" data for a certain player?
Write Requirement: Increment a daily homerun count for a certain baseball player.
I would love to understand how to structure the data and also the mechanics of both read and write? Will this simple storage task scale? Thanks all.
I would model your requirements with a one-to-many relationship like this:
To retrieve all the
DailyHomeruns
of a givenPlayer
you can do it in this way:Read Requirement:
This means that if your last 5 years homeruns query set contains in average 800 entities *, this query performs the same whether it searches over a thousand entities or a million entities.
Write Requirement:
Writes are slow in Google App Engine but your scenario seems quite trivial and I don't see any possible contention/timeout problem; after all you just need to serially update the
DailyHomeruns
incrementing the counter for a small number of times per day.Other thoughts:
If you need to calculate some stats, for example the total number of Homeruns of a given
Player
,don't even think to use GQL for this purpose because it does not provide any aggregate function à laSQL
.Instead, you have to design your database upfront, defining a model to store the total count of Homeruns per player.
Using the transactions API, each time you increment the
DailyHomeruns
you will need to increment the TotalHomeruns entity for that Player.* I have estimated 3 matches per week for 52 weeks multiplied per 5 years