Consider the following models and their properties:
Media
├── Image
│ - CreatedAt
│ - Width
│ - Height
├── Audio
│ - CreatedAt
│ - Duration
└── Video
- CreatedAt
- Width
- Height
- Duration
There are a few ways of implementing this:
1. Different kinds, different properties
This can be done by PolyModel in ndb. We have a base model Media
which factors the common properties, and Image
, Audio
and Video
can be subclasses.
Since each subclass has its own kind, querying the N newest media is not possible without in-memory sorting.
2. Same kind, same properties
Implementing this in datastore using single table inheritance is possible, but there is no point in enforcing 'unused' properties in a schemaless database. For example, there will be many rows for Audio
entities with unused Width
and Height
properties.
Querying the N newest media is possible here, but it has the disadvantage of unused properties.
3. Same kind, different properties
Unlike relational databases, datastore does not require entities of the same kind to have the same properties. It is possible that Image
, Audio
and Video
all to have the same Media
kind, while having their own set of properties. An extra property called Type
is necessary to distinguish them from each other.
Querying the N newest media is possible with this method, and there is no unused properties. But are there any gotchas with this approach? Are we losing any application level schema safety and data integrity with this?
This is exactly what ndb's polymodel was designed for. However, it works differently than you described. Consider the following definitions:
Inside Datastore, an
Image
would be stored with the kindMedia
and with a property namedclass
equal to["Media", "Image"]
.Using the models, you can query for any
Media
using:But you can also query for individual types:
Note that the above query gets converted by ndb into the query: