Does MongoDB duplicate subdocument with identical data?

477 views Asked by At

I'm completely new to MongoDB and looking at moving my base persistence code (for many projects) over to it using JDO as an agnostic layer. So I'm asking this question from the perspective of a java developer who likes to the work with beans as the basic model unit.

My question is about subdocuments and whether they exists independently or are internally consolidated by MongoDB. i.e. if I had a domain structure like this:

Household - collection of Persons

Person
 - name
 - address

Address
 - street
 - postcode

If I had a document for a household it would have multiple Persons but each Person would have the same address.

Would each address be a distinct and separate entity within MongoDB (even though they are the same 'class' and have the same values. Or does Mongo somehow identify that they are referring to the same entity and internally store a UID for each Address?

More importantly. If I update the postcode for one address does that mean that every member of Household's address subdocument would reflect that change?

It seems if it does then it's straying into the relational sphere but without such referencing I can see horrible inefficiencies arising?

2

There are 2 answers

0
evanchooly On

Mongo will not deduplicate those subdocuments for you, no. If you want to normalize that data, you'll need to save those addresses in to a different collection (ideally) and store DBRefs to those documents when you save the enclosing documents. Using something like morphia or spring-data can help manage those references for you.

0
DataNucleus On

If persisting data via JDO you have the choice of embedding the Person+Address into Household, or persisting as individual objects (just like you do with RDBMS). If storing as not-embedded then its up to you whether you have multiple copies of the same Person, or a single one referred to by multiple Households. If storing as embedded then they are embedded, so part of Household, hence info is dupd.