If I have 2 million rows in a db4o database, would you recommend a large flat table, or a hierarchy?

574 views Asked by At

I have 2 million rows in a flat db4o table. A lot of the information is repeated - for example, the first column has only three possible strings.

I could easily break the table into a 4-tier hierarchy (i.e. navigate from root >> symbol >> date >> final table) - but is this worth it from a speed and software maintenance point of view?

If it turns out that it would be cleaner to break the table into a hierarchy, any recommendations on a good method to achieve this within the current db4o framework?

Answers to questions

To actually answer your question, I would need more information. What kind of information do you store?

I'm storing objects containing strings and doubles. The hierarchy is exactly, in concept, like a file system with directories, sub-directories, and sub-sub-directories: a single root node contains an array of subclasses, and each sub-class in turn contains further arrays of sub-sub-classes, etc. Here is an example of the code:

// rootNode---|
//            sub-node 1----|
//                          |-----sub-sub-node 1
//                          |-----sub-sub-node 2
//                          |-----sub-sub-node 3
//                          |-----sub-sub-node X (others, N elements)
//            sub-node 2----|
//                          |-----sub-sub-node 1
//                          |-----sub-sub-node 2
//                          |-----sub-sub-node 3
//                          |-----sub-sub-node X (others, N elements)
//            sub-node 3----|
//                          |-----sub-sub-node 1
//                          |-----sub-sub-node 2
//                          |-----sub-sub-node 3
//                          |-----sub-sub-node X (others, N elements)
//            sub-node X (others, N elements)
class rootNode
{
  IList<subNode> subNodeCollection = new List<subNode>();
  string rootNodeParam;
}
class subNode
{
  IList<subSubNode> subSubNodeCollection = new List<subSubNode>();
  string subNodeParam;
}
class subSubNode
{
  string subSubNodeParam;
}

// Now, we have to work out a way to create a query that filters 
// by rootNodeParam, subNodeParam and subSubNodeParam.

Ans what are the access-patterns of your data? Are reading single objects by a query / search. Or are you reading a lot of objects which are related to each other?.

I'm trying to navigate down the tree, filtering by parameters as I go.

In general db4o (and other object databases) are good at navigational access. This means that you first query for some objects, and from there you navigate to related objects. For example you first query for a user-object. From there you navigate to the users home, city, job, friends etc objects. That kind of access works in great in db4o.

This is exactly what I'm trying to do, and exactly what works well in db4o if you only have 1-1 mappings between classes and subclasses. If you have 1-to-many by implementing an ArrayList of classes within a class, it can't do a query without instantiating the whole tree - or am I misled on this one?

So in your example in your case the 4-tier hierarchy can work great with db4o, but only when you can navigate from the root to the symbol object and so on. That mean that the root object has a collection of its 'children'-object

Yes - but is there any way to do a query, if each subNode contains a collection?

1

There are 1 answers

1
Gamlor On BEST ANSWER

As Sam Stainsby already pointed out in his commend, db4o doesn't have the notion of tables. It stored objects and thats db4o's unit of storage. Don't try to think in terms of tables, that doesn't really work with db4o.

As you said, you repeat information, so thats a good candidate to be separated in a other objects, which then can be referenced by other objects. In general I would first design a good domain-model, to be aware of how the data is organized and related to each other. And to think about what kind of data-access-patterns you have. And then try to find out how you can design your classes/object in a way which works with db4o.

To actually answer your question, I would need more information. What kind of information do you store? Ans what are the access-patterns of your data? Are reading single objects by a query / search. Or are you reading a lot of objects which are related to each other?.

In general db4o (and other object databases) are good at navigational access. This means that you first query for some objects, and from there you navigate to related objects. For example you first query for a user-object. From there you navigate to the users home, city, job, friends etc objects. That kind of access works in great in db4o.

So in your example in your case the 4-tier hierarchy can work great with db4o, but only when you can navigate from the root to the symbol object and so on. That mean that the root object has a collection of its 'children'-object

Btw: If you feel that is more natural to think in terms of tables for your data, then I recommend using a relational database. Relations databases are awesome at dealing with tables.