Multimodel database vs multiple individual databases?

404 views Asked by At

I am working on application which requires features offered by both graph database(to store raw data) and document database(extracted reports from raw data). I planned to use neo4j and mongodb. I am having second thoughts about and looking at orientDB. is it better to have a single multimodel database than two separate databases? The reason I leaned towards neo4j is its native graph storage which might come in handy for memory locality for large graphs. OrientDB doesn't store graph natively. or does it?

2

There are 2 answers

0
Lvca On BEST ANSWER

OrientDB stores graph natively. Its engine is 100% a Graph Database like Neo4j. Actually OrientDB and Neo4j are the only Graph Databases with index-free adjacency. Some other Graph Database acts as a layer on top of an existent model (RDBMS, Column or Document stores).

So there is nothing you can do with Neo4j that you can't do with OrientDB. But OrientDB allows to model more complex data, like Document DBMS (MongoDB) can do. For example each vertices and edges in OrientDB is a document (json), so you can store in the vertex and edge complex types like embedded properties, list, sets, date, decimal, etc.

0
Sebastian Good On

Don't be dazzled by terminology. "Index-free adjacency" is a term that simply means graph vertices are stored "with" their edges. Each database does this in a slightly different way. Neo4J stores them on disk in a linked list. If you have them in memory, and there's not too many of them, they're fast. If you have to hit them on disk, then you may need an index. Titan stores them as columns in a wide-column database such as Cassandra. If they're in memory, they're fast. If you have to hit them on disk, the underlying database's range queries make them fast to load in bulk, and extra indexing can decrease the cost of searching large edge lists.

This discussion is fairly valuable: How does Titan achieve constant time lookup using HBase / Cassandra?

Whether you're using OrientDB or any other database, your efficiency at graph queries will rely in large part on the indexing you put in place so that you start your graph queries on, and traverse through, a relatively small set of nodes. Be sure to model some of the queries you're doing to make sure that whatever database you choose will support the right indexes, whether they're across the whole graph, or local to each vertex.