ArangoDB comparison of documents in different databases

175 views Asked by At

I'm interested if it's possible to compare two documents with the same "_id"s (same collections names and "_keys") which are stored in different databases.

My use case is a custom "map / layout engine" that is "mainly" fed by "automatic import / conversion jobs" from an external geo-data system.

So far that works fine.

In some cases however it's necessary manually adjust e.g. the "x/y"-coordinates of some objects to make them more usable. By running the import job again any (e.g. to fetch the latest data) all manual adjustments are lost as they're simply overwritten by the "auto" data.

Therefore I think of a system setup consisting of several identically structured ArangoDB databases, used for different "stages" of the data lifecycle like:

  • "staging" - newly "auto imported" data is placed here.
  • "production" - the "final data" that's presented to the user including all the latest manual adjustments is stored here.

The according (simplified) lifecycle would be this way:

  1. Auto-import into "staging"
  2. Compare and import all manual adjustments from "production" into "staging"
  3. Deploy "merged" contents from 1. and 2. as the new "production" version.

So, this topic is all about step 2's "comparison phase" between the "production" and the "staging" data values.

In SQL I'd express it with sth. like this:

SELECT
x, y
FROM databaseA.layout AS layoutA
JOIN databaseB.layout ON (layoutA.id = layoutB.id) AS layoutB
WHERE
...         

Thanks for any hints on how to solve this in ArangoDB using an AQL query or a FOXX service!

1

There are 1 answers

0
adityamukho On

Hypothetically, if you had a versioning graph database handy, you could do the following:

  1. On first import, insert new data creating a fresh revision R0 for each inserted node.
  2. Manually change some fields of a node, say N in this data, giving rise to a new revision of N, say R1. Your previous version R0 is not lost though.
  3. Repeat steps 1 and 2 as many times as you like.

Finally, when you need to show this data to the end user, use custom application logic to merge as many previous versions as you want with the current version, doing an n-way merge rather than a 2-way merge.

If you think this could be a potential solution, you can take a look at CivicGraph, which is a version control layer built on top of ArangoDB.

Note: I am the creator of CivicGraph, and this answer could qualify as a promotion for the product, but I also believe it could help solve your problem.