Merge Two TinyDB Databases

890 views Asked by At

On Python, I'm trying to merge multiple JSON files obtained from TinyDB.

I was not able to find a way to directly merge two tinydb JSON files that have keys autogenerated in the sequence that not restart with the opening of the next file.

In code words, i want to merge large amount of data like this:

 hello1={"1":"bye",2:"good"....,"20000":"goodbye"}    

    hello2={"1":"dog",2:"cat"....,"15000":"monkey"}

As:

Hello3= {"1":"bye",2:"good"....,"20000":"goodbye","20001":"dog",20002:"cat"....,"35000":"monkey"}

Because of the problem to find the correct way to do it with TinyDB, I opened and transformed them simply in classic syntax json file, loading each file and then doing:

Data = Data['_default']

The problem that I have, is that at the moment the code works, but it has serious memory problems. After a few seconds, the created merged Db contains like 28Mb of data, but (probably) the cache saturate, and it starts to add all the other data in a really slow way.

So, I need to empty the cache after a certain amount of data, or probably i need to change the way to do this!

That's the code that i use:

Try1.purge()
Try1 = TinyDB('FullDB.json')

with open('FirstDataBase.json') as Part1 :
     Datapart1 = json.load(Part1)
     Datapart1 = Datapart1['_default']

     for dets in range(1, len(Datapart1)):

         Try1.insert(Datapart1[str(dets)])


with open('SecondDatabase.json') as Part2:
     Datapart2 = json.load(Part2)
     Datapart2 = Datapart2['_default']

     for dets in range(1, len(Datapart2)):

         Try1.insert(Datapart2[str(dets)])
1

There are 1 answers

0
stovfl On

Question: Merge Two TinyDB Databases ... probably i need to change the way to do this!


From TinyDB Documentation
Why Not Use TinyDB?
...
You are really concerned about performance and need a high speed database.

Single row insertion into a DB are always slow, try db.insert_multiple(....
The second one. with generator. gives you the option to hold down the memory footprint.

# From list
Try1.insert_multiple([{"1":"bye",2:"good"....,"20000":"goodbye"}])

or

# From generator function
Try1.insert_multiple(generator())