I have several databases and I need to do classification on them on NVIDIA DIGITS. But importing my big data into DIGITS takes a lot of time ( 2-4 days)!!! Imagine I have converted 2 image sets into .lmdb forms like:
data1 data2
--> folder train1_db: data.mdb, lock.mdb --> folder train2_db: data.mdb, lock.mdb
--> folder val1_db: data.mdb, lock.mdb --> folder val2_db: data.mdb, lock.mdb
--> mean.binaryproto --> mean.binaryproto
--> some other txt files... --> some other txt files...
Now I need to concatenate these two .lmdb databases and save time. So I have done that separately in python from Merge two LMDB databases for feeding to the network (caffe)
and I have the third dataset containing: train_db and val_db folders each containing data.mdb and lock.mdb files like above.
data3
--> folder train3_db: train1_db + train2_db
--> folder va3_db: val_db + va2_db
I need to import these into DIGITS so that I train a network on them.
My questions are:
1- should I import the folders
train_db and val_db in image LMDB
part?
2- I searched for label LMDB
but I did not understand what I should do in this part. Could you please clearly explain what I should do?
Many thanks for your help.
You have to create them in the same way that they did. I read them first then created what they did.
This works if you changing an existing Classification DataSet with the same class structure. You do have to edit the pickle file to update total number of images for both train and val in 2 places. You have to generate the lmdb files just like they have them.
By the way… Of course they don’t recommend this: Check out: https://github.com/NVIDIA/DIGITS/issues/1035
Here is my code: https://github.com/GemHunt/lmdb-testing/blob/master/create_lmdb_rotate_whole_image.py