I have a large csv file containing 15 columns and approximately 1 million rows. I want to parse the data into tinyDB. The code I use is below:
import csv
from tinydb import TinyDB
db = TinyDB('db.monitor')
table = db.table('Current')
i=0
datafile = open('newData.csv', 'rb')
data=csv.reader(datafile, delimiter = ';')
for row in data:
table.insert({'WT_ID': row[0], 'time': row[1], 'MeanCurrent': row[2], 'VapourPressure': row[3], 'MeanVoltage':row[4], 'Temperature': row[5], 'Humidity': row[6], 'BarPressure': row[7], 'RPM': row[8], 'WindSector': row[9], 'WindSpeed': row[10], 'AirDensity': row[12], 'VoltageDC': row[13], 'PowerSec': row[14], 'FurlingAngle': row[15]})
i=i+1
print i
However, it really takes forever. I have set the i variable to track the progress, and while in the first lines it runs fast, now its been more than an hour and it has parsed about 10000 lines at a pace of almost 1Hz
I couldn't find anything similar so any help would be appreciated
Thank you
Is TinyDB the best choice ? You seem to need a transational database and TinyDB is document oriented. On top of that, from the doc : Wy not use TinyDB
Your process run really slow because you are accumulating data into the RAM. As a workaround, you could split your csv in smaller trunk and populate your script with it. This way, the memory could be clean between each iteration.
tinyDB is quite not able to manage this amount of informations.