I'm using SQLite with R (the package RSQLite
). The following after importing rSQLite
, I'm trying to pipe 500 GB of data into an SQL table with the standard following command:
CREATE TABLE name (
field1 INTEGER,
field2 VARCHAR,
field3 INTEGER,
field4 INTEGER,
field5 VARCHAR,
);
However, this takes approximately 3 weeks to complete. How does one speed this up? Are other options commonly used?
I am trying to perform around 100 queries under one second on 20 billion rows, 5 columns. I am not using multiple commodity machines, but rather a single HPC server with 200 threads and 1 TB RAM. What would be the correct approach?