I feel like this is a very basic question, but I cannot find a comprehensive answer on this. I am wondering what is best practice for copying python data types, like lists or dictionaries, to a postgres database (using psycopg2).
Assuming I create or have a table that I want to fill up with data from a previous calculation, like
data = [(n, random.randint(0, 100)) for n in range(0, 100)]
As far as I know, the standard way to copy the data would be something like
curs.executemany("""
INSERT INTO my_table (id, rand_int)
VALUES (%s, %s)""", data)
I am guessing this will loop through the list and be rather slow. Is there a smarter or more efficient way?
UPDATE:
In the meantime I found this answer, suggesting to use something like:
args_str = ','.join(cur.mogrify("(%s,%s)", x) for x in data)
cur.execute("INSERT INTO table VALUES " + args_str)
And Craig suggests in his answer here to use copy_from
. So my question has changed a little:
What is the most efficient method and how could it be implemented?
For big data sets, use
COPY
, via psycopg2'scopy_from
function.http://initd.org/psycopg/docs/cursor.html#cursor.copy_from
See also how to speed up insertion performance in PostgreSQL.