Inserting millions of records into MySQL database using Python

Question

Inserting millions of records into MySQL database using Python

15k views Asked by jcoder12 At 19 June 2015 at 01:55

I have a txt file with about 100 million records (numbers). I am reading this file in Python and inserting it to MySQL database using simple insert statement from python. But its taking very long and looks like the script wouldn't ever finish. What would be the optimal way to carry out this process ? The script is using less than 1% of memory and 10 to 15% of CPU.

Any suggestions to handle such large data and insert it efficiently into database, would be greatly appreciated.

Thanks.

Original Q&A

There are 3 answers

spencer7593 On 19 June 2015 at 02:06

The fastest way to insert rows into a table is with LOAD DATA INFILE statement.

Reference: https://dev.mysql.com/doc/refman/5.6/en/load-data.html

Doing individual INSERT statements to insert one row at a time, RBAR (row by agonizing row) is tediously slow, because of all the work that the database has to go through to execute a statement... parsing for syntax, for semantics, preparing execution plan, obtaining and releasing locks, writing to the binary log, ...

If you have to do INSERT statements, then you could make use of MySQL multi-row insert, that would be faster.

  INSERT INTO mytable (fee, fi, fo, fum) VALUES 
   (1,2,3,'shoe')
  ,(4,5,6,'sock')
  ,(7,8,9,'boot')

If you insert four rows at a time, that's a 75% reduction in the number of statements that need to be executed.

Coffee and Code On 26 October 2018 at 09:48

Having tried to do this recently, I found a fast method, but this may be because I'm using an AWS Windows server to run python from that has a fast connection to the database. However, instead of 1 million rows in one file, it was multiple files that added up to 1 million rows. It's faster than other direct DB methods I tested anyway.

With this approach, I was able to read files sequentially and then run the MySQL Infile command. I then used threading with this process too. Timing the process it took 20 seconds to import 1 million rows into MySQL.

Disclaimer: I'm new to Python, so I was trying to see how far I could push this process, but it caused my DEV AWS-RDS DB to become unresponsive (I had to restart it), so taking an approach that doesn't overwhelm the process is probably best!

**emican** · Accepted Answer · 2015-06-19T04:56:50+00:00

Sticking with python, you may want to try creating a list of tuples from your input and using the execute many statement from the python mysql connector.

If the file is too big, you can use a generator to chunk it into something more digestible.

http://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-executemany.html

TechQA.

Inserting millions of records into MySQL database using Python

There are 3 answers

Related Questions in PYTHON

Related Questions in MYSQL

Related Questions in INSERT

Related Questions in SQL-INSERT

Related Questions in LARGE-DATA

Popular Questions

Popular Tags

Trending Questions