I am working on a single table (with no partitioning) having 700+ million rows. I wanted to load this data to another database so I used following pg_dump command,
pg_dump -Fc --column-inserts --data-only --table='tname' -U
postgres -d dbname > /root/tname_experiment_inserts_custom_format.dump
On the destination system I used the following command,
pg_restore -d dest_dbname -U postgres -j 7 /root/tname_experiment_inserts_custom_format.dump
The destination database already had the table I was trying to restore so I used TRUNCATE and then deleted all the indexes. The destination system has 32GB physical memory, I did the following settings in the postgres config file,
log_min_duration_statement = -1
autovacuum = off
maintenance_work_memory = 7gb
wal_level = minimal
fsync = off
full_page_writes= off
synchronous_commit= off
max_wal_size= 20GB
wal_buffers= 16MB
When I time the pg_restore then in an hour only about 16 million rows are getting inserted. Which means it is going to take 40+ hours (!) to restore the data. After that I have to create indexes and foreign constraints that I dropped which might take another several hours. I have a feeling I can do something differently to make this whole process much faster. Please give me any pointers which can help me make this process efficient. I also want to mention that I already looked at COPY but because it doesn't maintain the order of the primary keys this option is not good for me. If I don't know of any peculiar setting of COPY which preserves the order of the data then it would be great to know!
The purpose of whole thing was to change some datatypes of the columns which when done using alter table alter column query was also taking similar amount of time.
Given this table:
pg_dump -U postgres mydb --column-inserts --data-only
will generate this kind of output:
Omitting the
--colum-inserts
will produce:So the
--column-inserts
will generate one insert statement for every row which is very slow. The--data-only
flag only suppresses generation of the DDL to create the table(s).