COPY csv file with additionnal datas

251 views Asked by At

I have the following table :

persons(id,id_tech,name,nationality,id_list)

And a CSV file containing the datas for the column id_tech, name, and nationality.

Importing the data like so is working :

\copy persons(id_tech,name,nationality) FROM '/path/to/file.csv' DELIMITER ',' CSV

The id_list value is not in the CSV file because the file has to be imported on many servers where the value of this id can be different.

Is there a way to import the CSV file while providing an additional value for a specific column? I can't find anything relevant on the documentation.

Edit 1--
Note that all my command will be perform with pqxx in C++ (multi platform). I'm trying to avoid editing the file because of it's size.

Edit 2 --
I'm considering the following approach:

  • Create a temp table with a correct default value for the field I need
  • Import File into this temp table
  • Copy temp table to final table
  • Remove temp table

But I'm unsure of the performance. Biggest import can be close to a 500K lines.

2

There are 2 answers

0
grunk On BEST ANSWER

Found a solution which seems more than ok.

As stated in my OP , i use libpqxx to insert data , so instead of running a COPY sql request directly , i simply use pqxx::stream_to

Which allow me to add addtional field if needed :

pqxx::stream_to stream(w, mTable, std::vector<std::string>{"id_tech","name","nationality","extra_col"});
csv::CSVReader reader(filePath);
for (csv::CSVRow& row : reader) {

    stream << std::make_tuple(row[0].get<long long>(), row[1].get<std::string>(), row[2].get<std::string>(), custom_id);
}
stream.complete();

Taking around 10s to import 300K lines which is fine for my needs.

2
James Brown On

On Linux you could use for example awk to add the additional value field to your data and use psql to read from from stdin:

$ cat copy.sql
\copy persons(id_tech,name,nationality,extra_col) FROM '/dev/stdin' DELIMITER ',' CSV

and then:

$ awk '
BEGIN {
    FS=OFS=","
}
{
    print $1,$2,$3,"additional value"
}' file.csv | psql -h host -d database -f file.sql

(tested on PostgreSQL 12.4)