Postgres Bulk load using control card

Question

Postgres Bulk load using control card

1.6k views Asked by Cdr At 20 December 2016 at 13:15

We have a case, where we are loading data from flat file to postgres table. It is a delimited file.

File Content:

A|B|C
1.1|2016-12-20|3

I want to load only COLUMN A, B & C, but the table has 10 columns.

In Oracle, using SQL loader, where we can use control file to use the fields needed and we can do manipulation as required.

OPTIONS (DIRECT=TRUE, PARALLEL=FALSE)
LOAD DATA
INFILE FILENAME
APPEND
INTO TABLE TABLENAME
FIELDS TERMINATED BY '\t'
TRAILING NULLCOLS
(
A DECIMAL EXTERNAL,
B "to_date(:B,'YYYY-MM-DD')",
C "case when REGEXP_LIKE(:C,'^[0-9.,]') then :C else '0' end",
)

Used COPY command to load the COLUMNS A,B & C. But the trick part here is, we cannot use any manipulation in source files in COPY command as like SQL loader. So we cannot use COPY command. I left that option.

Other option:

PG BULKLOAD utility was much faster than COPY. There we can use CONTROL file. But options are limited like below;

#
# sample_csv.ctl -- Control file to load CSV input data
#
OUTPUT = table_name                   # [<schema_name>.]table_name
INPUT = /path/to/input_data_file.csv  # Input data location (absolute path)
TYPE = CSV                            # Input file type
QUOTE = "\""                          # Quoting character
ESCAPE = \                            # Escape character for Quoting
DELIMITER = ","                       # Delimiter

But I can't find any option to

Specify the fields required to be loaded
Do manipulation as like Oracle control file

Is there is any ways, to accomplish the above scenario using PG BULK LOAD with some reference. Can't find any other sources. I assume, PGBULKLOAD is the best way to load the data quickly. Kindly correct me if I'm wrong.

Any other option are available? without compromising performance.Kindly provide me some thoughts on it.

Original Q&A

There are 2 answers

**JosMac** · Answer 1 · 2016-12-20T14:07:42+00:00

JosMac On 20 December 2016 at 14:07

Check pgloader - it can load data also from csv-like file and it can do also conversion "on fly". Here is description and an example - http://pgloader.io/howto/csv.html

**Anand Rajakrishnan** · Answer 2 · 2021-04-16T19:21:49+00:00

I am using copy_from utility from psycopg2 that uses postgreSQL COPY. You can provide the input data as a STDIN (a file like object). This works pretty fast and can be comparable with SQLLDR in Oracle (I haven't tested it against the Direct path load option in SQL Loader). Otherwise, it works pretty well. https://www.psycopg.org/docs/cursor.html#cursor.copy_from

I use the below syntax:

cur.copy_from(dataIterator,schemaTable,sep=chr(31),null='',columns=objColList)

where,

dataIterator is a sub class of io.TextIOBase
schemaTable is the target table where data is being loaded
sep is the column separator
objColList is the list of columns that needs to be loaded in the target table

TechQA.

Postgres Bulk load using control card

There are 2 answers

Related Questions in POSTGRESQL

Related Questions in BULK-LOAD

Popular Questions

Popular Tags

Trending Questions