COPY s3 files to Redshift table with IDENTITY column without EXPLICIT_IDS

Question

COPY s3 files to Redshift table with IDENTITY column without EXPLICIT_IDS

2k views Asked by ezamur At 16 December 2016 at 22:41

I have a bunch of s3 files I want to copy into Redshift (using AWS Data Pipelines and RedshiftCopyActivity). The challenge lies in the fact that my s3 files have one column less then target Redshift table. Table itself has "id" column - an IDENTITY column which values are auto-generated during insert.

I understand that I should/could be using transformSql property of RedshiftCopyActivity but I am failing in constructing helpful query. Execution always returns me an error:

Exception ERROR: cannot set an identity column to a value

Some more details: Identity column is the first column of the table.

Data is successfully inserted into table called staging, as it should be. Also, I see my transformSQL was run and data is inserted into table called staging2. Logs show:

create temporary table staging2 as select myField1, myField2, ..., myFieldN from staging

but after that comes:

INSERT INTO target_table SELECT * FROM staging2

which causes error to happen.

So, how can I approach this and make Redshift ignore the fact that I am offering one column less? Maybe solution could be to make "id" column as the last one, I still didn't try this one. To be honest, I don't like how it sounds - like very fragile approach.

Original Q&A

There are 2 answers

sandeep rawat On 17 December 2016 at 03:53

Amusing you table table-name

id(identity) | Name(string)| Address(String)

Copy command should like

COPY table-name  
Name , Address
FROM data-source
CREDENTIALS 'aws-auth-args';

Note: Syntax for Copy

COPY table-name 
[ column-list ]
FROM data_source
[ WITH ] CREDENTIALS [AS] 'aws-auth-args'
[ [ FORMAT ] [ AS ] data_format ] 
[ [ parameter [ argument ] [, ... ] ]

**ezamur** · Accepted Answer · 2017-02-24T14:55:45+00:00

At the end, I wasn't able to make this working using RedshiftCopyActivity. It was always complaining about how value cannot be set to identity column. Event transformSQL parameter didn't help.

The solution that fits my needs utilizes ShellCommandActivity which runs a simple shell script. Basically, the idea is to install PSQL on EC2 running that is running mentioned shell script, connect to Redshift using PSQL and trigger COPY command that copies data from S3 to Redshift tables.

There are no problems with identity column using COPY command.

TechQA.

COPY s3 files to Redshift table with IDENTITY column without EXPLICIT_IDS

There are 2 answers

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-S3

Related Questions in COPY

Related Questions in AMAZON-REDSHIFT

Related Questions in AMAZON-DATA-PIPELINE

Popular Questions

Popular Tags

Trending Questions