How to set the delimiter, Postgresql

5.7k views Asked by At

I am wondering what the delimiter from this .csv file is. I am trying to import the .csv via the COPY FROM Statement, but somehow it throws always an error. When I change the delimiter to E'\t' it throws an error. When I change the delimiter to '|' it throws a different error. I have been trying to import a silly .csv file for 3 days and I cannot achieve a success. I really need your help. Here is my .csv file: Download here, please

My code on postgresql looks like this:

CREATE TABLE movie
(
 imdib  varchar NOT NULL,
 name varchar NOT NULL,
 year integer,
 rating float ,
 votes integer,
 runtime varchar  ,
 directors varchar ,
 actors varchar ,
genres varchar
);

MY COPY Statement:

COPY movie FROM '/home/max/Schreibtisch/imdb_top100t_2015-06-18.csv' (DELIMITER E'\t', FORMAT CSV, NULL '', ENCODING 'UTF8');

When I use SHOW SERVER_ENCODING it says "UTF8". But why the hell can't postgre read the datas from the columns? I really do not get it. I use Ubuntu 64 bit, the .csv file has all the permissions it needs, postgresql has also. Please help me. These are my errors:

ERROR: missing data for column "name" CONTEXT: COPY movie, line 1: "tt0468569,The Dark Knight,2008,9,1440667,152 mins.,Christopher Nolan,Christian Bale|Heath Ledger|Aar..."

********** Error **********

ERROR: missing data for column "name" SQL state: 22P04 Context: COPY movie, line 1: "tt0468569,The Dark Knight,2008,9,1440667,152 mins.,Christopher Nolan,Christian Bale|Heath Ledger|Aar..."

2

There are 2 answers

1
AudioBubble On

The following works for me:

COPY movie (imdib,name,year,rating,votes,runtime,directors,actors,genres) 
    FROM 'imdb_top100t_2015-06-18.csv' 
    WITH (format csv, header false, delimiter E'\t', NULL '');

Unfortunately the file is invalid because on line 12011 the column year contains the value 2015 Video and thus the import fails because this can't be converted to an integer. And then further down (line 64155) there is an invalid value NA for the rating which can't be converted to a float and then one more for the votes.

But if you create the table with all varchar columns the above command worked for me.

0
Pankaj Sharma On

Use this code instead it is working fine on Linux as well on windows

\COPY movie(imdib,name,year,rating,votes,runtime,directors,actors,genres) FROM 'D:\test.csv' WITH DELIMITER '|' CSV HEADER;

and one more thing insert header in your csv file like shown below:

imdib|name|year|rating|votes|runtime|directors|actors|genres
tt0111161|The Shawshank Redemption|1994|9.3|1468273|142 mins.|Frank Darabont|Tim Robbins|Morgan Freeman

and use single byte delimiter like ',','|' etc.

Hope this will work for you ..!