How to delimit a compressed fixed length file with uncompressing it

626 views Asked by At

I'm dealing with compressed (gzip) fixed length flat files which I then need to turn into delimited flat files so I can feed it to gpload. I was told it is possible to delimit the file without needing to decompress it, and feed it directly to gpload since it can handle compressed files.

Does anybody know of a way to delimit the file while it is in .gz format?

1

There are 1 answers

0
0x0FFF On

There is no way to delimit the gzip-compressed data without decompressing it. But you don't need to delimit it, you can just load it as a fixed-width data type, it would be decompressed on the fly by gpfdist. Refer to the "Importing and Exporting Fixed Width Data" chapter in admin guide here: http://gpdb.docs.pivotal.io/4330/admin_guide/load.html

Here's an example:

[gpadmin@localhost ~]$ gunzip -c testfile.txt.gz 
Bob                 Jones                         27  
Steve               Balmer                        50  

[gpadmin@localhost ~]$ gpfdist -d ~ -p 8080 &
[1] 41525
Serving HTTP on port 8080, directory /home/gpadmin

[gpadmin@localhost ~]$ psql -c "
>     CREATE READABLE EXTERNAL TABLE students (
>         name    varchar(20),
>         surname varchar(30),
>         age int)
>     LOCATION ('gpfdist://127.0.0.1:8080/testfile.txt.gz')
>     FORMAT 'CUSTOM' (formatter=fixedwidth_in, 
>              name='20', surname='30', age='4');
>     "
CREATE EXTERNAL TABLE

[gpadmin@localhost ~]$ psql -c "select * from students;"
  name | surname | age 
-------+---------+-----
 Bob   | Jones   |  27
 Steve | Balmer  |  50
(2 rows)