ERROR: protocol "gphdfs" does not exist

529 views Asked by At

when I

postgres=#   CREATE EXTERNAL TABLE csv_hdfs_lineitem (like a) LOCATION (
    'gphdfs://xxxxx/gptest/lineitem.csv'
) FORMAT 'text' (delimiter E'|' null E'\\N' escape E'off' fill missing fields)
ENCODING 'UTF8'
;

it shows

ERROR: protocol "gphdfs" does not exist

I want to know how to configure greenplum to support gphdfs protocol

1

There are 1 answers

1
Sung Yu-wei On
  1. you need to install hadoop client to all gpdb nodes and add class_path
  2. setup 2 guc, gp_hadoop_target_version and gp_hadoop_home pointing to the hadoop distribution and binary.
  3. restart gpdb
  4. grant protocol access to gpadmin.
  5. try gphdfs external table.

For detail, check the following link

http://gpdb.docs.pivotal.io/43110/admin_guide/load/topics/g-one-time-hdfs-protocol-installation.html#topic20