Is there any more effective way to fetch the whole Greenplum's dump than doing it through multiple JDBC connections to master node?
I need to download the whole dump of Greenplum through JDBC. To do the job quicker I am going to use Spark parallelism (fetching data in parallel through multiple JDBC connections). As I understand, I will have multiple JDBC connections to Greenplum's single master node. I am going to store the data at HDFS in parquet format.
For parallel exporting, you can try gphdfs writable external table. Gpdb segments can parallel write/read External sources.
http://gpdb.docs.pivotal.io/4340/admin_guide/load/topics/g-gphdfs.html