I am using beeline to export data to hdfs with command:
INSERT OVERWRITE DIRECTORY $export_tmp
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
select * from xxx_table_name;
I want to set echo output file size such as 1024M.
It can be many files if it runs on many mappers or reducers at the last vertex.
The easiest way is to execute in a shell
or
You also can try to execute it inside beeline using
!sh
Also maybe counters printed at the end of the job can be used, like HDFS: Number of bytes written (not sure is this figure correct or not)