I downloaded and built parquet-1.5.0 of https://github.com/apache/parquet-mr.
I now want to run some commands on my parquet files that are in hdfs. I tried this:
cd ~/parquet-mr/parquet-tools/src/main/scripts
./parquet-tools meta hdfs://localhost/my_parquet_file.parquet
and I got:
Error: Could not find or load main class parquet.tools.Main
The script is built on the assumption that
parquet-tools-<version>.jar
is located in a directory calledlib
next to the script file itself, like so:You can set up such a file layout by issuing the following commands from the root of the parquet-mr git repo (of course many alternative ways and installation locations are possible):
After this you can run
~/.local/share/parquet-tools/parquet-tools
. (I tested this with version 1.10.1-SNAPSHOT though instead of 1.5.0.)