I'm trying to use pyarrow and i keep getting the following error.
ImportError: Can not find the shared library: libhdfs3.so
so i read some stackoverflow and it says that i need to set enviorment variable for ARROW_LIBHDFS_DIR.
The path to libhdfs.so is /usr/local/hadoop/native/
it tried to set it in bashrc but it didn't work
the conda installation doesn't seem to work i.e.
conda install libhdfs3
pip install libhdfs3
conda install -c clinicalgraphics libgcrypt11
conda install libprotobuf=2.5
conda update libhdfs3
it will be a great help if i get this. thanks in advance.
ensure
libhdfs.sois in$HADOOP_HOME/lib/nativeas well as in$ARROW_LIBHDFS_DIRuse this to check if you have the variable set in your bash environment
ls $ARROW_LIBHDFS_DIRif not locate the file using
locate -l 1 libhdfs.soAssign the directory path you locate to the ARROW_LIBHDFS_DIR variable using
ARROW_LIBHDFS_DIR=<directory location to libhdfs.so>referenced here in SO - https://stackoverflow.com/a/62749351/6263217