I'm new in python & I'm trying to connect with Hadoop HDFS system. I got the following reference code as which I tried to implement it, but it's showed error while importing the package.
from pyarrow import HdfsClient
# Using libhdfs
hdfs = HdfsClient('192.168.0.119', '50070', 'cloudera', driver='libhdfs')
Error: ImportError: cannot import name 'HdfsClient'
I even tried to install it using "pip", but
Could not find a version that satisfies the requirement HdfsClient (from versi ons: ) No matching distribution found for HdfsClient
then I tried using "conda", but again
Collecting package metadata: done Solving environment: failed
PackagesNotFoundError: The following packages are not available from current cha nnels:
- hdfsclient
Current channels:
- https://repo.anaconda.com/pkgs/main/win-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/win-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/win-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://repo.anaconda.com/pkgs/msys2/win-64
- https://repo.anaconda.com/pkgs/msys2/noarch
To search for alternate channels that may provide the conda package you're looking for, navigate to
https://anaconda.organd use the search bar at the top of the page.
Actually I'm trying to connect to the HUE using:
IP Add -> 192.168.0.119
Port name -> 50070
Username -> cloudera
password -> cloudera
But it's not working out. Can anyone please suggest to connect it in a better way or how to import "HdfsClient" package in Python 3.
HDFSClientis deprecated. You might want to usepyarrow.hdfs.connect. Also trypip freezeto see if the relevant library is installed in your python environment or not. ex.