I am trying to create Hive external table using Beeline on top of S3 object storage using "S3a//" scheme.I have followed the official cloudera documentation and configured the below properties.
- fs.s3a.access.key
- fs.s3a.secret.key
- fs.s3a.endpoint
I am able to run hadoop fs -Dfs.s3a.access.key=<access_key> -Dfs.s3a.secret.key=<secret_key> -Dfs.s3a.endpoint=<host_port> -ls s3a://<bucket_name>/dir/
successfully and able to see the directories. So I know my credentials, bucket access, and overall Hadoop setup is valid.
However, when I attempt to access the same s3 resources from hive(Beeline), e.g. run CREATE EXTERNAL TABLE statements using LOCATION 's3a://[bucket-name]/dir/', it fails.
Configurations
set fs.s3a.access.key=<access_key>;
set fs.s3a.secret.key=<secret_key>;
set fs.s3a.endpoint=<host:port>;
Query
CREATE EXTERNAL TABLE NAME_TEST_S3(name string, age int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TextFile LOCATION 's3a://<bucket_name>/dir/'
I am getting below error.
ERROR : FAILED: Execution error, return code 40000 from org.apache.hadoop.hive.ql.ddl.DDLTask. MetaException(message:Got exception: java.nio.file.AccessDeniedException <bucket_name>: org.apache.hadooop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.sdkClientException: Unable to load AWS Credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)) (state=08S01, code=40000)
Note : I am using CDH-7.1.6 , Hive 3.1.3 and S3 object storage. I am able to access the same s3 resources using hadoop fs as well as using spark scala read api
Anyone have any idea what's missing from this equation?