I have decided to read/copy files straight from their online repository to avoid download the files at first. Given this is my first attempt at this, this's been my first interaction with aws.s3
.
First, just to make sure I could run something simple, I checked if the bucket existed. I did so with bucket_exists
defining both the bucket
and the region
. The bucket does exist.
However, the file I want to inspect is an .h5
file. To work with it, I got the rhdf5
library from BiocManager
. Then, to inspect the one file, I did the following:
s3read_using(
FUN = rhdf5::H5Fopen,
bucket = "s3://arpa-e-perform/ERCOT/",
region = "us-west-2",
object = "s3://arpa-e-perform/ERCOT/2018/Solar/Actuals/BA_level/BA_solar_actuals_2018.h5")
Unfortunately, it didn't work. The message and the error message I got follow:
List of 6
$ Code : chr "PermanentRedirect"
$ Message : chr "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future "| truncated
$ Endpoint : chr "arpa-e-perform.s3.amazonaws.com"
$ Bucket : chr "arpa-e-perform"
$ RequestId: chr "BGEZ97HJH10KAPRE"
$ HostId : chr "pxKXcYNLchSYTwEaPLDoFRo11qkWontw+kWAtb8ZqTTEYwTptAkSgl8dbJoI8a2URXIxDCOE7/g="
- attr(*, "headers")=List of 7
..$ x-amz-bucket-region: chr "us-west-2"
..$ x-amz-request-id : chr "BGEZ97HJH10KAPRE"
..$ x-amz-id-2 : chr "pxKXcYNLchSYTwEaPLDoFRo11qkWontw+kWAtb8ZqTTEYwTptAkSgl8dbJoI8a2URXIxDCOE7/g="
..$ content-type : chr "application/xml"
..$ transfer-encoding : chr "chunked"
..$ date : chr "Mon, 06 Jun 2022 17:35:35 GMT"
..$ server : chr "AmazonS3"
..- attr(*, "class")= chr [1:2] "insensitive" "list"
- attr(*, "class")= chr "aws_error"
NULL
Error in parse_aws_s3_response(r, Sig, verbose = verbose) :
Moved Permanently (HTTP 301).
Today's been my first interaction with aws.s3
and I'm still going through the manual/forums, so all help will be appreciated. Thank you.
I think the problem here is that you're not access the file at the correct location. The error message says "The bucket you are attempting to access must be addressed using the specified endpoint" and then provides the 'endpoint' as "arpa-e-perform.s3.amazonaws.com", which looks much more like a regular http URL.
Here's an example of reading the
meta
dataset from the file usingrhdf5
.