I'm looking to access a grib file to extract parameters (such as temperature, etc) from within the cloud without ever having to store the file locally. I've heard this can be done with the cfgrib API, but can't find any example documentation (I checked the source documentation here, but this doesn't include anything for accessing within the cloud).
From experience working with pygrib, I know that API reads in a grib file as a bytes representation, and cfgrib appears to handle it similarly. After some researching and trial and error, I've come up with this code that tries to read a byte string representation of the file:
import boto3 import boto from botocore.config import Config from botocore import UNSIGNED import pygrib import cfgrib
if __name__ == '__main__':
# Define boto config
my_config = Config(
signature_version = UNSIGNED,
retries = {
'max_attempts': 10,
'mode': 'standard'
}
)
session = boto3.Session(profile_name='default')
s3 = session.resource('s3')
my_bucket = s3.Bucket('nbmdata')
# Get a unique key for each file in s3
file_keys = []
for my_bucket_object in my_bucket.objects.all():
file_keys.append(my_bucket_object.key)
# Extract each file as a binary string (without downloading)
grib_files = []
for key in file_keys:
s3 = boto.connect_s3()
bucket = s3.lookup('bucket') # Removed bucket name
key = bucket.lookup(key)
your_bytes = key.get_contents_as_string(headers={'Range' : 'bytes=73-1024'})
grib_files.append(your_bytes)
# Interpret binary string into pygrib
for grib_file in grib_files:
grbs = pygrib.open(grib_file)
This appears to ALMOST work. I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xee in position 7: invalid continuation byte
I get the same error when I try to swap this out with cfgrib. What am I missing here?
Try something like this. I was using the GEFS data hosted on AWS instead and it worked great. I believe there is nbmdata on AWS also that can be found here: https://registry.opendata.aws/noaa-nbm/. No account should be needed, so it would just be a matter of changing the
s3_object
name to the path/filename of the file you want from here https://noaa-nbm-pds.s3.amazonaws.com/index.html