I have used ibm_boto3 as mentioned in the IBM COS documentations. I have defined the resources as following:
cos = ibm_boto3.resource("s3",
ibm_api_key_id=COS_API_KEY_ID,
ibm_service_instance_id=SERVICE_INSTANCE_ID,
ibm_auth_endpoint=COS_AUTH_ENDPOINT,
config=Config(signature_version="oauth"),
endpoint_url=COS_ENDPOINT
)
Following is the code that I am using to get the content of the pdf file:
def get_item(bucket_name, item_name):
print("Retrieving item from bucket: {0}, key: {1}".format(bucket_name, item_name))
try:
file = cos.Object(bucket_name, item_name).get()
file_content = file["Body"].read() #returns data in bytes
#print("\nFILE:-------------------------\n", file) #shows the meta data of the object
return file_content
except ClientError as be:
print("CLIENT ERROR: {0}\n".format(be))
except Exception as e:
print("Unable to retrieve file contents: {0}\n".format(e))
The object is of ibm_botocore.response.StreamingBody object type. I am not able to convert the data obtained in bytes to string. I have tried decoding with utf-8 and base64 but doesn't work. I get the following error when I try and decode with utf-8:
Unable to retrieve file contents: 'utf-8' codec can't decode byte 0xb5 in position 11: invalid start byte
I am also unable to figure out what type of encoding is used by IBM COS.
Thanks in advance.