I'm creating a csv processing pipeline where I upload a raw csv file to S3, Lambda performs transformations/cleaning on it with python, and then uploads it into a different/receiving S3 bucket. In the same Lambda function I'm creating a create_data_set per each csv file being processed and then referenced to that particular csv which is being sent into the receiving S3 Bucket.

The objective is that when a raw csv is uploaded to s3, it will be processed in a lambda function, sent to a receiving s3 bucket, and automatically (or as much as possible) update an already created dashboard to point to the new/latest csv file as its new dataset which is generated by that same Lambda function.

However I am getting this error message:

[ERROR] ClientError: An error occurred (ValidationException) when calling the CreateDataSet operation: 1 validation error detected: Value 'arn:aws:s3:::randombucket-processed-salestotalv4' at 'physicalTableMap.string.member.s3Source.dataSourceArn' failed to satisfy constraint: Specified resource is not reachable in this region ('us-east-1')
Traceback (most recent call last):
  File "/var/task/handler.py", line 100, in featureengineering
    responsedataset = quicksight_client.create_data_set(
  File "/var/runtime/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/var/runtime/botocore/client.py", line 635, in _make_api_call
    raise error_class(parsed_response, operation_name)

and that kind of threw me off, and not sure how to resolve it. Bellow is the segment of the Lambda function that deals with the data base creation and ingestion, and is part (sits bellow) of the same Lambda function that does the data processing:

 responsedataset = quicksight_client.create_data_set(
        AwsAccountId='123456',
        DataSetId= csv_file_1, #the data set ID and name will be the same as the file
        Name= csv_file_1,
        PhysicalTableMap={
            'string': {
                'S3Source': {
                    'DataSourceArn': 'arn:aws:s3:::randombucket-processed-salestotalv4',
                    'UploadSettings': {
                        'Format': 'CSV',
                        'StartFromRow': 1,
                        'ContainsHeader': True,
                        'TextQualifier': 'SINGLE_QUOTE',
                        'Delimiter': ','
                    },
                    'InputColumns': [
                        *removed columns*
                        },
                    ]
                }  
            }
        },
        ImportMode='DIRECT_QUERY',
        Tags=[
            {
                'Key': 'Example',
                'Value': 'test'
            },
        ]
    )
    responseingestion = quicksight_client.create_ingestion(
    DataSetId= csv_file_1,
    IngestionId=csv_file_1,
    AwsAccountId='123456')

Something to note is that csv_file_1 is csv_file_1 = 'total_sales_' + str(datetime.now().strftime('%Y_%m_%d_%H_%M_%S')) + '.csv' which I defined in the beginning of the function, and use it to uniquely designate the processed csv from previous versions in S3, but also uniquely designate DataSetID and IngestionID. (not sure if that's a smart idea or not).

There is probably quite a bit more to be worked/changed with this, but any assistance with the error message and resolving the issue with the region would be appreciated.

0

There are 0 answers