When I create an S3 VPC endpoint for AWS Glue service to unload data from a redshift cluster, the ETL job only works when the VPC endpoint policy is set to "full access".
i.e
{
"Statement": [
{
"Action": "*",
"Effect": "Allow",
"Resource": "*",
"Principal": "*"
}
]
}
It does not work when the policy is set to "custom" and modified as below.
{
"Statement": [
{
"Action": "*",
"Effect": "Allow",
"Resource": ["arn:aws:s3:::examplebucket",
"arn:aws:s3:::examplebucket/*"],
"Principal": "*"
}
]
}
In ETL job I have specified examplebucket as the location to save the ETL script and the temporary files, so I find it difficult to understand why the ETL job fails only when the policy is set to Custom. Does Glue try to access another S3 resource other than the specified bucket in job?
Glue jobs also need the following: 1. Temporary directory in S3. 2. Location in S3 to store the generated python script.
For example, if the script location is not specified; glue automatically picks the following location "s3://aws-glue-scripts-YourAccountId-us-east-1/"
Make sure your IAM role policies reflect the s3 locations that you picked as well.