I have provisioned snowplow pipeline on aws following Snowplow docs for opensource (quick start https://docs.snowplow.io/docs/getting-started-on-community-edition/quick-start/
). when i run my click event using my custom schema, which is stored in snowplow iglu db using snowplow iglu server on aws ec2, i can see enriched event coming in s3
but i can't see this data in my redshift cluster. i guess rs loader server is not loading events from s3 and uploading them to redshift. i have configured sg of redshift to allow traffic from rs loader server security group. i am attaching iam policy for rs loader server for clarity
{
"Statement": [
{
"Action": [
"s3:ListBucket",
"s3:PutObject",
"s3:GetObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::dev-snowplow-terraform-sample-bucket/",
"arn:aws:s3:::dev-snowplow-terraform-sample-bucket/*"
]
},
{
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::dev-snowplow-terraform-sample-bucket/*/shredding_complete.json"
]
},
{
"Action": [
"sqs:DeleteMessage",
"sqs:GetQueueUrl",
"sqs:ListQueues",
"sqs:ChangeMessageVisibility",
"sqs:ReceiveMessage",
"sqs:SendMessage"
],
"Effect": "Allow",
"Resource": [
"arn:aws:sqs:<region>:<account_id>:dev-snowplow-rs-loader.fifo"
]
},
{
"Action": [
"logs:PutLogEvents",
"logs:CreateLogStream",
"logs:DescribeLogStreams"
],
"Effect": "Allow",
"Resource": [
"arn:aws:logs:<region>:<account_id>:log-group:/aws/ec2/dev-snowplow-rs-loader-server:*"
]
},
{
"Action": [
"sts:AssumeRole"
],
"Effect": "Allow",
"Resource": [
"arn:aws:iam::<account_id>:role/dev-snowplow-rs-loader-server-sts-credentials"
]
}
],
"Version": "2012-10-17"
}
(i am using correct region and account id)
also the sqs queue, upon running test(click) event, has 1 message available but it's never consumed am i missing something? really stuck here. i am using snowplow terraform configs and haven't changed any thing on my own
i expect rs loader to load data from s3 to redshift cluster. please mentions if i have missed any steps during provisioning of snowplow pipeline
The issue is fixed, All i needed to do was take a fresh clone of snowplow repository and re-deploy the pipeline. i guess i was missing some modules on my local repo.