SnowPlow rs loader server not loading data from s3 to redshift cluster

32 views Asked by At

I have provisioned snowplow pipeline on aws following Snowplow docs for opensource (quick start https://docs.snowplow.io/docs/getting-started-on-community-edition/quick-start/). when i run my click event using my custom schema, which is stored in snowplow iglu db using snowplow iglu server on aws ec2, i can see enriched event coming in s3 enter image description here but i can't see this data in my redshift cluster. i guess rs loader server is not loading events from s3 and uploading them to redshift. i have configured sg of redshift to allow traffic from rs loader server security group. i am attaching iam policy for rs loader server for clarity

{
    "Statement": [
        {
            "Action": [
                "s3:ListBucket",
                "s3:PutObject",
                "s3:GetObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::dev-snowplow-terraform-sample-bucket/",
                "arn:aws:s3:::dev-snowplow-terraform-sample-bucket/*"
            ]
        },
        {
            "Action": [
                "s3:GetObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::dev-snowplow-terraform-sample-bucket/*/shredding_complete.json"
            ]
        },
        {
            "Action": [
                "sqs:DeleteMessage",
                "sqs:GetQueueUrl",
                "sqs:ListQueues",
                "sqs:ChangeMessageVisibility",
                "sqs:ReceiveMessage",
                "sqs:SendMessage"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:sqs:<region>:<account_id>:dev-snowplow-rs-loader.fifo"
            ]
        },
        {
            "Action": [
                "logs:PutLogEvents",
                "logs:CreateLogStream",
                "logs:DescribeLogStreams"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:logs:<region>:<account_id>:log-group:/aws/ec2/dev-snowplow-rs-loader-server:*"
            ]
        },
        {
            "Action": [
                "sts:AssumeRole"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:iam::<account_id>:role/dev-snowplow-rs-loader-server-sts-credentials"
            ]
        }
    ],
    "Version": "2012-10-17"
}

(i am using correct region and account id)

also the sqs queue, upon running test(click) event, has 1 message available but it's never consumed enter image description here am i missing something? really stuck here. i am using snowplow terraform configs and haven't changed any thing on my own

i expect rs loader to load data from s3 to redshift cluster. please mentions if i have missed any steps during provisioning of snowplow pipeline

1

There are 1 answers

0
jahanzaib younis On

The issue is fixed, All i needed to do was take a fresh clone of snowplow repository and re-deploy the pipeline. i guess i was missing some modules on my local repo.