Formatting for Firehose transformation output

8.6k views Asked by At

I am using AWS Kinesis Firehose with a custom Data Transformation. The Lambda's written in Python 3.6 and returns strings that look like the following:

{
    "records": [
        {
            "recordId": "...",
            "result": "Ok",
            "data": "..."
        },
        {
            "recordId": "...",
            "result": "Ok",
            "data": "..."
        },
        {
            "recordId": "...",
            "result": "Ok",
            "data": "..."
        }
    ]
}

This Lambda is perfectly happy, and logs outputs that look like the above just before returning them to Firehose. However, the Firehose's S3 Logs then show an error:

Invalid output structure: Please check your function and make sure the processed records contain valid result status of Dropped, Ok, or ProcessingFailed.

Looking at the examples for this spread across the web in JS and Java, it's not clear to me what I need to be doing differently; I'm quite confused.

2

There are 2 answers

1
G. Nebiolo On

I've found the same error using Node.js.

Reading the documentation http://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html my mistake was not base64-encoding of the data field of every record.

I resolved doing this:

{
    recordId: record.recordId,
    result: 'Ok',
    data: new Buffer(JSON.stringify(data)).toString('base64')
}
0
David Meng On

If your data is a json object, you can try following

import base64
import json
def lambda_handler(event, context):
    output = []
    for record in event['records']:
        # your own business logic.
        json_object = {...} 
        output_record = {
            'recordId': record['recordId'],
            'result': 'Ok',
            'data': base64.b64encode(json.dumps(json_object).encode('utf-8')).decode('utf-8')
        }
        output.append(output_record)
    return {'records': output}

base64.b64encode function only works with b'xxx' string while 'data' attribute of output_record needs a normal 'xxx' string.