Parse CloudTrail logs with Python

3k views Asked by At

I'm working on a lambda function that gets events from CloudTrail and analyse them.

I have this script:

 s3.download_file(bucket, key, download_path)
        with gzip.open(download_path, "r") as f:
            data = json.loads(f.read())
            print json.dumps(data)
            for event in data['Records']:
                if event['eventName'] in event_list:
                    dateEvent = datetime.strptime(event['eventTime'], "%Y-%m-%dT%H:%M:%SZ")
                    for element in event['userIdentity']:
                        for session in element[0]['sessionContext']:
                            username = session['userName']
                            role = session['arn']

I can't get out of the event the value of userName and the arn. I get this error:

string indices must be integers: TypeError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 34, in lambda_handler
for session in element[0]['sessionContext']:
TypeError: string indices must be integers

How to make that work? What is the right way?

Here is the json string:

 "userIdentity": {
                "principalId": "aaaaaaaaaaaaaaaaaaaa",
                "accessKeyId": "aaaaaaaaaaaaaaaaaaaaa",
                "sessionContext": {
                    "sessionIssuer": {
                        "userName": "aaaaaaaaaaaaa",
                        "type": "Role",
                        "arn": "arn:aws:iam::aaaaaaaaaaaaaaaaaa:role/aaaaaaa",
                        "principalId": "aaaaaaaaaaaaaaaaaa",
                        "accountId": "aaaaaaaaaaaaaaaaaaa"
                    },
                    "attributes": {
                        "creationDate": "2017-09-14T15:03:08Z",
                        "mfaAuthenticated": "false"
                }
            },
        "type": "AssumedRole",
        "arn": "aaaaaaaaaaaaaaaaaaaaaaaa",
        "accountId": "aaaaaaaaaaaaaaaaaa"
    },
1

There are 1 answers

0
wkl On BEST ANSWER

The userIdentity element may or may not have a sessionContext element because those only exist if temporary IAM credentials were used during that event.

A userIdentity element without sessionContext looks like this:

"userIdentity": {
  "type": "IAMUser",
  "principalId": "AIDAJ45Q7YFFAREXAMPLE",
  "arn": "arn:aws:iam::123456789012:user/Alice",
  "accountId": "123456789012",
  "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
  "userName": "Alice"
}

But a userIdentity with a sessionContext element would look like like this:

"userIdentity": {
    "type": "AssumedRole",
    "principalId": "AROAIDPPEZS35WEXAMPLE:AssumedRoleSessionName",
    "arn": "arn:aws:sts::123456789012:assumed-role/RoleToBeAssumed/MySessionName",
    "accountId": "123456789012",
    "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
    "sessionContext": {
      "attributes": {
        "creationDate": "20131102T010628Z",
        "mfaAuthenticated": "false"
      },
      "sessionIssuer": {
        "type": "Role",
        "principalId": "AROAIDPPEZS35WEXAMPLE",
        "arn": "arn:aws:iam::123456789012:role/RoleToBeAssumed",
        "accountId": "123456789012",
        "userName": "RoleToBeAssumed"
      }
    }
}

...or it could even look like this if no role federation occurred.

"userIdentity": {
    "type": "IAMUser",
    "principalId": "EX_PRINCIPAL_ID",
    "arn": "arn:aws:iam::123456789012:user/Alice",
    "accountId": "123456789012",
    "accessKeyId": "EXAMPLE_KEY_ID",
    "userName": "Alice",
    "sessionContext": {"attributes": {
        "mfaAuthenticated": "false",
        "creationDate": "2014-03-06T15:15:06Z"
    }}
}

So going back to your code:

for element in event['userIdentity']:
    for session in element[0]['sessionContext']:
        username = session['userName']
        role = session['arn']

element[0] doesn't exist because sessionContext isn't a list.

If you want to fetch the used or assumed username and role ARN, I think this would work. It takes into account events that were done directly via IAMUser or via AssumedRole.

user_identity = event['userIdentity']

# check to see if we have a sessionContext[sessionIssuer]
if 'sessionIssuer' in user_identity.get('sessionContext', {}):
    user_name = user_identity['sessionContext']['sessionIssuer']['userName']
    arn = user_identity['sessionContext']['sessionIssuer']['arn']
else:
    user_name = user_identity['userName']
    arn = user_identity['arn']

And as a part of your processing loop:

for event in data['Records']:
    if event['eventName'] in event_list:
        dateEvent = datetime.strptime(event['eventTime'], "%Y-%m-%dT%H:%M:%SZ")
        user_identity = event['userIdentity']

        # check to see if we have a sessionContext[sessionIssuer]
        if 'sessionIssuer' in user_identity.get('sessionContext', {}):
            user_name = user_identity['sessionContext']['sessionIssuer']['userName']
            arn = user_identity['sessionContext']['sessionIssuer']['arn']
        else:
            user_name = user_identity['userName']
            arn = user_identity['arn']