Choosing or Aggregating Dimensions recorded against Cloudwatch Data Agent Metric

2.2k views Asked by At

I'm using the Procstat plugin of Cloudwatch Data Agent to record some per-process CPU usage.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-procstat-process-metrics.html

This is all being gathered OK, but the metric is being recorded with 'Instance ID (e.g i-143...)', 'Image ID (e.g ami-123...)' and 'Instance Type (e.g t3.small)'

When servers are scaled in and out, my alarms break because the Instance ID changes. I also update the AMI and may at some point change the Instance Type. If I'm addressing an instance ID, the AMI and instance type will be fixed anyway.

Is there a way to configure Cloudwatch Data agent to record the metrics without those dimensions or a way for Cloudwatch Metrics to aggregate across all instance IDs?

1

There are 1 answers

0
Tom Harvey On BEST ANSWER

I found what I needed in the append_dimensions and aggregate_dimensions options for the CloudWatch Agent config.

In the top level of the "metrics" block in the config you can add dimensions:

"metrics": {
        "append_dimensions": {
            "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
            "ImageId": "${aws:ImageId}",
            "InstanceId": "${aws:InstanceId}",
            "InstanceType": "${aws:InstanceType}"
        },
...

But, you can only add those AWS dimensions.

You can add custom dimensions, but only to the specific metric. So, for example, in the CPU metrics collector:

         "metrics_collected": {
             "cpu": {
                 "append_dimensions": {
                     "CustomDimension": "Foo"
                 },
                 "measurement": [
                     "cpu_usage_idle",
                     "cpu_usage_iowait",
                     "cpu_usage_user",
                     "cpu_usage_system"
                 ],
                 "metrics_collection_interval": 60,
                 "resources": [
                     "*"
                 ],
                 "totalcpu": false
             },

You can add these to the procstat group as well, despite it being a list:

            "procstat": [
                 {
                     "append_dimensions": {
                         "CustomDimension": "Foo"
                     },
                     "pid_file": "/var/run/celerybeat/beat.pid",
                     "measurement": [
                         "cpu_usage",
                         "memory_locked",
                         "pid_count"
                     ]
                 }

Finally, you can aggregate on these custom dimensions using aggregation_dimensions at the top level of the metrics block.

While the custom dimension is appended in the specific metric_collected, you can use these dimensions up the top when aggregation_dimensions

"metrics": {
         "append_dimensions": {
             "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
             "InstanceId": "${aws:InstanceId}",
             "InstanceType": "${aws:InstanceType}"
         },
         "aggregation_dimensions" : [
             ["AutoScalingGroupName"],
             ["AutoScalingGroupName", "InstanceType"],
             ["CustomDimension"],
             ["CustomDimension", "InstanceType"],
             ["CustomDimension", "pidfile"],
         ],
         "metrics_collected": {
...

The docs for these are in https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html but it took me a while to dig them out, test them and work out that the custom dimensions need to live in the specific metrics_collected sections.