AWS CloudWatch Logs filter pattern issues

12.4k views Asked by At

I have a several crawlers that crawls multiple sites and stores the contents in a database. The logs from the program are stored in CloudWatch Logs.

If the crawlers successfully pulls back content it looks like similarly to below

HTTP GET: 200 - https://www.thecheyennepost.com/news/national/r

HTTP GET: 200 - https://www.thecheyennepost.com/news/f-e-warren-hous

The issue I'm dealing with is identifying when 400 errors pop up. Below is an example:

HTTP GET: 429 - https://www.livingstonparishnews.com/search/?l=25&sort=

HTTP GET: 429 - https://www.livingstonparishnews.com/search/?l=25&sort=rele

HTTP GET: 429 - https://www.ktbs.com/search/?l=25&s=start_time&sd=desc&f=

I tried using status_code=4* but that didn't do anything

I just want to be able to filter any and all 400 errors.

Any help that can be provided would be greatly appreciated.

3

There are 3 answers

0
Derek Menénedez On BEST ANSWER

Yes! Now you can with Logs Insights :)

First... you need to have the new UI or in another way go to "Logs Insights" service... jaja

CloudWatch -> CloudWatch Logs -> Log groups -> [your service logs]

With the new UI you can see this button (or go to Logs Insights in the search engine of aws cli):

Cloud Watch Example

Now you can see this:

Logs Insights UI

  1. It's a box for querys, it's like a SQL.
  2. The time range in which you will search

Now in your case.. you need this query (tell me if you need to filter another thing)

fields @message
| sort @timestamp desc
| filter @message like /4{1}[0-9]{1}[0-9]{1}/

I see your logs and you have spaces between your status code and I think this is the best

fields @message
| sort @timestamp desc
| filter @message like / 4{1}[0-9]{1}[0-9]{1} /

And that's all

Now run the query and you will see only logs that contains status codes [4xx]. I hope that solve your problem

NOTE: if you go directly from search engine to Logs Insights you need to select the service logs that you scan with the query. On the combobox in top of query box.

0
Carl On

There's an example buried in the user guide under Filter pattern syntax for metric filters, subscription filters, filter log events, and Live Tail.

In your case, try { $.status="4*" }. (This is different from the suggestion in another answer using square brackets which I couldn't get to work despite referencing the same documentation page.)

Here's what it looks like on some API Gateway logs in CloudWatch (I only had 5xx errors available to test).

enter image description here

0
sihaya On

You can also use the special filtering syntax in the "Search log groups" functionality of CloudWatch logs. In your case you would enter the following search term:

[proto, verb, status=4*, ...]

I find it a bit simpler to use. However, it's not possible to save the queries anywhere.

The syntax is described here:

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/FilterAndPatternSyntax.html#extract-log-event-values