I'm trying to parse Datadog logs that have a mixed format.
I have some logs coming to Datadog that look like this:
... some_field="value123" org_id="test-org-id" query_params="{"updated_date": {"eq": "2020-01-01T00:00:00Z", "lte": "2025-01-01T00:00:00Z"}, "active": {"eq": true}}" request_id="test-request-id" request_method="GET" ... Response: {"data": []}
I've so far been parsing them using the following rules in the Grok parser:
rule ...%{some_field}?%{org_id}?%{query_params}?%{request_id}?%{request_method}?...%{regex(".*"):msg}
org_id (%{regex("org_id=")}\"%{regex("[a-zA-Z_0-9\\-]+"):org_id}\"\s)
query_params (%{regex("query_params=")}\"%{regex("\\{.*\\}"):query_params}\"\s)
request_id (%{regex("request_id=")}\"%{regex("[a-zA-Z_0-9\\-\\=]+"):request_id}\"\s)
request_method (%{regex("request_method=")}\"%{regex("[a-zA-Z]+"):request_method}\"\s)
and here's what the output looks like:
{
"msg": "Response: {"data": []}",
"request_method": "GET",
"org_id": "test-org-id",
"query_params": "{\"updated_date\": {\"eq\": \"2020-01-01T00:00:00Z\", \"lte\": \"2025-01-01T00:00:00Z\"}, \"active\": {\"eq\": true}}",
"request_id": "test-request-id"
}
I'd like to:
have the query params parsed as JSON so they could come out like this
-
{ "msg": "Response: {"data": []}", "request_method": "GET", "org_id": "test-org-id", "query_params": { "updated_date": { "eq": "2020-01-01T00:00:00Z", "lte": "2025-01-01T00:00:00Z" }, "active": true } "request_id": "test-request-id" }
-
get any advice on simplifying my master rule. I've tried something like this:
-
api_rule %{celery_prefix}?\s*%{data::keyvalue}?%{query_params}?%{data::keyvalue}?%{regex("Response: .*"):msg}but it threw away the data from query_params for some reason.
-
I have control over the way the logs come out, so if anything needs to be changed from the backend that's also an option.
Thanks!
Solved with some help from the great folks at Datadog support.