I am in the process of converting FluentD with Fluent-bit to ship logs from K8S to S3. I need some help with tag_rewrite and pushing logs to the correct path in S3.
FluentD config:
<record>
# environment ${record["kubernetes"]["namespace_name"]}
# pod ${record["kubernetes"]["pod_name"]}
# podid ${record["kubernetes"]["pod_id"]}
# container ${record["kubernetes"]["container_name"]}
# </record>
# </filter>
# <match **>
# @type s3
# s3_bucket logs.bucket.us-east-1.domain.com
# s3_region us-east-1
# s3_object_key_format %Y/%m/%d/${environment}/${container}/${container}-${environment}-%Y%m%d-%H%M-${podid}-%{index}.%{file_extension}
# store_as text
Fluentbit Config:
[INPUT]
Name tail
Tag s3logs.*
Path /var/log/containers/*.log
parser cri
multiline.parser cri
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Skip_Empty_Lines On
Refresh_Interval 10
[FILTER]
Name kubernetes
Match s3logs.*
Merge_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On
Keep_Log Off
Labels Off
Annotations Off
[FILTER]
Name record_modifier
Match s3logs.*
Record cluster_name ${CLUSTER}
[FILTER]
name lua
alias set_std_keys
match s3logs.*
script /fluent-bit/scripts/s3_path.lua
call set_std_keys
[FILTER]
name rewrite_tag
match s3logs.*
rule $log ^.*$ s3.$namespace_name.$app_name.$container_name.$pod_id true
outputs: |
[OUTPUT]
Name s3
Match s3logs.*
bucket logs.bucket.us-east-1.domain.com
region us-east-1
s3_key_format /%Y/%m/%d/$TAG[1]/$TAG[2]/$TAG[3]/$TAG[3]-$TAG[1]-%Y%m%d-%H%M-${podid}.txt
store_dir /var/log/fluentbit-s3-buffers
total_file_size 256MB
upload_timeout 2m
use_put_object On
preserve_data_ordering On
S3 Lua script - based on my reading you have to use a LUA script to extract K8S metada:
function set_std_keys(tag, timestamp, record)
-- Pull up cluster
if (record["cluster_name"] ~= nil) then
record["cluster_name"] = record["cluster_name"]
else
record["cluster_name"] = "mycluster"
end
if (record["kubernetes"] ~= nil) then
kube = record["kubernetes"]
-- Pull up namespace
if (kube["namespace_name"] ~= nil and string.len(kube["namespace_name"]) > 0) then
record["namespace_name"] = kube["namespace_name"]
else
record["namespace_name"] = "default"
end
-- Pull up container name
if (kube["container_name"] ~= nil and string.len(kube["container_name"]) > 0) then
record["container_name"] = kube["container_name"]
end
-- Pull up pod id
if (kube["pod_id"] ~= nil and string.len(kube["pod_id"]) > 0) then
record["pod_id"] = kube["pod_id"]
end
-- Pull up app name (Deployment, StateFuleSets, DaemonSet, Job, CronJob etc)
if (kube["labels"] ~= nil) then
labels = kube["labels"]
if (labels["app"] ~= nil and string.len(labels["app"]) > 0) then
record["app_name"] = labels["app"]
elseif (labels["app.kubernetes.io/instance"] ~= nil and string.len(labels["app.kubernetes.io/instance"]) > 0) then
record["app_name"] = labels["app.kubernetes.io/instance"]
elseif (labels["k8s-app"] ~= nil and string.len(labels["k8s-app"]) > 0) then
record["app_name"] = labels["k8s-app"]
elseif (labels["name"] ~= nil and string.len(labels["name"]) > 0) then
record["app_name"] = labels["name"]
end
else
record["app_name"] = record["app_name"]
end
end
return 2, timestamp, record
end
shipping logs via FluentD goes to the correct path in s3 for example: 2023/10/14/namespace/container_name/container-name-namespace_name-2023-10-14-UUID.txt
shipping logs via Fluentbit goes to the wrong path in s3 for example:
2023/10/14/var/log/containers/containers-var-20231014-0759-.log-object00N1PX3n
How can i get the logs from Fluentbit to be shipped to the correct. I'm sure it's just a configuration issue, but I've been through the Fluentbit docs, sought help on Slack, even scrolled through GitHub, but to no avail.
If I understand correctly, you have:
The issue is about constructing the correct
s3_key_format
and making sure the tag rewrite and Lua script are setting up the necessary fields in the logs record that are used to construct the S3 object key.In other words, align Fluent Bit's S3 path formatting to match that of FluentD.
Try and update the
s3_key_format
field in the[OUTPUT]
section of your Fluent Bit configuration to match the format used in FluentD. Make sure the field references are correctly mapped to the log record fields populated by your Lua script and other filters.And verify your Lua script: see if it is correctly extracting and setting the required fields (
namespace_name
,container_name
, andpod_id
) in the log records. To check that, add some logging in the script to verify the values being set.Also, make sure the
rewrite_tag
filter is correctly configured to generate the tags you need. Verify that the regex and replacement pattern are correct, and that the fields referenced in the pattern are being populated as expected.Notes:
In Fluent Bit, the
s3_key_format_tag_delimiters
option allows you to specify characters to split the tag into parts, which can then be used in thes3_key_format
option. Make sure your tags are structured in a way that allows for this splitting to achieve the desired path format.FluentD allows for specific time formatting and tagging through its
<buffer>
configuration. If possible, try to replicate this setup in Fluent Bit to achieve similar path formatting.