I have jobs in my system being logged into files, where each distinct job output is logged into a different file, named after the job's id.
Most of the actions in the system are executed through these jobs so I'm expecting a high number of them.
My first approach was to label the logs using the file name pattern, which works fine but will produce a stream for each job id.
From the Loki documentation about label fundamentals I understand I should be striving to minimize the cardinality of the logs being indexed, which leaves me with a problem.
The job id is something I want users to be able to query by,
- either through the label
{job="myjob", job_id="1234"}
- or through a text search
{job="myjob"} |= "job_id=1234"
(which is the recommended way to avoid high index cardinality).
Given that the job id is derived from the log file's name and not the log text, how should I approach this?
Is it possible to artificially add the job id to all extracted log line texts? if so, is that a good practice?
How is label cardinality effected over time? as in - if a label value exists only for a limited time span (like a job id, which is limited to that job) is this still considered a bad practice?