I'm new to ELK and I'm currently struggling with some setup - maybe I missed a point. I have set up my Logstash to parse my DB (MySql), and I've got 2 cases:
"Unforeseen maintenance" -> In this case, I would need to stop/start Logstash manually without purging my DB, so when it starts, I need it to track the records that have not been ingested yet.
"Planned maintenance" -> In this case, I make a TRUNCATE on my DB, thus getting back to 0 with my Ids. I thought that by altering the file set in "last_run_metadata_path", Logstash would start using this new value that I give it. Though, it looks like it keeps the old Id in memory. Hence, my new entries are lost and never ingested.
Here is my setup:
input {
jdbc {
codec => plain { charset=>"UTF-8" }
clean_run => false
jdbc_driver_library => "C:\com.mysql.jdbc_5.1.5.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/talend_logs?characterEncoding=utf8"
jdbc_user => "root"
jdbc_password => "************"
schedule => "* * * * *"
statement => "SELECT id, pid, ...... FROM flowlogs WHERE id > :sql_last_value order by id"
use_column_value => true
tracking_column => "id"
last_run_metadata_path => "C:\logstash-8.6.1\config\lastValueOfMySqlForLogs.yml"
}
}
output {
elasticsearch {
hosts => ["https://localhost:9200"]
index => "myLogs"
user => "elastic"
password => "*********"
cacert => "C:\elasticsearch-8.6.1\config\certs\http_ca.crt"
}
stdout{}
}
Could you please let me know if I misunderstood something in the behaviour of logstash with the indexes ? Is it even possible to achieve what I'm trying to do with IDs ? (couldn't work with dates unfortunately)
Thank you very much.
I have tried to work with the clean_run parameter. When set to true, it restarts at 0 at every restart, which is normal per the documentation. When I set it to false with the last_run_metadata_path, I was expecting it to restart using the last value set in this file. It's the case, but it keeps on increasing - I'm not able to reset it (I would need it after I truncate data from my table).
Nevermind, I think I solved my issue by modifying the setup. I'm no longer based on the id but on the timestamps eventually.
I'm using a date from my DB (moment) that is the date at which I'm inserting the record in the DB. Then, I let logstash set the sql_last_value with the last query execution date.
I just had to handle the timezones in this configuration. So far it looks like it solved both my issues, so I'll close this topic.
Here is the new setup in case it would help anyone.