Am a Flink beginner.
Tutorials whatever I see focus on very simple logic with on the fly data. Like when temperature is > 100 within x seconds, etc.
How do I bring in a a logic like this
1. When the temperature is 90 deg for 10 consecutive records
2. When the temperature for the last 10 minutes is < 90
A stypid question , Does apache flink support this kind of pattern
I could see within x seconds
but there is nothing like for x minutes / for x records
You can certainly address such use cases with Flink.
I'll sketch some solutions using Flink's window operators below. Note that there are also other ways to do this using custom functions / operators which can provide lower latency and less state to handle but which require more user-defined code.
This can be done using a sliding window that collects ten records and slides by one record. You should implement the window function as a ReduceFunction which immediately combines the records of the window into a boolean value that encodes whether all temperatures are > 90 degs or if one is not. This will reduce the space requirements to one record per window, i.e., 10 records at a time (because 10 windows are simultaneously computed). Note that count windows can be problematic because ordering is somewhat hard to reason about in a distributed stream processor.
This can be done using a sliding time window, e.g., a window over 10 minutes which slides by one minute. This will give you a 1 minute resolution, i.e., it will check every minute for the temperature of the last 10 minutes. Again, you'll have one record for each window (10 at a time for a 10min/1min window, 20 at a time for a 10min/30sec window, ...). The other logic is the same as with the count approach above. If you use event-time logic, you can control records with out-of-order timestamps.
Depending on you use case the window approach might be sufficient. If you need better latency or have you can implement your use case also in a stateful
FlatMapFunction
(last 10 records) or statefulProcessFunction
which gives you access to timestamps and watermarks for better time control.More complex patterns can be detected by Flink's CEP library.