For a Flink streaming / Flink stateful function, it is known that the setBufferTimeout to small value (e.g., 5ms) would give 'best' latency experience. What are the other recommended configuration values one must take care (set, reset, modify..) when optimizing for latency in Flink stream or stateful functions jobs?
Flink optimal configuration for minimum Latency
1k views Asked by Mazen Ezzeddine At
1
There are 1 answers
Related Questions in APACHE-FLINK
- Fine grained resource mangement and heap memory in flink task slot
- Does parallel flink tasks affect each other if they are unioned at the end?
- I am facing issue with ParquetFileWriting n hdfs in flink where parquet file size is around 382 KB . I want the parquet file in MB
- Apache Flink (AWS) does not recognize saved temporary function
- Flink 1.19 error Cannot determine simple type name "com"
- Unsupported options found for 'hudi'
- Flink 1.18 register custom API endpoint handler
- Flink Stuck on Broadcast
- Blunder about RichCoFlatMapFunction in flink 1.17.2 according to the official leanring guide
- Is there a way to store & retrieve a window's state in flink
- puzzled with flink window state
- Flink 1.15.2 OOM issue due to RocksDB
- How to create custom metrics with labels (python SDK + Flink Runner)
- flink-rpc-akka-loader - Security Vulnerability Issues
- I am new to Apache Flink and getting error FileNotFoundError: [WinError 2] at in_streaming_mode() The system cannot find the file specified
Related Questions in FLINK-STREAMING
- Fine grained resource mangement and heap memory in flink task slot
- Flink 1.19 error Cannot determine simple type name "com"
- Getting FlinkRuntime Exception during oracle exactly once jdbc sink
- Is there a way to store & retrieve a window's state in flink
- puzzled with flink window state
- Flink 1.15.2 OOM issue due to RocksDB
- If I emit an event from an operator after holding it in state for certain duration will the downstream operator accept it if it is past the watermark?
- How to write to Kafka Topic(Or to a file) from a Flink Stream
- Flink marks source late arriving events
- Why is flink UI not showing the right numbers?
- Union of bounded and unbounded streams in flink
- gRPC Connection Cancelled with "Multiplexer Hanging Up" Error in PyFlink
- Delta Lake as ingress for Flink Stateful Functions
- implement custom partitioning with windowAll()
- implementation of RoundRobin partitioning in Apache Flink
Related Questions in FLINK-STATEFUN
- Run an Apache flink job in python, using Kafka without Docker
- How to manage joining metadata against an event in Flink with large, rarely changing metadata
- Does the Flink Kubernetes Operator support the deployment of Apache Flink Stateful Functions?
- How to handling completion of multiple asynchronous messages and ensuring exactly-once semantic in Flink Statefun
- Programmatically determining when Flink Stateful Functions has fully processed a batch of Kafka events
- Multiple Flink Statefun jobs on the same Flink cluster
- In StateFun with Apache DataStream examples how do we connect to remote Stateful Functions
- Timeout issue with Flink Stateful Functions with Azure Event Hub Kafaka endpoint
- How can you set the parallelism for a specific ingress for an embedded statefun application
- What is the correct way to scale flink statefun remote function
- Is there a way to broadcast configuration into all task managers or all FlatMapFunctions?
- How can I programmatically terminate the Statefun Harness
- Using Flink connector within Flink StateFun
- Using AWS Kinesis with localstack and Apache Flink ingress
- Flink Statefun Under Backpressure application crashes
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
End-to-end latency is affected by many factors. Ignoring latency accrued before events are ingested by Flink, that leaves these issues to consider:
Take advantage of operator chains. Avoid unnecessary use of keyBy and changes of parallelism. Use
reinterpretAsKeyedStreamwhere appropriate.These points above will help avoid unnecessary serialization, but you should also take care to optimize serialization. Using a slow serializer can have an enormous impact, as can using complex, deeply nested collection types where something simpler would do.
You should always enable object reuse. By default, Flink defensively makes copies of objects being passed down operator chains. When enabling object reuse, keep in mind that it is not safe to
If you avoid those two points, you may
If you are using event time processing, the optimal situation would be to be able to rely on having ascending timestamps, and to generate watermarks accordingly (with zero delay). If you are doing windowing, doing pre-aggregation will avoid load spikes as windows are closed, and configuring a short auto-watermarking interval will help minimize latency.
The FsStateBackend maintains state as objects on the heap, which are then subject to GC. This state backend has the best average latency, but you will want to carefully tune your garbage collector to avoid GC stalls. While much slower overall, the RocksDB state backend may have better worst-case latency, especially if you need to run with many task slots per task manager. With the FsStateBackend, one slot per TM will keep the scope for GC smaller, which helps reduce latency.
Avoid having many timers that fire simultaneously. Arrange for windows for different keys to fire at different times.
Keep in mind that downstream consumers of transactional sinks will experience latency that is governed by the checkpointing interval.
If you don't need exactly once guarantees, disable checkpoint barrier alignment by configuring checkpointing to use
CheckpointConfigInfo.ProcessingMode.AT_LEAST_ONCE.Unaligned checkpoints can, in some cases, be very helpful.
And finally, do whatever you can to avoid backpressure. Give the job more-than-adequate resources. Don't do any blocking i/o in your user functions. Try to avoid data skew (hot keys).