Many times I have typical ETL code that looks like this
./call_some_api.py | ./extract_transform | ./load_to_some_sql
What if either of the first two scripts stop sending bytes because of some internal error that causes them to stall. I wish there was another program I could put before ./load_to_some_sql
that will detect that 0 bytes has been sent in 5 minutes, and throw an exit code.
Curl has --speed-limit
and --speed-time
which do exactly this, but it's not generalized for other cli apps.
Is there a way to do this? Have a way to crash if the throughput hits a certain level on a pipe? I know state machines and other orchestration tools can do this, but in general if there's a way to do it from bash it would be helpful!
If you are interested in inactivity timeout, these can do it.
One liner (mentioned in comments). Adjust
$t
for inactivity seconds:perl -e'$t=300;$SIG{ALRM}=sub{die"Timeout\n"};alarm$t;while(<>){print;alarm$t}'
More robust:
If you are interested in measuring bytes/secs this can do it, on an interval: (calculation could be improved to keep last n secs bandwidth instead of an interval)
Both programs are for linux only.