I want to backup my system by invoking a tar'ing script over ssh that pipes back to stdout for the ssh-initiating host to then store the tar.
However, I want to perform logical dumps of some services running on that host, but do not have enough disk space there to dump these huge files on disk to then capture them from tar.
I do know, that tar cannot handle streams (or any files with unknown size). So I figured, I split the dumps while running into fixed-size chunks, store them on disk temporarily, bring them to tar for processing, and then delete them before processing the next chunk.
My script for this looks something like:
mkfifo filenames
tar --files-from filenames -cf - &
TAR_PID=$!
exec 100>filenames
# tar all relevant host-level directories/files
echo "/etc" >&100
echo "/root" >&100
function splitfilter() {
tee $1
(
# wait for tar to finish reading the file and delete it after being processed
inotifywait -e close_nowrite $1
rm $1
) &
RM_SHELL_PID=$!
# send the filename for processing to tar
echo $1 >&100
wait $RM_SHELL_PID
}
export -f splitfilter
# perform the logical dumps of my services
dump_program_1 | split -b 4K /var/backup/PREFIX_DUMP_1_ --filter "splitfilter $FILE"
dump_program_2 | split -b 4K /var/backup/PREFIX_DUMP_2_ --filter "splitfilter $FILE"
exec 100>&-
wait $TAR_PID
rm filenames
However, I cannot figure out why this randomly works and doesn't. I have observed two distinct failure behaviours so far:
- tar not stopping. At the end of the script I do close the file descriptor, so I expect the fifo to signal EOF to tar. This should end the tar process rather quickly as it only needs to complete processing of the last 4k chunk (if not already finished). I cannot explain why it randomly hangs. The resulting archive is actually complete (except tar's EOF-marker)...
- tar processing 0-byte files. After some time of processing, it seems
inotifywaitwakes up, before tar has closed the chunk file for reading. Thus resulting in the file being deleted and thus showing as a 0-byte-sized entry in the archive. I have mitigated this somehow by putting asleep 1after theecho $1 >&100call. After that, the first couple chunks do actually get filled, but after some time running, the later chunks become 0-sized again. I feel a timing problem here somewhere, but can't see it currently.
After a day of debugging I am losing hope in this approach, but it WOULD be sooo good if it would work reliably: it could actually produce streamed tars! Don't get me wrong: it worked once or twice while debugging. I just cannot figure why not always
The tar format is fairly simple. We can stream it ourselves with this TXR Lisp program.
Caveat: this doesn't handle long paths; it puts out only one header block per object.
The backup list consists of a mixture of paths and command entries.
Commands are executed, and their output chopped into 4K pieces which become numbered files. These are deleted as we go, so nothing accumulates.
Even when we write our own implementation of tar, we still have to do this because the format requires the size of every object to be known in advance and placed into the header. There is no way to stream arbitrarily long command output as a tar stream.
I don't have a regression test suite for this; I tested it manually by archiving individual objects of various kinds and doing comparisons of hex dumps between that and GNU tar, and then unpacking directory trees archived by this implementation, doing recursive diffs to the original tree.
However, I wonder whether the backup service you are using won't handle catenated archives. If it handles catenated archives, then you can just use multiple invocations of
tarto produce the stream, and not have all these process coordination issues.For a tar consumer to handle catenated archives, it just has to ignore all-zero blocks (not treat them as the end of the archive), but keep reading.
If the backup service is like this then you can basically do it along the lines of this:
I can't see any option in GNU Tar not to write the terminating zeros. It might be possible to write a filter to get rid of these:
The not yet written
remove-zero-blocksfilter that reads 512 byte blocks through a block-oriented FIFO that is long enough to cover the blocking factor used bytar. It places a newly-read buffer into one end of the FIFO, and writes the oldest one that is bumped from the other end. When EOF is encountered, the FIFO is flushed, but all trailing 512 byte blocks that are zero are omitted.That should defeat a backup service that refuses to ignore zero blocks.