I have 6,369 files of 256 MB each (1.63 TB total) stored in a RAM disk volume on a Linux server equipped with 4 TB of RAM. I need to merge them into a single file stored in the same RAM disk. What kind of merge operation would give me the best performance? If more RAM is needed, I can store the original parts on a 1.9 TB NVMe drive. The server has 128 cores.
Notes:
- Files are already compressed
- We do not have any limitations regarding available RAM or NVMe
Given that these files are ordered in a way (such as continuous numbering or formatted date),
cat
should do the trick from the shell prompt:You may want to make sure, that sorting is no issue in your particular shell: https://unix.stackexchange.com/questions/368318/does-the-bash-star-wildcard-always-produce-an-ascending-sorted-list
Other than that, the number of input files, when using the command line (instead of scripting) should be not relevant, but you still should check with your setup beforehand: https://unix.stackexchange.com/questions/356386/is-there-a-maximum-to-bash-file-name-expansion-globbing-and-if-so-what-is-it