I'm using a pipe including sort to merge multiple large textfiles and remove dupes.
I don't have root permissions but the box isn't configured in any way to cut non root privileges further down than default debian jessie.
The box has 32GB RAM and 16GB are in use.
Regardless on how I call sort (GNU sort 8.13) it fills up all the remaining RAM and crashes with "out of memory".
It really fills up all the memory before crashing. I followed the process in top.
I tried to explicitly set the max memory usage with the -S parameter ranging from 80% to 10% and from 8G to 500M.
The whole pipe looks similar to:
cat * | tr -cd '[:print:]' |sort {various params tested here} -T /other/tmp/path/ | uniq > ../output.txt
Always the same behavior.
Does anyone know what could cause such issue?
And of course how to solve it?
I found the issue myself. It's fairly easy.
The "tr -cd '[:print:]'" removes line breaks and sort reads line by line.
So it tries to read all the files as one line and the -S parameter can't do its job.