I have a folder containing a lot of images. I have a code which transforms these images into black and white format and then use tesseract to convert them into text files. I have been using the following code to split these files into subgroups:
i=0; for f in *; do d+dir_$(printf %03d $((i/(number of files in each folder+1))); mkdir -p $d; mv "$f" $d' let i++; done
This command works great to split up the files (puts the grouped files into different folders) but because I am planning on using this procedure for many many files I would like to change this process to be less time consuming (it would take a bit too much time to move the files to a folder). Is there a way I can specify the subgroup of files in order to run a process and use & in order to do multiple instances at once? For example, I would like to run a process for the firt 400 files in a folder and then use " & " in order to run that same process for the files that are in the order of 401-800.
Here is the code that I am using for the conversion:
parallel -j 5 convert {} "-resample 200 -colorspace Gray" {.}BW.png ::: *.png ; parallel -j 5 tesseract {} {} -l tla -psm 6 ::: *BW.png ; rm *BW.png
By group I simply mean the first 400 files, the second group would be the following 400 files and so on...
So my whole ordeal was with trying to use my code on a directory with a lot of files. In order to get rid of the errer stating that there are too many Arguments, I used this code that I gathered from previous Ole Tange posts:
Thanks to everyone that contributed.