I am trying to calculate the FID between 2 datasets, using pytorch-fid. The GitHub link for the code is: https://github.com/mseitzer/pytorch-fid
Attempt 1: I started off small, with 2800 images, all 512×512, in each dataset. Unfortunately my generated images were PNGs, and the real images were JPGs. So I cd-ed into my generated image folder, ran the following bash script (which uses ImageMagick), and converted all the PNGs into JPGs:
for file in *.png
do convert "$file" "$(basename "$file" .png).jpg"
done
With these, I managed to get a FID of 143.6, which I acknowledge is pretty bad.
Attempt 2: I thought increasing the number of images might lower the FID, so I tried again with 5000 images in each dataset. The bash script I wrote above was taking very long to run. So I switched to the Python script below:
import os
directory = <PATH TO MY DIRECTORY>
files = os.listdir(directory)
# Then you rename the files
for file_name in files:
# You give the full path of the file
old_name = os.path.join(directory, file_name)
# You CHANGE the extension
new_name = old_name.replace('.jpg', '.png')
os.rename(old_name, new_name)
I later realized this only changes the file extension to '.jpg', and not the file format. This obviously did not work, and I ended up getting the following error:
RuntimeError: Trying to resize storage that is not resizable
Attempt 3: I went back to my generated image folder, reversed the code from Attempt 2 (renamed all the JPGs back into PNGs) and ran the script below using Pillow to actually change the file format. Keep in mind, both datasets now have 5000, 512×512, JPG images.
from PIL import Image
import os
path = <PATH TO MY DIRECTORY>
files=os.listdir(path)
for file in files:
if file.endswith(".png"):
img = Image.open(path+file)
#print(img)
file_name, file_ext = os.path.splitext(file)
img.save('<PATH TO OUTPUT DIRECTORY>{}.jpg'.format(file_name))
But I am still getting the runtime error from Attempt 2. What could be wrong? Is there an issue with the PIL JPG conversion? Should I revert to ImageMagick? Please help!
Edit: I made a silly mistake. I overlooked the fact that a couple of my real images weren't 512×512. I am now able to get a FID of 131.02. That's still pretty bad, but I will continue trying to improve it.
Also, thanks for suggesting multiprocessing. I used it and my conversion process was way faster than before.
If you are processing 5,000 images, you don't want to pay the start-up time for 5,000
convertprocesses. You would be better to use:and then you only pay one process start-up cost. However, that only uses a single CPU core, and will probably overflow your command-line length, I mean the
ARG_MAXparameter. So you will do better to use GNU Parallel which will use all your cores in parallel. Depending on your CPU and your RAM and your image sizes, you will likely want a command like this:That runs
findto identify all the PNG files and send the list to GNU Parallel with null-termination so spaces do not upset you. Then GNU Parallel will run as manymogrifyprocesses as you have CPU cores and pass each process 64 files to convert, which means you will only pay 1 process start-up for every 64 files. It will keep starting new jobs , each processing 64 images as each previous job exits, thereby keeping all your CPU cores busy.You may need to tweak the numbers and maybe add
--etaor--barto get a progress bar so you can watch them "whoosh" past. Examples here and here.If you cannot install GNU Parallel, at least start each of your
convertprocesses in the background, and then add awaitafter every 4 jobs or so. Something very approximately like this, (untested):Likewise with your PIL/Pillow code, you should consider multiprocessing like this.
As regards the poor results, I suspect JPEG's "chroma subsampling" may have affected your results. You could either turn off chroma-subsampling, or try converting with
-quality 90.