How to ignore corrupted files?

507 views Asked by At

How to loop through a directory in Python and open wave files that are good whilst ignoring bad (corrupted) ones?

I want to open various wave files from a directory. However, some of these files may be corrupted, some may not be to specification. In particular there will be files in that directory which when trying to open them will raise the error:

wave.Error: file does not start with RIFF id

I want to ignore those files. I want to catch the error and continue with the loop. How can this be done?

My code:

for file_path in files:
            sig=0
            file = str(file_path)
            sig, wave_params = DataGenerator.open_wave(file)
            if sig == 0:
                print(
                    'WARNING: Could not open wave file during data creation: ' + file)
                continue
            if wave_params[0] != 1:
                print("WARNING: Wrong NUMBER OF CHANNELS in " + file)
                txt.write(
                    "WARNING: Wrong NUMBER OF CHANNELS in " + file + "\n")
                continue
            if wave_params[1] != 2:
                print("WARNING: Wrong SAMPLE WIDTH in " + file)
                txt.write("WARNING: Wrong SAMPLE WIDTH in " + file + "\n")
                continue
            if wave_params[2] != RATE:
                print("WARNING: Wrong FRAME RATE in " + file)
                txt.write("WARNING: Wrong FRAME RATE in " + file + "\n")
                continue
            if wave_params[3] != SAMPLES:
                print("WARNING: Wrong NUMBER OF SAMPLES in " + file)
                txt.write(
                    "WARNING: Wrong NUMBER OF SAMPLES in " + file + "\n")
                continue
            if wave_params[4] != 'NONE':
                print("WARNING: Wrong comptype: " + file)
                txt.write("WARNING: Wrong comptype: " + file + "\n")
                continue
            if wave_params[5] != 'not compressed':
                print("WARNING: File appears to be compressed " + file)
                txt.write(
                    "WARNING: File appears to be compressed " + file + "\n")
                continue
            if bit_depth != (wave_params[2] * (2**4) * wave_params[1]):
                print("WARNING: Wring bit depth in " + file)
                txt.write("WARNING: Wring bit depth in " + file + "\n")
                continue
            if isinstance(sig, int):
                print("WARNING: No signal in " + file)
                txt.write("WARNING: No signal in " + file + "\n")
                continue

My code for opening the wave file:

    def open_wave(sound_file):
    """
    Open wave file
    Links:
         https://stackoverflow.com/questions/16778878/python-write-a-wav-file-into-numpy-float-array
         https://stackoverflow.com/questions/2060628/reading-wav-files-in-python
    """
    if Path(sound_file).is_file():
        sig = 0
        with wave.open(sound_file, 'rb') as f:
            n_channels = f.getnchannels()
            samp_width = f.getsampwidth()
            frame_rate = f.getframerate()
            num_frames = f.getnframes()
            wav_params = f.getparams()
            snd = f.readframes(num_frames)
        audio_as_np_int16 = np.frombuffer(snd, dtype=np.int16)
        sig = audio_as_np_int16.astype(np.float32)
        return sig, wav_params
    else:
        print('ERROR: File ' + sound_file + ' does not exist. BAD.')
        print("Problem with openng wave file")
        exit(1)

The missing lines which scale the output of the wave file correctly is done on purpose.

I am interested in how to catch the error mentioned above. A tipp of how to open wave files defensively would be nice, too. That is how can I simply ignore wave files that throw errors?

2

There are 2 answers

0
ti7 On BEST ANSWER

just wrap your function in a try:except block

for file_path in files:
    sig=0
    file = str(file_path)
    try:  # attempt to use `open_wave`
        sig, wave_params = DataGenerator.open_wave(file)
    except wave.Error as ex:
        print(f"caught Exception reading '{file}': {repr(ex)}")
        continue  # next file_path
    # opportunity to catch other or more generic Exceptions
    ...  # rest of loop
0
Grimmace_23 On

You could make use of a try-catch block. where you 'try' accessing the file and you catch a potential exception. here you could just make a 'pass'