Python: why is my O(n) slowing down as it progresses?

95 views Asked by At

All I'm doing it iterating over a list of files. For some reason it shoots right up to about 30% then slows to an infuriating crawl.

I've looked into the files where it should be when it starts slowing, and they do grow in size a bit, but they do not grow linearly and consistently. Code below:

for i in wavs:
    gc.collect()
    print_progress_bar(wavs.index(i)/len(wavs))
    shapes.append(getshape(i.src_path))

I was reading that map() is a way to speed up loops, but I feel like the loop overhead is not what's growing. Is gc.collect() helping me at all here?

Here are the functions being called:

def print_progress_bar(percent):
    done_len = int(percent * progress_bar_length)
    bar = "["
    bar += "=" * done_len
    bar += ">"
    bar += " " * (progress_bar_length - done_len)
    bar += "] "
    bar += str(round(percent*100, 2))
    bar += "%"
    print("\r", end='')
    print(bar+(" " * 5), end='')
    sys.stdout.flush()

def getshape(path):
    if getsize(path) == 0:
            return (0,)

    try:
            this_wav = wave.open(path, 'r')
    except:
            temp_path = "C:\\soundfiles\\tempwav.wav"
            command = ["sox", path , "-r", "8000" ,  "-c" ,  "1", "-e", "unsigned", temp_path]
            convert = subprocess.Popen(command)
            convert.wait()

            this_wav = wave.open(temp_path, 'r')
    finally:
            num_frames = this_wav.getnframes()
            frames = this_wav.readframes(num_frames)
            this_wav.close()
            this_wav_array = list(map(lambda x: x, frames))

            return numpy.fft.fft(this_wav_array).shape or (0,)

I tried profiling, but later parts overshadow what is shown here. This is populating shapes list for comparison later. From what I've checked manually, the wav files requiring exception are pretty evenly distributed.

This is not the 'meat' of the script, but it is getting annoying. Any ideas?

0

There are 0 answers