I am trying to enhance an audio file (3:16 minutes in length, available here) using Speechbrain. If I run the code below (from this tutorial), I get the error OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 39.59 GiB total capacity; 33.60 GiB already allocated; 3.19 MiB free; 38.06 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
.
What is the recommended way to fix the issue? Should I just cut the audio file in pieces?
from speechbrain.pretrained import SepformerSeparation as separator
import torchaudio
model = separator.from_hparams(source="speechbrain/sepformer-wham-enhancement",
savedir='pretrained_models/sepformer-wham-enhancement', run_opts={"device":"cuda"})
est_sources = model.separate_file(path=audio_file)
torchaudio.save("enhanced_wham.wav", est_sources[:, :, 0].detach().cpu(), 8000)
Yeah, I tried to process the audio of 1:30 minutes, it showed Cuda Memory error in google colab, and when try to split up the audio, it completed in a couple of seconds.
Attaching my code(not production ready code)