How make batch size Optimum for inference ASR model on GPU

76 views Asked by At

I had trained a Whisper ASR model and I have 8Gb GPU memory, my model take 4Gb of my GPU and I have to calculate the maximum batch size to fit to the model and get the transcribes because my data are around 200k each hour and I have to make it as fastest as possible and my total Parameters of my model is 763857920

I tried to use 10 audio and I got the CUDA out of ..., the data was like this:

Chunk-Size------1.38Mb
chunk-len-------9
GPU-free(Mb)----3777.94MB
gpu-util--------1700.00%
gpu-mem---------200.00%
ERROR-message---Tried to allocate 20.00 MiB
0

There are 0 answers