Reshaping a wav file to 8000Hz

479 views Asked by At

I am developing an ML model, here I try to read a sample_wav.wav file and provide the data into the predict function to predict an output.

python_model.py

import librosa
import numpy as np

# Python and TensorFlow ML model
[...]

# sample_wav.wav sample rate : 44100 Hz
path_to_sample_wav='../sample-wavs/sample_wav.wav'

samples, sample_rate = librosa.load(path_to_sample_wav, sr = 44100)
print(sample_rate)
print(type(samples))
print(samples.shape)

print('resampling process:-------------- ')
samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=8000)
print(samples.shape)

Actual output:

44100
<class 'numpy.ndarray'>
(52992,)
resampling process:-------------- 
(9614,)

Expected output:

  • There is a training_record_01.wav file. This file worked fine with the predict(audio) function with no issues.
    Sample rate: 16000 Hz
    samples.shape shows (8000,)
16000
<class 'numpy.ndarray'>
(16000,)
resampling process:-------------- 
(8000,)

predict function

def predict(audio):
    prob=model.predict(audio.reshape(1,8000,1))
    index=np.argmax(prob[0])
    return classes[index]

How can I reshape the sample_wav.wav file as the expected output, if it's possible? I hope to call the predict(audio) function and pass the audio data and get the output.

Currently, I am having the following issue when using predict function:

print("Text:",predict(samples))

# issue

ValueErrorTraceback (most recent call last)
<ipython-input-198-5637a981cc7f> in <module>
----> 1 print("Text:",predict(samples))

<ipython-input-147-213bc78e946a> in predict(audio)
      1 def predict(audio):
----> 2     prob=model.predict(audio.reshape(1,8000,1))
      3     index=np.argmax(prob[0])
      4     return classes[index]

ValueError: cannot reshape array of size 9614 into shape (1,8000,1)


FYI: librosa v0.8.1

0

There are 0 answers