How to do voice activity detection in torch audio in R

35 views Asked by At

I am doing some voice activity detection using the function functional_vad from the R package torchaudio. Currently, I have the following code:

library(torch)
library(torchaudio)

url = "https://www2.cs.uic.edu/~i101/SoundFiles/CantinaBand3.wav"
filename = tempfile(fileext = ".wav")
httr::GET(url, httr::write_disk(filename, overwrite = TRUE))

s <- transform_to_tensor(torchaudio_load(filename, unit = "samples"))
waveform    <- s[[1]]
sample_rate <- s[[2]]

plot(waveform[1], col = "royalblue", type = "l")

v <- functional_vad(waveform, sample_rate = sample_rate)
plot(v[1], col = "royalblue", type = "l")

Now I have two questions.

First, in the documentation the following warning is given: The effect can trim only from the front of the audio, so in order to trim from the back, the reverse effect must also be used. How exactly can I trim from the back?

Second, in the example code variable v gets the result. How can I save this result as a wave file on disk?

0

There are 0 answers