Is there any way to do this without writing the file to memory first?

47 views Asked by At

I have this code that transcibes the audio stream of some youtube video gived the link, now as you might except this is quite slow since I must downlaod the video stream first as an .mp4 then convert it, using moviepy, to .wav then record the audio then transcribe it. I want to have the exact same functionality but without downloading the stream first by writing the data of the stream to some buffer for example.

from pytube import YouTube
import speech_recognition as sr
from moviepy.editor import *

video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)

recognizer = sr.Recognizer()

audio_stream.download(filename="mp4_output.mp4")
audio = AudioFileClip("mp4_output.mp4")
audio.write_audiofile("wav_output.wav")

                    
with sr.AudioFile("./wav_output.wav") as audio_file:
    audio_data = recognizer.record(audio_file, duration=100)
    transcript = recognizer.recognize_sphinx(audio_data=audio_data)

I've tried the following

from pytube import YouTube
import speech_recognition as sr

video = YouTube(url=link)
audio_stream = video.streams.get_by_itag(140)

buffer = io.BytesIO()
audio_stream.stream_to_buffer(buffer)
                    
recognizer = sr.Recognizer()

with sr.AudioFile(buffer) as audio_file:
    audio_data = recognizer.record(audio_file, duration=100)
    transcript = recognizer.recognize_sphinx(audio_data=audio_data) 

I get the following error audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC; check if file is corrupted or in another format is there a way to convert buffer into one of those formats?

0

There are 0 answers