I'm developing an app and I need some way to compare 2 voices if they' match or not, I know that Voice Recognizer is a way to do that but since (i think) it needs to translate the voice into string first, it won't be so suitable for other language apart from the lang supported by the speech recognizer....any idea? Just like old-day phone used to do, the voice tag where it just compare the voice input with the voice it recorded earlier during the setup
Compare voice wav in android or voice tag ( voice commands ) API
7.8k views Asked by rami At
2
There are 2 answers
0
c'quet
On
An idea is comparing the similarity of the voices in their spectograms. The features in spectrogram is robust and resist to noise which is a good reference for analysing two voice. If you take this approach you should find out the features of the voices first and than you need to know how to compare the features in two spectrograms, it refers to pattern recognition.
This api http://code.google.com/p/musicg-sound-api/ is written in java and can be used in android. It captures the wave spectrogram.
Related Questions in ANDROID
- Creating global Class holder
- Flutter + Dart: Editing name of a tab shows up a black screen
- android-pdf-viewer Received status code 401 from server: Unauthorized
- Sdk 34 WRITE_EXTERNAL_STORAGE not working
- ussd reader in Recket Native module
- Incorrect display of LinearGradientBrush in IOS
- The Binary Version Of its metadata is 1.8.0, expected Version is 1.6.0 build error
- I can't make TextInput to auto expand properly in Android
- Creating multiple instances of a class with different initializing values in Flutter
- How to create a lottie animation
- making android analyze with coverity sast tool
- Flutter plugin development android src not opening after opening example
- I initialize my ViewModel in the Activity with several fragments as tabs, but the fragments(tabs) return null for the updated livedata
- Node.js Server + Socket.IO + Android Mobile Applicatoin XHR Polling Error...?
- How I can use the shared preferences class?
Related Questions in WAV
- Using MAX 9814 PCM data to create a .WAV file
- I'm getting chopping noises and incorrect results from my phase vocalizer in jupyter notebook
- How to implement pause/resume feature for wav audio with sounddevice in python
- WAV music extracts some features through librosa. How to restore this matrix to WAV format?
- Get PII durations (start-end time) from an Audio file using Transcription/other techniques
- Binary Visualisation of a WAV file using Python
- BINARY to WAV converted
- using gst-launch-1.0 to record audio to an audio file from any pipewiresrc input
- NAudio only plays sound on the default output device
- stm32 cubeIDE DMA DAC noise on DAC output
- Send OpenAI Text To Speech Wav stream to Twilio stream
- trying to record two channels from multi channel audio device into a wav file using the gstreamer c api without gst_parse_launch
- Write a numpy array to headerless wav file?
- Using webrtcvad to capture audio when the user starts speaking and stops speaking (like Siri) and then saving to a .wav file
- What kind of wav or wave sound data format is required in vosk nodejs library for speech recognition?
Related Questions in SPEECH-RECOGNITION
- How to Avoid Speech Recognition from Recognizing Speaker Playback in Unity
- recognize_google fails with WinError 10060
- React native voice isn't detecting my voice
- Comparing analog signal from Electret mic with samples
- Unable to convert Speech to Text using Azure Speech-to-Text service
- Python Script Not Generating Sync Map Despite Successful Command Line Execution
- Automatic speech recognition from scratch
- google speech transcribe-streaming-audio with single_utterance and time limit
- Azure AI Speech Service - No punctuation on Recognized return
- How to get the microphone to record sound with Google Speech recognition on Raspberry Pi 3?
- How to fix the below mention error in python
- How to increase the time for which the Microsoft Speech Service SDK listens in a single go?
- Make real time prediction with Keras
- AttributeError: module 'speech_recognition' has no attribute 'Microphone'
- Is there any way to do this without writing the file to memory first?
Related Questions in VOICE-RECOGNITION
- Android SpeechRecognizer not working with Chinese
- Hotwords won't trigger on bumblebee-hotword-node
- Why doesn't video-conferencing with subtitles exist?
- Real-time Word Highlighting in React Component with Speech Recognition Updates Using react-hook-speech-to-text
- How to achieve offline voice recognition and trigger the keyboard microphone in React Native?
- Using webrtcvad to capture audio when the user starts speaking and stops speaking (like Siri) and then saving to a .wav file
- How to extract these acoustic features from audio files
- Using Voice Assistants (Siri or Google) to control launch and actions in an app - React Native
- React-Native : Developing a React-Native Android application for Voice Assistant | Error in voice activation (file-build.gradle ,index.ts)
- Why Python voice assistant works so slow?
- How to Custom command with Custom Intent to open my Android App using Hey, Google Voice command
- How to Automatically Pause and Resume Narration in a Next.js 14 App When User Speaks..?
- Detecting Silence in Python Voice Assistant
- how to add events to calendar with google calendar api
- How to restrict Flutter Voice Recognition to specific words only?
Related Questions in WAVE
- Why am I not able to play the wave file?
- Discretized function becomes complex while free propagating a real function when sampled at even number of points using FFT and IFFT in Python
- How to Normalize a function in python?
- How to imporve audio sound quality from 2 input devices using Pyaudio?
- What kind of wav or wave sound data format is required in vosk nodejs library for speech recognition?
- aperiodic signal with a wave generator
- WAVE file is not readable by Python's libraries
- For pyaudio recording on streamlit, how to use session state as a trigger to control the recoding
- soundfile.LibsndfileError: Error opening Format not recognised. error in python server
- Syncronising audio outputs on python script
- What's wrong? Cannot load Microsoft RIFF/WAVE info
- Trying to save a .wav audio file from Nao. I get a damaged file
- How can I trim silence at the start and end of a recording(wav) in python?
- Recording WAV files in safari and chrome yields different codecs
- Solve Unanticipated Host Error Python pyaudio
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
A relatively simple way to do this is to use FFT (Fast Fourier Transform) to convert the time-domain data of the original WAV file into frequency-domain data (in which each value in your transformed array represents the relative magnitude/intensity of a particular frequency band).
If the same person speaks the same word twice, the resulting time-domain data will nevertheless still be very different numerically in the two WAV files. Converting both WAV files to the frequency domain (using the same size of FFT window for both, even if the two files are of slightly different lengths) will produce frequency arrays that are much more similar to each other than were the original WAV files.
Unfortunately, I haven't been able to find any FFT libraries specifically for Android. Here's a question that references some Java-based libraries:
Signal processing library in Java?