i need a good pitch shift solution for my project to change the voice. a lot of pitch shift js libraries around - tried them all but they don't provide the desired result. main thing is no control on the result voice timbre and i get Mickey mouse or hell zombie sounding stuff but not real voices with it. while here the result is just outstanding if to test with vega's voice: http://www.sonicapi.com/docs/live-task-demo?task=process-elastiqueTune#demo_form unfortunately i'm total zero with audio processing and wanna know at least how it's done, what a type of shifting algorythm is used here and how we can achieve timbre/formant control over the process. any hints highly appreciated. thanks ;)
Related Questions in JAVASCRIPT
- Using Puppeteer to scrape a public API only when the data changes
- inline SVG text (js)
- An array of images and a for loop display the buttons. How to assign each button to open its own block by name?
- Storing the preferred font-size in localStorage
- Simple movie API request not showing up in the console log
- Authenticate Flask rest API
- Deploying sveltekit app with gunjs on vercel throws cannot find module './lib/text-encoding'
- How to request administrator rights?
- mp4 embedded videos within github pages website not loading
- Scrimba tutorial was working, suddenly stopped even trying the default
- In Datatables, start value resets to 0, when column sorting
- How do I link two models in mongoose?
- parameter values only being sent to certain columns in google sheet?
- Run main several times of wasm in browser
- Variable inside a Variable, not updating
Related Questions in WEB-AUDIO-API
- AudioContext not heard although it is running
- How to set wetness and dryness of a convolver filter in the Web Audio API?
- Why is this AudioWorklet to MP3 code producing different results on Chromium and Firefox?
- iOS Audio Ducking Issue in Daily.co SDK
- Is there a way to convert a blob with MIME type 'audio/wav' into a WAV file?
- Do browsers limit the number of decodeAudioData calls per session?
- Is it possible to correctly convert Float32Array to Int32Array without testing each float?
- How to receive user voice through microphone in the Chrome extension panel?
- web audio api, AudioContext dont play on channel 3. Channels 4 and above do not work correctly
- Why might the outputs passed to AudioWorkletProcessor's process() method contain an empty array?
- Is this a correct way of using the Web Audio API? (slight sound artifacts)
- Playing Blobs in html5 audio causing interruptions
- preserve phase when using WebAudio’s AnalyserNode
- .init is not a function using opus-recorder
- Streaming audio from a PyAudio stream to be played on a webpage in Javascript
Related Questions in PITCH-SHIFTING
- I'm getting chopping noises and incorrect results from my phase vocalizer in jupyter notebook
- ToneJS PitchShift with MediaStream
- How to calculate the phases after a pitch shift on the STFT?
- Using Tone.js, can I get the raw PCM data that represents the audio played through my speakers?
- Non-realtime pitch shift function for swift or objective c
- Convert Audio Nodes from Web Audio API to play in same quality with Expo Webview for IOS
- How to make "duck" audio effect in real time (pyaudio)
- Algorithms for implementing MIDI pitch bend messages
- What is a good algorithm for pitch shifting?
- Pitch shift in typescript
- React Native audio change pitch sound audio
- Tone.PitchShift and Howler.js issues
- How can I specify n_fft in Librosa pitch shifting effects
- Midi music pitch shift not working for iOS in Swift
- Real time pitch shifting from scratch using python
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
This question touches a very broad subject. Here's a few pointers.
Pitch can be, in general, shifted by offsetting the frequencies that form the voice material. An easy version of this is resampling in temporal domain, where in essential the recording is played back in a different speed. This naturally leads to a tempo change as well which is often not desirable.
In order to preserve the tempo, you need to "explode" the material into its components, in other words, make a domain change from temporal domain to frequency domain. This is what Fourier Transform is for. Once done, you have an estimate of set of frequencies (and respective phases if properly done in complex space) per sample.
The perceived timbre of the voice depends on the relative amplitudes of the frequency set called overtones. Overtones are formed in the speaker's vocal tract and to the listener, heard together with the fundamental frequency. You can control the timbre using different filters in either time domain, spectral (frequency) domain or cepstral domain. This kind of signal processing is a subject for a library section full of books.
You can move from back from the spectral (frequency) domain to the temporal (time) domain using inverse Fourier transform.
To sum up, the naive approach to shift the pitch you need to transform the samples from temporal to spectral domain, resample along the time axis, and then do the inverse Fourier transform to get back to the time domain.
Besides Fourier transform, you could use wavelets. I hope this gets you started.