Real time pitch shifting from scratch using python

2.3k views Asked by At

I need help with a project which consists of 2 parts:

  1. real time pitch shifter in python (from scratch).
  2. switch the pitches of 2 voices from 2 different speakers.

I have 2 questions:

  1. I couldn't find the proper math behind pitch shifting to implement it from scratch so a simple explanation would be appreciated.
  2. Do I need to extract pitches from 2 voices to switch them or there's a simpler solution? If not an explanation on how to properly extract pitch from a sound and switching it is appreciated.

Thanks in advance.

2

There are 2 answers

2
jnnnnn On

librosa does this. The source is at

https://github.com/librosa/librosa/blob/main/librosa/effects.py#L253

The algorithm used there is summarised by the comment,

# Stretch in time, then resample

To explain this a little more, you can change the pitch by "stretching out" (or squashing) the waveform in the horizontal direction. This would, for example, make the vibrations of Middle C (262 Hz) be further apart and thus lower in frequency -- and, as a result, also lower in pitch. Stretching it out to double (and then filling in samples so that the sample rate remains unchanged) would change the pitch down an octave to C3 at 131Hz.

It looks like the hard part is resampling effectively, but a variety of algorithms are mentioned in the code.

0
Sadegh On

The first part which needs scratch code is done here

for two voices you need two pitches for sure, however, you can just make unsupervised training in order to recognize the speaker so it is not very hard.

You can also use frames containing their voice if they are mixed and you want to do it without machine learning methods.

There are also lots of more robust ways of finding the speaker with ML and the most famous is MFCC which is explained here.