7.2 Python implementation
Last updated
Was this helpful?
Last updated
Was this helpful?
In the (Section 4), we already have code that implements pitch shifting:
Unlike the granular synthesis chapter, the above function DFT_pshift
is more suitable for a real-time implementation as the input x[n:n+G]
and output y[n:n+G]
buffers have the same length. Recall that this was not the case for the granular synthesis implementation in the IPython notebook (cell 70 under Section 3).
Lookup tables.
State variables.
Integer data types.
With our DFT rescaling functions now complete, we can procede to applying the effect on a fixed WAV file.
At the top of the script, we have a new code snippet:
This code defines a new global variable N_BINS
for the number of frequency bins in the positive half of the spectrum after performing a DFT of length GRAIN_LEN_SAMP
(grain length in samples).
Now for the fun part! That is, implementing the pitch shifting and trying it out.
If we look at the following code snippet from the function DFT_pshift
of the IPython notebook:
We can observe three steps:
Apply the tapering window to the input grain: x[n:n+G] * win
.
Apply the DFT rescaling function.
Apply the tapering window to the DFT-rescaled grain: w * win
.
The last step is already performed as part of our granular synthesis code, so we have to implement the first two steps. However, we will also have additional steps of casting our input grain to floating point, as the FFT works with this variable type, and casting back to integer.
Apply tapering window to the (integer) input grain.
Cast input grain to floating point.
Apply the DFT rescaling function.
Cast rescaled grain to integer type.
You may notice that in the provided script there is a new global array called input_concat_float
, which you are suggested to use.
With this additional code, your DFT-based pitch shifter should be complete!
Congrats on implementing the DFT-based pitch shifter! You may notice some unwanted artifacts when applying this effect. More advanced versions (such as commercial "auto-tune" applications) take great care to minimize such artifacts by doing a more sophisticated analysis of each speech segment. We saw a similar type of analysis when computing the LPC coefficients in the previous chapter.
The code here is essentially ready for porting to C. However, we still need to introduce you to a library for computing the DFT. This will be covered in a (yet to come) section.
For now, enjoy your various voice effects!
However, in order to make the task of porting to C much easier, we would like to implement the above effect without using array operations and using some of the we saw in the alien voice chapter, namely:
There exist C libraries for performing the FFT (e.g. ) so we will still use numpy
's FFT library in our Python implementation.
In , you can find an incomplete function build_dft_rescale_lookup
for building this lookup table (see below). Read the TODO
comment for hints on how to compute it!
In , you can find an incomplete function dft_rescale
for performing this operation (see below). Edit the code under the TODO
comment in order to perform the spectrum rescaling.
Notice that we use the functions and so that we only work with the positive half of the spectrum, as we have a real-valued input signal.
TASK 3: As a sanity check, copy your code from your granular synthesis implementation below the comments # copy from granular synthesis
into the .
Complete also the ms2smp
function in
TASK 5: When your implementaton works with the fixed WAV file, you can copy your init
and process
functions to the in order to run the effect in real-time with your laptop's soundcard.