Tuesday, March 15, 2022

Python: Fast Fourier Transform (FFT) for speech processing using Scipy

Scipy provides the FFT and inverse FFT functions useful for sound processing.

Example:

import soundfile as sf

from scipy.fftpack import fft, ifft

with open('1.wav', 'rb') as f_wav:

    x, rate = sf.read(f_wav)

   

X = fft(x)

#The ifft function returns a complex variable, but the imaginary part is quite small to be neglected.

y_complex = ifft(X)

y = y_complex.real

sf.write('2.wav', y, 16000)


Reference:

Matlab: Discrete Cosine Transform (DCT) for speech processing (StudyEECC)

Python: Discrete Cosine Transform (DCT) for speech processing (StudyRaspberryPi)

Fourier Transforms (scipy.fft)

scipy and numpy inverse fft returns complex numbers not floats, can't save as wav (StackOverflow)

Python - How to use Soundfile to read and write WAV and FLAC files (Study Raspberry Pi)

Wednesday, March 9, 2022

Python: Discrete Cosine Transform (DCT) for speech processing using Scipy

The DCT and inverse DCT may be used to convert a speech signal into the transform domain using real values and back to the speech waveform.

Example using Python and Scipy:

import soundfile as sf

from scipy.fftpack import dct, idct


#DCT and IDCT functions have to be normalized.

def dct_norm (block):

    return dct(block.T, norm = 'ortho').T

def idct_norm (block):

    return idct(block.T, norm = 'ortho').T

  

with open('1.wav', 'rb') as f_wav:

    x, rate = sf.read(f_wav)


X = dct_norm(x)

y = idct_norm(X)


sf.write('2.wav', y, 16000)


Note that the DCT and IDCT functions have to be normalized to obtain identical results to MATLAB.

Reference:

Matlab: Discrete Cosine Transform (DCT) for speech processing (StudyEECC)

scipy.fftpack.dct

Discrete Cosine Transforms (Scipy Tutorial)

In scipy why doesn't idct(dct(a)) equal to a? (StackOverflow)

Python - How to use Soundfile to read and write WAV and FLAC files (Study Raspberry Pi)

Tuesday, March 8, 2022

Python - How to use Soundfile to read and write WAV and FLAC files

WAV and FLAC are two commonly used sound file formats:

WAV - Waveform Audio File Format (Wikipedia)

FLAC - Free Lossless Audio Codec (Wikipedia)

The major difference between the two formats is that WAV is without compression and FLAC is for lossess compression.

In Python, to read and write WAV and FLAC files including export in a different format, simply use the code with Soundfile as below:

import soundfile as sf

with open('wav/1.wav', 'rb') as f_wav:

    x_wav, rate = sf.read(f_wav)

with open('flac/2.flac', 'rb') as f_flac:

    x_flac, rate = sf.read(f_flac)

#exchange sound formats by saving

sf.write('1.flac', x_wav, 16000)

sf.write('2.wav', x_flac, 16000)