Wednesday, March 9, 2022

Python: Discrete Cosine Transform (DCT) for speech processing using Scipy

The DCT and inverse DCT may be used to convert a speech signal into the transform domain using real values and back to the speech waveform.

Example using Python and Scipy:

import soundfile as sf

from scipy.fftpack import dct, idct


#DCT and IDCT functions have to be normalized.

def dct_norm (block):

    return dct(block.T, norm = 'ortho').T

def idct_norm (block):

    return idct(block.T, norm = 'ortho').T

  

with open('1.wav', 'rb') as f_wav:

    x, rate = sf.read(f_wav)


X = dct_norm(x)

y = idct_norm(X)


sf.write('2.wav', y, 16000)


Note that the DCT and IDCT functions have to be normalized to obtain identical results to MATLAB.

Reference:

Matlab: Discrete Cosine Transform (DCT) for speech processing (StudyEECC)

scipy.fftpack.dct

Discrete Cosine Transforms (Scipy Tutorial)

In scipy why doesn't idct(dct(a)) equal to a? (StackOverflow)

Python - How to use Soundfile to read and write WAV and FLAC files (Study Raspberry Pi)

No comments:

Post a Comment