transform_mfcc.Rd
Create the Mel-frequency cepstrum coefficients from an audio signal.
transform_mfcc(
sample_rate = 16000,
n_mfcc = 40,
dct_type = 2,
norm = "ortho",
log_mels = FALSE,
...
)
(int, optional): Sample rate of audio signal. (Default: 16000
)
(int, optional): Number of mfc coefficients to retain. (Default: 40
)
(int, optional): type of DCT (discrete cosine transform) to use. (Default: 2
)
(str, optional): norm to use. (Default: 'ortho'
)
(bool, optional): whether to use log-mel spectrograms instead of db-scaled. (Default: FALSE
)
(optional): arguments for transform_mel_spectrogram.
tensor
: specgram_mel_db of size (..., n_mfcc
, time).
forward param: waveform (tensor): Tensor of audio of dimension (..., time)
By default, this calculates the MFCC on the DB-scaled Mel spectrogram. This output depends on the maximum value in the input spectrogram, and so may return different values for an audio clip split into snippets vs. a a full clip.