transform_inverse_mel_scale.Rd
Solve for a normal STFT from a mel frequency STFT, using a conversion matrix. This uses triangular filter banks.
transform_inverse_mel_scale(
n_stft,
n_mels = 128,
sample_rate = 16000,
f_min = 0,
f_max = NULL,
max_iter = 1e+05,
tolerance_loss = 1e-05,
tolerance_change = 1e-08,
...
)
(int): Number of bins in STFT. See n_fft
in transform_spectrogram.
(int, optional): Number of mel filterbanks. (Default: 128
)
(int, optional): Sample rate of audio signal. (Default: 16000
)
(float, optional): Minimum frequency. (Default: 0.
)
(float or NULL, optional): Maximum frequency. (Default: sample_rate %/% 2
)
(int, optional): Maximum number of optimization iterations. (Default: 100000
)
(float, optional): Value of loss to stop optimization at. (Default: 1e-5
)
(float, optional): Difference in losses to stop optimization at. (Default: 1e-8
)
(optional): Arguments passed to the SGD optimizer. Argument lr will default to 0.1 if not specied.(Default: NULL
)
Tensor: Linear scale spectrogram of size (..., freq, time)
forward param:
melspec (Tensor): A Mel frequency spectrogram of dimension (..., n_mels
, time)
It minimizes the euclidian norm between the input mel-spectrogram and the product between the estimated spectrogram and the filter banks using SGD.