sliding-window Cepstral Mean Normalization — transform_sliding_window

Apply sliding-window cepstral mean (and optionally variance) normalization per utterance.

transform_sliding_window_cmn(
  cmn_window = 600,
  min_cmn_window = 100,
  center = FALSE,
  norm_vars = FALSE
)

Arguments

cmn_window: (int, optional): Window in frames for running average CMN computation (int, default = 600)
min_cmn_window: (int, optional): Minimum CMN window used at start of decoding (adds latency only at start). Only applicable if center == FALSE, ignored if center==TRUE (int, default = 100)
center: (bool, optional): If TRUE, use a window centered on the current frame (to the extent possible, modulo end effects). If FALSE, window is to the left. (bool, default = FALSE)
norm_vars: (bool, optional): If TRUE, normalize variance to one. (bool, default = FALSE)

Tensor: Tensor of audio of dimension (..., time).

forward param: waveform (Tensor): Tensor of audio of dimension (..., time).