It is implemented using normalized cross-correlation function and median smoothing.

functional_detect_pitch_frequency(
  waveform,
  sample_rate,
  frame_time = 10^(-2),
  win_length = 30,
  freq_low = 85,
  freq_high = 3400
)

Arguments

waveform

(Tensor): Tensor of audio of dimension (..., freq, time)

sample_rate

(int): The sample rate of the waveform (Hz)

frame_time

(float, optional): Duration of a frame (Default: 10 ** (-2)).

win_length

(int, optional): The window length for median smoothing (in number of frames) (Default: 30).

freq_low

(int, optional): Lowest frequency that can be detected (Hz) (Default: 85).

freq_high

(int, optional): Highest frequency that can be detected (Hz) (Default: 3400).

Value

Tensor: Tensor of freq of dimension (..., frame)