For each frame, take the highest value of NCCF, apply centered median smoothing, and convert to frequency.
functional__find_max_per_frame(nccf, sample_rate, freq_high)
(tensor): Usually a tensor returned by functional__compute_nccf
(int): sampling rate of the waveform, e.g. 44100 (Hz)
(int): Highest frequency that can be detected (Hz)
Note: If the max among all the lags is very close to the first half of lags, then the latter is taken.
with indices