Four energy-related LLD |
Sum of auditory spectrum (loudness) |
Sum of RASTA-style filtered auditory spectrum (modulation loudness) |
RMS energy, zero-crossing rate |
Fifty-five spectral LLD |
RASTA-style auditory spectrum, bands 1–26 (0–8 kHz) |
MFCC 1–14 |
Spectral energy 250–650 Hz, 1 k–4 kHz |
Spectral roll off points 0.25, 0.50, 0.75, 0.90 |
Spectral flux, centroid, entropy, slope |
Variance, skewness, kurtosis |
Psychoacoustic sharpness and harmonicity |
Six voicing-related LLD |
F 0 via sub harmonic summation (SHS) and Viterbi smoothing |
Probability of voicing, logarithmic HNR by waveform matching |
Jitter (local and delta), shimmer (local) |