From: ALBAYZIN Query-by-example Spoken Term Detection 2016 evaluation
Description | No. of features |
---|---|
Sum of auditory spectra | 1 |
Zero-crossing rate | 1 |
Sum of RASTA style filtering auditory spectra | 1 |
Frame intensity | 1 |
Frame loudness | 1 |
Root mean square energy and log-energy | 2 |
Energy in frequency bands 250–650 Hz and 1000–4000 Hz | 2 |
Spectral rolloff points at 25%, 50%, 75%, 90% | 4 |
Spectral flux | 1 |
Spectral entropy | 1 |
Spectral variance | 1 |
Spectral skewness | 1 |
Spectral kurtosis | 1 |
Psychoacoustical sharpness | 1 |
Spectral harmonicity | 1 |
Spectral flatness | 1 |
MFCCs | 16 |
MFCC filterbank | 26 |
Line spectral pairs | 8 |
Cepstral PLP coefficients | 9 |
RASTA PLP coefficients | 9 |
Fundamental frequency (F0) | 1 |
Probability of voicing | 1 |
Jitter | 2 |
Shimmer | 1 |
Log harmonics-to-noise ratio (logHNR) | 1 |
LPC formant frequencies and bandwidths | 6 |
Formant frame intensity | 1 |
First derivative | 102 |
Total | 204 |