Search on speech from spoken queries: the Multi-domain International ALBAYZIN 2018 Query-by-Example Spoken Term Detection Evaluation

EURASIP Journal on Audio, Speech, and Music Processing

Table 7 Acoustic features used in the A-Hybrid DTW+LVCSR QbE STD system

Description	Number of features
Sum of auditory spectra	1
Zero-crossing rate	1
Sum of RASTA style filtering auditory spectra	1
Frame intensity	1
Frame loudness	1
Root mean square energy and log-energy	2
Energy in frequency bands 250–650 Hz (energy 250–650) and 1000–4000 Hz	2
Spectral Rolloff points at 25%, 50%, 75%, 90%	4
Spectral flux	1
Spectral entropy	1
Spectral variance	1
Spectral skewness	1
Spectral kurtosis	1
Psychoacoustical sharpness	1
Spectral harmonicity	1
Spectral flatness	1
Mel-frequency cepstral coefficients	16
MFCC filterbank	26
Line spectral pairs	8
Cepstral perceptual linear predictive coefficients	9
RASTA PLP coefficients	9
Fundamental frequency (F0)	1
Probability of voicing	1
Jitter	2
Shimmer	1
log harmonics-to-noise ratio (logHNR)	1
LCP formant frequencies and bandwidths	6
Formant frame intensity	1
Deltas	102
Total	204