Skip to main content

Table 7 Acoustic features used in the A-Hybrid DTW+LVCSR QbE STD system

From: Search on speech from spoken queries: the Multi-domain International ALBAYZIN 2018 Query-by-Example Spoken Term Detection Evaluation

Description

Number of features

Sum of auditory spectra

1

Zero-crossing rate

1

Sum of RASTA style filtering auditory spectra

1

Frame intensity

1

Frame loudness

1

Root mean square energy and log-energy

2

Energy in frequency bands 250–650 Hz (energy 250–650) and 1000–4000 Hz

2

Spectral Rolloff points at 25%, 50%, 75%, 90%

4

Spectral flux

1

Spectral entropy

1

Spectral variance

1

Spectral skewness

1

Spectral kurtosis

1

Psychoacoustical sharpness

1

Spectral harmonicity

1

Spectral flatness

1

Mel-frequency cepstral coefficients

16

MFCC filterbank

26

Line spectral pairs

8

Cepstral perceptual linear predictive coefficients

9

RASTA PLP coefficients

9

Fundamental frequency (F0)

1

Probability of voicing

1

Jitter

2

Shimmer

1

log harmonics-to-noise ratio (logHNR)

1

LCP formant frequencies and bandwidths

6

Formant frame intensity

1

Deltas

102

Total

204

  1. PLP perceptual linear predictive, LPC linear predictive coding