From: An evolutionary feature synthesis approach for content-based audio retrieval
FS | Basic database (6 classes) | Extended database (12 classes) | ||
---|---|---|---|---|
ANMRR (μ ± σ) | AP (%) (μ ± σ) | ANMRR ( μ ± σ) | AP (%) (μ ± σ) | |
Segment | ||||
STAT | 0.167 ± 0.013 | 81.7 ± 1.5 | 0.425 ± 0.012 | 55.0 ± 1.2 |
MFCC | 0.221 ± 0.021 | 75.5 ± 2.2 | 0.398 ± 0.012 | 57.5 ± 1.2 |
Δ-MFCC | 0.332 ± 0.015 | 64.7 ± 1.5 | 0.511 ± 0.096 | 43.9 ± 3.1 |
ΔΔ-MFCC | 0.313 ± 0.016 | 66.7 ± 1.6 | 0.526 ± 0.033 | 45.0 ± 3.2 |
LPC | 0.524 ± 0.016 | 46.1 ± 1.5 | 0.675 ± 0.008 | 31.1 ± 0.8 |
LPCC | 0.556 ± 0.010 | 42.8 ± 1.0 | 0.720 ± 0.017 | 26.7 ± 1.6 |
S_AUDIO | 0.171 ± 0.011 | 81.5 ± 1.2 | 0.442 ± 0.029 | 53.4 ± 2.9 |
Key-frame | ||||
MFCC + deltas | 0.371 ± 0.012 | 60.3 ± 1.2 | 0.470 ± 0.012 | 50.5 ± 1.1 |
LPC + LPCC | 0.634 ± 0.013 | 35.3 ± 1.2 | 0.775 ± 0.011 | 21.6 ± 1.0 |
K_AUDIO | 0.289 ± 0.029 | 68.7 ± 2.8 | 0.441 ± 0.009 | 53.2 ± 0.9 |