From: An evolutionary feature synthesis approach for content-based audio retrieval
FS | Basic database (6 classes) | Extended database (12 classes) | ||
---|---|---|---|---|
ANMRR (μ ± σ) | AP (%) (μ ± σ) | ANMRR (μ ± σ) | AP (%) (μ ± σ) | |
Segment | ||||
STAT | 0.278 ± 0.016 | 70.4 ± 1.7 | 0.534 ± 0.024 | 44.1 ± 2.4 |
MFCC | 0.280 ± 0.022 | 69.6 ± 2.1 | 0.528 ± 0.023 | 44.9 ± 2.2 |
Δ-MFCC | 0.375 ± 0.016 | 60.6 ± 1.6 | 0.574 ± 0.011 | 40.6 ± 1.0 |
ΔΔ-MFCC | 0.393 ± 0.013 | 59.0 ± 1.3 | 0.598 ± 0.020 | 38.3 ± 2.0 |
LPC | 0.549 ± 0.006 | 43.8 ± 0.6 | 0.729 ± 0.013 | 25.7 ± 1.2 |
LPCC | 0.589 ± 0.015 | 39.6 ± 1.5 | 0.740 ± 0.006 | 24.8 ± 0.5 |
S_AUDIO | 0.236 ± 0.015 | 74.7 ± 1.6 | 0.480 ± 0.013 | 49.9 ± 1.2 |
Key-frame | ||||
MFCC + deltas | 0.463 ± 0.024 | 51.4 ± 2.3 | 0.629 ± 0.030 | 35.2 ± 2.8 |
LPC + LPCC | 0.709 ± 0.007 | 28.4 ± 0.7 | 0.813 ± 0.003 | 18.0 ± 0.3 |
K_AUDIO | 0.346 ± 0.016 | 63.0 ± 1.5 | 0.532 ± 0.012 | 44.4 ± 1.2 |