From: An Adaptive Framework for Acoustic Monitoring of Potential Hazards
Classification problem | No. of mixtures | Feature set | Recognition rate (%) |
---|---|---|---|
Vocalic versus non-vocalic sound events (subway environment) | 64 | MFCC+dMFCC | 100 |
Vocalic versus non-vocalic sound events (urban environment) | 128 | MFCC+dMFCC | 99.85 |
Vocalic versus non-vocalic sound events (military environment) | 128 | MFCC+dMFCC+MPEG-7 LLDs | 100 |
Typical versus atypical non-vocalic sound events (subway environment) | 128 | MFCC+dMFCC+MPEG-7 LLDs | 97.2 (87.6) |
Typical versus atypical non-vocalic sound events (urban environment) | 128 | MFCC+dMFCC+MPEG-7 LLDs | 92.95 (88.2) |
Typical versus atypical non-vocalic sound events (military environment) | 32 | MFCC+dMFCC+MPEG-7 LLDs | 100 (91.6) |
Explosion versus gunshot sound events | 512 | MFCC+dMFCC+MPEG-7 LLDs | 83.9 (76.4) |
Normal versus screamed speech | 128 | MFCC+dMFCC+intonation+CB-TEO-auto-Env | 100 (89.1) |