Skip to main content

Table 2 Recognition rates achieved regarding each stage of system's topology for different kinds of environments. The recognition score without the additional feature extraction stage is depicted in parenthesis for comparison.

From: An Adaptive Framework for Acoustic Monitoring of Potential Hazards

Classification problem No. of mixtures Feature set Recognition rate (%)
Vocalic versus non-vocalic sound events (subway environment) 64 MFCC+dMFCC 100
Vocalic versus non-vocalic sound events (urban environment) 128 MFCC+dMFCC 99.85
Vocalic versus non-vocalic sound events (military environment) 128 MFCC+dMFCC+MPEG-7 LLDs 100
Typical versus atypical non-vocalic sound events (subway environment) 128 MFCC+dMFCC+MPEG-7 LLDs 97.2 (87.6)
Typical versus atypical non-vocalic sound events (urban environment) 128 MFCC+dMFCC+MPEG-7 LLDs 92.95 (88.2)
Typical versus atypical non-vocalic sound events (military environment) 32 MFCC+dMFCC+MPEG-7 LLDs 100 (91.6)
Explosion versus gunshot sound events 512 MFCC+dMFCC+MPEG-7 LLDs 83.9 (76.4)
Normal versus screamed speech 128 MFCC+dMFCC+intonation+CB-TEO-auto-Env 100 (89.1)