Skip to main content

Table 6 The results of MIL and source-separation-based methods

From: Frequency-dependent auto-pooling function for weakly supervised sound event detection

Method Parameters Audio tagging Sound event detection Error rate
  10 k F-score AUC mAP F-score AUC mAP ER D I
MIL Attention [20] 54.15 0.671 0.923 0.723 0.341 0.861 0.348 1.574 0.885 0.689
  TALNet [19] 94.06 0.646 0.911 0.687 0.397 0.849 0.390 1.339 0.865 0.474
Source VGG-GWRP [24] 58.76 0.572 0.923 0.635 0.429 0.803 0.372 1.991 0.780 1.210
separation VGG-AP 58.76 0.538 0.909 0.639 0.352 0.823 0.362 1.886 0.844 1.061
-based VGG-FAP 58.76 0.590 0.923 0.672 0.407 0.848 0.385 1.776 0.823 0.952
  DDC-GWRP 28.84 0.626 0.931 0.689 0.468 0.808 0.404 1.850 0.813 1.037
  DDC-AP 28.84 0.573 0.919 0.684 0.382 0.845 0.398 1.831 0.853 0.978
  DDC-FAP 29.10 0.633 0.931 0.719 0.446 0.868 0.427 1.689 0.845 0.844