Table 11 Top-10 positive distractor events for speech (event labels related to false positive decisions of the network about the “Speech” class)

From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset

Event Event ID Ratio \(d^{+}_{sp}\)
Crowd /m/03qtwd 76/94 0.521
Insect /m/03vt0 67/111 0.411
Water /m/0838f 76/137 0.403
Sizzle /m/07p9k1k 35/40 0.381
Battle cry /m/04gy_2 36/44 0.376
Fowl /m/025rv6n 64/119 0.375
Cheering /m/053hz1 36/45 0.372
Stir /m/07ptfmf 32/36 0.364
Children shouting /t/dd00135 33/39 0.363
Mechanisms /t/dd00077 42/67 0.353
  1. \(d^{+}_{sp}\) score (Eq. 13) is used to rank the events. The ratio column shows the number of false positives for speech where the distractor event label is found (numerator) and the number of non-speech segments that contain the distractor event (denominator)