Skip to main content

Table 11 Top-10 positive distractor events for speech (event labels related to false positive decisions of the network about the “Speech” class)

From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset

Event

Event ID

Ratio

\(d^{+}_{sp}\)

Crowd

/m/03qtwd

76/94

0.521

Insect

/m/03vt0

67/111

0.411

Water

/m/0838f

76/137

0.403

Sizzle

/m/07p9k1k

35/40

0.381

Battle cry

/m/04gy_2

36/44

0.376

Fowl

/m/025rv6n

64/119

0.375

Cheering

/m/053hz1

36/45

0.372

Stir

/m/07ptfmf

32/36

0.364

Children shouting

/t/dd00135

33/39

0.363

Mechanisms

/t/dd00077

42/67

0.353

  1. \(d^{+}_{sp}\) score (Eq. 13) is used to rank the events. The ratio column shows the number of false positives for speech where the distractor event label is found (numerator) and the number of non-speech segments that contain the distractor event (denominator)