Skip to main content

Table 10 Top 10 negative distractor events for speech (event labels related to false negative decisions of the network about the “speech” class)

From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset

Event

Event ID

Ratio

\(d^{-}_{sp}\)

Whispering

/m/02rtxlg

24/30

0.301

Male singing

/t/dd00003

22/52

0.216

Musical instrument

/m/04szw

66/293

0.193

Female singing

/t/dd00004

19/50

0.191

Singing

/m/015lz1

17/45

0.179

Violin, fiddle

/m/07y_7

13/23

0.179

Music

/m/04rlf

810/5636

0.143

Disco

/m/026z9

10/23

0.137

Bass guitar

/m/018vs

10/23

0.137

Guitar

/m/0342h

34/204

0.134

  1. \(d^{-}_{sp}\) score (Eq. 12) is used to rank the events. The ratio column shows the number of false negatives for speech where the distractor event label is found (numerator) and the number of speech segments that contain the distractor event (denominator)