Skip to main content

Table 3 Event detection performance with various components of the proposed model. The standard deviations are computed from 5 iterations of each model

From: Multi-rate modulation encoding via unsupervised learning for audio event detection

 

Condition

PSDS1

PSDS2

 

avg \(\varvec{\pm }\) std

avg \(\varvec{\pm }\) std

1\(\times\)CRNN

\(f_c\) = 0.8 Hz

0.361 ± 0.007

0.601 ± 0.007

\(f_c\) = 2.4 Hz

0.371 ± 0.006

0.603 ± 0.002

\(f_c\) = 4 Hz

0.365 ± 0.005

0.593 ± 0.008

VAE

0.365 ± 0.003

0.605 ± 0.004

2\(\times\)CRNN

\(f_{c1}\) = 0.8 Hz

0.368 ± 0.004

0.612 ± 0.007

\(f_{c2}\) = 2.4 Hz

\(f_{c1}\) = 0.8 Hz

0.365 ± 0.003

0.594 ± 0.003

\(f_{c2}\) = 4 Hz

\(f_{c1}\) = 2.4 Hz,

0.374 ± 0.009

0.611± 0.009

\(f_{c2}\) = 4 Hz

3\(\times\)CRNN

Low-pass only

0.373 ± 0.007

0.607 ± 0.005

High-pass only

0.376 ± 0.007

0.613 ± 0.008