Skip to main content

Table 1 Performance of CRNNs. All results are given in macro average F1. Tamp amplitude threshold, lr learning rate. All results are measured in macro average F1

From: Towards cross-modal pre-training and learning tempo-spatial characteristics for audio recognition with convolutional and recurrent neural networks

CRNN

lr

Batch size

Tamp [dB]

Devel

Test

0.001

128

− 30

66.5

0.001

128

− 45

57.8

0.001

64

− 30

55.4

0.01

128

− 30

66.2

0.01

128

− 60

69.2

0.01

64

− 30

70.7

72.3

0.01

64

− 45

73.7

79.3

0.01

64

− 60

78.8

74.2

Fusion of best 3 CRNNs

81.4

82.2

DCASE 2018, task 5 baseline [8]

0.0001

256

84.5

83.1