Skip to main content

Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 5 Music event detection results with different network architectures

From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio dataset

Model L N p Train Validation Test
     Cost Acc.% Cost Acc.% Cost Acc.%
FConn 4 2048 7.15 0.518 74.73 0.552 72.50 0.554 72.74
CNN3x3 7 256 6.60 0.362 85.28 0.386 84.14 0.396 83.51
CNN7x7 6 128 6.69 0.355 85.46 0.379 84.19 0.379 84.20
LSTM 3 32 4.57 0.559 72.39 0.553 72.98 0.554 72.65
C1-LSTM 3 256 6.40 0.431 81.08 0.466 79.48 0.460 79.75
C2-LSTM 6 128 6.00 0.333 86.61 0.383 84.34 0.380 84.49
  1. The Model column refers to the network architecture, L and N are the number of hidden layers and nodes in each layer (the detailed function of these parameters in each structure can be found in Section 3.3). p is a base-10 logarithmic measure of the number of parameters. The value of the cost or loss function and the clasiffication accuracy is included for the training, validation and test subsets. The best model in terms of validation cost is highlighted in italics