From: Music detection from broadcast contents using convolutional neural networks with a Mel-scale kernel
Test data | Model type | F-score (%) | Precision (%) | Recall (%) |
---|---|---|---|---|
Korean drama (dev) | Spectrogram + CNN with melCL (proposed) | 95.9 | 95.9 | 96.0 |
Spectrogram + CNN | 92.2 | 94.0 | 90.5 | |
Mel-spectrogram + CNN | 94.2 | 95.7 | 92.8 | |
Spectrogram + bi-GRU | 88.0 | 87.0 | 89.0 | |
Mel-spectrogram + bi-GRU | 93.4 | 91.9 | 95.0 | |
Mel-spectrogram + bi-LSTM | 90.6 | 90.1 | 91.1 | |
Korean reality | Spectrogram + CNN with melCL (proposed) | 94.7 | 93.0 | 96.4 |
Spectrogram + CNN | 90.7 | 91.4 | 89.9 | |
Mel-spectrogram + CNN | 93.5 | 91.1 | 95.9 | |
spectrogram + bi-GRU | 90.6 | 84.9 | 97.2 | |
Mel-spectrogram + bi-GRU | 92.3 | 88.5 | 87.8 | |
Mel-spectrogram + bi-LSTM | 92.6 | 87.5 | 98.4 | |
British 8 h | Spectrogram + CNN with melCL (proposed) | 86.5 | 85.3 | 87.8 |
Spectrogram + CNN | 83.5 | 79.8 | 87.5 | |
Mel-spectrogram + CNN | 86.8 | 83.3 | 90.5 | |
Spectrogram + bi-GRU | 75.0 | 65.7 | 87.4 | |
Mel-spectrogram + bi-GRU | 78.5 | 67.8 | 93.1 | |
Mel-spectrogram + bi-LSTM | 80.5 | 72.5 | 90.5 | |
Spanish 12 h | Spectrogram + CNN with melCL (proposed) | 88.9 | 84.7 | 93.4 |
Spectrogram + CNN | 86.6 | 80.0 | 94.4 | |
Mel-spectrogram + CNN | 80.9 | 70.6 | 94.6 | |
Spectrogram + bi-GRU | 75.3 | 63.8 | 92.0 | |
Mel-spectrogram + bi-GRU | 74.1 | 61.5 | 93.2 | |
Mel-spectrogram + bi-LSTM | 75.6 | 63.4 | 93.6 | |
MIREX 2015 | Spectrogram + CNN with melCL (proposed) | 95.3 | 99.4 | 91.6 |
Spectrogram + CNN | 93.8 | 98.8 | 89.3 | |
Mel-spectrogram + CNN | 92.5 | 93.8 | 91.2 | |
Spectrogram + bi-GRU | 92.8 | 94.9 | 90.8 | |
Mel-spectrogram + bi-GRU | 94.3 | 92.3 | 96.4 | |
Mel-spectrogram + bi-LSTM | 95.3 | 94.1 | 92.7 | |
Dafx 07 | Spectrogram + CNN with melCL (proposed) | 84.9 | 84.0 | 85.9 |
Spectrogram + CNN | 84.4 | 77.7 | 92.3 | |
Mel-spectrogram + CNN | 80.1 | 69.2 | 95.1 | |
Spectrogram + bi-GRU | 68.4 | 57.5 | 84.5 | |
Mel-spectrogram + bi-GRU | 69.0 | 53.3 | 98.0 | |
Mel-spectrogram + bi-LSTM | 70.6 | 55.4 | 97.3 |