TY - JOUR AU - Jang, Byeong-Yong AU - Heo, Woon-Haeng AU - Kim, Jung-Hyun AU - Kwon, Oh-Wook PY - 2019 DA - 2019/06/26 TI - Music detection from broadcast contents using convolutional neural networks with a Mel-scale kernel JO - EURASIP Journal on Audio, Speech, and Music Processing SP - 11 VL - 2019 IS - 1 AB - We propose a new method for music detection from broadcasting contents using the convolutional neural networks with a Mel-scale kernel. In this detection task, music segments should be annotated from the broadcast data, where music, speech, and noise are mixed. The convolutional neural network is composed of a convolutional layer with kernel that is trained to extract robust features. The Mel-scale changes the kernel size, and the backpropagation algorithm trains the kernel shape. We used 52 h of mixed broadcast data (25 h of music) to train the convolutional network and 24 h of collected broadcast data (ratio of music of 50–76%) for testing. The test data consisted of various genres (drama, documentary, news, kids, reality, and so on) that are broadcast in British English, Spanish, and Korean languages. The proposed method consistently showed better performance in all the three languages than the baseline system, and the F-score ranged from 86.5% for British data to 95.9% for Korean drama data. Our music detection system takes about 28 s to process a 1-min signal using only one CPU with 4 cores. SN - 1687-4722 UR - https://doi.org/10.1186/s13636-019-0155-y DO - 10.1186/s13636-019-0155-y ID - Jang2019 ER -