Fig. 10From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio datasetTwo-class confusion matrices of the best network setting for speech and music event detection (C2-LSTM model with L=6 and N=256). The percentages indicate the total amount of test segments in each possible real class-predicted class combinationBack to article page