Fig. 9From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio datasetFour-class confusion matrix of the best network setting for speech and music event detection (C2-LSTM model with L=6 and N=256). The percentages indicate the total amount of test segments in each possible real class-predicted class combinationBack to article page