Skip to main content
Fig. 5 | EURASIP Journal on Audio, Speech, and Music Processing

Fig. 5

From: A large TV dataset for speech and music activity detection

Fig. 5

The mean error rate (the lower the better) across all datasets as described in the sed_eval toolbox. The models CRNN-P-Cue and CRNN-P-Pseu are selected for comparison. TVSM-test (music) and Muspeak (music) represent the music evaluation while TVSM-test (speech) represents the speech evaluation. The other test datasets only contain either speech or music labels as described in Section 3

Back to article page