Skip to main content

Table 4 F-measures for segment-level evaluation on speech detection

From: A large TV dataset for speech and music activity detection

  Model Arch. Training data PCEN Muspeak AVASpeech TVSM-test
Third-party method (T1) CNN    0.94 0.79 0.84
Third-party method (T2) CRNN    0.97 0.77 0.81
TCN-Cue TCN TVSM-cuesheet   0.60 0.86 0.90
TCN-P-Cue TCN TVSM-cuesheet 0.61 0.86 0.89
TCN-P-Pseu TCN TVSM-pseudo 0.60 0.88 0.91
CRNN-Cue CRNN TVSM-cuesheet   0.63 0.86 0.91
CRNN-P-Cue CRNN TVSM-cuesheet 0.63 0.86 0.91
CRNN-P-Pseu CRNN TVSM-pseudo 0.67 0.88 0.91
  1. The Highest result of each evaluation dataset is marked as boldface