Skip to main content

Table 4 F-measures for segment-level evaluation on speech detection

From: A large TV dataset for speech and music activity detection

 

Model Arch.

Training data

PCEN

Muspeak

AVASpeech

TVSM-test

Third-party method (T1)

CNN

  

0.94

0.79

0.84

Third-party method (T2)

CRNN

  

0.97

0.77

0.81

TCN-Cue

TCN

TVSM-cuesheet

 

0.60

0.86

0.90

TCN-P-Cue

TCN

TVSM-cuesheet

✓

0.61

0.86

0.89

TCN-P-Pseu

TCN

TVSM-pseudo

✓

0.60

0.88

0.91

CRNN-Cue

CRNN

TVSM-cuesheet

 

0.63

0.86

0.91

CRNN-P-Cue

CRNN

TVSM-cuesheet

✓

0.63

0.86

0.91

CRNN-P-Pseu

CRNN

TVSM-pseudo

✓

0.67

0.88

0.91

  1. The Highest result of each evaluation dataset is marked as boldface