Skip to main content

Table 7 The effect of VAD and speech separation

From: Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

Models

CN-Celeb-T

CN-Celeb-T-VAD

EER

DCF08

DCF10

EER

DCF08

DCF10

RawNet2

17.25%

0.58

0.89

16.28%

0.60

0.86

RawNet2*

17.30%

0.60

0.90

16.51%

0.61

0.87

RawNet-MHSA

15.34%

0.56

0.86

15.16%

0.57

0.85

RawNet-all-SA

15.51%

0.57

0.91

15.18%

0.59

0.87

RawNet-origin-SA*

16.14%

0.58

0.87

15.89%

0.60

0.87

RawNet-SA

15.04%

0.56

0.87

14.81%

0.58

0.86

  1. “*” denotes that the network is initialized with the trained RawNet2 parameters