Skip to main content

Table 7 The effect of VAD and speech separation

From: Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

Models CN-Celeb-T CN-Celeb-T-VAD
EER DCF08 DCF10 EER DCF08 DCF10
RawNet2 17.25% 0.58 0.89 16.28% 0.60 0.86
RawNet2* 17.30% 0.60 0.90 16.51% 0.61 0.87
RawNet-MHSA 15.34% 0.56 0.86 15.16% 0.57 0.85
RawNet-all-SA 15.51% 0.57 0.91 15.18% 0.59 0.87
RawNet-origin-SA* 16.14% 0.58 0.87 15.89% 0.60 0.87
RawNet-SA 15.04% 0.56 0.87 14.81% 0.58 0.86
  1. “*” denotes that the network is initialized with the trained RawNet2 parameters