Skip to main content

Table 9 The influence of different similarity measurements on the recognition performance

From: Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

Models Similarity VoxCeleb-E VoxCeleb-H CN-Celeb
EER DCF08 DCF10 EER DCF08 DCF10 EER DCF08 DCF10
RawNet2 Cosine 2.57% 0.14 0.52 4.89% 0.24 0.64 24.27% 0.78 0.97
PLDA 3.78% 0.19 0.58 6.43% 0.29 0.70 27.76% 0.82 1.00
B-vector 3.39% 0.19 0.69 5.99% 0.32 0.87 26.16% 0.82 1.00
RawNet-origin-SA* Cosine 2.37% 0.13 0.50 4.54% 0.22 0.63 23.49% 0.78 0.94
PLDA 3.51% 0.17 0.59 6.04% 0.28 0.72 27.46% 0.81 0.97
B-vector 3.17% 0.18 0.69 5.60% 0.29 0.84 26.24% 0.81 1.00
RawNet-SA Cosine 2.54% 0.14 0.47 4.52% 0.22 0.65 22.24% 0.76 0.94
PLDA 3.94% 0.19 0.59 6.48% 0.29 0.76 24.67% 0.80 0.96
B-vector 3.54% 0.21 0.72 6.46% 0.37 0.91 22.84% 0.84 1.00
  1. “*” denotes that the network is initialized with the trained RawNet2 parameters