Skip to main content

Table 6 The effect of different feature aggregation methods on recognition performance

From: Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

Aggregation Method VoxCeleb-E VoxCeleb-H CN-Celeb
EER DCF08 DCF10 EER DCF08 DCF10 EER DCF08 DCF10
Average Pooling 3.20% 0.16 0.52 5.35% 0.25 0.65 21.85% 0.73 0.93
Max Pooling 6.66% 0.34 0.80 10.56% 0.47 0.86 23.66% 0.80 0.96
ASP 5.85% 0.31 0.77 9.41% 0.44 0.85 23.50% 0.79 0.96
SAP 4.24% 0.22 0.63 6.97% 0.33 0.78 22.40% 0.76 0.94
Ghost VLAD 3.01% 0.16 0.52 5.01% 0.24 0.63 21.32% 0.73 0.93
Bi-GRU 2.80% 0.15 0.52 5.04% 0.25 0.67 22.31% 0.77 0.95
GRU 2.54% 0.14 0.47 4.52% 0.22 0.65 22.24% 0.76 0.94