Skip to main content

Table 6 The effect of different feature aggregation methods on recognition performance

From: Anchor voiceprint recognition in live streaming via RawNet-SA and gated recurrent unit

Aggregation Method

VoxCeleb-E

VoxCeleb-H

CN-Celeb

EER

DCF08

DCF10

EER

DCF08

DCF10

EER

DCF08

DCF10

Average Pooling

3.20%

0.16

0.52

5.35%

0.25

0.65

21.85%

0.73

0.93

Max Pooling

6.66%

0.34

0.80

10.56%

0.47

0.86

23.66%

0.80

0.96

ASP

5.85%

0.31

0.77

9.41%

0.44

0.85

23.50%

0.79

0.96

SAP

4.24%

0.22

0.63

6.97%

0.33

0.78

22.40%

0.76

0.94

Ghost VLAD

3.01%

0.16

0.52

5.01%

0.24

0.63

21.32%

0.73

0.93

Bi-GRU

2.80%

0.15

0.52

5.04%

0.25

0.67

22.31%

0.77

0.95

GRU

2.54%

0.14

0.47

4.52%

0.22

0.65

22.24%

0.76

0.94