EURASIP Journal on Audio, Speech, and Music Processing

Table 14 Summary of results obtained with i-vector systems and different speaker embeddings in the Tagalog and Cantonese subsets of NIST SRE 2016

From: Introducing phonetic information to speaker embedding for speaker verification

	Tagalog		Cantonese		Pooled
	EER(%)	minDCF16	EER(%)	minDCF16	EER(%)	minDCF16
i-vector	21.37	0.8901	10.07	0.6564	15.73	0.7861
DNN/i-vector	22.25	0.9059	11.47	0.6950	16.90	0.8127
x-vector	13.60	0.7877	5.33	0.4429	9.46	0.6365
x-vector-pc (c=0)	13.06	0.7794	4.86	0.4207	8.96	0.6214
x-vector-pv (c=0.2)	12.60	0.7593	4.55	0.4127	8.58	0.6045
x-vector-mt (1-layer sharing)	12.82	0.7629	4.36	0.3878	8.59	0.5941
sc-vector (1-layer sharing)	12.92	0.7675	4.00	0.3835	8.44	0.5951
c-vector (c=0.2 + 1-layer sharing)	12.00	0.7451	4.04	0.3629	8.04	0.5692

The pooled results are also demonstrated

Back to article page