A robust polynomial regression-based voice activity detector for speaker verification

EURASIP Journal on Audio, Speech, and Music Processing

Table 6 Female speaker verification results of i-vector method in terms of percent EER (minDCF) for the proposed algorithm, Drugman’s VAD method [27], and Rangachari’s noise tracking method [21]. The last columns show the relative percent EER reduction rates compared to Drugman’s VAD and Rangachari’s method, respectively

Noise type	SNR level (dB)	Proposed algorithm	Drugman’s VAD	Rangachari’s noise tracking	EER reduction compared to Drugman’s	EER reduction compared to Rangachari’s
Lynx	− 10	31.32 (0.58)	41.84 (0.78)	44.52 (0.84)	25.14	29.65
	− 5	20.16 (0.38)	29.44 (0.55)	36.76 (0.86)	31.52	45.15
	0	11.82 (0.22)	15.80 (0.30)	23.93 (0.44)	25.19	50.60
	5	6.81 (0.12)	8.34 (0.15)	14.35 (0.26)	18.34	52.54
	10	4.13 (0.07)	4.85 (0.08)	9.64 (0.18)	14.84	57.15
F16	− 10	38.79 (0.71)	46.26 (0.85)	47.71 (0.88)	16.14	18.69
	− 5	27.99 (0.52)	37.63 (0.70)	42.20 (0.78)	25.61	33.67
	0	17.4 (0.33)	24.43 (0.46)	31.54 (0.59)	28.77	44.83
	5	9.93 (0.18)	11.89 (0.22)	19.29 (0.36)	16.48	48.52
	10	5.87 (0.10)	6.16 (0.11)	12.54 (0.23)	4.70	53.19
Car	− 10	3.62 (0.06)	3.77 (0.06)	6.74 (0.12)	3.97	47.29
	− 5	2.82 (0.05)	3.19 (0.05)	6.23 (0.11)	11.59	54.73
	0	2.75 (0.04)	3.12 (0.05)	6.09 (0.11)	11.86	54.84
	5	2.75 (0.04)	3.04 (0.05)	6.02 (0.11)	9.54	54.32
	10	2.75 (0.04)	3.04 (0.05)	6.09 (0.11)	9.54	54.82
Babble	− 10	33.21 (0.63)	44.81 (0.84)	46.12 (0.84)	25.88	27.99
	− 5	21.68 (0.40)	34.15 (0.63)	40.32 (0.75)	36.51	46.23
	0	12.98 (0.24)	20.08 (0.37)	27.19 (0.51)	35.35	52.26
	5	6.89 (0.12)	9.42 (0.17)	16.75 (0.31)	26.85	58.86
	10	4.06 (0.07)	5.07 (0.09)	10.73 (0.19)	19.92	62.16
Stitel	− 10	33.21 (0.63)	46.04 (0.85)	45.17 (0.83)	27.86	26.47
	− 5	26.83 (0.50)	34.37 (0.64)	35.53 (0.65)	21.93	24.48
	0	15.08 (0.28)	18.92 (0.35)	22.33 (0.42)	20.29	32.46
	5	8.12 (0.14)	10.37 (0.19)	13.77 (0.26)	21.69	41.03
	10	4.20 (0.07)	6.23 (0.11)	8.99 (0.17)	32.58	53.28