A robust polynomial regression-based voice activity detector for speaker verification

EURASIP Journal on Audio, Speech, and Music Processing

Table 1 Male speaker verification results of GMM-UBM method in terms of percent EER (minDCF) for the proposed algorithm, Drugman’s VAD method [27], and Rangachari’s noise tracking method [21]. The last columns show the relative percent EER reduction rates compared to Drugman’s VAD and Rangachari’s method, respectively

Noise type	SNR level (dB)	Proposed algorithm	Drugman’s VAD	Rangachari’s noise tracking	EER reduction compared to Drugman’s	EER reduction compared to Rangachari’s
Lynx	− 10	34.25 (0.64)	46.10 (0.85)	47.4 (0.87)	25.70	27.74
	− 5	25.30 (0.47)	32.18 (0.60)	39.22 (0.72)	21.38	35.49
	0	15.29 (0.28)	14.60 (0.27)	22.47 (0.42)	− 4.72	31.95
	5	8.41 (0.15)	8.41 (0.15)	13.45 (0.24)	0	37.47
	10	5.42 (0.10)	6.50 (0.12)	9.93 (0.18)	16.61	45.41
F16	− 10	41.28 (0.78)	48.31 (0.88)	48.16 (0.89)	14.55	14.28
	− 5	31.88 (0.60)	41.82 (0.80)	45.18 (0.84)	23.77	29.43
	0	20.87 (0.39)	24.38 (0.46)	33.4 (0.60)	14.40	37.51
	5	11.85 (0.22)	11.54 (0.21)	18.19 (0.34)	− 2.68	34.85
	10	6.95 (0.13)	7.8 (0.14)	12.46 (0.23)	10.89	44.22
Car	− 10	5.96 (0.10)	6.27 (0.11)	8.94 (0.16)	4.94	33.33
	− 5	4.74 (0.08)	5.88 (0.10)	8.35 (0.15)	19.38	43.23
	0	4.35 (0.08)	5.50 (0.10)	8.18 (0.15)	20.91	46.82
	5	4.05 (0.07)	5.27 (0.09)	7.95 (0.14)	23.15	49.05
	10	4.05 (0.07)	5.12 (0.09)	7.95 (0.14)	20.90	49.05
Babble	− 10	36.85 (0.69)	48.08 (0.87)	47.85 (0.88)	23.35	22.98
	− 5	26.83 (0.50)	38.45 (0.72)	43.94 (0.87)	30.22	42.84
	0	17.50 (0.33)	19.49 (0.36)	28.28 (0.51)	10.21	38.11
	5	10.01 (0.18)	10.16 (0.18)	14.52 (0.27)	1.47	31.06
	10	6.72 (0.12)	7.26 (0.13)	10.93 (0.20)	7.44	38.51
Stitel	− 10	42.66 (0.79)	47.17 (0.86)	45.18 (0.84)	9.56	5.57
	− 5	33.71 (0.62)	37.23 (0.69)	37.15 (0.69)	9.45	9.26
	0	19.95 (0.37)	19.26 (0.36)	20.41 (0.38)	− 3.58	2.25
	5	9.40 (0.17)	9.71 (0.18)	11.62 (0.21)	3.19	19.10
	10	5.96 (0.11)	6.95 (0.12)	9.32 (0.17)	14.24	36.05