EURASIP Journal on Audio, Speech, and Music Processing

Table 5 AUC (%) comparison between the proposed method and other methods (Ramirez et al. [8], Kinnunen et al. [11], Sohn et al. [12], and Segbroeck et al. [44])

From: Enhancement of speech dynamics for voice activity detection using DNN

		AUC (%)—mean ± standard deviation
Noise	SNR (dB)	Proposed	Ramirez	Kinnunen	Sohn	Segbroeck
Clean		99.06 ±0.13	71.03 ±1.02	95.65 ±0.43	88.48 ±1.45	84.57 ±1.21
White	10	97.91 ±0.28	74.09 ±1.50	93.65 ±1.05	93.32 ±0.50	79.63 ±0.24
	5	97.44 ±0.43	73.57 ±1.36	93.02 ±1.31	87.84 ±0.50	77.75 ±0.34
	0	96.59 ±0.50	72.33 ±1.24	91.06 ±1.59	77.34 ±1.14	75.21 ±0.67
	− 5	94.69 ±0.66	68.97 ±1.07	83.85 ±1.28	66.79 ±1.85	71.89 ±0.77
Babble	10	96.84 ±0.60	68.84 ±1.32	87.71 ±0.91	87.56 ±0.87	81.25 ±0.35
	5	95.19 ±0.71	67.28 ±0.71	84.19 ±0.80	79.97 ±0.70	79.05 ±0.69
	0	91.30 ±0.74	63.62 ±0.91	76.59 ±0.99	70.05 ±0.94	72.99 ±1.13
	− 5	83.20 ±0.87	59.37 ±1.01	66.73 ±1.52	60.33 ±0.93	62.71 ±1.25
Factory	10	97.25 ±0.39	70.35 ±1.85	88.12 ±1.73	88.04 ±0.67	81.19 ±0.96
	5	95.96 ±0.43	67.54 ±1.57	84.42 ±1.67	79.55 ±0.85	78.99 ±1.06
	0	93.18 ±0.46	62.78 ±1.53	77.70 ±1.07	67.15 ±0.82	74.67 ±0.87
	− 5	85.91 ±0.29	57.81 ±1.71	66.38 ±0.76	56.28 ±0.56	67.23 ±0.52
Car	10	99.02 ±0.11	69.06 ±1.60	94.62 ±0.36	91.56 ±1.29	84.46 ±0.96
	5	98.94 ±0.11	68.27 ±1.32	93.64 ±0.80	92.15 ±0.83	84.38 ±0.96
	0	98.79 ±0.09	68.50 ±1.02	92.42 ±0.40	92.41 ±0.34	84.08 ±0.91
	− 5	98.40 ±0.05	68.77 ±1.87	90.16 ±0.17	91.81 ±0.10	83.49 ±1.06
Pink	10	97.79 ±0.39	73.11 ±1.67	90.37 ±1.33	90.54 ±0.40	81.00 ±0.94
	5	96.82 ±0.59	72.36 ±1.63	88.87 ±1.85	82.51 ±1.39	78.96 ±1.21
	0	95.26 ±0.70	70.32 ±1.53	84.13 ±1.76	71.70 ±1.70	76.10 ±1.31
	− 5	91.56 ±1.03	65.69 ±1.40	74.46 ±1.36	62.81 ±1.82	71.78 ±1.04

The numbers in italics indicate the best results

Back to article page