Skip to main content

Table 5 AUC (%) comparison between the proposed method and other methods (Ramirez et al. [8], Kinnunen et al. [11], Sohn et al. [12], and Segbroeck et al. [44])

From: Enhancement of speech dynamics for voice activity detection using DNN

  

AUC (%)—mean ± standard deviation

Noise

SNR (dB)

Proposed

Ramirez

Kinnunen

Sohn

Segbroeck

Clean

 

99.06 ±0.13

71.03 ±1.02

95.65 ±0.43

88.48 ±1.45

84.57 ±1.21

White

10

97.91 ±0.28

74.09 ±1.50

93.65 ±1.05

93.32 ±0.50

79.63 ±0.24

 

5

97.44 ±0.43

73.57 ±1.36

93.02 ±1.31

87.84 ±0.50

77.75 ±0.34

 

0

96.59 ±0.50

72.33 ±1.24

91.06 ±1.59

77.34 ±1.14

75.21 ±0.67

 

− 5

94.69 ±0.66

68.97 ±1.07

83.85 ±1.28

66.79 ±1.85

71.89 ±0.77

Babble

10

96.84 ±0.60

68.84 ±1.32

87.71 ±0.91

87.56 ±0.87

81.25 ±0.35

 

5

95.19 ±0.71

67.28 ±0.71

84.19 ±0.80

79.97 ±0.70

79.05 ±0.69

 

0

91.30 ±0.74

63.62 ±0.91

76.59 ±0.99

70.05 ±0.94

72.99 ±1.13

 

− 5

83.20 ±0.87

59.37 ±1.01

66.73 ±1.52

60.33 ±0.93

62.71 ±1.25

Factory

10

97.25 ±0.39

70.35 ±1.85

88.12 ±1.73

88.04 ±0.67

81.19 ±0.96

 

5

95.96 ±0.43

67.54 ±1.57

84.42 ±1.67

79.55 ±0.85

78.99 ±1.06

 

0

93.18 ±0.46

62.78 ±1.53

77.70 ±1.07

67.15 ±0.82

74.67 ±0.87

 

− 5

85.91 ±0.29

57.81 ±1.71

66.38 ±0.76

56.28 ±0.56

67.23 ±0.52

Car

10

99.02 ±0.11

69.06 ±1.60

94.62 ±0.36

91.56 ±1.29

84.46 ±0.96

 

5

98.94 ±0.11

68.27 ±1.32

93.64 ±0.80

92.15 ±0.83

84.38 ±0.96

 

0

98.79 ±0.09

68.50 ±1.02

92.42 ±0.40

92.41 ±0.34

84.08 ±0.91

 

− 5

98.40 ±0.05

68.77 ±1.87

90.16 ±0.17

91.81 ±0.10

83.49 ±1.06

Pink

10

97.79 ±0.39

73.11 ±1.67

90.37 ±1.33

90.54 ±0.40

81.00 ±0.94

 

5

96.82 ±0.59

72.36 ±1.63

88.87 ±1.85

82.51 ±1.39

78.96 ±1.21

 

0

95.26 ±0.70

70.32 ±1.53

84.13 ±1.76

71.70 ±1.70

76.10 ±1.31

 

− 5

91.56 ±1.03

65.69 ±1.40

74.46 ±1.36

62.81 ±1.82

71.78 ±1.04

  1. The numbers in italics indicate the best results