Skip to main content

Table 4 AUC results of the comparison VADs with the BLSTM model and STFT acoustic feature on the Chinese Noisy-THCHS-30 test dataset

From: AUC optimization for deep learning-based voice activity detection

Noise type

SNR

MCE

MMSE

MaxAUCsigm

MaxAUChinge

Babble

− 10 dB

0.5226

0.5268

0.5315

0.5308

 

− 5 dB

0.5826

0.5918

0.5944

0.5944

 

0 dB

0.6800

0.6901

0.6897

0.6943

 

5 dB

0.7787

0.7834

0.7853

0.7915

 

10 dB

0.8484

0.8481

0.8520

0.8563

 

15 dB

0.8870

0.8864

0.8893

0.8928

 

20 dB

0.9096

0.9099

0.9116

0.9140

Factory

− 10 dB

0.6247

0.6238

0.6300

0.6420

 

− 5 dB

0.7177

0.7168

0.7216

0.7314

 

0 dB

0.7962

0.7948

0.7976

0.8030

 

5 dB

0.8483

0.8471

0.8483

0.8511

 

10 dB

0.8805

0.8798

0.8810

0.8828

 

15 dB

0.9031

0.9020

0.9033

0.9053

 

20 dB

0.9198

0.9175

0.9188

0.9223

Volvo

− 10 dB

0.8851

0.8753

0.8848

0.8845

 

− 5 dB

0.9077

0.8984

0.9095

0.9101

 

0 dB

0.9208

0.9145

0.9234

0.9252

 

5 dB

0.9292

0.9257

0.9313

0.9332

 

10 dB

0.9352

0.9328

0.9353

0.9382

 

15 dB

0.9384

0.9361

0.9368

0.9412

 

20 dB

0.9398

0.9374

0.9375

0.9425