Skip to main content

Table 3 AUC results of the comparison VADs with the BLSTM model and STFT acoustic feature on the English Noisy-CHiME-4 test dataset

From: AUC optimization for deep learning-based voice activity detection

Noise type

SNR

MCE

MMSE

MaxAUCsigm

MaxAUChinge

Babble

− 10 dB

0.5163

0.5270

0.5428

0.5383

 

− 5 dB

0.5636

0.5761

0.6010

0.5940

 

0 dB

0.6491

0.6567

0.6867

0.6787

 

5 dB

0.7466

0.7499

0.7716

0.7641

 

10 dB

0.8227

0.8241

0.8362

0.8283

 

15 dB

0.8703

0.8696

0.8765

0.8699

 

20 dB

0.8977

0.8974

0.9003

0.8978

Factory

− 10 dB

0.6024

0.6031

0.6066

0.6089

 

− 5 dB

0.6864

0.6830

0.6898

0.6923

 

0 dB

0.7659

0.7610

0.7653

0.7685

 

5 dB

0.8243

0.8196

0.8204

0.8240

 

10 dB

0.8617

0.8580

0.8573

0.8599

 

15 dB

0.8862

0.8826

0.8811

0.8824

 

20 dB

0.9033

0.8995

0.8977

0.8984

Volvo

− 10 dB

0.8562

0.8432

0.8752

0.8702

 

− 5 dB

0.8871

0.8780

0.8996

0.8961

 

0 dB

0.9062

0.9010

0.9137

0.9107

 

5 dB

0.9166

0.9136

0.9223

0.9182

 

10 dB

0.9220

0.9194

0.9261

0.9214

 

15 dB

0.9248

0.9227

0.9277

0.9229

 

20 dB

0.9264

0.9248

0.9287

0.9241