Skip to main content

Table 6 AUC results of the comparison VADs with the BLSTM model and STFT acoustic feature on the Noisy-CHiME-4 test dataset, when the Chinese Noisy-THCHS-30 dataset was used as the training set

From: AUC optimization for deep learning-based voice activity detection

Noise type

SNR

MCE

MMSE

MaxAUCsigm

MaxAUChinge

Babble

− 10 dB

0.5752

0.5664

0.5833

0.5799

 

− 5 dB

0.6442

0.6391

0.6588

0.6565

 

0 dB

0.7272

0.7222

0.7462

0.7441

 

5 dB

0.7900

0.7867

0.8076

0.8057

 

10 dB

0.8246

0.8289

0.8387

0.8390

 

15 dB

0.8420

0.8430

0.8529

0.8467

 

20 dB

0.8487

0.8624

0.8579

0.8628

Factory

− 10 dB

0.5992

0.5938

0.6115

0.6011

 

− 5 dB

0.6743

0.6694

0.6897

0.6822

 

0 dB

0.7340

0.7294

0.7536

0.7474

 

5 dB

0.7791

0.7769

0.7994

0.7929

 

10 dB

0.8142

0.8166

0.8299

0.8285

 

15 dB

0.8373

0.8458

0.8485

0.8503

 

20 dB

0.8474

0.8603

0.8569

0.8646

Volvo

− 10 dB

0.7571

0.7551

0.7790

0.7858

 

− 5 dB

0.7933

0.7905

0.8195

0.8270

 

0 dB

0.8244

0.8229

0.8443

0.8534

 

5 dB

0.8350

0.8343

0.8530

0.8602

 

10 dB

0.8374

0.8295

0.8560

0.8602

 

15 dB

0.8423

0.8518

0.8600

0.8589

 

20 dB

0.8476

0.8595

0.8645

0.8593