Skip to main content

Advertisement

Table 4 The performances of the Kaldi baseline and the MESR systems in terms of WER (%) based on three mask estimation methods and different values of αin clean condition

From: Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments

Emotional states Kaldi baseline MESR baseline
PNCC-mask MFCC-mask
α = 0 α = 0.5 α = 0 α = 0.5
Anger 11.31 11.97 4.10 19.34 5.08
Disgust 13.92 13.92 13.19 21.25 15.02
Fear 13.00 14.10 8.97 27.11 11.36
Happiness 12.52 12.34 9.48 29.86 8.94
Sadness 11.60 8.94 9.70 17.30 9.70
Average 12.47 12.25 9.08 22.97 10.02