EURASIP Journal on Audio, Speech, and Music Processing

Table 4 The performances of the Kaldi baseline and the MESR systems in terms of WER (%) based on three mask estimation methods and different values of αin clean condition

From: Robust emotional speech recognition based on binaural model and emotional auditory mask in noisy environments

Emotional states	Kaldi baseline	MESR baseline
		PNCC-mask		MFCC-mask
		α = 0	α = 0.5	α = 0	α = 0.5
Anger	11.31	11.97	4.10	19.34	5.08
Disgust	13.92	13.92	13.19	21.25	15.02
Fear	13.00	14.10	8.97	27.11	11.36
Happiness	12.52	12.34	9.48	29.86	8.94
Sadness	11.60	8.94	9.70	17.30	9.70
Average	12.47	12.25	9.08	22.97	10.02

Back to article page