Skip to main content

Table 2 The effect of modifying the normalized reference frequency, λ0, on the recognition performance of the proposed GMM-HMM EASR system (in terms of WER (%)) for Persian ESD. The values of WER are obtained by applying different warping methods to various acoustic features extracted from different emotional utterances

From: Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition

Feature type Warping type Emotional states
Anger Disgust Fear Happy Sad Average WER
MFCC DCT Warping λ0 = 0 42.30 24.54 36.08 29.70 26.43 31.81
λ0 = 0.4 28.20 22.34 31.32 18.78 18.25 23.78
λ0 = 0.7 38.36 21.79 31.87 19.14 21.48 26.53
Filterbank & DCT Warping λ0 = 0 42.13 23.81 34.25 30.05 21.48 30.34
λ0 = 0.4 27.05 21.61 32.42 20.21 15.97 23.45
λ0 = 0.7 40.82 22.71 33.52 22.72 21.29 28.21
M-MFCC DCT Warping λ0 = 0 36.07 19.41 27.11 24.15 20.15 25.38
λ0 = 0.4 17.21 16.30 21.61 16.10 19.01 18.05
λ0 = 0.7 29.67 15.93 21.98 17.35 17.49 20.48
Filterbank & DCT Warping λ0 = 0 34.26 17.95 25.82 23.79 20.72 24.51
λ0 = 0.4 22.79 15.93 24.73 17.35 20.53 20.27
λ0 = 0.7 31.48 15.38 27.84 18.96 18.25 22.38
ExpoLog DCT Warping λ0 = 0 37.87 16.12 26.01 26.83 20.34 25.43
λ0 = 0.4 37.54 14.65 25.64 27.55 20.34 25.13
λ0 = 0.7 30.00 13.00 23.26 20.04 16.54 20.57
Filterbank & DCT Warping λ0 = 0 30.49 13.55 15.02 22.72 15.78 19.51
λ0 = 0.4 32.46 13.92 16.12 28.98 19.01 22.10
λ0 = 0.7 26.89 12.82 17.22 22.18 20.53 19.92
GFCC DCT Warping λ0 = 0 40.66 44.51 48.72 44.90 27.38 41.23
λ0 = 0.4 26.89 39.74 43.77 39.53 26.81 35.35
λ0 = 0.7 28.52 40.66 43.96 42.58 31.94 37.53
Filterbank & DCT Warping λ0 = 0 39.67 43.77 47.44 44.01 27.19 40.42
λ0 = 0.4 28.36 40.84 45.60 40.79 29.09 36.94
λ0 = 0.7 30.82 40.84 44.51 43.83 34.79 30.96
PNCC DCT Warping λ0 = 0 3.93 5.13 5.31 6.80 5.70 5.37
λ0 = 0.4 4.10 5.13 4.40 6.26 4.56 4.89
λ0 = 0.7 4.10 5.13 4.40 6.26 4.56 4.89
Filterbank & DCT Warping λ0 = 0 3.93 5.13 4.58 6.62 5.70 5.19
λ0 = 0.4 3.93 5.31 4.95 6.44 5.32 5.19
λ0 = 0.7 3.93 5.31 4.95 6.44 5.32 5.19