Skip to main content

Table 3 The effect of modifying the normalized reference frequency, λ0, on the recognition performance of the proposed GMM-HMM EASR system (in terms of WER (%)) for CREMA-D. The values of WER are obtained by applying different warping methods to various acoustic features extracted from different emotional utterances

From: Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition

Feature type

Warping type

Emotional states

Anger

Disgust

Fear

Happy

Sad

Average WER

MFCC

DCT Warping

λ0 = 0

8.11

5.54

11.20

6.39

6.50

7.55

λ0 = 0.4

7.78

5.47

9.20

4.92

5.71

6.62

λ0 = 0.7

7.19

5.74

10.06

4.98

6.18

6.83

Filterbank & DCT Warping

λ0 = 0

7.80

5.09

9.69

5.40

6.06

6.81

λ0 = 0.4

7.78

5.44

9.25

4.67

5.65

6.56

λ0 = 0.7

7.60

6.70

10.91

5.60

7.20

7.60

M-MFCC

DCT Warping

λ0 = 0

8.11

6.53

10.33

6.42

6.97

7.67

λ0 = 0.4

7.28

6.35

9.50

5.79

6.09

7.00

λ0 = 0.7

7.67

6.25

9.32

5.84

6.85

7.19

Filterbank & DCT Warping

λ0 = 0

7.38

6.26

9.32

6.06

6.77

7.16

λ0 = 0.4

7.42

6.16

9.32

5.90

6.50

7.06

λ0 = 0.7

8.33

6.85

10.19

6.19

7.88

7.89

ExpoLog

DCT Warping

λ0 = 0

8.25

7.25

12.32

7.88

9.63

9.07

λ0 = 0.4

7.47

6.87

11.48

6.41

8.73

8.19

λ0 = 0.7

7.52

7.17

11.38

5.85

8.53

8.09

Filterbank & DCT Warping

λ0 = 0

7.17

6.63

11.22

6.49

8.75

8.05

λ0 = 0.4

6.97

7.00

10.56

6.11

8.52

7.83

λ0 = 0.7

7.22

7.47

10.33

5.94

9.10

8.01

GFCC

DCT Warping

λ0 = 0

55.73

23.07

44.39

39.48

27.97

38.13

λ0 = 0.4

55.33

22.68

43.67

35.85

27.97

37.10

λ0 = 0.7

56.64

24.46

45.28

37.09

30.07

38.71

Filterbank & DCT Warping

λ0 = 0

54.95

22.33

43.28

38.36

26.89

37.16

λ0 = 0.4

55.99

23.02

44.31

36.82

29.32

37.89

λ0 = 0.7

57.76

25.49

46.39

38.69

32.09

40.08

PNCC

DCT Warping

λ0 = 0

4.49

1.96

6.25

2.17

2.26

3.43

λ0 = 0.4

4.52

2.03

6.75

3.08

2.37

3.75

λ0 = 0.7

4.52

2.03

6.75

3.08

2.37

3.75

Filterbank & DCT Warping

λ0 = 0

3.97

1.83

5.96

1.71

1.99

3.09

λ0 = 0.4

4.44

1.98

6.21

2.56

2.09

3.46

λ0 = 0.7

4.44

1.98

6.21

2.56

2.09

3.46