Skip to main content

Table 2 The effect of modifying the normalized reference frequency, λ0, on the recognition performance of the proposed GMM-HMM EASR system (in terms of WER (%)) for Persian ESD. The values of WER are obtained by applying different warping methods to various acoustic features extracted from different emotional utterances

From: Feature compensation based on the normalization of vocal tract length for the improvement of emotion-affected speech recognition

Feature type

Warping type

Emotional states

Anger

Disgust

Fear

Happy

Sad

Average WER

MFCC

DCT Warping

λ0 = 0

42.30

24.54

36.08

29.70

26.43

31.81

λ0 = 0.4

28.20

22.34

31.32

18.78

18.25

23.78

λ0 = 0.7

38.36

21.79

31.87

19.14

21.48

26.53

Filterbank & DCT Warping

λ0 = 0

42.13

23.81

34.25

30.05

21.48

30.34

λ0 = 0.4

27.05

21.61

32.42

20.21

15.97

23.45

λ0 = 0.7

40.82

22.71

33.52

22.72

21.29

28.21

M-MFCC

DCT Warping

λ0 = 0

36.07

19.41

27.11

24.15

20.15

25.38

λ0 = 0.4

17.21

16.30

21.61

16.10

19.01

18.05

λ0 = 0.7

29.67

15.93

21.98

17.35

17.49

20.48

Filterbank & DCT Warping

λ0 = 0

34.26

17.95

25.82

23.79

20.72

24.51

λ0 = 0.4

22.79

15.93

24.73

17.35

20.53

20.27

λ0 = 0.7

31.48

15.38

27.84

18.96

18.25

22.38

ExpoLog

DCT Warping

λ0 = 0

37.87

16.12

26.01

26.83

20.34

25.43

λ0 = 0.4

37.54

14.65

25.64

27.55

20.34

25.13

λ0 = 0.7

30.00

13.00

23.26

20.04

16.54

20.57

Filterbank & DCT Warping

λ0 = 0

30.49

13.55

15.02

22.72

15.78

19.51

λ0 = 0.4

32.46

13.92

16.12

28.98

19.01

22.10

λ0 = 0.7

26.89

12.82

17.22

22.18

20.53

19.92

GFCC

DCT Warping

λ0 = 0

40.66

44.51

48.72

44.90

27.38

41.23

λ0 = 0.4

26.89

39.74

43.77

39.53

26.81

35.35

λ0 = 0.7

28.52

40.66

43.96

42.58

31.94

37.53

Filterbank & DCT Warping

λ0 = 0

39.67

43.77

47.44

44.01

27.19

40.42

λ0 = 0.4

28.36

40.84

45.60

40.79

29.09

36.94

λ0 = 0.7

30.82

40.84

44.51

43.83

34.79

30.96

PNCC

DCT Warping

λ0 = 0

3.93

5.13

5.31

6.80

5.70

5.37

λ0 = 0.4

4.10

5.13

4.40

6.26

4.56

4.89

λ0 = 0.7

4.10

5.13

4.40

6.26

4.56

4.89

Filterbank & DCT Warping

λ0 = 0

3.93

5.13

4.58

6.62

5.70

5.19

λ0 = 0.4

3.93

5.31

4.95

6.44

5.32

5.19

λ0 = 0.7

3.93

5.31

4.95

6.44

5.32

5.19