Skip to main content

Table 4 The confusion matrix for the evaluation of the emotional expressiveness of synthesized speech using the proposed interpolation method

From: Speaker-dependent model interpolation for statistical emotional speech synthesis

 

Happy

Angry

Sad

Neutral

Non-synthetic

Accuracy

Happy

34

6

1

3

1

34/45

Angry

13

21

1

7

3

21/45

Sad

1

0

38

2

5

38/45

Neutral

0

0

9

20

15

20/45

Non-synthetic

1

0

0

4

40

40/45

  1. There are five listener groups, three listeners in a listener group, and three utterances in an utterance set. Thus, the result of emotional expressiveness for each emotion is based on answers to 45 = 5 × 3 × 3 test samples