Skip to main content

Advertisement

Table 4 The confusion matrix for the evaluation of the emotional expressiveness of synthesized speech using the proposed interpolation method

From: Speaker-dependent model interpolation for statistical emotional speech synthesis

  Happy Angry Sad Neutral Non-synthetic Accuracy
Happy 34 6 1 3 1 34/45
Angry 13 21 1 7 3 21/45
Sad 1 0 38 2 5 38/45
Neutral 0 0 9 20 15 20/45
Non-synthetic 1 0 0 4 40 40/45
  1. There are five listener groups, three listeners in a listener group, and three utterances in an utterance set. Thus, the result of emotional expressiveness for each emotion is based on answers to 45 = 5 × 3 × 3 test samples