EURASIP Journal on Audio, Speech, and Music Processing

Table 4 The confusion matrix for the evaluation of the emotional expressiveness of synthesized speech using the proposed interpolation method

From: Speaker-dependent model interpolation for statistical emotional speech synthesis

	Happy	Angry	Sad	Neutral	Non-synthetic	Accuracy
Happy	34	6	1	3	1	34/45
Angry	13	21	1	7	3	21/45
Sad	1	0	38	2	5	38/45
Neutral	0	0	9	20	15	20/45
Non-synthetic	1	0	0	4	40	40/45

There are five listener groups, three listeners in a listener group, and three utterances in an utterance set. Thus, the result of emotional expressiveness for each emotion is based on answers to 45 = 5 × 3 × 3 test samples

Back to article page