Skip to main content

Table 5 Naturalness MUSHRA average scores and 95% confidence interval

From: A unit selection text-to-speech-and-singing synthesis framework from neutral speech: proof of concept

 

Configuration

T=100 bpm

T=50 bpm

S7

V

74 ± 6

70 ± 6

 

S7pdLC

41 ± 5

44 ±6

 

S7pdC

39 ± 6

43 ± 6

 

S7p

36 ± 5

40 ± 6

 

MLC

42 ±5

44 ±5

S4

V

69 ± 7

67 ± 7

 

S4pdLC

42 ±6

38 ± 6

 

S4pdC

39 ± 6

41 ±6

 

S4p

35 ± 6

38 ± 6

 

MLC

38 ± 6

37 ± 6

S0

V

66 ± 7

70 ± 6

 

S0pdLC

44 ±5

38 ± 4

 

S0pdC

44 ±5

42 ±6

 

S0p

41 ± 6

35 ± 6

 

MLC

44 ±6

39 ± 5

  1. Best values achieved by the proposed system in each scenario are in italics