Skip to main content

Table 6 RMSE of syllable logF0 contour (spn), syllable duration (sdn), syllable energy level (sen), and syllable-juncture pause duration (pdn) for the proposed approach and the baseline systems with various logF0 codebook

From: A parametric prosody coding approach for Mandarin speech using a hierarchical prosodic model

(a) SD case

 

Inside (TrainTB)

Outside (TestTB)

spn(logHz)

sdn(ms)

sen(dB)

pdn(ms)

spn(logHz)

sdn(ms)

sen(dB)

pdn(ms)

 SD-HPM

.070

4.8

.68

41.4

.064

4.7

.70

34.3

 SD-BSL-24

.069

5.0

.64

33.5

.066

4.7

.59

32.9

 SD-BSL-32

.065

5.0

.64

33.5

.061

4.7

.59

32.9

 SD-BSL-64

.053

5.0

.64

33.5

.050

4.7

.59

32.9

 SD-BSL-128

.044

5.0

.64

33.5

.042

4.7

.59

32.9

 SD-BSL-256

.037

5.0

.64

33.5

.042

4.7

.59

32.9

(b) SI case

 

Inside (TrainTC2)

Outside (TestTC)

spn(logHz)

sdn(ms)

sen(dB)

pdn(ms)

spn(logHz)

sdn(ms)

sen(dB)

pdn(ms)

 SI-HPM

.065

9.3

.80

44.8

.056

7.5

.66

44.9

 SI-BSL-10

.063

9.1

.78

42.0

.060

10.9

.88

39.4

 SI-BSL-16

.056

9.1

.78

42.0

.054

10.9

.88

39.4

 SI-BSL-32

.047

9.1

.78

42.0

.046

10.9

.88

39.4

 SI-BSL-64

.040

9.1

.78

42.0

.039

10.9

.88

39.4

 SI-BSL-128

.034

9.1

.78

42.0

.033

10.9

.88

39.4

 SI-BSL-256

.029

9.1

.78

42.0

.029

10.9

.88

39.4