Skip to main content

Table 4 The usages of the subsets for each speech corpus and their statistics

From: A parametric prosody coding approach for Mandarin speech using a hierarchical prosodic model

Corpus

Subsets

Usages

Spk#

Utt#

Syl#

Hours

Remark

Treebank

TrainTB

Training of the HPM, the AM for forced-alignment and the models for HMM-based speech synthesizer

1

376

51,868

3.9

 

TestTB

Evaluation of prosody coding

1

44

3898

0.3

 

TCC300

TrainTC1

Training of the AM for forced-alignment

274

8036

300,728

23.9

Include all set A and 90% of set B

TrainTC2

Training of the HPM

164

962

106,955

8.3

Subset of TrainTC1

TestTC

Evaluation of prosody coding and adaptation of HMM model for speech sythesis

19

226

26,357

2.4

Selected from Set B of TCC300