Table 4 The usages of the subsets for each speech corpus and their statistics

From: A parametric prosody coding approach for Mandarin speech using a hierarchical prosodic model

Corpus Subsets Usages Spk# Utt# Syl# Hours Remark
Treebank TrainTB Training of the HPM, the AM for forced-alignment and the models for HMM-based speech synthesizer 1 376 51,868 3.9  
TestTB Evaluation of prosody coding 1 44 3898 0.3  
TCC300 TrainTC1 Training of the AM for forced-alignment 274 8036 300,728 23.9 Include all set A and 90% of set B
TrainTC2 Training of the HPM 164 962 106,955 8.3 Subset of TrainTC1
TestTC Evaluation of prosody coding and adaptation of HMM model for speech sythesis 19 226 26,357 2.4 Selected from Set B of TCC300