Skip to main content

Advertisement

Table 6 Differences in the text processing and language modeling during the recent time periods

From: Classification of heterogeneous text data for robust domain-specific language modeling

  Period
  Dec 2011 Jul 2012 Dec 2012 Apr 2013 May 2013
No. of pronunciation variants 475,156 475,357 474,456 474,453 474,453
No. of unique word forms 326,299 326,295 325,555 325,555 325,555
No. of words under classes 97,471 97,680 97,678 97,678 97,678
No. of classes of words 20 22 22 22 22
No. of transparent words 4 5 5 5 5
Vocabulary extension -
Word classes extension - - -
Adding new text data - -
Additional text processing -
Filled pause modeling -
New text classification - - -
  1. • Change was performed.