Skip to main content

Advertisement

Table 7 Language model perplexity and performance of Slovak LVCSR system with different acoustic models

From: Classification of heterogeneous text data for robust domain-specific language modeling

   APD1+APD2 250 h APD1+APD2 250 h APD1+APD2 APD1+APD2
  Text (table mic.) (close-talk mic.) +PAR 340 h +PAR+BN 520 h
PPL classification sp. adapt.: no sp. adapt.: no sp. adapt.: no sp. adapt.: no
   eval. set: gender-bal. eval. set: gender-bal. eval. set: gender-bal. eval. set: gender-bal.
  Weighting Similarity Acc % Corr % Acc % Corr % Acc % Corr % Acc % Corr %
40.4302 Reference language model 91.84 93.08 93.61 94.51 94.36 95.13 94.06 94.89
36.0428 tf-idf Bhattacharyya 92.44 93.64 93.99 94.85 94.70 95.46 94.36 95.13
35.9444   Jaccard index 92.46 93.65 93.97 94.85 94.72 95.47 94.37 95.16
38.1756   Jensen-Shannon 92.23 93.39 93.78 94.70 94.50 95.25 94.21 94.99
38.1289 Okapi Bhattacharyya 92.17 93.34 93.77 94.65 94.61 95.34 94.27 95.02
39.9782   Jaccard index 92.10 93.31 93.60 94.54 94.48 95.21 94.11 94.89
39.2267   Jensen-Shannon 92.27 93.42 93.77 94.67 94.61 95.36 94.18 94.95
40.1325 Ltu Bhattacharyya 91.86 93.12 93.57 94.51 94.42 95.16 94.05 94.87
40.1439   Jaccard index 91.87 93.12 93.56 94.50 94.40 95.16 94.04 94.87
40.1319   Jensen-Shannon 91.87 93.12 93.57 94.51 94.42 95.16 94.05 94.87