Skip to main content

Table 7 Language model perplexity and performance of Slovak LVCSR system with different acoustic models

From: Classification of heterogeneous text data for robust domain-specific language modeling

  

APD1+APD2 250 h

APD1+APD2 250 h

APD1+APD2

APD1+APD2

 

Text

(table mic.)

(close-talk mic.)

+PAR 340 h

+PAR+BN 520 h

PPL

classification

sp. adapt.: no

sp. adapt.: no

sp. adapt.: no

sp. adapt.: no

  

eval. set: gender-bal.

eval. set: gender-bal.

eval. set: gender-bal.

eval. set: gender-bal.

 

Weighting

Similarity

Acc %

Corr %

Acc %

Corr %

Acc %

Corr %

Acc %

Corr %

40.4302

Reference language model

91.84

93.08

93.61

94.51

94.36

95.13

94.06

94.89

36.0428

tf-idf

Bhattacharyya

92.44

93.64

93.99

94.85

94.70

95.46

94.36

95.13

35.9444

 

Jaccard index

92.46

93.65

93.97

94.85

94.72

95.47

94.37

95.16

38.1756

 

Jensen-Shannon

92.23

93.39

93.78

94.70

94.50

95.25

94.21

94.99

38.1289

Okapi

Bhattacharyya

92.17

93.34

93.77

94.65

94.61

95.34

94.27

95.02

39.9782

 

Jaccard index

92.10

93.31

93.60

94.54

94.48

95.21

94.11

94.89

39.2267

 

Jensen-Shannon

92.27

93.42

93.77

94.67

94.61

95.36

94.18

94.95

40.1325

Ltu

Bhattacharyya

91.86

93.12

93.57

94.51

94.42

95.16

94.05

94.87

40.1439

 

Jaccard index

91.87

93.12

93.56

94.50

94.40

95.16

94.04

94.87

40.1319

 

Jensen-Shannon

91.87

93.12

93.57

94.51

94.42

95.16

94.05

94.87