Skip to main content

Table 8 Language model perplexity and performance of the Slovak LVCSR system with gender-dependent acoustic models

From: Classification of heterogeneous text data for robust domain-specific language modeling

  

APD1+APD2

APD1+APD2

APD1+APD2

APD1+APD2

 

Text

+PAR 340 h

+PAR 340 h

+PAR 340 h

+PAR 340 h

PPL

classification

sp. adapt.: female

sp. adapt.: male

sp. adapt.: female

sp. adapt.: male

  

eval. set: gender-bal.

eval. set: gender-bal.

eval. set: female sp.

eval. set: male sp.

 

Weighting

Similarity

Acc %

Corr %

Acc %

Corr %

Acc %

Corr %

Acc %

Corr %

40.4302

Reference language model

90.15

91.68

92.72

93.80

95.72

96.48

94.10

94.87

36.0428

tf-idf

Bhattacharyya

91.23

92.50

93.23

94.18

95.97

96.68

94.34

95.06

35.9444

 

Jaccard index

91.26

92.55

93.24

94.22

95.98

96.68

94.73

95.11

38.1756

 

Jensen-Shannon

90.71

92.10

92.92

93.94

95.81

96.54

94.23

94.94

38.1289

Okapi

Bhattacharyya

90.95

92.23

93.03

94.01

95.88

96.59

94.25

94.96

39.9782

 

Jaccard index

90.59

91.99

92.82

93.84

95.81

96.53

94.17

94.90

39.2267

 

Jensen-Shannon

90.93

92.27

93.00

93.97

95.94

96.65

94.17

94.89

40.1325

Ltu

Bhattacharyya

90.19

91.70

92.72

93.78

95.73

96.49

94.10

94.85

40.1439

 

Jaccard index

90.18

91.70

92.73

93.78

95.76

96.51

94.11

94.86

40.1319

 

Jensen-Shannon

90.18

91.70

92.72

93.78

95.73

96.49

94.10

94.85