Skip to main content

Table 5 Average tonal syllable recognition rate (%) after eigenphone-based speaker adaptation using sparse group lasso

From: Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation

λ 3

Number of adaptation sentences

 

1

2

4

6

8

10

10

53.78

56.57

58.14

59.06

60.05

60.91

 

(0.61, 0.01)

(0.47, 0.0)

(0.31, 0.0)

(0.22, 0.0)

(0.18, 0.0)

(0.15, 0.0)

20

54.76

56.74

58.29

59.21

60.18

60.93

 

(0.62, 0.01)

(0.45, 0.0)

(0.31, 0.0)

(0.22, 0.0)

(0.18, 0.0)

(0.15, 0.0)

30

54.55

56.86

58.55

59.53

60.20

61.25

 

(0.63, 0.02)

(0.44, 0.0)

(0.32, 0.0)

(0.23, 0.0)

(0.18, 0.0)

(0.15, 0.0)

40

54.49

56.65

58.35

59.32

60.11

60.93

 

(0.63, 0.05)

(0.43, 0.0)

(0.31, 0.0)

(0.23, 0.0)

(0.18, 0.0)

(0.16, 0.0)

80

54.13

56.04

57.72

58.92

59.90

60.43

 

(0.78, 0.37)

(0.45, 0.02)

(0.33, 0.0)

(0.23, 0.0)

(0.19, 0.0)

(0.16, 0.0)

120

54.05

54.95

57.01

58.35

59.38

60.24

 

(0.91, 0.76)

(0.58, 0.21)

(0.35, 0.01)

(0.23, 0.0)

(0.19, 0.0)

(0.16, 0.0)

  1. The number of eigenphones (N) was fixed to 100. λ1=10.0, λ2=0, and λ3 was varied between 10 and 150. the average overall sparsity and column sparsity of the eigenphone matrix are shown in parentheses as pairs.