Skip to main content

Advertisement

Table 5 Average tonal syllable recognition rate (%) after eigenphone-based speaker adaptation using sparse group lasso

From: Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation

λ 3 Number of adaptation sentences
  1 2 4 6 8 10
10 53.78 56.57 58.14 59.06 60.05 60.91
  (0.61, 0.01) (0.47, 0.0) (0.31, 0.0) (0.22, 0.0) (0.18, 0.0) (0.15, 0.0)
20 54.76 56.74 58.29 59.21 60.18 60.93
  (0.62, 0.01) (0.45, 0.0) (0.31, 0.0) (0.22, 0.0) (0.18, 0.0) (0.15, 0.0)
30 54.55 56.86 58.55 59.53 60.20 61.25
  (0.63, 0.02) (0.44, 0.0) (0.32, 0.0) (0.23, 0.0) (0.18, 0.0) (0.15, 0.0)
40 54.49 56.65 58.35 59.32 60.11 60.93
  (0.63, 0.05) (0.43, 0.0) (0.31, 0.0) (0.23, 0.0) (0.18, 0.0) (0.16, 0.0)
80 54.13 56.04 57.72 58.92 59.90 60.43
  (0.78, 0.37) (0.45, 0.02) (0.33, 0.0) (0.23, 0.0) (0.19, 0.0) (0.16, 0.0)
120 54.05 54.95 57.01 58.35 59.38 60.24
  (0.91, 0.76) (0.58, 0.21) (0.35, 0.01) (0.23, 0.0) (0.19, 0.0) (0.16, 0.0)
  1. The number of eigenphones (N) was fixed to 100. λ1=10.0, λ2=0, and λ3 was varied between 10 and 150. the average overall sparsity and column sparsity of the eigenphone matrix are shown in parentheses as pairs.