Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation

EURASIP Journal on Audio, Speech, and Music Processing

Table 3 Average tonal syllable recognition rate (%) after eigenphone-based speaker adaptation using elastic net

λ ₂	Number of adaptation sentences
	1	2	4	6	8	10
10	52.27	55.98	58.10	59.19	60.22	61.08
	(0.67)	(0.48)	(0.33)	(0.24)	(0.20)	(0.16)
40	52.27	55.98	58.14	59.17	60.18	61.08
	(0.67)	(0.48)	(0.33)	(0.24)	(0.20)	(0.16)
80	52.22	55.96	58.12	59.17	60.20	61.04
	(0.67)	(0.48)	(0.33)	(0.24)	(0.20)	(0.16)
120	52.22	55.98	58.16	59.17	60.16	61.08
	(0.67)	(0.48)	(0.33)	(0.24)	(0.20)	(0.16)
1,000	52.31	55.98	58.02	59.13	60.13	60.97
	(0.67)	(0.48)	(0.33)	(0.24)	(0.20)	(0.16)
2,000	52.35	55.98	58.02	59.13	60.16	60.97
	(0.67)	(0.48)	(0.33)	(0.24)	(0.20)	(0.16)

The number of eigenphones (N) was fixed to 100. λ₁=10, λ₃=0, and λ₂ was varied between 10 and 2,000. The average overall sparsity is shown in parentheses.