Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification

Table 15 Distant-talking speaker identification rates for evaluation data (%)

Method	RT60 of test data (s) (RWCP data)					Ave.
	0.38	0.47	0.60	0.78	1.30
(a) Conventional methods
CMN	79.70	76.05	75.55	74.40	75.75	76.29
MCLMS-SS	82.25	79.70	78.75	78.05	81.30	80.01
MSLP-SS	82.85	78.60	78.50	78.00	75.70	78.73
BF-MLP	72.35	69.30	64.05	64.90	63.25	66.70
(b) DNN-based feature transformation methods
BF-DNN	87.90	84.95	82.45	84.00	82.15	84.29
DAE	92.10	89.70	87.60	89.45	88.10	89.39
DAE + BF-DNN	94.20	92.20	90.65	91.95	90.70	91.94