Nonlinear residual echo suppression based on dual-stream DPRNN

EURASIP Journal on Audio, Speech, and Music Processing

Table 2 Performance of our proposed methods and several typical RES methods

Echo	Model	PESQ	SDR	STOI
Artificial speech	LAEC	1.48	−2.60	0.622
	LSTM	2.14	6.33	0.780
	MSTasNet	2.54	11.6	0.857
	DSDPRNN_ty	2.61	12.3	0.866
	DSDPRNN_tx	2.66	12.8	0.876
	DSDPRNN_fy	2.75	12.4	0.880
	DSDPRNN_fx	2.74	12.5	0.882
Artificial music	LAEC	1.48	−2.90	0.634
	LSTM	2.08	5.46	0.755
	MSTasNet	2.43	10.7	0.830
	DSDPRNN_ty	2.50	11.5	0.842
	DSDPRNN_tx	2.61	12.6	0.865
	DSDPRNN_fy	2.62	11.4	0.857
	DSDPRNN_fx	2.64	11.6	0.863
ER speech	LAEC	16.1	−2.05	0.697
	LSTM	2.13	4.85	0.799
	MSTasNet	2.66	11.6	0.890
	DSDPRNN_ty	2.68	11.7	0.892
	DSDPRNN_tx	2.62	11.5	0.887
	DSDPRNN_fy	2.77	11.3	0.904
	DSDPRNN_fx	2.66	10.6	0.895
ER music	LAEC	1.70	−1.12	0.730
	LSTM	2.25	5.95	0.826
	MSTasNet	2.72	12.2	0.898
	DSDPRNN_ty	2.75	12.6	0.900
	DSDPRNN_tx	2.68	12.3	0.897
	DSDPRNN_fy	2.79	11.9	0.907
	DSDPRNN_fx	2.76	11.7	0.907
LL speech	LAEC	1.95	1.67	0.806
	LSTM	2.55	9.23	0.884
	MSTasNet	2.99	15.0	0.932
	DSDPRNN_ty	3.00	15.6	0.932
	DSDPRNN_tx	2.87	14.9	0.920
	DSDPRNN_fy	3.02	15.3	0.938
	DSDPRNN_fx	3.04	15.7	0.938
LL music	LAEC	1.97	2.16	0.820
	LSTM	2.60	9.07	0.889
	MSTasNet	3.04	15.6	0.934
	DSDPRNN_ty	3.07	16.0	0.935
	DSDPRNN_tx	2.89	14.8	0.921
	DSDPRNN_fy	3.12	15.8	0.944
	DSDPRNN_fx	3.13	16.0	0.943