Skip to main content

Table 3 Performance of the pre-trained model and the fine-tuned models with ER recorded echo

From: Nonlinear residual echo suppression based on dual-stream DPRNN

Echo

Model

PESQ

SDR

STOI

Artificial speech

LAEC

1.48

−2.60

0.622

 

Time

2.61

12.3

0.866

 

Time_1

2.56

12.2

0.865

 

Time_2

2.57

12.1

0.864

 

TF

2.75

12.4

0.880

 

TF_1

2.70

12.4

0.875

 

TF_2

2.69

12.3

0.875

Artificial music

LAEC

1.48

−2.90

0.634

 

Time

2.50

11.5

0.842

 

Time_1

2.44

11.4

0.841

 

Time_2

2.46

11.3

0.841

 

TF

2.62

11.4

0.857

 

TF_1

2.58

11.3

0.853

 

TF_2

2.57

11.3

0.852

ER speech

LAEC

1.61

−2.05

0.697

 

Time

2.68

11.7

0.892

 

Time_1

blue2.70

blue12.0

blue0.894

 

Time_2

blue2.75

blue12.5

blue0.899

 

TF

2.77

11.3

0.904

 

TF_1

blue2.80

blue11.9

blue0.905

 

TF_2

blue2.88

blue12.4

blue0.912

ER music

LAEC

1.70

−1.12

0.730

 

Time

2.75

12.6

0.900

 

Time_1

blue2.76

blue12.8

blue0.901

 

Time_2

blue2.80

blue13.0

blue0.906

 

TF

2.79

11.9

0.907

 

TF_1

blue2.83

blue12.3

blue0.908

 

TF_2

blue2.91

blue12.6

blue0.914

LL speech

LAEC

1.95

1.67

0.806

 

Time

3.00

15.6

0.932

 

Time_1

3.00

15.8

0.933

 

Time_2

3.03

16.1

0.935

 

TF

3.02

15.3

0.938

 

TF_1

3.08

15.8

0.939

 

TF_2

3.13

16.1

0.943

LL music

LAEC

1.97

2.16

0.820

 

Time

3.07

16.0

0.935

 

Time_1

3.03

16.1

0.936

 

Time_2

3.04

16.2

0.937

 

TF

3.12

15.8

0.944

 

TF_1

3.17

16.1

0.944

 

TF_2

3.18

16.2

0.946