Skip to main content

Table 4 Performance of the pre-trained model and the fine-tuned models with LL recorded echo

From: Nonlinear residual echo suppression based on dual-stream DPRNN

Echo

Model

PESQ

SDR

STOI

Artificial speech

LAEC

1.48

−2.60

0.622

 

Time

2.61

12.3

0.866

 

Time_1

2.59

12.1

0.866

 

Time_2

2.60

12.0

0.864

 

TF

2.75

12.4

0.880

 

TF_1

2.73

12.4

0.879

 

TF_2

2.70

12.2

0.875

Artificial music

LAEC

1.48

−2.90

0.634

 

Time

2.50

11.5

0.842

 

Time_1

2.47

11.4

0.842

 

Time_2

2.47

11.2

0.840

 

TF

2.62

11.4

0.857

 

TF_1

2.61

11.4

0.856

 

TF_2

2.58

11.2

0.852

ER speech

LAEC

1.61

−2.05

0.697

 

Time

2.68

11.7

0.892

 

Time_1

2.70

11.9

0.894

 

Time_2

2.72

11.8

0.894

 

TF

2.77

11.3

0.904

 

TF_1

2.81

11.7

0.906

 

TF_2

2.83

11.6

0.908

ER music

LAEC

1.70

−1.12

0.730

 

Time

2.75

12.6

0.900

 

Time_1

2.75

12.7

0.900

 

Time_2

2.77

12.8

0.902

 

TF

2.79

11.9

0.907

 

TF_1

2.83

12.2

0.909

 

TF_2

2.86

12.3

0.911

LL speech

LAEC

1.95

1.67

0.806

 

Time

3.00

15.6

0.932

 

Time_1

blue3.02

blue15.9

blue0.934

 

Time_2

blue3.07

blue16.3

blue0.936

 

TF

3.02

15.3

0.938

 

TF_1

blue3.08

blue15.8

blue0.940

 

TF_2

blue3.22

blue16.5

blue0.947

LL music

LAEC

1.97

2.16

0.820

 

Time

3.07

16.0

0.935

 

Time_1

blue3.07

blue16.2

blue0.936

 

Time_2

blue3.10

blue16.4

blue0.939

 

TF

3.12

15.8

0.944

 

TF_1

blue3.17

blue16.1

blue0.945

 

TF_2

blue3.24

blue16.5

blue0.948