Skip to main content

Table 3 Comparison of average PESQ, STOI, and SDR for test datasets of different durations

From: Supervised Attention Multi-Scale Temporal Convolutional Network for monaural speech enhancement

Metrics

PESQ

STOI (%)

SDR

OUTE

Duration

100

500

1000

1500

100

500

1000

1500

100

500

1000

1500

100

500

1000

1500

Unprocessed

2.08

2.08

2.08

2.08

91.7

91.7

91.7

91.7

14.81

14.81

14.81

14.81

-

-

-

-

CRN

2.13

2.55

2.59

2.64

91.3

93.8

94.0

94.1

17.69

19.29

19.38

19.58

75

335

340

330

MSTCN

2.67

2.77

2.78

2.80

93.7

94.3

94.5

94.4

15.873

16.82

17.16

17.07

59

230

540

690

LSTM-IRM

2.57

2.90

2.93

3.01

93.8

95.2

95.2

95.5

18.42

19.90

20.03

20.22

34

120

150

285

GCRN

2.55

2.85

2.91

2.96

92.9

94.4

94.6

94.9

18.84

20.83

20.99

21.40

93

355

400

525

GaGNet

2.67

2.98

2.98

3.02

93.4

94.9

95.0

95.1

19.51

21.04

21.14

21.44

50

230

260

300

Conv-TasNet

2.62

2.99

3.12

3.09

93.4

95.0

95.6

95.4

19.58

21.50

22.15

22.02

78

200

260

315

DCCRN

3.06

3.22

3.28

3.25

95.1

95.7

95.8

95.8

20.73

21.48

21.75

21.56

66

145

210

300

DPCRN

3.15

3.19

3.27

3.24

95.4

95.6

95.9

95.7

21.23

21.53

21.84

21.74

45

130

210

345

SA-MSTCN\(^{1}\)

3.16

3.38

3.44

3.44

95.4

96.1

96.3

96.3

20.53

21.45

21.70

21.74

58

190

340

420

SA-MSTCN\(^{2}\)

3.16

3.41

3.50

3.48

95.4

96.2

96.6

96.4

20.53

21.95

22.31

22.15

87

355

640

720