Skip to main content

Table 1 Reconstruction error of audio source separation using frequency filter banks as input

From: Learning long-term filter banks for audio source separation and audio scene classification

Init

Method

Re_toep

Re_inv

  

M/V =0.1

M/V =1

M/V =10

M/V =0.1

M/V =1

M/V =10

–

TriFB-Null

3.49

1.51

0.55

3.49

1.51

0.55

–

GaussFB-Null

3.28

1.47

0.58

3.28

1.47

0.58

–

TriFB-CNN-1layer

2.85

1.51

0.61

2.85

1.51

0.61

–

GaussFB-CNN-1layer

2.91

1.50

0.64

2.91

1.50

0.64

–

TriFB-GaussLTFB

2.66

1.38

0.50

3.65

1.80

0.74

–

GaussFB-GaussLTFB

2.60

1.39

0.56

3.91

1.67

0.67

Random

TriFB-FullLTFB

3.90

41.37

2.28

3.84

1.83

0.78

Random

GaussFB-FullLTFB

3.55

1.99

0.86

3.85

1.64

0.66

Identity

TriFB-FullLTFB

2.69

1.39

0.52

3.92

1.63

0.62

Identity

GaussFB-FullLTFB

2.62

1.39

0.56

3.85

1.51

0.59

  1. M/V represents the energy ratio between music and voice