Skip to main content

Table 1 Proposed architecture for discriminator

From: Predominant audio source separation in polyphonic music

Input size

Description

3× 256 × 256

Input spectrogram

64 × 128 × 128

4 × 4 Conv, 64 filters, stride 2, pad 1

64 × 128 × 128

Leaky ReLU (\(\alpha\)=0.2)

128 × 64 × 64

4 × 4 Conv, 64 filters, stride 2, pad 1

128 × 64 × 64

Instance normalization

128 × 64 × 64

Leaky ReLU (\(\alpha\)= 0.2)

256 × 32 × 32

4 × 4 Conv, 64 filters, stride 2, pad 1

256 × 32 × 32

Instance normalization

256 × x32 × 32

Leaky ReLU (\(\alpha\)= 0.2)

512 × 31 × 31

4 × 4 Conv, 512 filters, stride 1, pad 1

1 × 4 × 4

4 × 4 Conv, stride 1, pad 1