Skip to main content

Table 1 Various hyperparameters chosen for Vi-T and Swin-T

From: Transformer-based ensemble method for multiple predominant instruments recognition in polyphonic music

Hyperparameter

Vi-T

Swin-T

Image size

72  × 72

72  × 72

Patch dimension

6  × 6

4  × 4

Hyper parameter (C)

64

96

Number of heads

8

8

Number of windows

NA

4

Number of MLP nodes

2048,1048

256, 256

Mini batch-size

256

32