Skip to main content

Table 5 Architecture used for siamese and prototypical networks

From: Deep neural networks for automatic speech processing: a survey from large corpora to limited data

Layer number

Layer type

Parameters

0

Input data

MFCC with a windowing of 25 ms and a 10 ms stride

1

Stacked bidirectional GRUs

5 GRUs of 256 cells each

2

Dropout

Of 0.2

3

Batch normalization

For each direction

4

Linear layer

128 filters