From: Deep neural networks for automatic speech processing: a survey from large corpora to limited data
Layer number | Layer type | Parameters |
---|---|---|
0 | Input data | MFCC with a windowing of 25 ms and a 10 ms stride |
1 | Stacked bidirectional GRUs | 5 GRUs of 256 cells each |
2 | Dropout | Of 0.2 |
3 | Batch normalization | For each direction |
4 | Linear layer | 128 filters |