From: Accent modification for speech recognition of non-native speakers using neural style transfer
Layer | Output shape | Parameters |
---|---|---|
Conv2D (F=32, K=3) | B, X, T, 32 | 417344 |
Conv2D (F=64, K=3) | B, X, T, 64 | 18496 |
Conv2D (F=128, K=3) | B, X, T, 128 | 73856 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2D (F=128, K=3) | B, X, T, 128 | 147584 |
Conv2DTr (F=64, K=3) | B, X, T, 64 | 73792 |
Conv2DTr (F=32, K=3) | B, X, T, 32 | 18464 |
Conv2D (F=32, K=3) | B, X, T, 32 | 82976 |
 | Total params: | 2,160,768 |
 | Trainable params: | 2,160,768 |
 | Non-trainable params: | 0 |