From: Grid-based approximation for voice conversion in low resource environments
Input: a sequence of feature vectors related to the source speaker x _{1:T } |
Initialization: set the initial weights, \(\left \{w_{0|0}^{k}\right \}_{k=1}^{N_{y}}\). |
Main iteration: for t=1,…T, perform the following steps: |
1. Evaluate the prior weights, \(\left \{w_{t|t-1}^{k}\right \}_{k=1}^{N_{y}}\), using Eqs. (10) and (16). |
2. Evaluate the posterior weights, \(\left \{w_{t|t}^{k}\right \}_{k=1}^{N_{y}}\), using Eqs. (11) and (14). |
3. Evaluate \({\tilde {\mathbf {y}}_{t}\ }\)=\({\ \mathcal {F}\{\mathbf {x}_{t}\}}\), using Eq. (22). |
Output: a sequence of converted vectors \({\tilde {\mathbf {y}}_{1:T}}\) |