Figure 4From: Lip-Synching Using Speaker-Specific Articulation, Shape and Appearance ModelsThe phasing model of the PHMM predicts phasing relations between acoustic onsets of the phones (bottom) and onsets of context-dependent phone HMM that generate the frames of the gestural score (top). In this example, onsets of gestures characterizing the two last sounds are in advance compared to effective acoustics onsets. For instance an average delay between observed gestural and acoustic onset is computed and stored for each context-dependent phone HMM. This delay is optimized with an iterative procedure described in Section 4.3 and illustrated in Figure 5.Back to article page