Skip to main content

Table 2 Word recognition accuracy [%] for each method on the source domain

From: Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation

 

# Utterance/word

Model

250

500

Baseline

48.21

54.62

Proposed

50.06 (86.65)

55.07 (90.51)

  1. #Utterance/word indicates the number of utterances per word used to train the model
  2. The value in parentheses shows the accuracy of the audio model