EURASIP Journal on Audio, Speech, and Music Processing

Table 2 Word recognition accuracy [%] for each method on the source domain

From: Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation

	# Utterance/word
Model	250	500
Baseline	48.21	54.62
Proposed	50.06 (86.65)	55.07 (90.51)

#Utterance/word indicates the number of utterances per word used to train the model
The value in parentheses shows the accuracy of the audio model

Back to article page