Skip to main content

Table 3 Comparison between the SpeechDat telephone quality (TF), SpeeCon narrow-band (NB) and SpeeCon wide-band (WB) recognisers. Results are given in terms of correct frames % for phonemes (ph) and visemes (vi), and accuracy.

From: SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support

Database

SpeechDat

SpeeCon

Data size (ca. hours)

200

40

Speakers (#)

5000

550

Speech quality

TF

NB

WB

Sampling (kHz)

8

8

16

Correct frames (% ph)

54.2

65.2

68.7

Correct frames (% vi)

59.3

69.0

74.5

Accuracy (% ph)

56.5

62.2

63.2