EURASIP Journal on Audio, Speech, and Music Processing

Table 4 Results for synthesized speech and voice conversion

From: DeepDet: YAMNet with BottleNeck Attention Module (BAM) for TTS synthesis detection

Set category	Spoofing system	Accuracy (%)	Precision (%)	Recall (%)	EER (%)	Min-tDCF
Eval	VC	93.3	97	95	0.90	0.06
	TTS	99.8	99.6	99.4	0.50	0.005
	LA(overall)	99.92	99.2	99.76	0.042	0.0015
Dev	VC	94.1	95	95	0.50	0.11
	TTS	98.8	98.6	99.1	0.010	0.015
	LA(overall)	98.9	98.2	98.9	0.015	0.002

Back to article page