Fig. 6From: Multimodal voice conversion based on non-negative matrix factorizationMel-cepstrum distortion in white noise environments (β is the weight of image feature)Back to article page