Skip to main content

Table 1 DER of the test set using the proposed long-term speech features and proposed speaker diarization systems

From: The use of long-term features for GMM- and i-vector-based speaker diarization systems

Features for segmentation

Features for clustering

Speaker diarization system

GMM/BIC

i-Vector/CD

i-Vector/PLDA

MFCC

MFCC

23.97

22.96

21.05

MFCC

MFCC + delta

21.57

19.34

19.47

MFCC + JS

MFCC + JS

22.83

–

–

MFCC + formants

MFCC + formants

21.11

20.26

20.71

MFCC + (formants + pitch + intensity)

MFCC + (formants + pitch + intensity)

23.45

–

–

MFCC + (JS + formants + pitch + intensity)

MFCC + (JS + formants + pitch + intensity)

21.68

20.13

20.03

MFCC + (JS + formants + pitch + intensity + GNE)

MFCC + (JS + formants + pitch + intensity + GNE)

21.91

20.44

19.46

MFCC + (JS + formants + pitch + intensity)

MFCC + delta + (JS + formants + pitch + intensity)

21.76

18.2

19.37

MFCC + (JS + formants + GNE)

MFCC + delta + (JS + formants + GNE)

21.52

18.87

19.2

MFCC + (JS + formants + pitch + intensity + GNE)

MFCC + delta + (JS + formants + pitch + intensity + GNE)

22.68

18.68

18.95

  1. JS jitter and shimmer, CD cosine-distance