Skip to main content

Table 5 Conditions for speaker recognition

From: Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification

 

Values

Sampling frequency

16 kHz

Frame length

25 ms

Frame shift

10 ms

Feature space

25 dimensions with CMN

 

(12 MFCCs + Δ + Δpower)

Acoustic model

GMMs with 128 diagonal

 

covariance matrices