Skip to main content

Table 5 Conditions for speaker recognition

From: Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification

Sampling frequency 16 kHz
Frame length 25 ms
Frame shift 10 ms
Feature space 25 dimensions with CMN
  (12 MFCCs + Δ + Δpower)
Acoustic model GMMs with 128 diagonal
  covariance matrices