Method | Type | Characteristics | Advantages | Disadvantages |
---|---|---|---|---|
Basic PCM | Waveform | Uniform ADC | Very simple | High bit rate |
Logarithmic PCM | Waveform | Log amplitude compression | No latency | Medium-high bit rate |
Adaptive PCM | Waveform | Quantizer follows energy changes | Simple | Medium-high bit rate |
Differential PCM | Waveform | Short-time spectral predictor | Exploits speech spectral envelope detail | Medium bit rate |
Linear predictive coding | Vocoder | All-pole spectral model | Low bit rate; standard model for cellular telephony | Loss of phase in basic model |
Adaptive transform coding | Waveform | Transmits much spectral detail | Good speech quality | High complexity |
Sub-band coding | Waveform | Band-pass filters | Good speech quality | High complexity |
Sinusoidal (harmonic) coding | Waveform | Codes individual harmonics | Good speech quality | Requires F0 estimator |
Channel vocoder | Vocoder | Flat spectrum in each channel | Low rate | Reverberation; loss of phase |
Formant vocoder | Vocoder | Direct formant model | Low rate | Requires estimates of formant frequencies |
Variational autoencoder | Neural network | Encoder/decoder | Basic neural model | Costly |
Flow neural model | Neural network | Transforms Gaussian noise sequences | Can use parallel processing | More difficult to train |
Generative adversarial network | Neural network | Adversarial discriminator and generator | Fast processing | Lower quality than other neural methods |
Autoregressive neural model | Neural network | Exploits long conditional pdfs | Very high quality | High latency; costly |