Open Access

Linear Prediction Using Refined Autocorrelation Function

EURASIP Journal on Audio, Speech, and Music Processing20072007:045962

DOI: 10.1155/2007/45962

Received: 16 October 2006

Accepted: 14 June 2007

Published: 25 July 2007

Abstract

This paper proposes a new technique for improving the performance of linear prediction analysis by utilizing a refined version of the autocorrelation function. Problems in analyzing voiced speech using linear prediction occur often due to the harmonic structure of the excitation source, which causes the autocorrelation function to be an aliased version of that of the vocal tract impulse response. To estimate the vocal tract characteristics accurately, however, the effect of aliasing must be eliminated. In this paper, we employ homomorphic deconvolution technique in the autocorrelation domain to eliminate the aliasing effect occurred due to periodicity. The resulted autocorrelation function of the vocal tract impulse response is found to produce significant improvement in estimating formant frequencies. The accuracy of formant estimation is verified on synthetic vowels for a wide range of pitch frequencies typical for male and female speakers. The validity of the proposed method is also illustrated by inspecting the spectral envelopes of natural speech spoken by high-pitched female speaker. The synthesis filter obtained by the current method is guaranteed to be stable, which makes the method superior to many of its alternatives.

[1234567891011121314151617181920212223242526272829]

Authors’ Affiliations

(1)
Department of Computer Science and Engineering, Shah Jalal University of Science and Technology
(2)
Department of Information and Computer Sciences, Saitama University

References

  1. Atal BS, Hanauer SL: Speech analysis and synthesis by linear prediction of the speech wave. The Journal of the Acoustical Society of America 1971,50(2B):637-655. 10.1121/1.1912679View ArticleGoogle Scholar
  2. Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561-580.View ArticleGoogle Scholar
  3. El-Jaroudi A, Makhoul J: Discrete all-pole modeling. IEEE Transactions on Signal Processing 1991,39(2):411-423. 10.1109/78.80824View ArticleGoogle Scholar
  4. Vallabha GK, Tuller B: Systematic errors in the formant analysis of steady-state vowels. Speech Communication 2002,38(1-2):141-160. 10.1016/S0167-6393(01)00049-8View ArticleMATHGoogle Scholar
  5. Wong DY, Markel JD, Gray AH Jr.: Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1979,27(4):350-355. 10.1109/TASSP.1979.1163260View ArticleGoogle Scholar
  6. Krishnamurthy A, Childers DG: Two-channel speech analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(4):730-743. 10.1109/TASSP.1986.1164909View ArticleGoogle Scholar
  7. Miyoshi Y, Yamato K, Mizoguchi R, Yanagida M, Kakusho O: Analysis of speech signals of short pitch period by a sample-selective linear prediction. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987,35(9):1233-1240. 10.1109/TASSP.1987.1165282View ArticleGoogle Scholar
  8. Pinto NB, Childers DG, Lalwani AL: Formant speech synthesis: improving production quality. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989,37(12):1870-1887. 10.1109/29.45534View ArticleGoogle Scholar
  9. Lee C-H: On robust linear prediction of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1988,36(5):642-650. 10.1109/29.1574View ArticleMATHGoogle Scholar
  10. Yanagida M, Kakusho O: A weighted linear prediction analysis of speech signals by using the given's reduction. Proceedings of the IASTED International Symposium on Applied Signal Processing and Digital Filtering, June 1985, Paris, France 129-132.Google Scholar
  11. Miyanaga Y, Miki N, Nagai N, Hatori K: A speech analysis algorithm which eliminates the influence of pitch using the model reference adaptive system. IEEE Transactions on Acoustics, Speech, and Signal Processing 1982,30(1):88-96. 10.1109/TASSP.1982.1163856View ArticleGoogle Scholar
  12. Fujisaki H, Ljungqvist M: Estimation of voice source and vocal tract parameters based on ARMA analysis and a model for the glottal source waveform. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87), April 1987, Dallas, Tex, USA 637-640.View ArticleGoogle Scholar
  13. Ding W, Kasuya H: A novel approach to the estimation of voice source and vocal tract parameters from speech signals. Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP '96), October 1996, Philadelphia, Pa, USA 2: 1257-1260.View ArticleGoogle Scholar
  14. Rahman MS, Shimamura T: Speech analysis based on modeling the effective voice source. IEICE Transactions on Information and Systems 2006,E89-D(3):1107-1115. 10.1093/ietisy/e89-d.3.1107View ArticleGoogle Scholar
  15. Deng H, Ward RK, Beddoes MP, Hodgson M: A new method for obtaining accurate estimates of vocal-tract filters and glottal waves from vowel sounds. IEEE Transactions on Audio, Speech, and Language Processing 2006,14(2):445-455.View ArticleGoogle Scholar
  16. Hermansky H, Fujisaki H, Sato Y: Spectral envelope sampling and interpolation in linear predictive analysis of speech. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '84), 1984, San Diego, Calif, USA 9: 53-56.Google Scholar
  17. Hermansky H: Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America 1990,87(4):1738-1752. 10.1121/1.399423View ArticleGoogle Scholar
  18. Varho S, Alku P: Separated linear prediction—a new all-pole modelling technique for speech analysis. Speech Communication 1998,24(2):111-121. 10.1016/S0167-6393(98)00003-XView ArticleGoogle Scholar
  19. Kabal P, Kleijn B: All-pole modelling of mixed excitation signals. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 1: 97-100.Google Scholar
  20. Oppenheim A, Schafer R: Homomorphic analysis of speech. IEEE Transactions on Audio and Electroacoustics 1968,16(2):221-226. 10.1109/TAU.1968.1161965View ArticleGoogle Scholar
  21. Rahman MS, Shimamura T: Linear prediction using homomorphic deconvolution in the autocorrelation domain. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '05), May 2005, Kobe Japan 3: 2855-2858.View ArticleGoogle Scholar
  22. Quatieri TF: Discrete-Time Speech Signal Processing: Principles and Practice. Prentice-Hall, Upper Saddle River, NJ, USA; 2002.Google Scholar
  23. Lim JS: Spectral root homomorphic deconvolution system. IEEE Transactions on Acoustics, Speech, and Signal Processing 1979,27(3):223-233. 10.1109/TASSP.1979.1163234View ArticleMATHGoogle Scholar
  24. Kobayashi T, Imai S: Spectral analysis using generalised cepstrum. IEEE Transactions on Acoustics, Speech, and Signal Processing 1984,32(6):1235-1238. 10.1109/TASSP.1984.1164454View ArticleGoogle Scholar
  25. Verhelst W, Steenhaut O: A new model for the short-time complex cepstrum of voiced speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(1):43-51. 10.1109/TASSP.1986.1164787View ArticleGoogle Scholar
  26. Kay SM: Modern Spectral Estimation: Theory and Application. Prentice-Hall, Upper Saddle River, NJ, USA; 1988.MATHGoogle Scholar
  27. Stoica P, Moses RL: Introduction to Spectral Analysis. Prentice-Hall, Upper Saddle River, NJ, USA; 1997.MATHGoogle Scholar
  28. Fant G, Liljencrants J, Lin QG: A four parameter model of glottal flow. In Quarterly Progress and Status. Speech Transmission Laboratory, Royal Institute of Technology, Stockholm, Sweden; 1985:1-13.Google Scholar
  29. Klatt DH: Software for a cascade/parallel formant synthesizer. Journal of the Acoustical Society of America 1980,67(3):971-995. 10.1121/1.383940View ArticleGoogle Scholar

Copyright

© M. S. Rahman and T. Shimamura. 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.