Linear Prediction Using Refined Autocorrelation Function

Rahman, M Shahidur; Shimamura, Tetsuya

doi:10.1155/2007/45962

Research Article
Open access
Published: 25 July 2007

Linear Prediction Using Refined Autocorrelation Function

M Shahidur Rahman¹ &
Tetsuya Shimamura²

EURASIP Journal on Audio, Speech, and Music Processing volume 2007, Article number: 045962 (2007) Cite this article

2023 Accesses
8 Citations
Metrics details

Abstract

This paper proposes a new technique for improving the performance of linear prediction analysis by utilizing a refined version of the autocorrelation function. Problems in analyzing voiced speech using linear prediction occur often due to the harmonic structure of the excitation source, which causes the autocorrelation function to be an aliased version of that of the vocal tract impulse response. To estimate the vocal tract characteristics accurately, however, the effect of aliasing must be eliminated. In this paper, we employ homomorphic deconvolution technique in the autocorrelation domain to eliminate the aliasing effect occurred due to periodicity. The resulted autocorrelation function of the vocal tract impulse response is found to produce significant improvement in estimating formant frequencies. The accuracy of formant estimation is verified on synthetic vowels for a wide range of pitch frequencies typical for male and female speakers. The validity of the proposed method is also illustrated by inspecting the spectral envelopes of natural speech spoken by high-pitched female speaker. The synthesis filter obtained by the current method is guaranteed to be stable, which makes the method superior to many of its alternatives.

[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29]

References

Atal BS, Hanauer SL: Speech analysis and synthesis by linear prediction of the speech wave. The Journal of the Acoustical Society of America 1971,50(2B):637-655. 10.1121/1.1912679
Article Google Scholar
Makhoul J: Linear prediction: a tutorial review. Proceedings of the IEEE 1975,63(4):561-580.
Article Google Scholar
El-Jaroudi A, Makhoul J: Discrete all-pole modeling. IEEE Transactions on Signal Processing 1991,39(2):411-423. 10.1109/78.80824
Article Google Scholar
Vallabha GK, Tuller B: Systematic errors in the formant analysis of steady-state vowels. Speech Communication 2002,38(1-2):141-160. 10.1016/S0167-6393(01)00049-8
Article MATH Google Scholar
Wong DY, Markel JD, Gray AH Jr.: Least squares glottal inverse filtering from the acoustic speech waveform. IEEE Transactions on Acoustics, Speech, and Signal Processing 1979,27(4):350-355. 10.1109/TASSP.1979.1163260
Article Google Scholar
Krishnamurthy A, Childers DG: Two-channel speech analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(4):730-743. 10.1109/TASSP.1986.1164909
Article Google Scholar
Miyoshi Y, Yamato K, Mizoguchi R, Yanagida M, Kakusho O: Analysis of speech signals of short pitch period by a sample-selective linear prediction. IEEE Transactions on Acoustics, Speech, and Signal Processing 1987,35(9):1233-1240. 10.1109/TASSP.1987.1165282
Article Google Scholar
Pinto NB, Childers DG, Lalwani AL: Formant speech synthesis: improving production quality. IEEE Transactions on Acoustics, Speech, and Signal Processing 1989,37(12):1870-1887. 10.1109/29.45534
Article Google Scholar
Lee C-H: On robust linear prediction of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1988,36(5):642-650. 10.1109/29.1574
Article MATH Google Scholar
Yanagida M, Kakusho O: A weighted linear prediction analysis of speech signals by using the given's reduction. Proceedings of the IASTED International Symposium on Applied Signal Processing and Digital Filtering, June 1985, Paris, France 129-132.
Google Scholar
Miyanaga Y, Miki N, Nagai N, Hatori K: A speech analysis algorithm which eliminates the influence of pitch using the model reference adaptive system. IEEE Transactions on Acoustics, Speech, and Signal Processing 1982,30(1):88-96. 10.1109/TASSP.1982.1163856
Article Google Scholar
Fujisaki H, Ljungqvist M: Estimation of voice source and vocal tract parameters based on ARMA analysis and a model for the glottal source waveform. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '87), April 1987, Dallas, Tex, USA 637-640.
Chapter Google Scholar
Ding W, Kasuya H: A novel approach to the estimation of voice source and vocal tract parameters from speech signals. Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP '96), October 1996, Philadelphia, Pa, USA 2: 1257-1260.
Article Google Scholar
Rahman MS, Shimamura T: Speech analysis based on modeling the effective voice source. IEICE Transactions on Information and Systems 2006,E89-D(3):1107-1115. 10.1093/ietisy/e89-d.3.1107
Article Google Scholar
Deng H, Ward RK, Beddoes MP, Hodgson M: A new method for obtaining accurate estimates of vocal-tract filters and glottal waves from vowel sounds. IEEE Transactions on Audio, Speech, and Language Processing 2006,14(2):445-455.
Article Google Scholar
Hermansky H, Fujisaki H, Sato Y: Spectral envelope sampling and interpolation in linear predictive analysis of speech. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '84), 1984, San Diego, Calif, USA 9: 53-56.
Google Scholar
Hermansky H: Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America 1990,87(4):1738-1752. 10.1121/1.399423
Article Google Scholar
Varho S, Alku P: Separated linear prediction—a new all-pole modelling technique for speech analysis. Speech Communication 1998,24(2):111-121. 10.1016/S0167-6393(98)00003-X
Article Google Scholar
Kabal P, Kleijn B: All-pole modelling of mixed excitation signals. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001, Salt Lake City, Utah, USA 1: 97-100.
Google Scholar
Oppenheim A, Schafer R: Homomorphic analysis of speech. IEEE Transactions on Audio and Electroacoustics 1968,16(2):221-226. 10.1109/TAU.1968.1161965
Article Google Scholar
Rahman MS, Shimamura T: Linear prediction using homomorphic deconvolution in the autocorrelation domain. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '05), May 2005, Kobe Japan 3: 2855-2858.
Article Google Scholar
Quatieri TF: Discrete-Time Speech Signal Processing: Principles and Practice. Prentice-Hall, Upper Saddle River, NJ, USA; 2002.
Google Scholar
Lim JS: Spectral root homomorphic deconvolution system. IEEE Transactions on Acoustics, Speech, and Signal Processing 1979,27(3):223-233. 10.1109/TASSP.1979.1163234
Article MATH Google Scholar
Kobayashi T, Imai S: Spectral analysis using generalised cepstrum. IEEE Transactions on Acoustics, Speech, and Signal Processing 1984,32(6):1235-1238. 10.1109/TASSP.1984.1164454
Article Google Scholar
Verhelst W, Steenhaut O: A new model for the short-time complex cepstrum of voiced speech. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986,34(1):43-51. 10.1109/TASSP.1986.1164787
Article Google Scholar
Kay SM: Modern Spectral Estimation: Theory and Application. Prentice-Hall, Upper Saddle River, NJ, USA; 1988.
MATH Google Scholar
Stoica P, Moses RL: Introduction to Spectral Analysis. Prentice-Hall, Upper Saddle River, NJ, USA; 1997.
MATH Google Scholar
Fant G, Liljencrants J, Lin QG: A four parameter model of glottal flow. In Quarterly Progress and Status. Speech Transmission Laboratory, Royal Institute of Technology, Stockholm, Sweden; 1985:1-13.
Google Scholar
Klatt DH: Software for a cascade/parallel formant synthesizer. Journal of the Acoustical Society of America 1980,67(3):971-995. 10.1121/1.383940
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shah Jalal University of Science and Technology, Sylhet, 3114, Bangladesh
M Shahidur Rahman
Department of Information and Computer Sciences, Saitama University, Saitama, 338-8570, Japan
Tetsuya Shimamura

Authors

M Shahidur Rahman
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuya Shimamura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M Shahidur Rahman.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Rahman, M.S., Shimamura, T. Linear Prediction Using Refined Autocorrelation Function. J AUDIO SPEECH MUSIC PROC. 2007, 045962 (2007). https://doi.org/10.1155/2007/45962

Download citation

Received: 16 October 2006
Revised: 07 March 2007
Accepted: 14 June 2007
Published: 25 July 2007
DOI: https://doi.org/10.1155/2007/45962

Linear Prediction Using Refined Autocorrelation Function

Abstract

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords