On the Characterization of Slowly Varying Sinusoids
© Xue Wen and Mark Sandler. 2010
Received: 4 March 2010
Accepted: 6 July 2010
Published: 26 July 2010
We give a brief discussion on the amplitude and frequency variation rates of the sinusoid representation of signals. In particular, we derive three inequalities that show that these rates are upper bounded by the 2nd and 4th spectral moments, which, in a loose sense, indicates that every complex signal with narrow short-time bandwidths is a slowly varying sinusoid. Further discussions are given to show how this result helps providing extra insights into relevant signal processing techniques.
where the real variables and are the amplitude and phase angle (or phase for short) of . By this definition every nonzero complex variable has a unique sinusoid representation, up to the polarization of and shift of , . In practice, these ambiguities are relieved by assuming that and be continuous and smooth .
The parameter variations considered in this paper are the first and second derivatives of , and the second derivative of , which, respectively, characterize amplitude and frequency modulations. We say that a sinusoid representation is slowly varying if these derivatives have small absolute values. Slowly varying sinusoids include many signals in speech, music, and telecommunications that we technically handle as sinusoids [4, 6, 7], and are the central elements of sinusoid modelling.
where we have omitted the subscript from . At , the term reaches its minimum whose square root defines the -bandwidth of (likewise we call the -bandwidth of at , as it is the -weighted -norm of ). According to (5), the -bandwidth of is upper-bounded if and the range of are. However, as the latter may grow very large over a long time span, (5) does not imply that slowly varying sinusoids have concentrated spectra, as stationary sinusoids do.
where is the Fourier transform of . Since is concentrated at , (7) shows that is also concentrated at as long as and remain small. Mallet  used the term windowed Fourier ridge to describe this time-frequency distribution as it involves a spectral peak that evolves in time. We call it spectral ridge for short.
Despite all these studies on sinusoid representations, one question has been overlooked: what type of signals can be modelled as slowly varying sinusoids? From the results above, it is obvious that signals with wide short-time bandwidths, such as wide-band noise, cannot be slowly varying sinusoids. In this paper, we consider the inverse: do narrow short-time bandwidths always imply a slowly varying sinusoid? In other words, does a concentrated short-time spectrum necessarily set certain upper bounds on and ? The concentration of a spectrum is measured by the moments of the spectral energy distribution (i.e., normalized energy spectrum), or spectral moments for short. The spectral moment of x with centre is given by and can be interpreted as the biased -bandwidth, as it becomes the -bandwidth if . From (5), it is obvious that upper bounds the average amplitude derivative and average frequency departure from . However, the 2nd moment is not enough to set an upper bound on . In what follows, we provide a new result that employs higher spectral moments to upper bound as well as and .
2. Parameter Variation Rate Upper Bounds in Terms of 2nd and 4th Spectral Moments
The kurtosis is generally understood to represent the "peakedness" of : a small kurtosis indicates bulky peak and sharp tails; a large kurtosis indicates narrow peak and heavy tails. In the context of (13), (9) (11) states that the for the same 2nd moment, more modulation is allowed by larger kurtoses.
Inequalities (9) (11) can also be directly applied to windowed Fourier transforms by replacing with where is the window function. As , if and are upper bounded, then so is . (9) (11) indicate that the sinusoid representation of a signal whose STFT forms a spectral ridge is necessarily slowly varying in terms of short-time average of parameter variation rates. This, together with our comments in the introduction, completes the following statement.
A complex signal has slowly varying sinusoid representation if and only if it has narrow short-time bandwidths.(*)
We notice that is measured differently in (14) and (16), giving a double meaning to "slowly varying frequency" in (*). For this reason, (*) does not actually give a pair of strictly converse statements, and the equivalence between slowly varying sinusoids and spectral ridges, as established by (14) (16), should only be considered qualitatively. Nevertheless, by these results we have partially answered the question of what kind of signals can be modelled as slowly varying sinusoids. In the rest of this paper, we focus on (*) as a guideline and see how it relates to various sinusoid modelling practices.
3. Discussions and Conclusion
3.1. Combining Sinusoids with Close Frequencies
Beating  is a well known effect observed from adding two sinusoids with similar frequencies, in which they "melt" into a single tone with additional modulation. This phenomenon can be easily explained by (*): as slowly varying sinusoids have short-time spectral energy concentrated near their angular frequencies, if the frequencies are close, then their sum also has concentrated short-time spectral energy, therefore is also a slowly varying sinusoid. Additional modulation may be introduced as the result of a wider bandwidth contributed by the small interval between the participant frequencies. A quantitative proof of this argument is given in , which leads to an additive re-estimation algorithm for measuring parameters of slowly varying sinusoids.
Statement (*) also reveals the difficulty in separating close sinusoids by the slow variation criterion alone. Since there are infinite number of ways to divide a spectral ridge into 2 or more subridges, and since all narrow ridges are necessarily slowly varying sinusoids, there are infinite number of separations that are slowly varying.
3.2. Atomic Decomposition
where is a slowly varying sinusoid, , and are constants for given , and is the overlap-add window centred at the i th reference point, say . Adjacent windows are arranged to have considerable overlap.
The use of overlapping stationary sinusoids to approximate time-varying sinusoids is partially justified by (*). Since windowed sinusoid atoms have concentrated spectral energy at the sinusoid frequencies, their sum will form a narrow spectral ridge as long as frequencies of adjacent atoms are close enough so that the result represents a slowly varying sinusoid. It is also apparent that if there is a large frequency jump between any adjacent atoms, then the sum is no longer a slowly varying sinusoid, indicating that (17) is not a suitable representation of .
3.3. Real Sinusoids and Analytic Signals
A real slowly varying sinusoid can always be written as the sum of two conjugate complex slowly varying sinusoids. According to (*), its spectrogram is made up of two spectral ridges. To find a slowly varying double-sinusoid representation for a real sinusoid, one only needs to separate the spectrogram into two parts, each containing one ridge. This separation is generally not unique. If the two parts are conjugate to each other, then the real part of the corresponding complex sinusoids equals half of the real sinusoid.
Most of the real sinusoids we encounter in practice have always-positive frequencies so that each spectral ridge lies in a half plane on either side of the time axis. In this case, the most natural separation is obtained by splitting the spectrum along time axis, which leads to analytic complex sinusoids . We observe by (*) that the analytic representation is slowly varying by design, if the concerned sinusoid does have a slowly varying representation at all.
3.4. More on Slowly Varying Real Sinusoids
Nonunique representations of real sinusoids may cause problems in evaluating sinusoid estimators. For example, while a complex linear chirp defines a linear frequency for its corresponding real chirp, the latter's analytic counterpart defines a nonlinear frequency which is no less convincing. Fortunately, in  we have shown that the difference between various sinusoid representations of the same real signal is bounded by their parameter variation rates. Consequently, if a signal has multiple slowly varying sinusoid representations, then they are close to each other.
In this paper, we have given three inequalities that bound the parameter variation rates of the sinusoid representation of a complex signal by its 2nd and 4th spectral moments, indicating that every complex signal with narrow short-time bandwidths is necessarily a slowly varying sinusoid. This, together with several previous results, serves to argue towards the equivalence between slowly varying sinusoids and signals with narrow short-time bandwidths, which, in return, provides extra insights into various aspects of sinusoid modelling.
This paper was supported by the EPSRC EP/E017614/1 project OMRAS2 (Online Music Recognition and Searching).
- McAulay RJ, Quatieri TF: Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing 1986, 34(4):744-754. 10.1109/TASSP.1986.1164910View ArticleGoogle Scholar
- Serra X: Musical sound modeling with sinusoids plus noise. Musical Signal Processing 1997, 91-122.Google Scholar
- Peeters G, Rodet X: SINOLA: a new analysis/synthesis method using spectrum peak shape distortion, phase and reassigned spectrum. Proceedings of the International Computer Music Conference (ICMC '99), 1999, Beijing, China 153-156.Google Scholar
- Carlson AB: Communication Systems. 2nd edition. McGraw-Hill, New York, NY, USA; 1981.Google Scholar
- Cohen L, Loughlin P, Vakman D: On an ambiguity in the definition of the amplitude and phase of a signal. Signal Processing 1999, 79(3):301-307. 10.1016/S0165-1684(99)00103-6MATHView ArticleGoogle Scholar
- Fant G: The acoustics of speech. Proceedings of the 3rd International Conference Solar Air-Conditioning, 1959, Stuttgart, Germany 188-201.Google Scholar
- Fletcher NH, Rossing TD: The Physics of Musical Instruments. 2nd edition. Springer, New York, NY, USA; 1998.MATHView ArticleGoogle Scholar
- Cohen L, Lee C: Standard deviation of instantaneous frequency. Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), May 1989 4: 2238-2241.View ArticleGoogle Scholar
- Mallat S: A Wavelet Tour of Signal Processing. 2nd edition. Academic Press; 1999.MATHGoogle Scholar
- Davidson KL, Loughlin PJ: Instantaneous spectral moments. Journal of the Franklin Institute 2000, 337(4):421-436. 10.1016/S0016-0032(00)00034-XMATHMathSciNetView ArticleGoogle Scholar
- Jeffress LA: Beating sinusoids and pitch changes. Journal of the Acoustical Society of America 1968, 43(6):1464. 10.1121/1.1911027View ArticleGoogle Scholar
- Wen X, Sandler M: Additive and multiplicative reestimation schemes for the sinusoid modeling of audio. Proceedings of 17th European Signal Processing Conference (EUSIPCO '09), 2009, Glasgow, UKGoogle Scholar
- Gabor D: Theory of communication. Journal of the Institute of Electronics Engineers 1946, 3: 429-459.Google Scholar
- Mallat SG, Zhang Z: Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing 1993, 41(12):3397-3415. 10.1109/78.258082MATHView ArticleGoogle Scholar
- Brown GJ, Cooke M: Computational auditory scene analysis. Computer Speech and Language 1994, 8(4):297-336. 10.1006/csla.1994.1016View ArticleGoogle Scholar
- George EB, Smith MJT: Analysis-by-synthesis/overlap-add sinusoidal modeling applied to the analysis and synthesis of musical tones. Journal of the Audio Engineering Society 1992, 40(6):497-516.Google Scholar
- Davy M, Godsill SJ: Bayesian harmonic models for musical signal analysis. In Bayesian Statistics 7. Oxford University Press, Oxford, UK; 2003.Google Scholar
- Wen X: Harmonic sinusoid modelling of tonal music events, Ph.D. thesis. University of London, London, UK; 2007.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.