# On the Characterization of Slowly Varying Sinusoids

- Xue Wen
^{1}Email author and - Mark Sandler
^{1}

**2010**:941732

https://doi.org/10.1155/2010/941732

© Xue Wen and Mark Sandler. 2010

**Received: **4 March 2010

**Accepted: **6 July 2010

**Published: **26 July 2010

## Abstract

We give a brief discussion on the amplitude and frequency variation rates of the sinusoid representation of signals. In particular, we derive three inequalities that show that these rates are upper bounded by the 2nd and 4th spectral moments, which, in a loose sense, indicates that every complex signal with narrow short-time bandwidths is a slowly varying sinusoid. Further discussions are given to show how this result helps providing extra insights into relevant signal processing techniques.

## 1. Introduction

where the real variables
and
are the *amplitude* and *phase angle* (or *phase* for short) of
. By this definition every nonzero complex variable has a unique sinusoid representation, up to the polarization of
and
shift of
,
. In practice, these ambiguities are relieved by assuming that
and
be continuous and smooth [5].

The parameter variations considered in this paper are the first and second derivatives of , and the second derivative of , which, respectively, characterize amplitude and frequency modulations. We say that a sinusoid representation is slowly varying if these derivatives have small absolute values. Slowly varying sinusoids include many signals in speech, music, and telecommunications that we technically handle as sinusoids [4, 6, 7], and are the central elements of sinusoid modelling.

*t*or

*ω*to a real number, by

where we have omitted the subscript
from
. At
, the term
reaches its minimum whose square root defines the
-*bandwidth* of
(likewise we call
the
-*bandwidth* of
at
, as it is the
-weighted
-norm of
). According to (5), the
-bandwidth of
is upper-bounded if
and the range of
are. However, as the latter may grow very large over a long time span, (5) does not imply that slowly varying sinusoids have concentrated spectra, as stationary sinusoids do.

*short-time bandwidth*of with window . Figure 1 illustrates the concept of short-time bandwidth applied to a linear chirp in the time-frequency plane. The solid line depicts the angular frequency of the chirp as a function of time. Its short-time spectra are evaluated using two rectangular windows, and , whose durations are marked along the time axis. Although the linear chirp is not band limited, each window captures a band-limited portion of it. The frequency content captured by window distributes uniformly over ( , ) while that by window distributes uniformly over ( , ). If both windows contain plenty periods of the sinusoid, then the bandwidths of the two spectra, and , are roughly proportional to and , which are in turn proportional to the length of and and the chirp rate of the sinusoid.

where
is the Fourier transform of
. Since
is concentrated at
, (7) shows that
is also concentrated at
as long as
and
remain small. Mallet [9] used the term *windowed Fourier ridge* to describe this time-frequency distribution as it involves a spectral peak that evolves in time. We call it *spectral ridge* for short.

Despite all these studies on sinusoid representations, one question has been overlooked: what type of signals can be modelled as slowly varying sinusoids? From the results above, it is obvious that signals with wide short-time bandwidths, such as wide-band noise, *cannot* be slowly varying sinusoids. In this paper, we consider the inverse: do narrow short-time bandwidths always imply a slowly varying sinusoid? In other words, does a concentrated short-time spectrum necessarily set certain upper bounds on
and
? The concentration of a spectrum is measured by the moments of the spectral energy distribution (i.e., normalized energy spectrum), or *spectral moments* for short. The
spectral moment of *x* with centre
is given by
and can be interpreted as the *biased*
-*bandwidth*, as it becomes the
-bandwidth if
. From (5), it is obvious that
upper bounds the average amplitude derivative and average frequency departure from
. However, the 2nd moment is not enough to set an upper bound on
. In what follows, we provide a new result that employs higher spectral moments to upper bound
as well as
and
.

## 2. Parameter Variation Rate Upper Bounds in Terms of 2nd and 4th Spectral Moments

*biased kurtosis*at , defined as

The kurtosis is generally understood to represent the "peakedness" of : a small kurtosis indicates bulky peak and sharp tails; a large kurtosis indicates narrow peak and heavy tails. In the context of (13), (9) (11) states that the for the same 2nd moment, more modulation is allowed by larger kurtoses.

Inequalities (9) (11) can also be directly applied to windowed Fourier transforms by replacing with where is the window function. As , if and are upper bounded, then so is . (9) (11) indicate that the sinusoid representation of a signal whose STFT forms a spectral ridge is necessarily slowly varying in terms of short-time average of parameter variation rates. This, together with our comments in the introduction, completes the following statement.

*A complex signal has slowly varying sinusoid representation if and only if it has narrow short-time bandwidths.*(*)

*-bandwidth*is simply the

*p*th spectral moment computed with . In (*) the "only if" part comes from the previous studies we summarized by (6) in the introduction; the "if" part comes from our results (9) (11). The plural form in "bandwidths" refers to the values evaluated in both - and -norms at different points over the whole duration. A quantitative presentation of (*) is given by rewriting (6) and (10), (11) employing a sliding window

where is the window function centred at .

We notice that is measured differently in (14) and (16), giving a double meaning to "slowly varying frequency" in (*). For this reason, (*) does not actually give a pair of strictly converse statements, and the equivalence between slowly varying sinusoids and spectral ridges, as established by (14) (16), should only be considered qualitatively. Nevertheless, by these results we have partially answered the question of what kind of signals can be modelled as slowly varying sinusoids. In the rest of this paper, we focus on (*) as a guideline and see how it relates to various sinusoid modelling practices.

## 3. Discussions and Conclusion

### 3.1. Combining Sinusoids with Close Frequencies

*Beating* [11] is a well known effect observed from adding two sinusoids with similar frequencies, in which they "melt" into a single tone with additional modulation. This phenomenon can be easily explained by (*): as slowly varying sinusoids have short-time spectral energy concentrated near their angular frequencies, if the frequencies are close, then their sum also has concentrated short-time spectral energy, therefore is also a slowly varying sinusoid. Additional modulation may be introduced as the result of a wider bandwidth contributed by the small interval between the participant frequencies. A quantitative proof of this argument is given in [12], which leads to an additive re-estimation algorithm for measuring parameters of slowly varying sinusoids.

Statement (*) also reveals the difficulty in separating close sinusoids by the slow variation criterion alone. Since there are infinite number of ways to divide a spectral ridge into 2 or more subridges, and since all narrow ridges are necessarily slowly varying sinusoids, there are infinite number of separations that are slowly varying.

### 3.2. Atomic Decomposition

*atom*refers to basic waveforms with concentrated time and frequency localization into which a signal is decomposed. Windowed sinusoid atoms have been used in short-time Fourier and Gabor transforms [13], matching pursuits [14], auditory scene analysis [15], and methods for approximating time-varying sinusoids [16, 17]. An

*overlap-add sinusoidal model*was proposed [16] in the typical form of atomic decomposition

where
is a slowly varying sinusoid,
,
and
are constants for given
, and
is the overlap-add window centred at the *i* th reference point, say
. Adjacent windows are arranged to have considerable overlap.

The use of overlapping stationary sinusoids to approximate time-varying sinusoids is partially justified by (*). Since windowed sinusoid atoms have concentrated spectral energy at the sinusoid frequencies, their sum will form a narrow spectral ridge as long as frequencies of adjacent atoms are close enough so that the result represents a slowly varying sinusoid. It is also apparent that if there is a large frequency jump between any adjacent atoms, then the sum is no longer a slowly varying sinusoid, indicating that (17) is not a suitable representation of .

*two*sinusoids will allow much slower modulation rates than a single-sinusoid representation.

### 3.3. Real Sinusoids and Analytic Signals

A real slowly varying sinusoid can always be written as the sum of two conjugate complex slowly varying sinusoids. According to (*), its spectrogram is made up of two spectral ridges. To find a slowly varying *double*-sinusoid representation for a real sinusoid, one only needs to separate the spectrogram into two parts, each containing one ridge. This separation is generally not unique. If the two parts are conjugate to each other, then the real part of the corresponding complex sinusoids equals half of the real sinusoid.

Most of the real sinusoids we encounter in practice have always-positive frequencies so that each spectral ridge lies in a half plane on either side of the time axis. In this case, the most natural separation is obtained by splitting the spectrum along time axis, which leads to analytic complex sinusoids [5]. We observe by (*) that the analytic representation is slowly varying by design, if the concerned sinusoid does have a slowly varying representation at all.

*nearly*analytic sinusoid obtained by setting the spectrogram in Figure 3(a), rather than the spectrum, to zero over negative frequencies. Although Figure 3(b) and Figure 3(c) look different, both are slowly varying complex sinusoids with the real part equalling half the linear chirp in Figure 3(a).

### 3.4. More on Slowly Varying Real Sinusoids

*a*and

*φ*to give the slowest varying representation, in the sense of minimizing

*I*in (18), is given as

where is the 4th-order derivative of . This condition is automatically satisfied regardless of if is exponential and is trinomial, but can be more constraining in other cases.

Nonunique representations of real sinusoids may cause problems in evaluating sinusoid estimators. For example, while a complex linear chirp defines a linear frequency for its corresponding real chirp, the latter's analytic counterpart defines a nonlinear frequency which is no less convincing. Fortunately, in [18] we have shown that the difference between various sinusoid representations of the same real signal is bounded by their parameter variation rates. Consequently, if a signal has multiple slowly varying sinusoid representations, then they are close to each other.

### 3.5. Conclusion

In this paper, we have given three inequalities that bound the parameter variation rates of the sinusoid representation of a complex signal by its 2nd and 4th spectral moments, indicating that every complex signal with narrow short-time bandwidths is necessarily a slowly varying sinusoid. This, together with several previous results, serves to argue towards the equivalence between slowly varying sinusoids and signals with narrow short-time bandwidths, which, in return, provides extra insights into various aspects of sinusoid modelling.

## Declarations

### Acknowledgment

This paper was supported by the EPSRC EP/E017614/1 project OMRAS2 (Online Music Recognition and Searching).

## Authors’ Affiliations

## References

- McAulay RJ, Quatieri TF:
**Speech analysis/synthesis based on a sinusoidal representation.***IEEE Transactions on Acoustics, Speech, and Signal Processing*1986,**34**(4):744-754. 10.1109/TASSP.1986.1164910View ArticleGoogle Scholar - Serra X:
**Musical sound modeling with sinusoids plus noise.***Musical Signal Processing*1997, 91-122.Google Scholar - Peeters G, Rodet X:
**SINOLA: a new analysis/synthesis method using spectrum peak shape distortion, phase and reassigned spectrum.***Proceedings of the International Computer Music Conference (ICMC '99), 1999, Beijing, China*153-156.Google Scholar - Carlson AB:
*Communication Systems*. 2nd edition. McGraw-Hill, New York, NY, USA; 1981.Google Scholar - Cohen L, Loughlin P, Vakman D:
**On an ambiguity in the definition of the amplitude and phase of a signal.***Signal Processing*1999,**79**(3):301-307. 10.1016/S0165-1684(99)00103-6MATHView ArticleGoogle Scholar - Fant G:
**The acoustics of speech.***Proceedings of the 3rd International Conference Solar Air-Conditioning, 1959, Stuttgart, Germany*188-201.Google Scholar - Fletcher NH, Rossing TD:
*The Physics of Musical Instruments*. 2nd edition. Springer, New York, NY, USA; 1998.MATHView ArticleGoogle Scholar - Cohen L, Lee C:
**Standard deviation of instantaneous frequency.***Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), May 1989***4:**2238-2241.View ArticleGoogle Scholar - Mallat S:
*A Wavelet Tour of Signal Processing*. 2nd edition. Academic Press; 1999.MATHGoogle Scholar - Davidson KL, Loughlin PJ:
**Instantaneous spectral moments.***Journal of the Franklin Institute*2000,**337**(4):421-436. 10.1016/S0016-0032(00)00034-XMATHMathSciNetView ArticleGoogle Scholar - Jeffress LA:
**Beating sinusoids and pitch changes.***Journal of the Acoustical Society of America*1968,**43**(6):1464. 10.1121/1.1911027View ArticleGoogle Scholar - Wen X, Sandler M:
**Additive and multiplicative reestimation schemes for the sinusoid modeling of audio.***Proceedings of 17th European Signal Processing Conference (EUSIPCO '09), 2009, Glasgow, UK*Google Scholar - Gabor D:
**Theory of communication.***Journal of the Institute of Electronics Engineers*1946,**3:**429-459.Google Scholar - Mallat SG, Zhang Z:
**Matching pursuits with time-frequency dictionaries.***IEEE Transactions on Signal Processing*1993,**41**(12):3397-3415. 10.1109/78.258082MATHView ArticleGoogle Scholar - Brown GJ, Cooke M:
**Computational auditory scene analysis.***Computer Speech and Language*1994,**8**(4):297-336. 10.1006/csla.1994.1016View ArticleGoogle Scholar - George EB, Smith MJT:
**Analysis-by-synthesis/overlap-add sinusoidal modeling applied to the analysis and synthesis of musical tones.***Journal of the Audio Engineering Society*1992,**40**(6):497-516.Google Scholar - Davy M, Godsill SJ:
**Bayesian harmonic models for musical signal analysis.**In*Bayesian Statistics 7*. Oxford University Press, Oxford, UK; 2003.Google Scholar - Wen X:
*Harmonic sinusoid modelling of tonal music events, Ph.D. thesis*. University of London, London, UK; 2007.Google Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.