- Methodology
- Open access
- Published:
A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes
EURASIP Journal on Audio, Speech, and Music Processing volume 2024, Article number: 38 (2024)
Abstract
This paper introduces a simplified and controllable model for mode coupling in the context of modal synthesis. The model employs efficient coupled filters for sound synthesis purposes, intended to emulate the generation of sounds radiated by sources under strongly nonlinear conditions. Such filters generate tonal components in an interdependent way and are intended to emulate realistic perceptually salient effects in musical instruments in an efficient manner. The control of energy transfer between the filters is realized through a coupling matrix. The generation of prototypical sounds corresponding to nonlinear sources with the filter bank is presented. In particular, examples are proposed to generate sounds corresponding to impacts on thin structures and to the perturbation of the vibration of objects when it collides with an other object. The sound examples presented in the paper and available for listening on the accompanying site illustrate that a simple control of the input parameters allows the generation of sounds whose evocation is coherent and that the addition of random processes yields a significant improvement to the realism of the generated sounds.
1 Introduction
Modal synthesis operates according to the decomposition of the complex dynamic behavior of a vibrating object into contributions from modes, each oscillating independently at a single frequency. This approach, applicable to linear and time-invariant systems, is widely used and forms the basis for various physical modeling synthesis software packages [1, 2] and is closely related to sound synthesis methodologies employing filter banks [3,4,5].
For vibrating objects incorporating nonlinear effects, the modal interpretation must be generalized to include energy transfer between different modes and other effects such as, e.g., frequency shifting of modes over time. It may cause the delayed and sustained appearance of tonal components that cannot be generated by a linear model. This complex phenomenon, widely studied for the typical case of thin plates and shells [6, 7], can be modeled and solved under certain conditions. The numerical solution of the Föppl-von Kármán system [8, 9] that governs the underlying dynamics of nonlinear thin plates at moderate vibration amplitudes yields realistic and convincing sound synthesis [10] but at heavy computational cost. Ducceschi and Touzé [11] propose the modal resolution of the system with the offline calculation of coupling coefficients. They manage under certain approximations to significantly reduce the computation time without being able to achieve real-time sound synthesis (about 8 times real-time on a CPU) [12]. As of 2023, real-time performance is available for limited plate sizes [13]. Another typical case of coupling between modes induced by nonlinear phenomena results from collisions in musical instruments [14] and has been the subject of various studies, including on modal interactions [15]. Computational cost for synthesis can also be heavy in such cases.
For synthesis purposes, and particularly if real-time performance is the ultimate aim, it can be useful to depart from strict physical models and examine modal interactions from a perceptual point of view—closer in spirit to so-called “procedural audio” approaches [16]. Skare and Abel [17] perform real-time modal synthesis of crash cymbals with a GPU-accelerated modal filterbank. Their method consists in identifying the modal parameters (including a rough approximation of the couplings) on recorded sounds, although the energy transfer mechanism is unspecified.
In this paper, we propose a simple model for energy transfers between modes. Then, we design coupled filters based on the design proposed by Mathews and Smith [18] and adapted by Skare and Abel [17] to incorporate energy transfer. This study is a direct extension of previously conducted work [19], incorporating additional effort to ensure that the design of coupled filters is more coherent with the underlying physical system. In particular, we propose an equivalence between the power of the signal of the filters and the energy of a vibration mode from an equivalent physical system to ensure energy conservation during transfers. Inter-modal energy transfer is encoded in a matrix containing all the coupling coefficients. The aim of this paper is not to propose a synthesis model performing an accurate simulation of a physical system. Instead, we seek to develop a framework allowing direct modeling of sounds targeted to the way they are perceived. This results in an efficient way to generate sounds evoking nonlinear sources and can yield real-time event-driven synthesis of sounds in virtual or augmented reality environments, a particularly active field of research [20, 21].
Some background on modal synthesis is given in Section 2, and the energy transfer model is presented in Section 3. Then, the design of the coupled filters is detailed in Section 4, the definition of the matrix containing the coupling terms is proposed in Section 5, and methods to enhance the efficiency and randomize the process are presented in Section 6. Various example systems used to generate prototypical sounds are presented in Section 7. Sound examples are available online [22].
2 Modal synthesis for the linear case
The modal resolution of a linear partial differential equation (PDE) system describing the vibrations of a resonant object is well-described in various texts [23]. Solutions are of the following form for the displacement w depending on a spatial coordinate \(\textbf{r}\) and time t:
where
Here, * represents a convolution operation, and the impulse response \(h_i(t)\) of the following form:
the function \(\phi _i(\textbf{r})\) is the \(i^\text {th}\) mode’s shape or basis function, and \(\omega _{d_i}\) and \(\alpha _i\) are the angular frequency and the damping coefficient of the \(i^\text {th}\) mode, respectively. One can note that the angular frequency differs from the angular natural frequency \(\omega _i\):
The constants \(A_i\) and \(\varphi _i\) derive from the initial conditions, and \(g_i(t)\) is the modal excitation (formally derived from a PDE system by the projection of an excitation source term \(g(\textbf{r},t)\) onto the modal basis functions \(\phi _i(\textbf{r})\)).
3 Inter-modal energy transfer
3.1 Definitions and approximations
For a linear system, the mechanical energy of the \(i^\text {th}\) mode \(E_m^i(t)\) can be calculated by adding its potential energy \(E_p^i(t)\), its kinetic energy \(E_i^i(t)\), and the accumulated energy \(E_s^i(t)\) supplied by the source up to time t:
Here, \(K_i\) is the modal stiffness, \(M_i\) is the modal mass and \(q_i(t)\) is the modal displacement, defined as follows:
The function \(m(\textbf{r})\) is the density, and \(\Omega\) is the closed space containing the vibrating object.
A simple approximation to the mechanical energy follows from the assumption that it is proportional to the square of the modal displacement amplitude (see Fig. 1). Thus, we can approximate the mechanical energy of the \(i^\text {th}\) mode by computing the power of the modal displacement signal denoted as \(P_i(t)\) (where the power of a sinusoidal signal is equal to its squared amplitude divided by 2):
with \(A_i\) the initial amplitude of the tonal component.
It is important to note that the power referred to here is not mechanical power expressed in watts but rather signal power (this will be useful for the design of the coupled filters).
Additionally, assuming identical modal masses for all modes (this follows from a uniform density and modal orthogonality) allows us to establish that the mechanical energy of a given mode is also proportional to the square of the mode’s angular frequency. Indeed, the potential energy is proportional to the square of the angular frequency and the squared modal displacement:
Moreover, considering that mechanical energy constitutes the sum of potential and kinetic energy, and that kinetic energy is zero when the potential energy reaches its maximum, it follows that the mechanical energy of a mode is directly proportional to its squared angular frequency and the square of the amplitude of the modal displacement (proportionate to the signal power):
To establish a simple and controllable model, we neglect the influence of the phase on energy transfers. We introduce a term \(\Pi _T^i(t)\) to induce transfers of energy between distinct modes. In the absence of an external source, the energy of a mode is expressed as its initial value, with a decrease over time due to the cumulative losses and modified by the transferred energy with other modes:
One can note that \(E_{m}^i\) appears in the loss term because we assumed an exponential decay for the modes, as is commonly done in modal models.
3.2 Energy transfer model
The challenge is to arrive at a model simple enough to be controllable (i.e., to be able to predict the sound outcome of a manipulation of the parameters) and complete enough to allow the matching of modal trajectories to a range of nonlinear phenomena. We define the transfer term as following:
Here, \([\cdot ]_{+}\) indicates the “positive part of”, i.e., \([\zeta ]_{+} = \frac{1}{2}(\zeta + |\zeta |)\), \(\tau _i\) is the threshold beyond which the energy of mode i is transferred to other modes (\(\tau _i\ge 0\)), \(\lambda\) is the redistribution rate (\(\lambda \ge 0\)), and \(c_{ik}\) is a positive coupling coefficient (\(c_{ik}\ge 0\) and \(\sum \nolimits _i c_{ik}\le 1\)).
Thus, the transfer terms are proportional to the excess energy above a threshold and the terms \(c_{ik}\) define the proportions distributed and received by each other component. Note that this relation is not an immediate consequence of a physical model but is a heuristic means of capturing salient phenomena in a physical system. Our focus is on the design of a synthesis process with a predictable sound outcome rather than on the simulation of a physical system. Nevertheless, our model remains physically informed and consistent with the conservation of energy in the associated mechanical system.
This transfer process is nonlinear due to the introduction of energy transfer between modes, a characteristic nonlinear phenomenon in physical systems. Moreover, the emergence of couplings itself is nonlinear owing to the incorporation of a threshold effect (i.e., there is no coupling below \(\tau _i\)). This threshold effect is easily justified from a physical standpoint when considering collision phenomena (i.e., interaction occurring above a certain threshold corresponding to the contact between two objects). However, it is less aligned with reality concerning geometric nonlinearities. In such cases, it might be conceivable to introduce other nonlinear transfers (e.g., \(\propto \left( E_m^i(t)\right) ^\beta\) with \(\beta \ne 1\)), but this type of transfer would be challenging to control.
We can define the following differential equation that governs the energy variations of the modes, excluding the effect of the source:
This equation can be analytically solved by considering initial conditions \(E_m^i(0)\) and distinguishing between cases where the energy of each mode is either above or below the corresponding threshold \(\tau _i\). For example, if \(E_m^i(0)>\tau _i\) and \(E_m^k(t)<\tau _k\ \forall k\ne i\) the solution takes the following form (see Fig. 2):
with \(t_{0}=-\frac{1}{\lambda +2\alpha _i} \left[ \ln {\left( \tau _i \left( 1- \frac{\lambda }{\lambda +2 \alpha _i}\right) \right) - \ln {\left( E_m^i(0) - \frac{\lambda \tau _i}{\lambda +2 \alpha _i}\right) }} \right]\).
We can rewrite Eq. (12) in terms of signal power (see Eq. (9)), which will be useful for the implementation of the coupled filters (Section 4):
We define here \(T_i(t)\) as the power-related transfer terms (equivalent to \(\Pi _T^i(t)\) for energy calculation).
In the next section, we present sufficient conditions on energy transfers to ensure the stability of the system.
3.3 Energy and stability
A sufficient condition for the stability of the system which is consistent with the physics (energy conservation) is to impose a non-positivity constraint for the sum of energy transfers:
This condition impedes the creation of energy during transfer between modes. It is ensured by the following condition on the coupling coefficients:
4 Design of the coupled filters
4.1 Linear filtering for modal synthesis
In the linear case, a straightforward approach to numerical solution at a sample rate \(f_{s}\) in Hz is to use recursive filters with an exponentially-damped sinusoidal impulse response. The filter proposed by Mathews and Smith [18] has this property. The implementation of this filter consists in calculating, for each time step n, the imaginary part of a complex number z(n) whose rotation speed in the complex plane is constant and corresponds to the angular frequency \(\omega\) of the exponentially damped sinusoid:
with u(n) the source of the filter, and Z the constant modification of the phase and modulus per time step:
with \(X = e^{-\alpha /f_s}\cos (\omega /f_s)\) and \(Y = e^{-\alpha /f_s}\sin (\omega /f_s)\).
The recurrence equation on the complex sequence z(n) is computed by the following system including a recurrence equation for the real part \(x(n) = \text {Re}(z(n))\) and a recurrence equation for the imaginary part \(y(n) = \text {Im}(z(n))\), which is the output of the filter:
for a real source \(u(n)\in \mathbb {R}\).
4.2 Principle and implementation of the coupling
Consider N filters defined as in Section 4.1 in parallel to be coupled through the methodology outlined in Section 3.2. We note \(z_i(n)\) the complex sequence corresponding to the \(i^{\text {th}}\) filter, with \(x_i(n)\) its real part and \(y_i(n)\) its imaginary part (corresponding to the output signal of the filter). The source for the \(i^{\text {th}}\) filter, corresponding to the projection of the source of the filter bank u(n) onto the \(i^{\text {th}}\) modal basis function, is noted \(u_i(n)\).
The continuous Eq. (14) can be solved in discrete time using the following recurrence relation (see the Appendix):
with \(P_i(n)\) the power of the tonal component defined as following:
\(x_i(n), y_i(n)\in \mathbb {R}\).
Thus, we can express the variation of the modulus of \(z_i(n)\) due to energy transfer between two time steps:
We can define an amplitude ratio between the modulus for two consecutive time steps if \(|z_i(n)|\ne 0\):
Thus, we can modify the recurrence equation defined in the previous section (see Eq. (17)) by incorporating the modulus variations due to energy transfers. It gives the following recurrence relation for \(z_i\), including the source and phase variations:
Finally, we can write the system of equations for the implementation of the filters:
with \(X_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_i(n)}{|z_i(n)|^2}} \cos (\omega _i/f_s)\) and \(Y_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_i(n)}{|z_i(n)|^2}} \sin (\omega _i/f_s)\)
In this way, power can be transferred among the different filters without affecting the phases. The coupling intervenes in the calculation of the transfer terms \(T_i(n)\) which ultimately involve the other filters.
5 Distribution matrix
This section presents a formalism for the calculation and control of the coupling between filters.
Now, define the column vectors \(\textbf{p}(n) = [P_{1}(n),\ldots ,P_{N}(n)]^{T}\) and \(\textbf{t}(n) = [T_{1}(n),\ldots ,T_{N}(n)]^{T}\). The power transfers between the tonal components \(\textbf{t}(n)\) at a given time step n are defined as:
An \(N\times 1\) column vector \(\varvec{\tau }\) containing the thresholds \(\tau _i\), \(i=1,\ldots ,N\) at which transfers are activated for each tonal component has also been introduced here.
Thus, the calculation of the transfer terms is performed by the matrix product of an \(N\times N\) redistribution matrix \(\textbf{M}\) with the vector resulting from the positive part of the difference between the power of each frequency component \(\textbf{p}(n)\) and the associated threshold \(\varvec{\tau }\). In other words, the transfer terms \(T_i(n)\) are proportional to the excess power above the corresponding threshold and the terms of the matrix \(\textbf{M}\) define the proportions distributed to and received by each component.
The diagonal entries \(M_{ii}\) of the matrix \(\textbf{M}\) define the proportion of power of the \(i^\text {th}\) mode that will be redistributed to other modes and the other terms of the column \(M_{ik}\) define the quantity that the \(i^\text {th}\) mode will receive from the \(k^\text {th}\) mode.
An efficient way to define the coefficients of the matrix is to use the following expression:
where \(a_{ik}\) is a coefficient weighting the redistribution from the \(k^\text {th}\) mode to the \(i^\text {th}\) mode (\(a_{ik}\ge 0\)), and \(\delta _{ik}\) is Kronecker’s \(\delta\) (\(\delta _{ii}=1\) and \(\delta _{ik}=0\) if \(i\ne k\)). While enforcing \(a_{ii}=0\) is not imperative, defining non-zero coefficients on the diagonal bears little significance, as it merely redistributes a mode onto itself, resulting in negligible effects.
In this formulation, the stability of the filter bank is ensured for arbitrary \(a_{ik}\), provided that at least one value per column is non-zero, that \(\lambda \le f_s\) and that \(0 \le \eta \le 1\). \(\eta\) corresponds to the efficiency of the transfers (\(\sum _{i=1}^N c_{ik}= \eta\)). If \(\eta =1\), there is no loss during the transfer process, If \(\eta =0\), all the transferred energy is lost. \(\lambda /f_s\) is the proportion of power above the threshold transferred to other modes at each time step (\(0 \le \lambda \le f_s\)). The values of the off-diagonal elements \(M_{ik}\) of the matrix \(\textbf{M}\) are the proportion of energy transferred by the mode k that will be received by the mode i.
The \(i^\text {th}\) transfer term \(T_i(n)\) can be expressed as follows:
Here, we can differentiate between the positive contributions \(T_{i+}(n)\), which represent energy moving to the \(i^\text {th}\) mode from the other modes, and the negative contribution \(T_{i-}(n)\), which denotes energy leaving the \(i^\text {th}\) mode to other modes.
6 Efficiency and randomization of the process
It is possible to perform the energy transfer process at a lower rate than the sample rate without degrading to audio quality. If we perform the transfers every \(N_0\) samples, the expression of the transfers (28) become:
As this step constitutes the heaviest computation in the algorithm as a whole, avoiding the need to calculate it at each time step can significantly enhance the efficiency of the code.
On the other hand, since energy transfers are highly regular and predictable (and precisely why this type of model was chosen), one might wish to introduce randomness into the transfer processes to induce more chaotic variations. A simple and effective method involve introducing a random variable \(\chi (n)\) following a uniform distribution between 0 and \(2\pi\). This randomness can be incorporated to vary the phases of the positive contributions of transfers in the implementation of the filters (see Eq. (25)):
with \(\tilde{X}_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_{i-}(n)}{|z_i(n)|^2}} \cos (\omega _i/f_s)\) and \(\tilde{Y}_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_{i-}(n)}{|z_i(n)|^2}} \sin (\omega _i/f_s)\)
One can note that this method tends to dissipate more energy due to the introduction of a component with a randomized phase, which has the potential to diminish the amplitude of a filter if their phase are opposed. Also, the modulations created with this approach are affected by \(N_0\).
7 Examples
Nonlinear vibration leads to complex phenomena that can produce subtle and chaotic variations in radiated sound. We can reduce the complexity of the model and propose a heuristic that attempts to maintain the essential perceptual attributes of an object vibrating under nonlinear conditions. The resulting synthetic sound is nevertheless less realistic and versatile than sounds generated by the direct resolution of physical models (such as, e.g., the Föppl-von Kármán system) although the synthesis quality can be improved by using random processes in the implementation of the algorithms.
The coupled filter bank proposed here is dependent on various parameters: the number of filters N, the oscillation frequencies \(\omega _i\) and damping \(\alpha _i\) for each filter, the coefficients \(a_{ik}\) and the parameters \(\lambda\) and \(\eta\) for the definition of the redistribution matrix \(\textbf{M}\), and the thresholds \(\tau _i\). Strategies for setting these parameters are presented in two cases of musical interest. In the case of nonlinear plate vibration, energy is transferred to filters of nearby frequency in order to generate a gradual cascade of energy towards the high-frequency range. In the case of a string colliding with a rigid object, in contrast, there is simultaneous transfer of energy to many frequency components.
7.1 Energy cascade in thin plates
Consider a thin rectangular plate (according to the Kirchhoff model [24]), with mass density \(\rho\) kg\(\cdot\) m\(^{-3}\), thickness H m, and flexural rigidity D in kg\(\cdot\)m\(^{2}\) \(\cdot\)s\(^{-2}\), and side lengths \(L_{x}\) and \(L_{y}\) in m. If the plate is simply supported on all its edges, the modal frequencies \(\omega _{lm}\) and modal shapes \(\phi _{lm}(x,y)\) can be expressed as follows [25]:
Here, \(\nu = L_{x}/L_{y}\) is the plate aspect ratio or the ratio between the length and width of the plate. The integer indices \(l, m \ge 1\) correspond to the number of vibration extrema (\(l-1\), \(m-1\) correspond to the number of vibration nodes) in the main directions of the rectangular plate (Cartesian coordinates (x, y)) with x and y being normalized by the length of the plate in the corresponding direction (so that \(0\le x,y \le 1\)).
For a point excitation force located at \((x_e,y_e)\), we can compute the modal forces using the mode shapes evaluated at the excitation point as \(\phi _{lm}(x_e,y_e)\). We define the source of the l, mth filter as follows:
where u(n) is the global excitation function.
We use a raised sinusoid for the excitation force (as proposed in [4, 26]) to simulate an impact:
For typical plate strikes, the strike duration \(N_{ex}/f_{s}\) in seconds is on the order of 1-4 ms.
The damping coefficients are chosen according to an exponential law, as proposed by Aramaki et al. [27], with parameters that are set to evoke a metallic object:
with \(\alpha _R=4\times 10^{-5}\) and \(\alpha _G=0.33220\). This set of parameters permits direct modal synthesis for linear plate vibration. To each pair of indices (l, m) we associate an index i (chosen in terms of increasing modal frequency) corresponding to the filter number used to generate the corresponding tonal component.
In order to produce the cascade of energy towards higher frequencies, we carry out transfers between filters whose frequencies are close. Indeed, the energy supplied by the impact is localized at low frequencies and transfers directed towards the neighboring modes allow the progressive appearance of higher frequency components. The weighting coefficients \(a_{ik}\) can be set as follows:
with \(f_i=\frac{\omega _i}{2 \pi }\)
We set \(\eta =1\) (ensuring conservation of energy during the redistribution). The cascade can be mainly controlled by \(\lambda\), or by the definition of thresholds \(\tau _i\) (see Figs. 3 and 4).
In the case of wave turbulence in plates [28], couplings between modes can lead to rapid variations in amplitude and frequency leading to a chaotic regime. In the chaotic regime, the resulting signal is noisy and difficult to reproduce by a set of tonal components. One way to reproduce this phenomenon with the coupled filters presented in this paper is to randomize the phases of the positive contributions of the transfer term in the source, as proposed in Section 6. In this way, the tonal components are subject to rapid random amplitude modulations that can evoke the chaotic phenomenon occurring during wave turbulence in the plates (see Fig. 5).
7.2 Collisions in sound production
The perturbation of the vibrations of an object when colliding with an obstacle can lead to different types of sound events. In the typical case of a guitar, the player can choke the string, mute it, play a natural harmonic. The string can also interact with the soundboard (slap, string buzz).
The model of an ideal vibrating string with simply supported boundary conditions gives the following modal frequencies and shapes:
where, here, the spatial coordinate x is normalized by the length of the string (\(0\le x \le 1\)). For a point excitation force located at \(x_e\), the source of the \(k^{th}\) filter can be defined as:
We use the same excitation force and damping model as previously (see Eqs. (33) and (34)).
The evocation of an obstacle disturbing the vibrations of the string requires the definition of thresholds that correspond to the location of the obstacle. We propose thresholds corresponding to the maximum amplitude of modal displacements without colliding with a virtual obstacle positioned at \(x_c,y_c\), where \(y_{c}\) is the vertical displacement of the obstacle relative to the string. The amplitude of a modal displacement that goes through the obstacle is \(y_c/|\sin {(k \pi x_c)}|\); thus, the threshold for the power is as follows:
We define a redistribution matrix with all columns being identical in order to cause a simultaneous redistribution to a set of tonal components. The coefficients \(a_{ik}\) are defined as follows:
with \(\xi _i(f_i)\) a parameter depending on the frequency allowing weighting of the redistribution according to the filter frequency. We define \(\xi _i(f_i)\) as the Fourier transform of the raised cosine, an approximation of the force profile caused by a collision (as defined for the source, Eq. (33)):
with \(f_i\) the frequency of the \(i^{th}\) filter and \(\gamma\) a parameter corresponding to the duration of the raised cosine. This results in a cutoff frequency beyond which there is no more transfer (see Fig. 6). Various examples of sound outputs for different configurations are presented. We can observe that the transfer does not affect even harmonics (resp. multiples of 3 and 4) for \(x_c = 1/2\) (resp.\(x_c = 1/3\) and \(x_c = {1}/4\) ), which allows the reproduction of a natural harmonic played on a guitar (see Fig. 7). There is a lower increase in the high-frequency components and a faster dissipation of all the tonal components involved in the redistribution as the efficiency \(\eta\) decreases. As \(\gamma\) increases, there is also less energy distributed to the high-frequency components, but this energy is not dissipated and remains in the low-frequency components. (see Fig. 8).
Collisions in musical instruments may be the source of more subtle phenomena than a simultaneous appearance of various frequency components. The cases of string buzz and tanpura can be approached by introducing random processes into the redistribution, as has been done for chaotic phenomena in plates (see Fig. 9).
It is possible to apply the same principle for the generation of sounds corresponding to collisions with 2D objects. For example, we can generate muted plate sounds (see Fig. 10).
8 Discussion
We conducted a comparative analysis of computational time in MATLAB (running on an AMD Ryzen 5 PRO 6650U - 2.90 GHz CPU) between uncoupled and coupled filter banks across various conditions (see Table 1).
For reference, the numerical resolution of the Föppl-von Kármán system typically consumes about one hour per second of sound without optimization/simplification, while the fastest simplified algorithm to date [13] still demands around 6 s of computation per second of sound on the same machine. Our model demonstrates real-time feasibility for handling 200 modes under any condition and potentially many more if the transfers are not calculated at each iteration (\(N_0>1\)). Despite a matrix featuring entirely non-zero elements, we achieve slightly superior efficiency in collision processing due to the uniformity of all columns, substantially simplifying the matrix calculation of \(\mathbf {t(n)}\). Notably, the computation of couplings remains below 15% of the total computational time for \(N_0\ge 100\), thereby enabling real-time execution for over 1000 modes.
Regarding sound quality assessment, the provided sound samples offer valuable insights. While an absolute perceptual evaluation of the model proves challenging, various subjective assessments for specific case studies form the basis of supplementary investigations, one of which has been recently published [29]. The generated sounds and the findings of this study demonstrate promising outcomes, with sounds effectively evoking the modeled phenomena and exhibiting satisfactory reproduction quality.
Another important aspect of nonlinear phenomena—frequency variations of modes—was not addressed in this paper. However, the employed filters have the advantage of being stable even when \(\omega\) varies with each sample. A relatively straightforward approach to integrate frequency variations of the tonal components consists of identifying them directly from recorded sounds or through physical modeling and to integrate these characteristics into the filter design.
9 Conclusions
In this paper, we have presented a model for mode coupling and the design of coupled resonant filters geared towards the emulation of mode coupling effects in nonlinear vibrating structures. This filter bank allows efficient and real-time sound synthesis even for a large number of filters. The coupling, performed without modifying the phase, introduces predictable and controllable effects on the output signal. The terms controlling the coupling between the different filters are grouped in a matrix whose definition is the main challenge. The setting of the parameters of the sound synthesis process is presented through various examples corresponding to vibrating objects with a nonlinear behavior. A simple setting allows the generation of typical sounds, though sometimes with an unnatural character—the introduction of random processes in the energy redistribution aids a great deal with both realism and plausibility.
Further work will be concerned with determining which sound morphologies are important from a perceptual point of view for the recognition of sound events [30] corresponding to nonlinear phenomena in order to reproduce them with this coupled filter bank. This could lead to the development of environmental sound synthesizers and virtual musical instruments (e.g., tanpura, cymbal ...) or to non-linear audio effects (such as the nonlinear reverberation of a snare drum due to the wires held under tension against the lower drumskin). The filter bank presented in this paper can also be used as an abstract sound generation tool. In this context, the challenge would be to design intuitive control for use in a musical or sound design context.
Availability of data and materials
No associated data.
Abbreviations
- CPU:
-
Central processing unit
- GPU:
-
Graphics processing unit
- PDE:
-
Partial differential equation
References
J.M. Adrien, in Representations of musical signals. The missing link: Modal synthesis (MIT Press, 55 Hayward St., Cambridge, MA, United States, 1991), pp. 269–298
J.D. Morrison, J.M. Adrien, Mosaic: A framework for modal synthesis. Comput. Music. J. 17(1), 45–56 (1993)
D. Rocchesso, The ball within the box: A sound-processing metaphor. Comput. Music. J. 19(4), 47–57 (1995)
K. Van Den Doel, P.G. Kry, D.K. Pai, in Proceedings of the 28th annual conference on Computer graphics and interactive techniques. Foleyautomatic: Physically-based sound effects for interactive simulation and animation (ACM Press, 1601 Broadway, 10th Floor, New York, NY, United States, 2001), pp. 537–544
S. Conan, E. Thoret, M. Aramaki, O. Derrien, C. Gondre, S. Ystad, R. Kronland-Martinet, An intuitive synthesizer of continuous-interaction sounds: Rubbing, scratching, and rolling. Comput. Music. J. 38(4), 24–37 (2014)
K. Legge, N.H. Fletcher, Nonlinearity, chaos, and the sound of shallow gongs. J. Acoust. Soc. Am. 86(6), 2439–2443 (1989)
A. Chaigne, C. Touzé, O. Thomas, Nonlinear vibrations and chaos in gongs and cymbals. Acoust. Sci. Technol. 26(5), 403–409 (2005)
A. Föppl, Vorlesungen über technische Mechanik (Druck und Verlag von B.G. Teubner, Leipzig, 1907)
T. von Kármán, Festigkeitsprobleme im maschinenbau. Encyklopädie Mathematischen Wiss. 4(4), 311–385 (1910)
S. Bilbao, A family of conservative finite difference schemes for the dynamical von karman plate equations. Num. Meth. PDE 24(1), 193–216 (2008)
M. Ducceschi, C. Touzé, Modal approach for nonlinear vibrations of damped impacted plates: Application to sound synthesis of gongs and cymbals. J. Sound Vib. 344, 313–331 (2015)
M. Ducceschi, C. Touzé, in 18th International Conference on Digital Audio Effects (DAFx-15). Simulations of nonlinear plate dynamics: An accurate and efficient modal algorithm (proceedings of the DAFX conferences 2015)
S. Bilbao, Z. Wang, C. Webb, M. Ducceschi, in Proceedings of the 26th International Conference on Digital Audio Effects. Real-time gong synthesis (Copenhagen, 2023)
S. Bilbao, A. Torin, V. Chatziioannou, Numerical modeling of collisions in musical instruments. Acta Acustica U. Acustica 101(1), 155–173 (2015)
C. Issanchou, S. Bilbao, J.L. Le Carrou, C. Touzé, O. Doaré, A modal-based approach to the nonlinear vibration of strings against a unilateral obstacle: Simulations and experiments in the pointwise case. J. Sound Vib. 393, 229–251 (2017)
A. Farnell, Designing Sound (MIT Press, Cambridge Massachusetts, 2010)
T. Skare, J. Abel, in Proc. 22nd Int. Conf. Dig. Audio Effects. Real-time modal synthesis of crash cymbals with nonlinear approximations, using a gpu (proceedings of the DAFX conferences, 2019)
M. Mathews, J.O. Smith, in Proceedings of the Stockholm Musical Acoustics Conference (SMAC 2003)(Stockholm), Royal Swedish Academy of Music (August 2003). Methods for synthesizing very high q parametrically well behaved two pole filters (2003)
S. Poirot, R. Kronland-Martinet, S. Bilbao, in 26th International Conference on Digital Audio Effects (DAFx23). A coupled resonant filter bank for the sound synthesis of nonlinear sources (proceedings of the DAFX conferences, 2023)
L. Pruvost, B. Scherrer, M. Aramaki, S. Ystad, R. Kronland-Martinet, in SIGGRAPH Asia 2015 Technical Briefs. Perception-based interactive sound synthesis of morphing solids’ interactions (2015), pp. 1–4
C. Gan, J. Schwartz, S. Alter, D. Mrowca, M. Schrimpf, J. Traer, J. De Freitas, J. Kubilius, A. Bhandwaldar, N. Haber et al., Threedworld: A platform for interactive multi-modal physical simulation. (2021). arXiv preprint arXiv:2007.04954. https://arxiv.org/abs/2007.04954
S. Poirot, Sound examples. https://www.prism.cnrs.fr/publications-media/EURASIPPoirot/. Accessed 17 Dec 2023
Z.F. Fu, J. He, Modal analysis (Elsevier, Oxford, 2001)
K.F. Graff, Wave motion in elastic solids (Dover Publications, New York, 1991)
N.H. Fletcher, T.D. Rossing, The physics of musical instruments (Springer Science & Business Media, New York, 2012)
S. Bilbao, Numerical sound synthesis: Finite difference schemes and simulation in musical acoustics (Wiley, Chichester, 2009)
M. Aramaki, M. Besson, R. Kronland-Martinet, S. Ystad, Controlling the perceived material in an impact sound synthesizer. IEEE Trans. Audio Speech Lang. Process. 19(2), 301–314 (2010)
M. Ducceschi, C. Touzé, O. Thomas, S. Bilbao, Dynamics of the wave turbulence spectrum in vibrating plates: A numerical investigation using a conservative finite difference scheme. Phys. D 280, 73–85 (2014)
S. Poirot, S. Bilbao, M. Aramaki, S. Ystad, R. Kronland-Martinet, A perceptually evaluated signal model: Collisions between a vibrating object and an obstacle. IEEE/ACM Trans. Audio Speech Lang. Process. 31, 2338–2350 (2023)
R. Kronland-Martinet, S. Ystad, M. Aramaki, High-level control of sound synthesis for sonification processes. AI Soc. 27(2), 245–255 (2012)
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
SP: main contribution to the conception of the work and drafted the paper, SB: design of the work and substantial revision, RKM: project initiator, design and work revision.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix section : equivalence between the continuous model and the discrete-time algorithm
Appendix section : equivalence between the continuous model and the discrete-time algorithm
In this section, we note \(P_i\left[ n \right]\) the numerical computation of \(P_i(t)\) for \(t=n/fs\).
•Case without transfer for any mode : \(P_i(t)<\tau _i \ \forall k \in \{ 1,2,...,N \},\ t \in \mathbb {R}_+\)
Continuous model for power transfers Eq. (14):
In this case, the definition of a recurrence equation is straightforward:
•Case with transfers: \(P_i(t)\ge \tau _i\)
Continuous model for power transfers Eq. (14):
Using Euler scheme to approximate \(\frac{dP_i(t)}{dt}\approx \left( P_i\left[ n +1\right] -P_i\left[ n \right] \right) f_s\) gives the following equation:
Using \(e^{x}=1+x+\mathcal {O}(x^2)\), we find the discrete-time model Eqs. (20) and (28):
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Poirot, S., Bilbao, S. & Kronland-Martinet, R. A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes. J AUDIO SPEECH MUSIC PROC. 2024, 38 (2024). https://doi.org/10.1186/s13636-024-00358-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13636-024-00358-2