Skip to main content

A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes

Abstract

This paper introduces a simplified and controllable model for mode coupling in the context of modal synthesis. The model employs efficient coupled filters for sound synthesis purposes, intended to emulate the generation of sounds radiated by sources under strongly nonlinear conditions. Such filters generate tonal components in an interdependent way and are intended to emulate realistic perceptually salient effects in musical instruments in an efficient manner. The control of energy transfer between the filters is realized through a coupling matrix. The generation of prototypical sounds corresponding to nonlinear sources with the filter bank is presented. In particular, examples are proposed to generate sounds corresponding to impacts on thin structures and to the perturbation of the vibration of objects when it collides with an other object. The sound examples presented in the paper and available for listening on the accompanying site illustrate that a simple control of the input parameters allows the generation of sounds whose evocation is coherent and that the addition of random processes yields a significant improvement to the realism of the generated sounds.

1 Introduction

Modal synthesis operates according to the decomposition of the complex dynamic behavior of a vibrating object into contributions from modes, each oscillating independently at a single frequency. This approach, applicable to linear and time-invariant systems, is widely used and forms the basis for various physical modeling synthesis software packages [1, 2] and is closely related to sound synthesis methodologies employing filter banks [3,4,5].

For vibrating objects incorporating nonlinear effects, the modal interpretation must be generalized to include energy transfer between different modes and other effects such as, e.g., frequency shifting of modes over time. It may cause the delayed and sustained appearance of tonal components that cannot be generated by a linear model. This complex phenomenon, widely studied for the typical case of thin plates and shells [6, 7], can be modeled and solved under certain conditions. The numerical solution of the Föppl-von Kármán system [8, 9] that governs the underlying dynamics of nonlinear thin plates at moderate vibration amplitudes yields realistic and convincing sound synthesis [10] but at heavy computational cost. Ducceschi and Touzé [11] propose the modal resolution of the system with the offline calculation of coupling coefficients. They manage under certain approximations to significantly reduce the computation time without being able to achieve real-time sound synthesis (about 8 times real-time on a CPU) [12]. As of 2023, real-time performance is available for limited plate sizes [13]. Another typical case of coupling between modes induced by nonlinear phenomena results from collisions in musical instruments [14] and has been the subject of various studies, including on modal interactions [15]. Computational cost for synthesis can also be heavy in such cases.

For synthesis purposes, and particularly if real-time performance is the ultimate aim, it can be useful to depart from strict physical models and examine modal interactions from a perceptual point of view—closer in spirit to so-called “procedural audio” approaches [16]. Skare and Abel [17] perform real-time modal synthesis of crash cymbals with a GPU-accelerated modal filterbank. Their method consists in identifying the modal parameters (including a rough approximation of the couplings) on recorded sounds, although the energy transfer mechanism is unspecified.

In this paper, we propose a simple model for energy transfers between modes. Then, we design coupled filters based on the design proposed by Mathews and Smith [18] and adapted by Skare and Abel [17] to incorporate energy transfer. This study is a direct extension of previously conducted work [19], incorporating additional effort to ensure that the design of coupled filters is more coherent with the underlying physical system. In particular, we propose an equivalence between the power of the signal of the filters and the energy of a vibration mode from an equivalent physical system to ensure energy conservation during transfers. Inter-modal energy transfer is encoded in a matrix containing all the coupling coefficients. The aim of this paper is not to propose a synthesis model performing an accurate simulation of a physical system. Instead, we seek to develop a framework allowing direct modeling of sounds targeted to the way they are perceived. This results in an efficient way to generate sounds evoking nonlinear sources and can yield real-time event-driven synthesis of sounds in virtual or augmented reality environments, a particularly active field of research [20, 21].

Some background on modal synthesis is given in Section 2, and the energy transfer model is presented in Section 3. Then, the design of the coupled filters is detailed in Section 4, the definition of the matrix containing the coupling terms is proposed in Section 5, and methods to enhance the efficiency and randomize the process are presented in Section 6. Various example systems used to generate prototypical sounds are presented in Section 7. Sound examples are available online [22].

2 Modal synthesis for the linear case

The modal resolution of a linear partial differential equation (PDE) system describing the vibrations of a resonant object is well-described in various texts [23]. Solutions are of the following form for the displacement w depending on a spatial coordinate \(\textbf{r}\) and time t:

$$\begin{aligned} w(\textbf{r},t)=\underbrace{w_h(\textbf{r},t)}_{\text {homogeneous solution}} + \underbrace{w_p(\textbf{r},t)}_{\text {particular solution}}, \end{aligned}$$
(1)

where

$$\begin{aligned} w_h(\textbf{r},t) & =\sum \limits _{i=1}^{\infty } e^{-\alpha _i t}\left[ A_i \cos (\omega _{d_i} t + \varphi _i)\right] \phi _i(\textbf{r})\end{aligned}$$
(2a)
$$\begin{aligned} w_p(\textbf{r},t) & s= \sum \limits _{i=1}^{\infty } \left( g_i(t) * h_i(t) \right) \phi _i(\textbf{r}), \end{aligned}$$
(2b)

Here, * represents a convolution operation, and the impulse response \(h_i(t)\) of the following form:

$$\begin{aligned} h_i(t)=\frac{1}{\omega _{d_i}} e^{-\alpha _i t} \sin (\omega _{d_i}t) \end{aligned}$$
(3)

the function \(\phi _i(\textbf{r})\) is the \(i^\text {th}\) mode’s shape or basis function, and \(\omega _{d_i}\) and \(\alpha _i\) are the angular frequency and the damping coefficient of the \(i^\text {th}\) mode, respectively. One can note that the angular frequency differs from the angular natural frequency \(\omega _i\):

$$\begin{aligned} \omega _{d_i}^2 = \omega _i^2 - \alpha _i^2. \end{aligned}$$
(4)

The constants \(A_i\) and \(\varphi _i\) derive from the initial conditions, and \(g_i(t)\) is the modal excitation (formally derived from a PDE system by the projection of an excitation source term \(g(\textbf{r},t)\) onto the modal basis functions \(\phi _i(\textbf{r})\)).

3 Inter-modal energy transfer

3.1 Definitions and approximations

For a linear system, the mechanical energy of the \(i^\text {th}\) mode \(E_m^i(t)\) can be calculated by adding its potential energy \(E_p^i(t)\), its kinetic energy \(E_i^i(t)\), and the accumulated energy \(E_s^i(t)\) supplied by the source up to time t:

$$\begin{aligned} E_p^i(t)= & \frac{1}{2} K_i {q_i}(t)^2\end{aligned}$$
(5a)
$$\begin{aligned} E_i^i(t)= & \frac{1}{2} M_i \left( \frac{dq_i(t)}{dt}\right) ^2\end{aligned}$$
(5b)
$$\begin{aligned} E_s^i(t)= & \int _0^t g_i(\Xi )q'_i(\Xi )d\Xi \end{aligned}$$
(5c)
$$\begin{aligned} E_m^i(t)= & E_p^i(t) + E_i^i(t) + E_s^i(t), \end{aligned}$$
(5d)

Here, \(K_i\) is the modal stiffness, \(M_i\) is the modal mass and \(q_i(t)\) is the modal displacement, defined as follows:

$$\begin{aligned} M_i & = \int _{\textbf{r} \in \Omega } m(\textbf{r}) \phi _i^2(\textbf{r}) d\textbf{r}\end{aligned}$$
(6a)
$$\begin{aligned} K_i & = \omega _i^2 M_i\end{aligned}$$
(6b)
$$\begin{aligned} q_i(t) & = e^{-\alpha _i t}\left[ A_i \cos (\omega _{d_i} t + \varphi _i)\right] + g_i(t) * h_i(t). \end{aligned}$$
(6c)

The function \(m(\textbf{r})\) is the density, and \(\Omega\) is the closed space containing the vibrating object.

A simple approximation to the mechanical energy follows from the assumption that it is proportional to the square of the modal displacement amplitude (see Fig. 1). Thus, we can approximate the mechanical energy of the \(i^\text {th}\) mode by computing the power of the modal displacement signal denoted as \(P_i(t)\) (where the power of a sinusoidal signal is equal to its squared amplitude divided by 2):

$$\begin{aligned} P_i(t)=(e^{-\alpha _i t}A_i)^2/2 \end{aligned}$$
(7)

with \(A_i\) the initial amplitude of the tonal component.

Fig. 1
figure 1

Displacement and energies of a mode with modal mass \(M_i=1\), modal stiffness \(K_i=1\), for a damping coefficient \(\alpha _i=0.1\), an initial amplitude \(A_i=1\), and an initial phase \(\varphi _i=0\). This example is without the source (\(g_i(t) = 0\)). We can see that the power of the signal, proportional to the square of the modal displacement amplitude is a rough approximation of the mechanical energy \(E_m^i(t)\)

It is important to note that the power referred to here is not mechanical power expressed in watts but rather signal power (this will be useful for the design of the coupled filters).

Additionally, assuming identical modal masses for all modes (this follows from a uniform density and modal orthogonality) allows us to establish that the mechanical energy of a given mode is also proportional to the square of the mode’s angular frequency. Indeed, the potential energy is proportional to the square of the angular frequency and the squared modal displacement:

$$\begin{aligned} E_p^i(t) = \frac{1}{2} K_i q_i^2(t) \propto \omega _i^2 M_i q_i^2(t) \end{aligned}$$
(8)

Moreover, considering that mechanical energy constitutes the sum of potential and kinetic energy, and that kinetic energy is zero when the potential energy reaches its maximum, it follows that the mechanical energy of a mode is directly proportional to its squared angular frequency and the square of the amplitude of the modal displacement (proportionate to the signal power):

$$\begin{aligned} E_m^i(t)\propto \omega _i^2 P_i(t) \end{aligned}$$
(9)

To establish a simple and controllable model, we neglect the influence of the phase on energy transfers. We introduce a term \(\Pi _T^i(t)\) to induce transfers of energy between distinct modes. In the absence of an external source, the energy of a mode is expressed as its initial value, with a decrease over time due to the cumulative losses and modified by the transferred energy with other modes:

$$\begin{aligned} E_m^i(t) = \underbrace{E_0^i}_{\text {initial energy}} - \underbrace{\int _0^t 2 \alpha _i E_m^i(\Xi ) d\Xi }_{\text {losses}} + \underbrace{\int _0^t \Pi _T^i(\Xi ) d\Xi }_{\text {energy transfers}}. \end{aligned}$$
(10)

One can note that \(E_{m}^i\) appears in the loss term because we assumed an exponential decay for the modes, as is commonly done in modal models.

3.2 Energy transfer model

The challenge is to arrive at a model simple enough to be controllable (i.e., to be able to predict the sound outcome of a manipulation of the parameters) and complete enough to allow the matching of modal trajectories to a range of nonlinear phenomena. We define the transfer term as following:

$$\begin{aligned} \Pi _T^i(t) = \underbrace{- \lambda \left[ E_m^i(t) - \tau _i \right] _+}_{\text {energy transferred to other modes}} + \underbrace{\sum \limits _k \lambda c_{ik} \left[ E_m^k(t) - \tau _k \right] _+}_{\text {energy transferred to mode k from other modes}}, \end{aligned}$$
(11)

Here, \([\cdot ]_{+}\) indicates the “positive part of”, i.e., \([\zeta ]_{+} = \frac{1}{2}(\zeta + |\zeta |)\), \(\tau _i\) is the threshold beyond which the energy of mode i is transferred to other modes (\(\tau _i\ge 0\)), \(\lambda\) is the redistribution rate (\(\lambda \ge 0\)), and \(c_{ik}\) is a positive coupling coefficient (\(c_{ik}\ge 0\) and \(\sum \nolimits _i c_{ik}\le 1\)).

Thus, the transfer terms are proportional to the excess energy above a threshold and the terms \(c_{ik}\) define the proportions distributed and received by each other component. Note that this relation is not an immediate consequence of a physical model but is a heuristic means of capturing salient phenomena in a physical system. Our focus is on the design of a synthesis process with a predictable sound outcome rather than on the simulation of a physical system. Nevertheless, our model remains physically informed and consistent with the conservation of energy in the associated mechanical system.

This transfer process is nonlinear due to the introduction of energy transfer between modes, a characteristic nonlinear phenomenon in physical systems. Moreover, the emergence of couplings itself is nonlinear owing to the incorporation of a threshold effect (i.e., there is no coupling below \(\tau _i\)). This threshold effect is easily justified from a physical standpoint when considering collision phenomena (i.e., interaction occurring above a certain threshold corresponding to the contact between two objects). However, it is less aligned with reality concerning geometric nonlinearities. In such cases, it might be conceivable to introduce other nonlinear transfers (e.g., \(\propto \left( E_m^i(t)\right) ^\beta\) with \(\beta \ne 1\)), but this type of transfer would be challenging to control.

We can define the following differential equation that governs the energy variations of the modes, excluding the effect of the source:

$$\begin{aligned} \frac{dE_m^i(t)}{dt} = -2\alpha _i E_m^i(t) \underbrace{- \lambda \left[ E_m^i(t) - \tau _i \right] _+ + \sum _k \lambda c_{ik} \left[ E_m^k(t) - \tau _k \right] _+}_{\Pi _T^i(t)}. \end{aligned}$$
(12)

This equation can be analytically solved by considering initial conditions \(E_m^i(0)\) and distinguishing between cases where the energy of each mode is either above or below the corresponding threshold \(\tau _i\). For example, if \(E_m^i(0)>\tau _i\) and \(E_m^k(t)<\tau _k\ \forall k\ne i\) the solution takes the following form (see Fig. 2):

$$\begin{aligned} E_m^i(t) = \left\{ \begin{array}{ll} \left( E_m^i(0) - \frac{\lambda \tau _i}{\lambda + 2 \alpha _i} \right) e^{ -\left( \lambda + 2\alpha _i \right) t} + \frac{\lambda \tau _i}{\lambda + 2 \alpha _i} & \ \text {if} \ t<t_{0} \ \ (\text {i.e.}\ E_m^i(t)>\tau _i)\\ \frac{\tau _i}{e^{-2\alpha _i t_0}} e^{-2\alpha _i t} & \ \text {else} \end{array} \right. \end{aligned}$$
(13)

with \(t_{0}=-\frac{1}{\lambda +2\alpha _i} \left[ \ln {\left( \tau _i \left( 1- \frac{\lambda }{\lambda +2 \alpha _i}\right) \right) - \ln {\left( E_m^i(0) - \frac{\lambda \tau _i}{\lambda +2 \alpha _i}\right) }} \right]\).

Fig. 2
figure 2

Evolution of mechanical energy (in blue) for an initial value \(E_m^i(0)\) exceeding the transfer threshold \(\tau _i = 0.3 E_m^i(0)\) and in the absence of source or transfer contributions from other modes for \(\lambda = 1\), \(\alpha _i = 1\) (see Eq. (13)). The energy lost due to dissipation is shown in red, and the energy to be transferred to other modes is presented in yellow. One can note that the energy transferred remains the same for \(t\ge t_0\) (\(t\ge t_o \Rightarrow E_m^i\le \tau _i\))

We can rewrite Eq. (12) in terms of signal power (see Eq. (9)), which will be useful for the implementation of the coupled filters (Section 4):

$$\begin{aligned} \frac{dP_i(t)}{dt} = -2\alpha _i P_i(t) \underbrace{- \lambda \left[ P_i(t) - \tau _i \right] _+ + \sum _k \lambda c_{ik} \frac{\omega _k^2}{\omega _i^2} \left[ P_{j}(t) - \tau _k \right] _+}_{T_i(t)}. \end{aligned}$$
(14)

We define here \(T_i(t)\) as the power-related transfer terms (equivalent to \(\Pi _T^i(t)\) for energy calculation).

In the next section, we present sufficient conditions on energy transfers to ensure the stability of the system.

3.3 Energy and stability

A sufficient condition for the stability of the system which is consistent with the physics (energy conservation) is to impose a non-positivity constraint for the sum of energy transfers:

$$\begin{aligned} \sum _i \Pi _T^i(t)\le 0 \end{aligned}$$
(15)

This condition impedes the creation of energy during transfer between modes. It is ensured by the following condition on the coupling coefficients:

$$\begin{aligned} \sum _i c_{ik}\le 1 \ \ \forall \ k \end{aligned}$$
(16)

4 Design of the coupled filters

4.1 Linear filtering for modal synthesis

In the linear case, a straightforward approach to numerical solution at a sample rate \(f_{s}\) in Hz is to use recursive filters with an exponentially-damped sinusoidal impulse response. The filter proposed by Mathews and Smith [18] has this property. The implementation of this filter consists in calculating, for each time step n, the imaginary part of a complex number z(n) whose rotation speed in the complex plane is constant and corresponds to the angular frequency \(\omega\) of the exponentially damped sinusoid:

$$\begin{aligned} y(n) = \text {Im}(z(n))\qquad \text {where}\qquad z(n+1)=Z z(n) + u(n) \end{aligned}$$
(17)

with u(n) the source of the filter, and Z the constant modification of the phase and modulus per time step:

$$\begin{aligned} Z = e^{-\alpha /f_s}e^{j\omega /f_s} = X + j Y \end{aligned}$$
(18)

with \(X = e^{-\alpha /f_s}\cos (\omega /f_s)\) and \(Y = e^{-\alpha /f_s}\sin (\omega /f_s)\).

The recurrence equation on the complex sequence z(n) is computed by the following system including a recurrence equation for the real part \(x(n) = \text {Re}(z(n))\) and a recurrence equation for the imaginary part \(y(n) = \text {Im}(z(n))\), which is the output of the filter:

$$\begin{aligned} x(n+1) & = \text {Re}(z(n+1)) = X x(n) - Y y(n) + u(n)\nonumber \\ y(n+1) & = \text {Im}(z(n+1)) = Y x(n) + X y(n) \end{aligned}$$
(19)

for a real source \(u(n)\in \mathbb {R}\).

4.2 Principle and implementation of the coupling

Consider N filters defined as in Section 4.1 in parallel to be coupled through the methodology outlined in Section 3.2. We note \(z_i(n)\) the complex sequence corresponding to the \(i^{\text {th}}\) filter, with \(x_i(n)\) its real part and \(y_i(n)\) its imaginary part (corresponding to the output signal of the filter). The source for the \(i^{\text {th}}\) filter, corresponding to the projection of the source of the filter bank u(n) onto the \(i^{\text {th}}\) modal basis function, is noted \(u_i(n)\).

The continuous Eq. (14) can be solved in discrete time using the following recurrence relation (see the Appendix):

$$\begin{aligned} P_i(n+1)=P_i(n)\underbrace{e^{-2\alpha _i /f_s}}_{\text {losses}}+\underbrace{T_i(n)}_{\text {transfer}} \end{aligned}$$
(20)

with \(P_i(n)\) the power of the tonal component defined as following:

$$\begin{aligned} P_i(n)=\frac{|z_i(n)|^2}{2}=\frac{1}{2}(x_i(n)^2+y_i(n)^2) \end{aligned}$$
(21)

\(x_i(n), y_i(n)\in \mathbb {R}\).

Thus, we can express the variation of the modulus of \(z_i(n)\) due to energy transfer between two time steps:

$$\begin{aligned} |z_i(n+1)|=\sqrt{|z_i(n)|^2 e^{-2\alpha _i /f_s}+2T_i(n)} \end{aligned}$$
(22)

We can define an amplitude ratio between the modulus for two consecutive time steps if \(|z_i(n)|\ne 0\):

$$\begin{aligned} \frac{|z_i(n+1)|}{|z_i(n)|}=\sqrt{e^{-2\alpha _i /f_s}+\frac{2T_i(n)}{|z_i(n)|^2}} \end{aligned}$$
(23)

Thus, we can modify the recurrence equation defined in the previous section (see Eq. (17)) by incorporating the modulus variations due to energy transfers. It gives the following recurrence relation for \(z_i\), including the source and phase variations:

$$\begin{aligned} z_i(n+1) = \left\{ \begin{array}{ll} \sqrt{2T_i(n)} + u_i(n) & \ \text {if} \ z_i(n)=0\\ \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_i(n)}{|z_i(n)|^2}} \ e^{j\omega _i/f_s} z_i(n)+ u_i(n) & \ \text {else} \\ \end{array} \right. \end{aligned}$$
(24)

Finally, we can write the system of equations for the implementation of the filters:

$$\begin{aligned} x_i(n+1) & \left\{ \begin{array}{ll} \sqrt{2T_i(n)} + u_i(n) & \ \text {if} \ z_i(n)=0\\ X_i x_i(n) - Y_i y_i(n) + u_i(n) & \ \text {else} \end{array} \right. \nonumber \\ y_i(n+1) & = \left\{ \begin{array}{ll} 0 & \ \text {if} \ z_i(n)=0\\ Y_i x_i(n) + X_i y_i(n) \ \quad \qquad & \ \text {else} \end{array} \right. \end{aligned}$$
(25)

with \(X_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_i(n)}{|z_i(n)|^2}} \cos (\omega _i/f_s)\) and \(Y_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_i(n)}{|z_i(n)|^2}} \sin (\omega _i/f_s)\)

In this way, power can be transferred among the different filters without affecting the phases. The coupling intervenes in the calculation of the transfer terms \(T_i(n)\) which ultimately involve the other filters.

5 Distribution matrix

This section presents a formalism for the calculation and control of the coupling between filters.

Now, define the column vectors \(\textbf{p}(n) = [P_{1}(n),\ldots ,P_{N}(n)]^{T}\) and \(\textbf{t}(n) = [T_{1}(n),\ldots ,T_{N}(n)]^{T}\). The power transfers between the tonal components \(\textbf{t}(n)\) at a given time step n are defined as:

$$\begin{aligned} \textbf{t}(n)=\textbf{M} \left[ \textbf{p}(n) - \varvec{\tau } \right] _+. \end{aligned}$$
(26)

An \(N\times 1\) column vector \(\varvec{\tau }\) containing the thresholds \(\tau _i\), \(i=1,\ldots ,N\) at which transfers are activated for each tonal component has also been introduced here.

Thus, the calculation of the transfer terms is performed by the matrix product of an \(N\times N\) redistribution matrix \(\textbf{M}\) with the vector resulting from the positive part of the difference between the power of each frequency component \(\textbf{p}(n)\) and the associated threshold \(\varvec{\tau }\). In other words, the transfer terms \(T_i(n)\) are proportional to the excess power above the corresponding threshold and the terms of the matrix \(\textbf{M}\) define the proportions distributed to and received by each component.

The diagonal entries \(M_{ii}\) of the matrix \(\textbf{M}\) define the proportion of power of the \(i^\text {th}\) mode that will be redistributed to other modes and the other terms of the column \(M_{ik}\) define the quantity that the \(i^\text {th}\) mode will receive from the \(k^\text {th}\) mode.

An efficient way to define the coefficients of the matrix is to use the following expression:

$$\begin{aligned} M_{ik}= \frac{\lambda }{f_s} \underbrace{\left( \eta \frac{a_{ik}}{\sum _{i=1}^N a_{ik}}\right) }_{c_{ik}} \frac{\omega _k^2}{\omega _i^2} - \frac{\lambda }{f_s} \delta _{ik} \end{aligned}$$
(27)

where \(a_{ik}\) is a coefficient weighting the redistribution from the \(k^\text {th}\) mode to the \(i^\text {th}\) mode (\(a_{ik}\ge 0\)), and \(\delta _{ik}\) is Kronecker’s \(\delta\) (\(\delta _{ii}=1\) and \(\delta _{ik}=0\) if \(i\ne k\)). While enforcing \(a_{ii}=0\) is not imperative, defining non-zero coefficients on the diagonal bears little significance, as it merely redistributes a mode onto itself, resulting in negligible effects.

In this formulation, the stability of the filter bank is ensured for arbitrary \(a_{ik}\), provided that at least one value per column is non-zero, that \(\lambda \le f_s\) and that \(0 \le \eta \le 1\). \(\eta\) corresponds to the efficiency of the transfers (\(\sum _{i=1}^N c_{ik}= \eta\)). If \(\eta =1\), there is no loss during the transfer process, If \(\eta =0\), all the transferred energy is lost. \(\lambda /f_s\) is the proportion of power above the threshold transferred to other modes at each time step (\(0 \le \lambda \le f_s\)). The values of the off-diagonal elements \(M_{ik}\) of the matrix \(\textbf{M}\) are the proportion of energy transferred by the mode k that will be received by the mode i.

The \(i^\text {th}\) transfer term \(T_i(n)\) can be expressed as follows:

$$\begin{aligned} T_i(n) = \underbrace{\eta \frac{\lambda }{f_s} \sum _{k=1}^{N}\left[ \frac{a_{ik}}{\sum _{i=1}^N a_{ik}} \frac{\omega _k^2}{\omega _i^2} \left[ P_k(n)-\tau _k\right] _+\right] }_{\text {positive contribution }T_{i+}(n)} - \underbrace{\frac{\lambda }{f_s} \left[ P_i(n)-\tau _i\right] _+}_{\text {negative contribution }T_{i-}(n)} \end{aligned}$$
(28)

Here, we can differentiate between the positive contributions \(T_{i+}(n)\), which represent energy moving to the \(i^\text {th}\) mode from the other modes, and the negative contribution \(T_{i-}(n)\), which denotes energy leaving the \(i^\text {th}\) mode to other modes.

6 Efficiency and randomization of the process

It is possible to perform the energy transfer process at a lower rate than the sample rate without degrading to audio quality. If we perform the transfers every \(N_0\) samples, the expression of the transfers (28) become:

$$\begin{aligned} T_i(n) = \left\{ \begin{array}{ll} \frac{\lambda N_0}{f_s} \left( \eta \sum _{k=1}^{N}\left[ \frac{a_{ik}}{\sum _{i=1}^N a_{ik}} \frac{\omega _k^2}{\omega _i^2} \left[ P_k(n)-\tau _k\right] _+\right] - \left[ P_i(n)-\tau _i\right] _+\right) & \ \text {if} \ n \ \text {mod}\ N_0 = 0\\ 0 & \ \text {else} \end{array} \right. \end{aligned}$$
(29)

As this step constitutes the heaviest computation in the algorithm as a whole, avoiding the need to calculate it at each time step can significantly enhance the efficiency of the code.

On the other hand, since energy transfers are highly regular and predictable (and precisely why this type of model was chosen), one might wish to introduce randomness into the transfer processes to induce more chaotic variations. A simple and effective method involve introducing a random variable \(\chi (n)\) following a uniform distribution between 0 and \(2\pi\). This randomness can be incorporated to vary the phases of the positive contributions of transfers in the implementation of the filters (see Eq. (25)):

$$\begin{aligned} x_i(n+1) & = \left\{ \begin{array}{ll} \text {Re}(e^{j \chi (n)}) \sqrt{2T_{i+}(n)} + u_i(n) & \ \text {if} \ z_i(n)=0\\ \tilde{X}_i x_i(n) - \tilde{Y}_i y_i(n) + \text {Re}(e^{j \chi (n)}) \sqrt{2T_{i+}(n)} + u_i(n) & \text {else} \end{array} \right. \nonumber \\ y_i(n+1) & = \left\{ \begin{array}{ll} \text {Im}(e^{j \chi (n)}) \sqrt{2T_{i+}(n)} & \text {if} \ z_i(n)=0\\ \tilde{Y}_i x_i(n) + \tilde{X}_i y_i(n) + \text {Im}(e^{j \chi (n)}) \sqrt{2T_{i+}(n)} \ \quad \qquad & \ \text {else} \end{array} \right. \end{aligned}$$
(30)

with \(\tilde{X}_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_{i-}(n)}{|z_i(n)|^2}} \cos (\omega _i/f_s)\) and \(\tilde{Y}_i = \sqrt{e^{-2\alpha _i /f_s}+\frac{2T_{i-}(n)}{|z_i(n)|^2}} \sin (\omega _i/f_s)\)

One can note that this method tends to dissipate more energy due to the introduction of a component with a randomized phase, which has the potential to diminish the amplitude of a filter if their phase are opposed. Also, the modulations created with this approach are affected by \(N_0\).

7 Examples

Nonlinear vibration leads to complex phenomena that can produce subtle and chaotic variations in radiated sound. We can reduce the complexity of the model and propose a heuristic that attempts to maintain the essential perceptual attributes of an object vibrating under nonlinear conditions. The resulting synthetic sound is nevertheless less realistic and versatile than sounds generated by the direct resolution of physical models (such as, e.g., the Föppl-von Kármán system) although the synthesis quality can be improved by using random processes in the implementation of the algorithms.

The coupled filter bank proposed here is dependent on various parameters: the number of filters N, the oscillation frequencies \(\omega _i\) and damping \(\alpha _i\) for each filter, the coefficients \(a_{ik}\) and the parameters \(\lambda\) and \(\eta\) for the definition of the redistribution matrix \(\textbf{M}\), and the thresholds \(\tau _i\). Strategies for setting these parameters are presented in two cases of musical interest. In the case of nonlinear plate vibration, energy is transferred to filters of nearby frequency in order to generate a gradual cascade of energy towards the high-frequency range. In the case of a string colliding with a rigid object, in contrast, there is simultaneous transfer of energy to many frequency components.

7.1 Energy cascade in thin plates

Consider a thin rectangular plate (according to the Kirchhoff model [24]), with mass density \(\rho\) kg\(\cdot\) m\(^{-3}\), thickness H m, and flexural rigidity D in kg\(\cdot\)m\(^{2}\) \(\cdot\)s\(^{-2}\), and side lengths \(L_{x}\) and \(L_{y}\) in m. If the plate is simply supported on all its edges, the modal frequencies \(\omega _{lm}\) and modal shapes \(\phi _{lm}(x,y)\) can be expressed as follows [25]:

$$\begin{aligned} \omega _{lm}=\frac{\pi ^2}{L_{x}^2}\sqrt{\frac{D}{\rho H}} \left( l^2 +\nu ^2 m^2 \right) \quad \phi _{lm}(x,y)=\sin (o \pi x) \sin (p \pi y) \end{aligned}$$
(31)

Here, \(\nu = L_{x}/L_{y}\) is the plate aspect ratio or the ratio between the length and width of the plate. The integer indices \(l, m \ge 1\) correspond to the number of vibration extrema (\(l-1\), \(m-1\) correspond to the number of vibration nodes) in the main directions of the rectangular plate (Cartesian coordinates (xy)) with x and y being normalized by the length of the plate in the corresponding direction (so that \(0\le x,y \le 1\)).

For a point excitation force located at \((x_e,y_e)\), we can compute the modal forces using the mode shapes evaluated at the excitation point as \(\phi _{lm}(x_e,y_e)\). We define the source of the lmth filter as follows:

$$\begin{aligned} u_{lm}(n)= \sin (l \pi x_e) \sin (m \pi y_e) u(n) \end{aligned}$$
(32)

where u(n) is the global excitation function.

We use a raised sinusoid for the excitation force (as proposed in [4, 26]) to simulate an impact:

$$\begin{aligned} u(n)=\left\{ \begin{array}{ll} A\sin ^2(\pi n/N_{ex}) & \ \text {if} \ n\le N_{ex}\\ 0 & \ \text {else} \\ \end{array} \right. \end{aligned}$$
(33)

For typical plate strikes, the strike duration \(N_{ex}/f_{s}\) in seconds is on the order of 1-4 ms.

The damping coefficients are chosen according to an exponential law, as proposed by Aramaki et al. [27], with parameters that are set to evoke a metallic object:

$$\begin{aligned} \alpha _{lm} = e^{(\alpha _G+\omega _{lm} \alpha _R)} \end{aligned}$$
(34)

with \(\alpha _R=4\times 10^{-5}\) and \(\alpha _G=0.33220\). This set of parameters permits direct modal synthesis for linear plate vibration. To each pair of indices (lm) we associate an index i (chosen in terms of increasing modal frequency) corresponding to the filter number used to generate the corresponding tonal component.

In order to produce the cascade of energy towards higher frequencies, we carry out transfers between filters whose frequencies are close. Indeed, the energy supplied by the impact is localized at low frequencies and transfers directed towards the neighboring modes allow the progressive appearance of higher frequency components. The weighting coefficients \(a_{ik}\) can be set as follows:

$$\begin{aligned} a_{ik} = \left[ 1 -\frac{|f_k - f_i|}{\Delta f} \right] _+ \end{aligned}$$
(35)

with \(f_i=\frac{\omega _i}{2 \pi }\)

We set \(\eta =1\) (ensuring conservation of energy during the redistribution). The cascade can be mainly controlled by \(\lambda\), or by the definition of thresholds \(\tau _i\) (see Figs. 3 and 4).

Fig. 3
figure 3

Spectrograms of output for filters whose frequency corresponds to the modal frequency of a thin plate for different values of \(\lambda\) (\(\tau _i = 0\)). From left to right: \(\lambda = 0.001 f_s\), \(\lambda = 0.01 f_s\), \(\lambda = 0.1 f_s\), \(\lambda = f_s\). We can observe that the energy cascade spreads faster and higher in frequency with the increase of \(\lambda\). Transfers are performed at each time step

Fig. 4
figure 4

Spectrograms of output for filters whose frequency corresponds to the modal frequency of a thin plate for different thresholds \(\tau _i\). Left: \(\tau _i = 0\); middle: \(\tau _i = 0\) except for \(i=10\) where \(\tau _{10}=1\); right: \(\tau _i\) is half the excitation amplitude. All tonal components decay simultaneously when the thresholds are zero (left). A component emerges and decays more slowly when its threshold is non-zero (middle). When all thresholds are different from zero, we observe a usual exponential decay after the delayed appearance of the high frequency component (right)

In the case of wave turbulence in plates [28], couplings between modes can lead to rapid variations in amplitude and frequency leading to a chaotic regime. In the chaotic regime, the resulting signal is noisy and difficult to reproduce by a set of tonal components. One way to reproduce this phenomenon with the coupled filters presented in this paper is to randomize the phases of the positive contributions of the transfer term in the source, as proposed in Section 6. In this way, the tonal components are subject to rapid random amplitude modulations that can evoke the chaotic phenomenon occurring during wave turbulence in the plates (see Fig. 5).

Fig. 5
figure 5

Spectrogram of output for filters whose frequency corresponds to the modal frequency of a thin plate. The random modulation of the redistributions induces rapid variations in the amplitude of the tonal components which generate noise and beating in the signal. Transfers are performed every \(N_0=4\) time step

7.2 Collisions in sound production

The perturbation of the vibrations of an object when colliding with an obstacle can lead to different types of sound events. In the typical case of a guitar, the player can choke the string, mute it, play a natural harmonic. The string can also interact with the soundboard (slap, string buzz).

The model of an ideal vibrating string with simply supported boundary conditions gives the following modal frequencies and shapes:

$$\begin{aligned} \omega _i=i \omega _1\qquad \qquad \phi _i(x) = \sin {(i \pi x)} \end{aligned}$$
(36)

where, here, the spatial coordinate x is normalized by the length of the string (\(0\le x \le 1\)). For a point excitation force located at \(x_e\), the source of the \(k^{th}\) filter can be defined as:

$$\begin{aligned} u_i(n)=\sin {(i \pi x_e)} u(n) \end{aligned}$$
(37)

We use the same excitation force and damping model as previously (see Eqs. (33) and (34)).

The evocation of an obstacle disturbing the vibrations of the string requires the definition of thresholds that correspond to the location of the obstacle. We propose thresholds corresponding to the maximum amplitude of modal displacements without colliding with a virtual obstacle positioned at \(x_c,y_c\), where \(y_{c}\) is the vertical displacement of the obstacle relative to the string. The amplitude of a modal displacement that goes through the obstacle is \(y_c/|\sin {(k \pi x_c)}|\); thus, the threshold for the power is as follows:

$$\begin{aligned} \tau _i = \frac{1}{2}\left( \frac{y_c}{\sin {(i \pi x_c)}} \right) ^2 \end{aligned}$$
(38)

We define a redistribution matrix with all columns being identical in order to cause a simultaneous redistribution to a set of tonal components. The coefficients \(a_{ik}\) are defined as follows:

$$\begin{aligned} a_{ik}=|\sin {(i \pi x_c)}|\xi _i(f_i) \quad \forall \ (i,k) \end{aligned}$$
(39)

with \(\xi _i(f_i)\) a parameter depending on the frequency allowing weighting of the redistribution according to the filter frequency. We define \(\xi _i(f_i)\) as the Fourier transform of the raised cosine, an approximation of the force profile caused by a collision (as defined for the source, Eq. (33)):

$$\begin{aligned} \xi _i = |\text {sinc}(f_i \gamma ) + \frac{1}{2} \left( \text {sinc}(f_i \gamma - 1) + \text {sinc}(f_i \gamma + 1) \right) | \end{aligned}$$
(40)

with \(f_i\) the frequency of the \(i^{th}\) filter and \(\gamma\) a parameter corresponding to the duration of the raised cosine. This results in a cutoff frequency beyond which there is no more transfer (see Fig. 6). Various examples of sound outputs for different configurations are presented. We can observe that the transfer does not affect even harmonics (resp. multiples of 3 and 4) for \(x_c = 1/2\) (resp.\(x_c = 1/3\) and \(x_c = {1}/4\) ), which allows the reproduction of a natural harmonic played on a guitar (see Fig. 7). There is a lower increase in the high-frequency components and a faster dissipation of all the tonal components involved in the redistribution as the efficiency \(\eta\) decreases. As \(\gamma\) increases, there is also less energy distributed to the high-frequency components, but this energy is not dissipated and remains in the low-frequency components. (see Fig. 8).

Fig. 6
figure 6

Value of \(\xi _i\) as a function of the frequency of filter i

Fig. 7
figure 7

Spectrograms of output for filters whose frequency are harmonic for different values of \(x_c\) (\(y_c = 0\), \(\gamma =2\times 10^{-4}\)s, \(\lambda =0.25\), \(\eta =0.5\)). Transfers are performed every 294 samples for times greater than 500 ms, which corresponds to a collision every 6.67 ms (150 Hz). From left to right: \(x_c = 1/2\), \(x_c = 1/3\), \(x_c = {1}/4\)

Fig. 8
figure 8

Spectrograms of output for filters whose frequency are harmonic for different values of \(\eta\) and \(\gamma\) (\(x_c = 1/2\), \(y_c = 0\)). From left to right: (\(\eta =0.5\), \(\gamma =2\times 10^{-4}\)s), (\(\eta =0.15\), \(\gamma =2\times 10^{-4}\)s), (\(\eta =0.5\), \(\gamma =2\times 10^{-3}\)s). Transfers are performed every 294 samples for times greater than 500 ms, which corresponds to a collision every 6.67 ms (150 Hz)

Collisions in musical instruments may be the source of more subtle phenomena than a simultaneous appearance of various frequency components. The cases of string buzz and tanpura can be approached by introducing random processes into the redistribution, as has been done for chaotic phenomena in plates (see Fig. 9).

Fig. 9
figure 9

Spectrograms of output for filters whose frequency are harmonic with the introduction of random processes during the redistribution (\(\lambda = 0.001\), \(\eta = 0.9\), \(x_c = 0.38\), \(y_c = 0.001\), \(\gamma =2\times 10^{-4}\)s). Transfers are performed every 294 samples for times greater than 500 ms, which corresponds to a collision every 6.67 ms (150 Hz)

It is possible to apply the same principle for the generation of sounds corresponding to collisions with 2D objects. For example, we can generate muted plate sounds (see Fig. 10).

Fig. 10
figure 10

Spectrograms of output for filters whose frequency corresponds to the modal frequency of a thin plate. Here \(\eta =0\), and we observe the quick dissipation of certain tonal components for three distinct impacts, which creates a sensation of choking

8 Discussion

We conducted a comparative analysis of computational time in MATLAB (running on an AMD Ryzen 5 PRO 6650U - 2.90 GHz CPU) between uncoupled and coupled filter banks across various conditions (see Table 1).

Table 1 Comparative analysis of computational time in MATLAB on a CPU (AMD Ryzen 5 PRO 6650U - 2.90 GHz, \(f_s = 48000\)Hz). N is the number of filters, \(c_{ik}>0\) p.m. is the number of non-zero coupling coefficient per mode, comp.time is the total computational time for a synthesized sound of 1 s, % is the computational time expressed as a percentage of the computational time without coupling, and \(N_{max}\) is the maximum number of filters required to maintain computational time below 1 s for generating 1 s of sound

For reference, the numerical resolution of the Föppl-von Kármán system typically consumes about one hour per second of sound without optimization/simplification, while the fastest simplified algorithm to date [13] still demands around 6 s of computation per second of sound on the same machine. Our model demonstrates real-time feasibility for handling 200 modes under any condition and potentially many more if the transfers are not calculated at each iteration (\(N_0>1\)). Despite a matrix featuring entirely non-zero elements, we achieve slightly superior efficiency in collision processing due to the uniformity of all columns, substantially simplifying the matrix calculation of \(\mathbf {t(n)}\). Notably, the computation of couplings remains below 15% of the total computational time for \(N_0\ge 100\), thereby enabling real-time execution for over 1000 modes.

Regarding sound quality assessment, the provided sound samples offer valuable insights. While an absolute perceptual evaluation of the model proves challenging, various subjective assessments for specific case studies form the basis of supplementary investigations, one of which has been recently published [29]. The generated sounds and the findings of this study demonstrate promising outcomes, with sounds effectively evoking the modeled phenomena and exhibiting satisfactory reproduction quality.

Another important aspect of nonlinear phenomena—frequency variations of modes—was not addressed in this paper. However, the employed filters have the advantage of being stable even when \(\omega\) varies with each sample. A relatively straightforward approach to integrate frequency variations of the tonal components consists of identifying them directly from recorded sounds or through physical modeling and to integrate these characteristics into the filter design.

9 Conclusions

In this paper, we have presented a model for mode coupling and the design of coupled resonant filters geared towards the emulation of mode coupling effects in nonlinear vibrating structures. This filter bank allows efficient and real-time sound synthesis even for a large number of filters. The coupling, performed without modifying the phase, introduces predictable and controllable effects on the output signal. The terms controlling the coupling between the different filters are grouped in a matrix whose definition is the main challenge. The setting of the parameters of the sound synthesis process is presented through various examples corresponding to vibrating objects with a nonlinear behavior. A simple setting allows the generation of typical sounds, though sometimes with an unnatural character—the introduction of random processes in the energy redistribution aids a great deal with both realism and plausibility.

Further work will be concerned with determining which sound morphologies are important from a perceptual point of view for the recognition of sound events [30] corresponding to nonlinear phenomena in order to reproduce them with this coupled filter bank. This could lead to the development of environmental sound synthesizers and virtual musical instruments (e.g., tanpura, cymbal ...) or to non-linear audio effects (such as the nonlinear reverberation of a snare drum due to the wires held under tension against the lower drumskin). The filter bank presented in this paper can also be used as an abstract sound generation tool. In this context, the challenge would be to design intuitive control for use in a musical or sound design context.

Availability of data and materials

No associated data.

Abbreviations

CPU:

Central processing unit

GPU:

Graphics processing unit

PDE:

Partial differential equation

References

  1. J.M. Adrien, in Representations of musical signals. The missing link: Modal synthesis (MIT Press, 55 Hayward St., Cambridge, MA, United States, 1991), pp. 269–298

  2. J.D. Morrison, J.M. Adrien, Mosaic: A framework for modal synthesis. Comput. Music. J. 17(1), 45–56 (1993)

    Article  Google Scholar 

  3. D. Rocchesso, The ball within the box: A sound-processing metaphor. Comput. Music. J. 19(4), 47–57 (1995)

    Article  Google Scholar 

  4. K. Van Den Doel, P.G. Kry, D.K. Pai, in Proceedings of the 28th annual conference on Computer graphics and interactive techniques. Foleyautomatic: Physically-based sound effects for interactive simulation and animation (ACM Press, 1601 Broadway, 10th Floor, New York, NY, United States, 2001), pp. 537–544

  5. S. Conan, E. Thoret, M. Aramaki, O. Derrien, C. Gondre, S. Ystad, R. Kronland-Martinet, An intuitive synthesizer of continuous-interaction sounds: Rubbing, scratching, and rolling. Comput. Music. J. 38(4), 24–37 (2014)

    Article  Google Scholar 

  6. K. Legge, N.H. Fletcher, Nonlinearity, chaos, and the sound of shallow gongs. J. Acoust. Soc. Am. 86(6), 2439–2443 (1989)

    Article  Google Scholar 

  7. A. Chaigne, C. Touzé, O. Thomas, Nonlinear vibrations and chaos in gongs and cymbals. Acoust. Sci. Technol. 26(5), 403–409 (2005)

    Article  Google Scholar 

  8. A. Föppl, Vorlesungen über technische Mechanik (Druck und Verlag von B.G. Teubner, Leipzig, 1907)

    Google Scholar 

  9. T. von Kármán, Festigkeitsprobleme im maschinenbau. Encyklopädie Mathematischen Wiss. 4(4), 311–385 (1910)

    Google Scholar 

  10. S. Bilbao, A family of conservative finite difference schemes for the dynamical von karman plate equations. Num. Meth. PDE 24(1), 193–216 (2008)

    Article  MathSciNet  Google Scholar 

  11. M. Ducceschi, C. Touzé, Modal approach for nonlinear vibrations of damped impacted plates: Application to sound synthesis of gongs and cymbals. J. Sound Vib. 344, 313–331 (2015)

    Article  Google Scholar 

  12. M. Ducceschi, C. Touzé, in 18th International Conference on Digital Audio Effects (DAFx-15). Simulations of nonlinear plate dynamics: An accurate and efficient modal algorithm (proceedings of the DAFX conferences 2015)

  13. S. Bilbao, Z. Wang, C. Webb, M. Ducceschi, in Proceedings of the 26th International Conference on Digital Audio Effects. Real-time gong synthesis (Copenhagen, 2023)

  14. S. Bilbao, A. Torin, V. Chatziioannou, Numerical modeling of collisions in musical instruments. Acta Acustica U. Acustica 101(1), 155–173 (2015)

    Article  Google Scholar 

  15. C. Issanchou, S. Bilbao, J.L. Le Carrou, C. Touzé, O. Doaré, A modal-based approach to the nonlinear vibration of strings against a unilateral obstacle: Simulations and experiments in the pointwise case. J. Sound Vib. 393, 229–251 (2017)

    Article  Google Scholar 

  16. A. Farnell, Designing Sound (MIT Press, Cambridge Massachusetts, 2010)

    Google Scholar 

  17. T. Skare, J. Abel, in Proc. 22nd Int. Conf. Dig. Audio Effects. Real-time modal synthesis of crash cymbals with nonlinear approximations, using a gpu (proceedings of the DAFX conferences, 2019)

  18. M. Mathews, J.O. Smith, in Proceedings of the Stockholm Musical Acoustics Conference (SMAC 2003)(Stockholm), Royal Swedish Academy of Music (August 2003). Methods for synthesizing very high q parametrically well behaved two pole filters (2003)

  19. S. Poirot, R. Kronland-Martinet, S. Bilbao, in 26th International Conference on Digital Audio Effects (DAFx23). A coupled resonant filter bank for the sound synthesis of nonlinear sources (proceedings of the DAFX conferences, 2023)

  20. L. Pruvost, B. Scherrer, M. Aramaki, S. Ystad, R. Kronland-Martinet, in SIGGRAPH Asia 2015 Technical Briefs. Perception-based interactive sound synthesis of morphing solids’ interactions (2015), pp. 1–4

  21. C. Gan, J. Schwartz, S. Alter, D. Mrowca, M. Schrimpf, J. Traer, J. De Freitas, J. Kubilius, A. Bhandwaldar, N. Haber et al., Threedworld: A platform for interactive multi-modal physical simulation. (2021). arXiv preprint arXiv:2007.04954. https://arxiv.org/abs/2007.04954

  22. S. Poirot, Sound examples. https://www.prism.cnrs.fr/publications-media/EURASIPPoirot/. Accessed 17 Dec 2023

  23. Z.F. Fu, J. He, Modal analysis (Elsevier, Oxford, 2001)

    Google Scholar 

  24. K.F. Graff, Wave motion in elastic solids (Dover Publications, New York, 1991)

    Google Scholar 

  25. N.H. Fletcher, T.D. Rossing, The physics of musical instruments (Springer Science & Business Media, New York, 2012)

    Google Scholar 

  26. S. Bilbao, Numerical sound synthesis: Finite difference schemes and simulation in musical acoustics (Wiley, Chichester, 2009)

    Book  Google Scholar 

  27. M. Aramaki, M. Besson, R. Kronland-Martinet, S. Ystad, Controlling the perceived material in an impact sound synthesizer. IEEE Trans. Audio Speech Lang. Process. 19(2), 301–314 (2010)

    Article  Google Scholar 

  28. M. Ducceschi, C. Touzé, O. Thomas, S. Bilbao, Dynamics of the wave turbulence spectrum in vibrating plates: A numerical investigation using a conservative finite difference scheme. Phys. D 280, 73–85 (2014)

    Article  MathSciNet  Google Scholar 

  29. S. Poirot, S. Bilbao, M. Aramaki, S. Ystad, R. Kronland-Martinet, A perceptually evaluated signal model: Collisions between a vibrating object and an obstacle. IEEE/ACM Trans. Audio Speech Lang. Process. 31, 2338–2350 (2023)

  30. R. Kronland-Martinet, S. Ystad, M. Aramaki, High-level control of sound synthesis for sonification processes. AI Soc. 27(2), 245–255 (2012)

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

SP: main contribution to the conception of the work and drafted the paper, SB: design of the work and substantial revision, RKM: project initiator, design and work revision.

Corresponding author

Correspondence to Samuel Poirot.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix section : equivalence between the continuous model and the discrete-time algorithm

Appendix section : equivalence between the continuous model and the discrete-time algorithm

In this section, we note \(P_i\left[ n \right]\) the numerical computation of \(P_i(t)\) for \(t=n/fs\).

•Case without transfer for any mode : \(P_i(t)<\tau _i \ \forall k \in \{ 1,2,...,N \},\ t \in \mathbb {R}_+\)

Continuous model for power transfers Eq. (14):

$$\begin{aligned} \frac{dP_i(t)}{dt} = -2\alpha _i P_i(t) \quad \Leftrightarrow \quad P_i(t)=P_i(0) e^{-2 \alpha _i t} \end{aligned}$$
(41)

In this case, the definition of a recurrence equation is straightforward:

$$\begin{aligned} P_i(t+1/fe)=P_i(t) e^{-2 \alpha _i /fe} \quad \Rightarrow \quad P_i\left[ n +1\right] =P_i\left[ n \right] e^{-2 \alpha _i /fe} \end{aligned}$$
(42)

•Case with transfers: \(P_i(t)\ge \tau _i\)

Continuous model for power transfers Eq. (14):

$$\begin{aligned} \frac{dP_i(t)}{dt} = -2\alpha _i P_i(t) - \lambda \left[ P_i(t) - \tau _i \right] _+ + \sum _k \lambda c_{ik} \frac{\omega _k^2}{\omega _i^2} \left[ P_k(t) - \tau _k \right] _+. \end{aligned}$$
(43)

Using Euler scheme to approximate \(\frac{dP_i(t)}{dt}\approx \left( P_i\left[ n +1\right] -P_i\left[ n \right] \right) f_s\) gives the following equation:

$$\begin{aligned} \left( P_i\left[ n +1\right] -P_i\left[ n \right] \right) f_s & = -2\alpha _i P_i\left[ n \right] - \lambda \left[ P_i\left[ n \right] - \tau _i \right] _+ + \sum _k \lambda c_{ik} \frac{\omega _k^2}{\omega _i^2} \left[ P_k\left[ n \right] - \tau _k \right] _+ \\ P_i\left[ n +1\right] & = \left( 1-\frac{2\alpha _i}{f_s}\right) P_i\left[ n \right] - \frac{\lambda }{f_s} \left[ P_i\left[ n \right] - \tau _i \right] _+ + \sum _k \frac{\lambda }{f_s} c_{ik} \frac{\omega _k^2}{\omega _i^2} \left[ P_k\left[ n \right] - \tau _k \right] _+ \end{aligned}$$

Using \(e^{x}=1+x+\mathcal {O}(x^2)\), we find the discrete-time model Eqs. (20) and (28):

$$\begin{aligned} P_i\left[ n +1\right] = e^{-2\alpha _i/f_s} P_i\left[ n \right] \underbrace{- \frac{\lambda }{f_s} \left[ P_i\left[ n \right] - \tau _i \right] _+ + \sum _k \frac{\lambda }{f_s} c_{ik} \frac{\omega _k^2}{\omega _i^2} \left[ P_k\left[ n \right] - \tau _k \right] _+}_{T_i\left[ n \right] } \end{aligned}$$
(44)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Poirot, S., Bilbao, S. & Kronland-Martinet, R. A simplified and controllable model of mode coupling for addressing nonlinear phenomena in sound synthesis processes. J AUDIO SPEECH MUSIC PROC. 2024, 38 (2024). https://doi.org/10.1186/s13636-024-00358-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13636-024-00358-2

Keywords