Skip to main content

Real-Time Perceptual Simulation of Moving Sources: Application to the Leslie Cabinet and 3D Sound Immersion

Abstract

Perception of moving sound sources obeys different brain processes from those mediating the localization of static sound events. In view of these specificities, a preprocessing model was designed, based on the main perceptual cues involved in the auditory perception of moving sound sources, such as the intensity, timbre, reverberation, and frequency shift processes. This model is the first step toward a more general moving sound source system, including a system of spatialization. Two applications of this model are presented: the simulation of a system involving rotating sources, the Leslie Cabinet and a 3D sound immersion installation based on the sonification of cosmic particles, the Cosmophone.

1. Introduction

The simulation of moving sources is of great importance in many audio sound applications, including musical applications, where moving sources can be used to generate special effects inducing novel auditory experiences. Motion of instruments while they are being played can also subtly affect the sound, and hence the expressiveness of the performance. Wanderley et al. [1] have described, for example, that the motion of the clarinet follows specific trajectories depending on the type of music played, independently of the player. Although the effect of this motion on sound has not yet been clearly established, it probably contributes to the rendering and should be taken into account in attempts to synthesize musical sounds. Virtual reality is another field, where moving sources play an important role. To simulate motion, the speed and trajectories are crucial to creating realistic acoustical environments, and developing signal processing methods for reconstructing these contexts is a great challenge.

Many authors have previously addressed these problems. Two main approaches have been used so far for this purpose: the physical approach, where sound fields resembling real ones as closely as possible are simulated, and the perceptual approach, where the resulting perceptual effects are taken into account.

The physical approaches used so far in this context have involved modelling sound fields using physical models based on propagation equations. In this case, the distribution of the acoustical energy in the 3D space requires a set of fixed loudspeakers precisely- and accurately-controlled. Several techniques such as ambisonics [2], surround sound [3] and, more recently, wave field synthesis [4], and VBAP [5] have been developed and used in studies on these lines. Specific systems designed for headphone listening have also been developed [6], which involve filtering signals recorded under anechoic conditions with head-related transfer functions (HRTFs). However, the specificity of individual HRTF gives rise to robustness issues, which have not yet been solved. In addition, it is not clear how a system of spatialization may be suitable for simulating rapidly moving sound sources, since they do not take the dynamics of the source into account. Lastly, Warren et al. [7] have established that different brain processes are responsible for mediating static and dynamic moving sounds, since the perceptual cues involved were found to differ between these two categories of sounds.

The perceptual approaches to these issues have tended to focus on the attributes that convey the impression that sounds are in motion. Chowning [8], who conducted empirical studies on these lines, established the importance of specific perceptual cues for the synthesis of realistic moving sounds.

In the first part of this paper, the physical and perceptual approaches are combined to develop a real-time model for a moving source that can be applied to any sound file. This model, which was based on Chowning's studies, was calibrated using physical knowledge about sound propagation, including air absorption, reverberation processes, and the Doppler effect. The second part of this paper deals with two audio applications of this system. The first application presented is the Leslie cabinet, a rotating source system enclosed in a wooden box, which was modelled by combining several moving sound elements to simulate complex acoustic phenomena. In this application, we take the case of a listener placed far from the sound sources, which means that the acoustic environment greatly alters the original sound. The second application focuses on a virtual reality installation combined with cosmic particle detectors: the Cosmophone. Here, the listener is immersed in a 3D space simulating the sonified trajectories of the particles.

2. What is Perceptually Relevant?

Based on previous studies (see, e.g., [9] and the references therein, [8, 10–16]), four important perceptual cues can be used to draw up a generic model for a moving sound source. Most of these cues do not depend on the spatialization process involved, but they are nevertheless greatly influencing the perception of sounds, including those emitted by fixed sources.

Sound pressure

From the physical point of view, the sound pressure relates to the sound intensity, and in a more complex way, the loudness. Sound pressure varies inversely with the distance between the source and the listener. This rule is of great importance from the perceptual point of view [15], and it is possibly decisive in the case of slowly moving sources. It is worth noting that only the relative changes in the sound pressure should be taken into account, since the absolute pressure has little effect on the resulting perceptual effect.

Timbre

Timbre is a perceptual attribute which makes it possible to discriminate between different sounds having the same pitch, loudness, and duration [17]. From a signal processing point of view, timbre variations are reflected in changes in both the time evolution and the spectral distribution of the sound energy. Subtle changes of timbre can also make it possible to distinguish between various sounds belonging to the same class. For example, in the class consisting of impact sounds on geometrically identical bars, it was established in a previous study that it is possible to differentiate perceptually between various wood species [18].

Changes in the timbre of moving sound sources, which are physically predictable, play an important perceptual role. Composers such as Maurice Ravel used cues of this kind in addition to intensity variations to make a realistic sensation of an an-coming band in his Bolero: the orchestra starts in a low-frequency register to simulate the band playing at a distance, and the brightness gradually increases to make the musicians seem to be coming closer. Schaeffer [10] also used changes of timbre in a radiophonic context to simulate auditory scenes, where the speakers occupied different positions in the virtual space.

The changes of timbre due to distance can be accounted for physically in terms of air absorption. The main perceptual effect of air absorption on sounds is due to a low-pass filtering process, the result of which depends on the distance between source and listener. Note that, under usual conditions, the  kHz frequency band, in which most human communications occur varies very little, even at large source-to-listener distances. To simulate moving sound sources which cover large distances, effects due to air absorption must be taken into account.

The doppler effect: a frequency shift

From the physical point of view, moving sound sources induce a frequency shift known as the Doppler effect. Actually, depending on the relative speed of the source with respect to the listener, the frequency measured at the listeners position is [19]

(1)

where is the frequency emitted by the source, and denote the relative speed of the listener in the direction of the source and the relative speed of the source in the direction of the listener, respectively, and is the sound velocity. During a given sound source trajectory, the perceived frequency is time-dependent and its specific pattern seems to be a highly relevant cue enabling the listener to construct a mental representation of the trajectory [15]. Chowning [8] used such a pattern to design efficient signal processing algorithms accounting for the perception of moving sources. It is worth noting here that the Doppler effect integrates changes in intensity as well as the frequency shifts. The perceptual result is, therefore, a complex combination of these two parameters, since an increase in the intensity tends to be perceived as a pitch variation due to the close relationship between intensity and frequency [13]. The Doppler effect is a dynamic process, which cannot be defined by taking motion to be a series of static source positions, and this effect is robust whatever the system of spatialization uses, including fixed mono speaker diffusion processes.

Environment: the effects of reverberation

In everyday life, quality of sound depends on the environment. Scientists and engineers working on room acoustics (see, e.g., [11]) have studied this crucial issue intensively. The influence of the environment is a complex problem, and modelling sounds taking architectural specificities into account are not the scope of this study. In particular, the effects of reverberation can be explained by the physical laws of sound propagation, which impose that distant sound sources lead to more highly reverberated signals than nearby sound sources because with distant sound sources, both the direct and reflected sound paths are of similar orders of magnitude, whereas with nearby sources, the direct sound is of greater magnitude than the reflected sounds. Moving sound sources, therefore, involve a time-dependent direct-reverberated ratio, the value of which depends on the distance between source and listener.

2.1. A Real-Time Moving Source Model

In line with the above considerations, a generic model was drawn up simulating the motion of an acoustic source by processing a sound file corresponding to the acoustic radiation emitted by a fixed source. This model consists of a combination of the four main components described above (Figure 1). The relative speed and distance between the listener and the moving source control the parameters of the model. Efficient interfaces can, therefore, be added to simplify the modelling of the trajectories. The resulting sound is intended for monophonic listening, but it could be linked to a system of spatialization, enhancing the realism of the motion.

Figure 1
figure 1

Scheme of the moving source model.

2.2. Implementation

We describe how each elementary process can be modelled algorithmically. The global implementation scheme is shown in Figure 3. The whole model was implemented in real time under Max/MSP [20] development environment. The implementation, which can be downloaded on the web (see Section 6), allowed to check the perceptual accuracy of the model.

2.2.1. Intensity Variations

Intensity variations are controlled directly by the level of the sound. Assuming the sound propagation to involve spherical waves, the sound level will vary with respect to , where is the source-to-listener distance. From the practical point of view, care must be taken to avoid divergence problems at .

2.2.2. Timbre Variations

As mentioned above, timbre variations due to the air absorption mainly affect the high-frequency components. Since this factor is probably of lesser perceptual importance than other motion cues, it is possible to simplify its treatment in the implementation process. Huopaniemi et al. [12] have established that the magnitude response of the low-pass filter accounting for air absorption can be modeled using low-pass IIR filters. The frequency response of these filters must vary with respect to the listener-to-source distance. However, no information seems to be available in the literature giving cues as to how accurately these filters have to be designed to ensure the realism of the distance perception. We, therefore, designed a model based on a compromise between perceptual accuracy and real-time performance. This constraint actually requires the number of control parameters (the so-called "mapping") as well as the algorithmic complexity to be minimized. A classical high-shelving second-order IIR filter was used as described in [21] to model the timbre variations due to the air absorption. This kind of filter, which was originally designed for parametric equalizers, makes it possible to either boost or cut off the high-frequency part of the audio spectrum. To simulate air absorption, the control parameters (cutoff frequency and gain) have to be linked to the listener-to-source distance. At a given listener-to-source distance , one "air transfer function" can be computed using formulae given in [22]. An optimization procedure, based on a least square minimization method, then gives the gain and cutoff frequency minimizing , where is the transfer function of the high-shelving filter. Since the cutoff frequency was found to depend weakly on the distance, it was set to 10 kHz. This led to a single control parameter: the gain . Furthermore, this gain in dB can be related to the distance in meters via the simple relation:

(2)

The computed air transfer functions and the simulated filter magnitude responses are compared in Figure 2 at distances up to 50 meters, with the parameters given above. Although the simulation differs from reality (especially in the high-frequency range), it yielded to perceptually satisfactory results. In addition, the factor , applied between the filter gain and the source-to-listener distance, can be changed, so that the effects of timbre variations can be easily adjusted (increased or decreased).

Figure 2
figure 2

Air transfer functions (solid lines) and simulated filter transfer functions modules (dotted lines) obtained by optimization for various source-to-listener distances. Air transfer functions were computed with a temperature of 20°C, an atmospheric pressure of 1013 HPa, and hygrometry. The cutoff frequency of the simulated filter was set at 10 kHz, and the filter gain was computed using (2).

Figure 3
figure 3

Implementation of the moving source model.

2.2.3. Doppler Frequency Shift

The Doppler frequency shift is due to changes in the path length between source and listener, and hence to changes in the propagation time, . The Doppler frequency shift (1) can then be controlled by a variable delay line. In the case of a sound source emitting a monochromatic signal and moving with respect to a fixed listener, Smith et al. [23] obtained the following expression:

(3)

For a given trajectory, (e.g., in the case of a source moving along a straight line and passing in front of the observer), the source velocity projected onto the source-to-listener line can be precalculated at each time sample. The delay value can then be computed as a function of time. However, when the source trajectory is unpredictable, derivative of the delay can be used as in (3). Strauss [24] suggested approximating complex trajectories as linear piecewise curves in order to obtain an analytical solution of .

Here, we adopted the approach proposed by Tsingos [25] who gave the following expression for :

(4)

where and are the respective positions of the listener and the source at time , and denotes the Euclidian distance. This expression was simplified in our implementation, since similar perceptual effects were still obtained, even at source speeds of 100 km/h,

(5)

Note that the delay line must deal with fractional values of . This problem has been previously addressed (see, e.g., [26]).

2.2.4. Reverberation Effect

Reverberation depends on the local environment and its treatment is usually left to the user. However, a few reverberation archetypes can be defined. In line with Chowning [8], we split the reverberation into its global and local components. The global reverberation originates from the whole space, whereas the local reverberation originates from the direction of the source. Actually, as Chowning stated, this corresponds to a fair approximation of a real acoustical situation, where the increase of the distance between the listener and the sound source leads to a decrease of the distance between the source and the reflecting surfaces, giving the reverberation some direction emphasis. The global reverberation level can be defined as , and the local reverberation level is given by . This ensures the following:

  1. (i)

    the sum of global and local reverberation levels varies as ;

  2. (ii)

    the ratio between the global reverberation level and the direct sound level varies as .

The modelling of the effects of reverberation can be enhanced with specific systems of spatialization. Actually, in the case of multiple speaker arrays, the global reverberation should be equally distributed to all the speakers, while the local reverberation follows the moving source. This method has been found to greatly improve the realism of the perceptual effects simulated.

3. A Leslie Cabinet Simulator

3.1. The Leslie Cabinet

The Leslie cabinet is an interesting application of the moving sound source model. Originally designed to add choral effect to Hammond organs, Leslie cabinets have been successfully used as an effect processor for many other musical instruments [27]. A Leslie cabinet is a wooden box, containing a rotating horn radiating high frequencies and a rotating speaker port adapted to a woofer radiating low frequencies. Each rotating source is driven by its own motor and mechanical assembly, and the rotating speeds of the sources are, therefore, all different. The crossover frequency of this two-way speaker system is about 800 Hz. A diffuser is mounted at the end of the horn to approximate an omnidirectional pattern of radiation. The box is almost completely closed and contains only the vents from which the sound radiates. The rotating speed of the horn is fast enough to obtain pitch and amplitude modulations due to the Doppler effect. In the woofer port, the frequency modulation is assumed not to be perceptible [27], the main perceptual effect is the amplitude modulation. In addition to these effects, the rotation of both low- and high-frequency sources results in time-dependent coupling with the room, creating a particular spatial modulation effect.

Smith et al. [23] investigated the Leslie effect, focusing mainly on the simulation of the sound radiated by the rotating horn. In this study, the authors concluded that under free field conditions, without the box, far from the rotating source, both the Doppler frequency shift and the amplitude modulation are likely to be almost sinusoidal. They also stated that the reflections occurring inside the wooden cabinet should be taken into account when simulating Leslie effects.

3.2. Measurements

To assess the perceptual effects of these factors, measurements were performed on a model 122A Leslie cabinet (Figure 4). The cabinet was placed in an anechoic room and driven by a sinusoidal generator. The acoustic pressure was measured using a microphone placed 1.2 m from the cabinet, at the same height from the floor as the rotating plane of the horns.

Figure 4
figure 4

View of the 122A Leslie cabinet (open and closed) used for our measurements.

From the signal recorded, , the analytic signal [28], given by , (where denotes the Hilbert transform operator) was calculated in order to deduce both amplitude and instantaneous frequency modulation laws.

The middle panel in Figure 5 shows the amplitude modulation law of the signal obtained with a 800 Hz input signal. The bottom panel shows the frequency modulation law of this signal. The instantaneous frequency showed a typical pattern, where the high-positive and negative peaks occur synchronously with a quasizero time amplitude signal. Patterns of this kind have been observed in situations where, for example, the vibrato of a singing voice is perturbed due to the room acoustics [29]. To determine the origin of these components, additional measurements were performed using sinusoidal input signals driving the horn alone. In this case, the interference was still observed, which means that radiation interference due to the woofer and the horn alone did not account for the complexity of the modulations. Other sound sources due to the enclosure, therefore, have to be taken into account in Leslie cabinet modeling procedures.

Figure 5
figure 5

Analysis of the acoustical output signal from the Leslie cabinet driven with a 800 Hz sinusoidal input signal. Both the woofer and the horn have been activated. (a) microphone signal, (b) amplitude modulation, (c) frequency modulation.

3.3. Implementation

The moving sound source model makes it easy to use the well-known image method [30] to account for the box wall reflections in the simulation procedure. The coordinates of the image sources can easily be deduced from the geometry of the cabinet, that is, the coordinates of the directly radiating source and those of the reflecting planes. Since the computational complexity of the image method increases exponentially with the number of reflections taken into account, perceptual assessments were performed to estimate the minimum number of source images required. It was concluded that one image source for each reflecting plane (first order) sufficed to obtain satisfactory perceptual results.

The implementation of the Leslie horn simulator is shown in Figure 6. The sound produced by the horn is composed of the sum of the direct sound source and the five image sources (the back wall of the horn part of our cabinet was removed). Each source was processed using the moving source model. In addition, the signals injected into the moving image source models were filtered to account for the frequency-dependent sound absorption by the wood material. The wood absorption filter was an FIR filter and its impulse response was based on wood absorption data available in the literature [31]. The same procedure was used for the woofer simulator. As in the real Leslie cabinet, crossover filtering of the input signal gives the input to both the woofer and the horn simulators. It is worth noting that to obtain a more realistic simulation of the Leslie cabinet, the distortion due to the nonlinear response of the Leslie tube amplifier has to be taken into account.

Figure 6
figure 6

Overview of the Leslie horn simulator with 5-image sources.

3.4. Results

To assess the perceptual quality of the model, listening tests have to be run. In addition, these tests should be entrusted to musicians experienced with the use of the Leslie cabinet manipulation. Nevertheless, to check the accuracy of the model, the main characteristics of the simulated signal obtained can be compared with the recorded one. For this purpose, we fed the model with a sinusoidal input signal with a frequency of 800 Hz (the crossover frequency) in order to include the effects of both the horn and the woofer. When the images source part was not active, the output signal showed periodic amplitude and frequency modulations, the extent of which was comparable to the data given by [23]. This can be seen in Figure 7, which gives both the signal and its amplitude and frequency modulation laws. In this case, the resulting audible effect (which can also be obtained as described in [32]) is a combination of the so-called vibrato and tremolo effects that does not correspond at all to the typical Leslie effect. When the source images were active, the signal characteristics were much more complex, as shown in Figure 8, where the aperiodic behavior of the modulation laws, which we believe to be responsible for the particular "Leslie effect," can be clearly seen. Actually, these features can also be seen in Figure 5, which shows the output signal recorded from a real Leslie cabinet driven by an 800 Hz monochromatic signal. Using musical signals, the sounds obtained with the Leslie cabinet and the simulator output have been described by professional musicians as being of a similar quality. A Max-MSP implementation of the Leslie cabinet simulator can be downloaded on the web (see Section 6).

Figure 7
figure 7

Analysis of the output signal from the horn simulator driven with a 800 Hz sinusoidal input signal. The part simulating the image sources has been disconnected. (a) microphone signal, (b) amplitude modulation, (c) frequency modulation.

Figure 8
figure 8

Analysis of the output signal from the complete Leslie simulator driven with a 800 Hz sinusoidal input signal. (a) microphone signal, (b) amplitude modulation, (c) frequency modulation.

3.5. Spatialization

Another important feature of the Leslie cabinet effect is the spatial modulation resulting from the time-dependent coupling between the cabinet and the listening room. To simulate this effect, a time-dependent directivity system was used. The directivity of this system should ideally be the same as that of the Leslie cabinet. A generic approach to this directivity simulation such as that described in [33] can be used here, which involves measuring the simulating system and the target directivity. From these measurements, a set of filters is obtained by optimization methods. In the case of the Leslie cabinet simulation, rotation of the sources increases the complexity of the problem. In the first step, we designed a simplified, easy to control system of spatialization preserving the concept of rotating source. Our system of spatialization consisted of four loudspeakers placed back to back (Figure 9) to cover the whole 360-degree range. The set of loudspeakers can be defined as two orthogonal dipoles ( and ) which are able to generate a variable pattern of directivity. The input signal fed to each speaker satisfies the following expressions:

(6)

The parameter can be set at any value ranging between 0 and 1, so that the pattern of directivity can be adjusted from the omnidirectional to the bidirectional pattern. When , each speaker receives the same signal, and the system is, therefore, omnidirectional. When , the speakers corresponding to each dipole receive signals with opposite phases. Each dipole then distributes the energy with a "figure of eight" pattern of directivity. Since the two dipoles are in phase quadrature, the resulting directivity of the whole system corresponds approximately to that produced by a rotating dipole at an angular speed of . When , which corresponds theoretically to a rotating cardioid pattern, satisfactory perceptual results were obtained.

Figure 9
figure 9

Scheme of the system of spatialization used for Leslie cabinet simulations.

In the real Leslie cabinet, the woofer port and the horns rotate at different angular frequencies. Two identical system of spatializations can thus be used to control the simulation process separately for the woofer and horn, each system being controlled by different angular rotation speed values.

4. Cosmophone

Sound is an interesting way of making invisible events perceptible. Actually, sounds produced by invisible or hidden sources can provide information about both the motion and the location of the sources. The cosmophone is a 3D sound immersion installation designed to sonify invisible cosmic particles, using synthetic sounds eliciting physically relevant sensations. The design of the cosmophone as a sound and music interface has been described in [34, 35]. We will describe below how the moving sound model was used in this framework to generate sounds evoking the trajectories of cosmic particles.

4.1. The Cosmic Rays

Interstellar space contains a permanent flux of high-energy elementary particles called "cosmic rays." These particles were created by violent events, such as those occurring when a huge and aged star explodes and becomes a supernova. The particles then remain confined in the galaxy for millions of years because of the galactic magnetic fields before reaching our planet. When colliding with the Earth's atmosphere, cosmic rays create showers of secondary particles. Although they are partly absorbed by the atmosphere, these showers have many measurable effects, including a flux of muons. Muons, which resemble heavy electrons but are usually absent from matter because of its short lifetime, are present in high levels in cosmic showers. Thanks to their outstanding penetrating properties, they are able to reach the ground. At sea level, they arrive at a rate of about a hundred muons per second per square meter. High-energy cosmic rays produce bunches of muons or multimuons, having the same direction and falling a few meters apart from each other.

4.2. The Cosmophone Installation

Human beings are unaware of the particles passing through their body. The cosmophone is a device designed to make the flux and properties of cosmic rays directly perceptible within a three-dimensional space. This is done by coupling a set of elementary particle detectors with an array of loudspeakers via a real-time data acquisition system and a real-time sound synthesis system (Figure 10). In this device, the information received from the detectors triggers the onset of sounds. Depending on the parameters of the particles detected, various types of sounds are generated. These parameters and the rate of occurrence of the various cosmic phenomena give rise to a large variety of sound effects. Many strategies for generating sounds from random events of this kind are currently being explored.

Figure 10
figure 10

Scheme of the cosmophone device.

The system of synthesis has to generate sounds in response to signals emitted by the particle detection system. To simulate a rain of particle, in which listeners are immersed, the loudspeakers were placed in two arrays: one above the listeners (above a ceiling) and the other one below them (under a specially built floor). The arrays of loudspeakers were arranged so that the ears of the listeners (who were assumed to be standing up and moving about inside the installation) were approximately equidistant from the two groups. Both ceiling and floor were acoustically transparent, but the speakers were invisible to the listeners. A particle detector was placed near each loudspeaker. When a particle first passed through a detector in the top group, then through a detector in the bottom group, a sound event was triggered. This sound event consisted of a sound moving from the ceiling to the floor, thus "materializing" the trajectory of the particle.

4.3. Sound Generation and Spatialization

The sound generator system was based on the moving sound source model described above. It also includes a synthesis engine allowing for the design of various sounds and a sampler triggering the use of natural sounds. Because of the morphology of human ears, one can accurately localize sources moving in a horizontal plane, but far less accurately those moving in the vertical plane [36]. Accordingly, initial experiments have shown that the use of a panpot to distribute the signal energy between two loudspeakers do not suffice to create the illusion of a vertically moving sound source. In particular, listeners were unable to exactly distinguish the starting and final positions of the moving source in 3D space. To improve the localization of the extreme points on the particle trajectory, we, therefore, added two short cues (called localization indices) to the sound event. The first cue is emitted by the upper loudspeaker at the beginning of the sound event and the second by the lower loudspeaker, at the end of the event. Since these two cues were chosen so as to be very exactly localizable, they have greatly improved the subjects perception of the vertical trajectory by giving the impression of a sound crossing the ceiling before hitting the floor.

A 24-channel cosmophone device was built for the Cité des Sciences et de l'Industrie in Paris, as part of a particle physics exhibition stand: the Théâtre des Muons (Figure 11). It was recently updated for the exhibition called Le Grand Récit de l'Univers. In this installation, two arrays of twelve speakers and detectors were placed in two concentric circles: the inner one comprises four speakers and detectors and the outer one, eight others. The outer circle was about five meters in diameter, which is wide enough to allow several listeners to stand in the installation.

Figure 11
figure 11

A picture of the cosmophone installed in the Cité des Sciences et de l'Industrie (Paris).

In practice, three different events could be distinguished: a single muon reaching a pair of detectors (by successively hitting a detector placed above the ceiling, then one located under the floor), a "small bunch," where more than one, but less than four pairs of detectors are hit simultaneously, and a "large bunch," when at least four pairs are hit. The three cases corresponded to different sound sequences (sound examples can be found at: http://cosmophone.in2p3.fr/).

5. Conclusion

To make virtual moving sound events realistic, some important features of the physical processes of real moving sources can be modeled. When dealing with synthesis processes or sounds recorded from fixed sources, a preprocessing step is required to induce in listeners a coherent mental representation of the motion. The real-time preprocessing model designed for this purpose accounts accurately for four main perceptual cues, namely, the intensity, timbre, and reverberation, as well as the Doppler effect. This model renders moving sound sources accurately, even in the case of monophonic diffusion systems, which shows the relative independence existing between sound motion and sound localization. The model parameters can be based on physical considerations. By simplifying the process, while keeping the most fundamental aspects of the situation, an accurate method of implementing and controlling the model in real time was developed.

The moving sound model could now be used as the basis of more complex systems involving the influence of room acoustics, for example. The Leslie Cabinet is a good example of systems of this kind, since the perceptual effects produced by the cabinet results from the effects of both the rotating source and the sound enclosure. We have also described here how a combination of several elementary moving sound source models can be used to accurately simulate this special choral effect and how the realism can be enhanced by connecting these models to a system of multiple speakers. Likewise, the moving source model has been used to construct a 3D sound immersion system for detection of cosmic particles. The cosmophone, which is based on a combination of moving source effects and spatialization techniques, is a good example of applications, where only a few features, such as localization indices improving our ability to localize vertically moving events, have been successfully added to our generic model.

The simulation of moving sound sources is an exciting field of research, always opening new domains of applications. Various techniques can be combined to generate novel audio effects such as those obtained by incorporating the Leslie cabinet simulator to the cosmophone installation. As far as the musical applications of this approach are concerned, we are currently developing an interface including a motion sensor for controlling a clarinet synthesis model in which the motion of the instrument is accounted for. Simulating the motion of sound sources is undoubtedly one of the keys to realistic sound modelling.

6. Methods

Cosmophone: http://cosmophone.in2p3.fr/.

Java atmospheric sound absorption calculators: http://www.csgnetwork.com/atmossndabsorbcalc.html. http://www.me.metu.edu.tr/me432/soft15.html.

Moving Sound Max/MSP patches downloadable from: http://www.lma.cnrs-mrs.fr/~kronland/MovingSources.

References

  1. Wanderley MM, Vines BW, Middleton N, McKay C, Hatch W: The musical significance of clarinetists' ancillary gestures: an exploration of the field. Journal of New Music Research 2005,34(1):97-113. 10.1080/09298210500124208

    Article  Google Scholar 

  2. Gerzon MA: Periphony: with-height sound reproduction. Journal of the Audio Engineering Society 1973,21(1):2-10.

    Google Scholar 

  3. ITU-Recommendation BS.775-1 : Multichannel stereophonic sound system with and without accompaning picture. 1994.

    Google Scholar 

  4. Berkhout AJ, de Vries D, Vogel P: Acoustic control by wave field synthesis. The Journal of the Acoustical Society of America 1993,93(5):2764-2778. 10.1121/1.405852

    Article  Google Scholar 

  5. Pulkki V: Virtual sound source positioning using vector base amplitude panning. Journal of the Audio Engineering Society 1997,45(6):456-466.

    Google Scholar 

  6. Schroeter J, Poesselt C, Opitz H, Divenyi PL, Blauert J: Generation of binaural signals for research and home entertainment. Proceedings of the 12th International Congress on Acoustics (ICA '86), July 1986, Toronto, Canada B1–6:

    Google Scholar 

  7. Warren JD, Zielinski BA, Green GGR, Rauschecker JP, Griffiths TD: Perception of sound-source motion by the human brain. Neuron 2002,34(1):139-148. 10.1016/S0896-6273(02)00637-2

    Article  Google Scholar 

  8. Chowning JM: The simulation of moving sound sources. Journal of the Audio Engineering Society 1971,19(1):2-6.

    Google Scholar 

  9. Väljamäe A, Larsson P, Västfjäll D, Kleiner M: Travelling without moving: auditory scene cues for translational self-motion. Proceedings of the 11th International Conference on Auditory Display (ICAD '05), July 2005, Limerick, Ireland

    Google Scholar 

  10. Schaeffer P: Traité des Objets Musicaux. Seuil, Paris, France; 1966.

    Google Scholar 

  11. Jot J-M, Warusfel O: A real-time spatial sound processor for music and virtual reality applications. Proceedings of the International Computer Music Conference (ICMC '95), September 1995, Banff, Canada 294-295.

    Google Scholar 

  12. Huopaniemi J, Savioja L, Karjalainen M: Modeling of reflections and air absorption in acoustical spaces: a digital filter design approach. Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA '97), October 1997, New Paltz, NY, USA 4.

    Google Scholar 

  13. Stevens SS: The relation of pitch to intensity. The Journal of the Acoustical Society of America 1935,6(3):150-154. 10.1121/1.1915715

    Article  Google Scholar 

  14. Neuhoff JG, McBeath MK: The Doppler illusion: the influence of dynamic intensity change on perceived pitch. Journal of Experimental Psychology: Human Perception and Performance 1996,22(4):970-985.

    Google Scholar 

  15. Rosenblum LD, Carello C, Pastore RE: Relative effectiveness of three stimulus variables for locating a moving sound source. Perception 1987,16(2):175-186. 10.1068/p160175

    Article  Google Scholar 

  16. Merer A, Ystad S, Kronland-Martinet R, Aramaki M, Besson M, Velay J-L: Perceptual categorization of moving sounds for synthesis applications. Proceedings of the International Computer Music Conference (ICMC '07), August 2007, Copenhagen, Denmark 69-72.

    Google Scholar 

  17. McAdams S, Bigand E: Thinking in Sound: The Cognitive Psychology of Human Audition. Oxford University Press, Oxford, UK; 1993.

    Book  Google Scholar 

  18. Aramaki M, Baillères H, Brancheriau L, Kronland-Martinet R, Ystad S: Sound quality assessment of wood for xylophone bars. The Journal of the Acoustical Society of America 2007,121(4):2407-2420. 10.1121/1.2697154

    Article  Google Scholar 

  19. Morse PM, Ingard KU: Theoretical Acoustics. MacGraw-Hill, New York, NY, USA; 1968.

    Google Scholar 

  20. Zicarelli D: An extensible real-time signal processing environment for max. In Proceedings of the International Computer Music Conference (ICMC '98), October 1998, Ann Arbor, Mich, USA. International Computer Music Association; 463-466.

    Google Scholar 

  21. Zölzer U: Digital Audio Signal Processing. John Wiley & Sons, New York, NY, USA; 1997.

    Google Scholar 

  22. ANSI-S1.26 : Method for calculation of the absorption of sound by the atmosphere. American National Standards Institute, New York, NY, USA, 1995

  23. Smith J, Serafin S, Abel J, Berners D: Doppler simulation and the leslie. Proceeding of the 5th International Conference on Digital Audio Effects (DAFx '02), September 2002, Hamburg, Germany

    Google Scholar 

  24. Strauss H: Implementing Doppler shifts for virtual auditory environments. In Proceedings of the 104th Audio Engineering Society Convention (AES '98), May 1998, Amsterdam, The Netherlands. Audio Engineering Society; paper no. 4687

    Google Scholar 

  25. Tsingos N: Simulation de champs sonores de haute qualité pour des applications graphiques interactives, Ph.D. thesis. Université de Grenoble 1, Saint-Martin-d'Hères, France; 1998.

    Google Scholar 

  26. Laakso TI, Välimäki V, Karjalainen M, Laine UK: Splitting the unit delay: tools for fractional delay filter design. IEEE Signal Processing Magazine 1996,13(1):30-60. 10.1109/79.482137

    Article  Google Scholar 

  27. Henricksen CA: Unearthing the mysteries of the leslie cabinet. Recording Engineer/Producer Magazine 1981, 130-134.

    Google Scholar 

  28. Ville J: Théorie et applications de la notion de signal analytique. Cables et Transmission 1948,2(1):61-74.

    Google Scholar 

  29. Arroabarren I, Rodet X, Carlosena A: On the measurement of the instantaneous frequency and amplitude of partials in vocal vibrato. IEEE Transactions on Audio, Speech and Language Processing 2006,14(4):1413-1421.

    Article  Google Scholar 

  30. Allen JB, Berkley DA: Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America 1979,65(4):943-950. 10.1121/1.382599

    Article  Google Scholar 

  31. Ballou G: Handbook for Sound Engineers. Focal Press, Woburn, Mass, USA; 1991.

    Google Scholar 

  32. Dish S, Zölzer U: Modulation and delay line based digital audio effects. Proceeding of the 2nd COST-G6 Workshop on Digital Audio Effects (DAFx '99), December 1999, Trondheim, Norway 5-8.

    Google Scholar 

  33. Warusfel O, Misdariis N: Directivity synthesis with a 3D array of loudspeakers-application for stage performance. Proceedings of the COST-G6 Conference on Digital Audio Effects (DAFx '01), December 2001, Limerick, Ireland

    Google Scholar 

  34. Gobin P, Kronland-Martinet R, Lagesse G-A, Voinier T, Ystad S: Designing musical interfaces with composition in mind. In Computer Music Modeling and Retrieval, Lecture Notes in Computer Science. Volume 2771. Springer, Berlin, Germany; 2003:225-246.

    Chapter  Google Scholar 

  35. Vallée C: The cosmophone: towards a sensuous insight into hidden reality. Leonardo 2002,35(2):129.

    Article  Google Scholar 

  36. Blauert J: Spatial Hearing. The MIT Press, Cambridge, Mass, USA; 1983.

    Google Scholar 

Download references

Acknowledgments

Part of this work has been supported by the French National Research Agency (A.N.R.) in the framework of the "senSons" project (JC05-41996), headed by S. Ystad (see http://www.sensons.cnrs-mrs.fr). The cosmophone was developed by D. Calvet, R. Kronland-Martinet, C. Vallée, and T. Voinier, based on an original idea by C. Vallée. The authors thank T. Guimezanes for his participation in the Leslie cabinet measurements.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R Kronland-Martinet.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kronland-Martinet, R., Voinier, T. Real-Time Perceptual Simulation of Moving Sources: Application to the Leslie Cabinet and 3D Sound Immersion. J AUDIO SPEECH MUSIC PROC. 2008, 849696 (2008). https://doi.org/10.1155/2008/849696

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1155/2008/849696

Keywords