Ndimensional Nmicrophone sound source localization
 Ali Pourmohammad^{1}Email author and
 Seyed Mohammad Ahadi^{1}
DOI: 10.1186/16874722201327
© Pourmohammad and Ahadi; licensee Springer. 2013
Received: 24 June 2013
Accepted: 21 November 2013
Published: 6 December 2013
Abstract
This paper investigates realtime Ndimensional wideband sound source localization in outdoor (farfield) and lowdegree reverberation cases, using a simple Nmicrophone arrangement. Outdoor sound source localization in different climates needs highly sensitive and highperformance microphones, which are very expensive. Reduction of the microphone count is our goal. Time delay estimation (TDE)based methods are common for Ndimensional wideband sound source localization in outdoor cases using at least N + 1 microphones. These methods need numerical analysis to solve closedform nonlinear equations leading to large computational overheads and a good initial guess to avoid local minima. Combined TDE and intensity level difference or interaural level difference (ILD) methods can reduce microphone counts to two for indoor twodimensional cases. However, ILDbased methods need only one dominant source for accurate localization. Also, using a linear array, two mirror points are produced simultaneously (halfplane localization). We apply this method to outdoor cases and propose a novel approach for Ndimensional entirespace outdoor farfield and low reverberation localization of a dominant wideband sound source using TDE, ILD, and headrelated transfer function (HRTF) simultaneously and only N microphones. Our proposed TDEILDHRTF method tries to solve the mentioned problems using source counting, noise reduction using spectral subtraction, and HRTF. A special reflector is designed to avoid mirror points and source counting used to make sure that only one dominant source is active in the localization area. The simple microphone arrangement used leads to linearization of the nonlinear closedform equations as well as no need for initial guess. Experimental results indicate that our implemented method features less than 0.2 degree error for angle of arrival and less than 10% error for threedimensional location finding as well as less than 150ms processing time for localization of a typical wideband sound source such as a flying object (helicopter).
Keywords
Sound source localization ITD TDE TDOA PHAT ILD HRTF1. Introduction
Source localization has been one of the fundamental problems in sonar [1], radar [2], teleconferencing or videoconferencing [3], mobile phone location [4], navigation and global positioning systems (GPS) [5], localization of earthquake epicenters and underground explosions [6], microphone arrays [7], robots [8], microseismic events in mines [9], sensor networks [10, 11], tactile interaction in novel tangible humancomputer interfaces [12], speaker tracking [13], surveillance [14], and sound source tracking [15]. Our goal is realtime sound source localization in outdoor environments, which necessitates a few points to be considered. For localizing such sound sources, a farfield assumption is usual. Furthermore, our experiments confirm that placing localization system in suitable higher heights often reduces the reverberation degree, especially for flying objects. Also, many such sound source signals are wideband signals. Moreover, outdoor highaccuracy sound source localization in different climates needs highly sensitive and highperformance microphones which are very expensive. Therefore, reduction of the number of microphones is very important, which in turn leads to reduced localization accuracies using conventional methods. Here, we intend to introduce a realtime accurate wideband sound source localization system in low degree reverberation farfield outdoor cases using fewer microphones.
The structure of this paper is as follows. After a literature review, we explain HRTF, ILD, and TDEbased methods and discuss TDEbased phase transform (PHAT). In Section 4, we explain sound source angle of arrival and location calculations using ILD and PHAT. Section 5 covers the introduction of TDEILDbased method to twodimensional (2D) halfplane sound source localization using only two microphones. Section 6 includes simulation of this method for 2D cases, where according to simulation results and due to the use of ILD, we introduce source counting. In Section 7, we propose, and in Section 8, we implement our TDEILDHRTFbased method for 2D wholeplane and threedimensional (3D) entirespace sound source localization. Section 9 includes the implementations. Finally conclusions will be made in Section 10.
2. Literature review
Passive sound source localization methods, in general, can be divided into direction of arrival (DOA) [16], time difference of arrival (TDOA) or TDE or interaural time difference (ITD) [17–20], ILD [21–24], and HRTFbased methods [25–30]. DOAbased beamforming and subspace methods typically need a large number of microphones for estimation of narrowband source locations in farfield cases and wideband source locations in nearfield cases. Also, they have higher processing needs in comparison to other methods. Many localization methods for nearfield cases have been proposed in the literature, such as maximum likelihood (ML), covariance approximation, multiple signal estimation (MUSIC), and estimation of signal parameters via rotational invariance techniques (ESPRIT) [16]. However, these methods are not applicable to the localization of wideband signal sources in farfield cases with small number of microphones. On the other hand, ILDbased methods are mostly applicable to the case of a single dominant sound source (high signaltonoise ratio (SNR)) [21–24]. TDEbased methods with high sampling rates are commonly used for 2D and 3D highaccuracy wideband nearfield and farfield sound source localization. In the case of ILD or TDEbased methods, minimum number of microphones required is three for 2D positioning and four for the 3D case [17–20, 31–39]. Finally, HRTFbased methods are applicable only to the case of calculating arrival angle in azimuth or elevation [30].
TDE and ILDbased outdoor farfield accurate wideband sound source localization in different climates needs highly sensitive and highperformance microphones which are very expensive. In the last decade, some papers were published which introduce 2D sound source localization methods using just two microphones in indoor cases using TDE and ILDbased methods simultaneously. However, it is noticeable that using ILDbased methods requires only one dominant source to be active, and it is known that, by using a linear array in the proposed TDEILDbased method, two mirror points will be produced simultaneously (halfplane localization) [40]. In this paper, we apply this method in outdoor (lowdegree reverberation) cases for a dominant sound source. We also propose a novel method to have 2D wholeplane (without producing two mirror points) and 3D entirespace dominant sound source localization using TDE, ILD, and HRTFbased methods simultaneously (TDEILDHRTF method). Based on the proposed method, a special reflector for the implemented simple microphone arrangement is designed, and source counting method is used to find that only one dominant sound source is active in the localization area.
In TDE and ILD localization approaches, calculations are carried out in two stages: estimation of time delay or intensity level differences and location calculation. Correlation is most widely used for time delay estimation [17–20, 31–39]. The most important issue in this approach is highaccuracy time delay estimation between microphone pairs. Meanwhile, the most important issue in ILDbased approach is high accuracy level difference measurement between microphone pairs [21–24]. Also, numerous results were published in the last decades for the second stage, i.e., location calculation. Equation complexities and large processing times are the important obstacles faced at this stage. In this paper, we propose a simple microphone arrangement that solves both these problems simultaneously.
In the abovementioned first stage, the classic methods of source localization from time delay estimates by detecting radio waves were Loran and Decca [31]. However, generalized crosscorrelation (GCC) using a ML estimator, proposed by Knapp and Carter [32], is the most widely used method for TDE. Later, a number of techniques were proposed to improve GCC in the presence of noise [3, 33–36]. As GCC is based on an ideal signal propagation model, it is believed to have a fundamental weakness of inability to cope well with reverberant environments. Some improvements were obtained by cepstral prefiltering by Stephenne and Champagne [37]. Even though more sophisticated techniques exist, they tend to be computationally expensive and are thus not well suited for realtime applications. Later, PHAT was proposed by Omologo and Svaizer [38]. More recently, a new PHATbased method was proposed for highaccuracy robust speaker localization, known as steered response patternphase or powerphase transform (SRPPHAT) [39]. Its disadvantage is higher processing time in comparison to PHAT, as it requires a search of a large number of candidate locations. According to the fact that in the application of this research, the number of candidate locations is much higher due to direction and distance estimation; this disadvantage does not allow us to use it in realtime applications.
In the last decade, according to the fact that the received signal energy is inversely proportional to the squared distance between the source and the receiving sensor, there has been some interest in using the received signal level at different sensors for source localization. Due to the spatial separation of the sensors, the source signal will arrive at the sensors with different levels so that the level differences can be utilized for source localization. Sheng and Hu [41] followed by Blatt and Hero [42] have proposed different algorithms for locating sources using a sensor network based on energy measurements. Birchfield and Gangishetty [22] applied ILD to sound source localization. While these works used only ILDbased methods to locate a source, Cui et al. [40] tried 2D sound source localization by a pair of microphones using TDE and ILDbased methods simultaneously. When the source signal is captured at the sensors, both time delay and signal level information are used for source localization. This technique is applicable for 2D localization with two sensors only. Also, due to the use of a linear array, it generates two mirror points simultaneously (halfplane localization). Ho and Sun [24] addressed a more common scenario of 3D localization using more than four sensors to improve the source location accuracy.
Human hearing system allows finding sound sources direction of arrival in 3D with just two ears. Pinnas, shoulders, and head diffract the incoming sound waves [43]. These propagation effects collectively are termed the HRTF. Batteau reported that the external ears play an important role in estimating the elevation angle of arrival [44]. Roffler and Butler [45], Oldfield and Parker [46], and Hofman et al. [47] have tried to find experimental evidence for this claim by using a Plexiglas headband to flatten the pinna against the head [45]. Based on HRTF measuring, Brown and Duda have made an extensive experimental study and provided empirical formulas for the multipath delays produced by pinna [48]. Although more sophisticated HRTF models have been proposed [49], the BrownDuda model has the advantage that it provides an analytical relationship between the multipath delays and the azimuth and elevation angles. Recently, Sen and Nehorai considered the BrownDuda HRTF model as an example to model the frequencydependent headshadow effects and the multipath delays close to the sensors for analyzing a 3D direction finding system with only two sensors inspired by the human auditory system [43]. However, they did not consider white noise gain error or spatial aliasing error in their model. They computed the asymptotic frequency domain CramerRao bound (CRB) on the error of the 3D direction estimate for zeromean widesense stationary Gaussian source signals. It should be noted that HRTFbased works are just able to estimate the azimuth and elevation angles [50]. In the last decades, some papers were published which tried to apply HRTF along with TDE for azimuth and elevation angle of arrival estimation [30]. According to this ability, we apply HRTF in our TDEILDbased localization system for solving the ambiguity in the generation of two mirror location points. We named it TDEILDHRTF method.
Given a set of TDEs and ILDs from a small set of microphones using PHAT and ILDbased methods, respectively, the second stage of a twostage algorithm determines the best point for the source location. The measurement equations are nonlinear. The most straightforward way is to perform an exhaustive search in the solution space. However, this is computationally expensive and inefficient. If the sensor array is known to be linear, the position measurement equations are simplified. Carter focused on a simple beamforming technique [1]. However, it requires a search in the range and bearing space. Also, beamforming methods need many more microphones for highaccuracy source localization. The linearization solution based on Taylorseries expansion by Foy [51] involves iterative processing, typically incurs high computational complexity, and for convergence, requires a tolerable initial estimate of the position. Hahn proposed an approach [20] that assumes a distant source. Abel and Smith proposed an explicit solution that can achieve the CramerRao Lower Bound (CRLB) in the small error region [52]. The situation is more complex when sensors are distributed arbitrarily. In this case, emitter position is determined from the intersection of a set of hyperbolic curves defined by the TDOA estimates using nonEuclidean geometry [53, 54]. Finding the solution is not easy as the equations are nonlinear. Schmidt has proposed a formulation [18] in which the source location is found as the focus of a conic passing through three sensors. This method can be extended to an optimal closedform localization technique [55]. Delosme [56] proposed a gradient method for search in a localization procedure leading to computation of optimal source locations from noisy TDOA's. Fang [57] gave an exact solution when the number of TDOA measurements is equal to the number of unknowns (coordinates of transmitter). This solution, however, cannot make use of extra measurements, available when there are extra sensors, to improve position accuracy. The more general situation with extra measurements was considered by Friedlander [58], Schau and Robinson [59], and Smith and Abel [55]. These methods are not optimum in the leastsquares sense and perform worse in comparison to the Taylorseries method.
Although closedform solutions have been developed, their estimators are not optimum. The divide and conquer (DAC) method [60] from Abel can achieve optimum performance, but it requires that the Fisher information is sufficiently large. To obtain a precise position estimate at reasonable noise levels, the Taylorseries method [51] is commonly employed. It is an iterative method that starts with an initial guess and improves the estimate at each step by determining the local linear leastsquares (LS) solution. An initial guess close to the true solution is needed to avoid local minima. Selection of such a starting point is not simple in practice. Moreover, convergence of the iterative process is not assured. It also suffers from convergence problem and large LS computational burden as the method is iterative. Within the last few years, some papers were published on improving LS and closedform methods [16, 23, 61–65]. Based on closedform hyperbolicintersection method, we will explain our proposed method, which using a simple arrangement of two microphones for 2D cases and three microphones for 3D cases, can simplify nonlinear equations of this method to have a linear equation. Although there have been attempts to linearize closedform nonlinear equations through algebraic means, such as [7, 16, 56, 63], our proposed method with simple pure geometrical linearization needs less microphones and features accurate localization and less processing time.
3. Basic methods
3.1. HRTF
3.2. ILDbased localization
Now using (12), (13), and (14), we can localize the sound source (Section 4.1.).
3.3. TDEbased localization
In an overall view, the time delay estimation methods are as follows [17–20, 31–39]:

CorrelationBased Methods: CrossCorrelation (CC), ML, PHAT, Average Square Difference Function (ASDF)

Adaptive FilterBased Methods: Sync Filter, LMS.
4. ILD and PHATbased angle of arrival and location calculation methods
4.1. Using ILD method
Therefore, source location is on a circle with center coordinates (k, 0) and radius$\left(\sqrt{l}\right)$. Now, using a new microphone to find a new equation, in combination with one of the first or second microphones, helps us to have another circle which leads to source location with different center coordinates and different radii relative to the first circle. Intersection of the first and second circles gives us source location x and y[22, 40].
4.2. Using PHAT method
It is noticeable that these are nonlinear equations (Hyperbolicintersection ClosedForm method) and numerical analysis should be used to calculate x and y, which will increase localization processing times. Also in this case, the solution may not converge.
5. TDEILDbased 2D sound source localization
The positive root gives the square of distance from source to origin. Substituting Rs into (51), the final source coordinate will be obtained [40].
We remember again that by using a linear array, two mirror points will be produced simultaneously. This means that we can localize 2D sound source only in halfplane.
6. Simulations of TDEILDbased method and discussion
Then, we calculated d_{1} and d_{2} using (24) and (25), and using (37), we calculated time delay between the received signals of the two microphones. For time delay positive values, i.e., sound source nearer to the first microphone (mic1 in Figure 1), we delayed second microphone signal with respect to the first microphone signal, and for negative values, i.e., sound source nearer to the second microphone (mic2 in Figure 1), did the opposite. Then using (6) and (7), we divided the first microphone signal by d_{1} and the second microphone signal by d_{2} to have correct attenuation in signals according to the source distances from microphones. Finally, we tried to calculate source location using the proposed TDEILD method (Section 5) in a variety of SNRs for some environmental noises.
7. Our proposed TDEILDHRTF method
Using TDEILDbased method, dual microphone 2D sound source localization is applicable. However, it is known that, by using a linear array in TDEILDbased method, two mirror points will be produced simultaneously (halfplane localization) [40]. Also, according to TDEILDbased simulation results (Section 6), it is noticeable that using ILDbased method needs only one dominant high SNR source to be active in localization area. Our proposed TDEILDHRTF method tries to solve these problems using source counting, noise reduction using spectral subtraction, and HRTF.
7.1. Source counting method
Using (34), τ_{1} = T_{2} − T_{1} gives us a maximum value for $R{1}_{{s}_{1}{s}_{2}}\left(\tau \right),{\tau}_{2}=T{\text{'}}_{2}{T}_{1}$, gives us a maximum value for $R{2}_{{s}_{1}{s}_{2}}\left(\tau \right),{\tau}_{3}={T}_{2}T{\text{'}}_{1}$, gives us a maximum value for $R{3}_{{s}_{1}{s}_{2}}\left(\tau \right)\phantom{\rule{0.22em}{0ex}}\mathit{and}\phantom{\rule{0.12em}{0ex}}{\tau}_{4}=T{\text{'}}_{2}T{\text{'}}_{1}$, and gives us a maximum value for $R{4}_{{s}_{1}{s}_{2}}\left(\tau \right)$. Therefore, we will have four peak values in crosscorrelation vector. However, according to this fact that (67) and (70) are crosscorrelation functions of a signal with delayed version of itself, and (68) and (69) are crosscorrelation functions of two different signals, τ_{1} and τ_{4} maximum values are dominant with respect to τ_{2} and τ_{3} values. Now, we conclude in two dominant sound sources area, crosscorrelation vector will have two dominant values and therefore equal count dominant values for more than two dominant sound sources signals as multiple power spectrum peaks in DOAbased multiple sound source beamforming methods [16]. Therefore, counting dominant crosscorrelation vector values, we can find the number of active and dominant sound sources in localization area.
7.2. Noise reduction using spectral subtraction
During the silent periods, i.e., periods without target sound, it can be estimated background noise spectrum, considering the noise to be stationary. Then, the noise magnitude spectrum can be subtracted from the noisy input magnitude spectrum. In nonstationary noise cases, there can be used an adaptive background noise spectrum estimation procedure [67].
7.3. Using HRTF method
High ΔH(f) values indicate that the sound source is in front, while negligible values indicate that sound source is at the back. One important point is that in order to have the same spectrum in both microphones when the sound source is at the back, careful design of the slits is necessary.
7.4. Extension of dimensions to three
The reasons for choosing the shape of sphere for the reflector are as follows [69]. The simplest type of reflector is a plane reflector introduced to direct signal in a desired direction. Clearly, using this type of reflector, the distance between reflector and microphone, d in (73), varies with respect to source position in 3D cases leading to a change in notch position within spectrum. The change in notch position may not be suitable as it might occur out of the spectral band of interest. To better adjust the energy in the forward direction, the geometrical shape of the plane reflector must be changed so as not to allow radiation in the back and side directions.
8. TDEILDHRTF approach algorithm
 1.
Setup microphones and hardware.
 2.
Calculate the sound recording hardware set (microphone, preamplifier, and sound card) amplification normalizing factor.
 3.
Apply voice activity detection. Is there valuable signal? yes → go to 4 no → go to 3.
 4.
Obtain s _{1}(t), s _{2}( t ), and s _{3}(t) → m = E _{1}/E _{2}.
 5.
Remove DC from the signals. Then normalize them regarding the sound intensity.
 6.
Hamming window signals regarding their stationary parts (at least about 100 ms for wideband quasiperiodic helicopter sound or twice that).
 7.
Apply FFT to signals.
 8.
Cancel noise using spectral subtraction.
 9.
Apply PHAT to the signals in order to calculate τ _{21} and τ _{31} in frequency domain (index of first maximum value). Then find the second maximum value of crosscorrelation vector.
 10.If the first maximum value is not dominant enough with respect to the second maximum value, go to the next windows of signals and do not calculate sound source location, otherwise:
 a.$F={\mathrm{cos}}^{1}\left(\frac{{v}_{\mathrm{sound}}.{t}_{21}}{2R}\right)\phantom{\rule{0.12em}{0ex}}\mathrm{and}\phantom{\rule{0.12em}{0ex}}\mathrm{?}={\mathrm{cos}}^{1}\left(\frac{{v}_{\mathrm{sound}}.{t}_{31}}{R}\right)\phantom{\rule{0.12em}{0ex}}$
 b.${v}_{\mathrm{sound}}=20.05\sqrt{273.15+\mathrm{Temperature}\left(\mathrm{Centigrade}\right)}$
 c.${r}_{1}=\frac{{t}_{21}.{v}_{\mathrm{sound}}}{1\sqrt{m}}\mathrm{and}\phantom{\rule{0.12em}{0ex}}{r}_{2}=\frac{{t}_{21}.{v}_{\mathrm{sound}}.\sqrt{m}}{1\sqrt{m}}$
 d.$x=\left({r}_{2}^{2}{r}_{1}^{2}\right)/4R\phantom{\rule{0.25em}{0ex}}\mathrm{and}\phantom{\rule{0.25em}{0ex}}y=\pm \sqrt{{r}_{1}^{2}{\left(xR\right)}^{2}}$
 e.$\left\mathrm{?H}\left(f\right)\right=\left10{\mathrm{log}}_{10}\frac{{H}_{1}\left(f\right)\phantom{\rule{0.25em}{0ex}}}{{H}_{3}\left(f\right)\phantom{\rule{0.25em}{0ex}}}\right$
 f.$\mathrm{if}\left(\left\mathrm{?H}\left(f\right)\right\u02dc0\right)\phantom{\rule{0.25em}{0ex}}y=\sqrt{{r}_{1}^{2}{\left(xR\right)}^{2}}$$\mathrm{else}\phantom{\rule{0.25em}{0ex}}y=\sqrt{{r}_{1}^{2}{\left(xR\right)}^{2}}$$?\left\{\begin{array}{c}\hfill {\mathsf{x}}_{\mathsf{s}}=\mathsf{r}.\mathsf{cos}\left(\mathsf{F}\right).\mathsf{sin}\left(\mathsf{?}\right)\hfill \\ \hfill {\mathsf{y}}_{\mathsf{s}}=\mathsf{r}.\mathsf{sin}\left(\mathsf{F}\right).\mathsf{sin}\left(\mathsf{?}\right)\hfill \\ \hfill {\mathsf{z}}_{\mathsf{s}}=\mathsf{r}.\mathsf{cos}\left(\mathsf{?}\right)\hfill \end{array}\right.$
 g.
Go to 3.
 a.
9. Hardware and software implementations and results
Results of the azimuth angle of arrival based on hardware implementation
ф Real angle (degrees)  ф Proposed method results for angle of arrival (degrees)  Absolute value of error (degrees) 

0  0.18  0.18 
15  15.13  0.13 
30  29.85  0.15 
45  45.18  0.18 
60  60.14  0.14 
75  74.91  0.09 
90  89.83  0.17 
105  104.82  0.18 
120  120.14  0.14 
135  135.11  0.11 
150  150.14  0.14 
165  165.09  0.09 
180  179.93  0.07 
Results of the elevation angle of arrival based on hardware implementation
θ Real angle (degrees)  θ Proposed method results for angle of arrival (degrees)  Absolute value of error (degrees) 

−10  −10.18  0.18 
−5  −5.16  0.16 
0  0.05  0.05 
15  14.15  0.15 
30  29.91  0.09 
45  45.19  0.19 
60  60.13  0.13 
75  75.11  0.11 
90  89.82  0.18 
Results for proposed 3D sound source localization method (Section 7) using noise reduction procedure
Real location ( m)  Proposed method results ( m)  Absolute error ( m)  

x  y  z  x  y  z  x  y  z 
1  10  5  1.9  9.2  4.1  0.9  0.8  0.9 
2  10  5  2.9  9.2  4.3  0.9  0.8  0.7 
3  10  5  2.2  10.7  5.6  0.8  0.7  0.6 
4  10  5  3.2  10.7  5.8  0.8  0.7  0.8 
5  10  5  5.5  9.5  5.8  0.5  0.5  0.8 
6  10  5  6.5  10.5  4.5  0.5  0.5  0.5 
7  10  5  7.7  10.8  4.4  0.7  0.8  0.6 
8  10  5  8.6  10.8  4.4  0.6  0.8  0.6 
9  10  5  9.8  9.1  4.2  0.8  0.9  0.8 
10  10  5  9.5  9.4  5.9  0.5  0.6  0.9 
10  15  5  10.6  15.7  5.5  0.6  0.7  0.5 
10  20  5  10.7  19.2  5.6  0.7  0.8  0.6 
10  25  5  9.2  23.9  5.7  0.8  1.1  0.7 
10  30  5  10.5  31.7  4.3  0.5  1.7  0.7 
10  9  5  10.8  9.7  5.8  0.8  0.7  0.8 
10  8  5  9.3  8.7  5.8  0.7  0.7  0.8 
10  1  5  10.6  6.5  5.9  0.6  0.5  0.9 
10  6  5  9.5  6.6  4.1  0.5  0.6  0.9 
10  5  5  10.5  4.5  4.2  0.5  0.5  0.8 
10  4  5  10.6  3.4  5.7  0.6  0.6  0.7 
10  3  5  9.3  3.6  5.5  0.7  0.6  0.5 
10  2  5  10.8  ±2.8  5.8  0.8  0.8  0.8 
10  1  5  10.1  ±0.2  4.3  0.1  0.8  0.7 
10  −1  5  10.6  ±0.4  4.1  0.6  0.6  0.9 
10  −2  5  10.5  ±2.7  5.6  0.5  0.7  0.6 
10  −3  5  10.4  −2.8  5.5  0.4  0.2  0.5 
10  −4  5  9.7  −3.7  5.5  0.3  0.3  0.5 
10  −5  5  9.4  −5.2  5.9  0.6  0.2  0.9 
10. Conclusion
In this paper, we reported on the simulation of TDEILDbased 2D halfplane sound source localization using only two microphones. Reduction of the microphone count was our goal. Therefore, we also proposed and implemented TDEILDHRTFbased 3D entirespace sound source localization using only three microphones. Also, we used spectral subtraction and source counting methods in lowdegree reverberation outdoor cases to increase localization accuracy. According to Table 3, implementation results show that the proposed method has led to less than 10% error for 3D location finding. This is a higher accuracy in source location measurement in comparison with similar researches which did not use spectral subtraction and source counting. Also, we indicated that partly covering one of the microphones by a halfsphere reflector leads to entirespace Ndimensional sound source localization using only Nmicrophones.
Authors’ information
AP was born in Azerbaijan. He has a Ph.D. in Electrical Engineering (Signal Processing) from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran. He is now an invited lecturer at the Electrical Engineering Department, Amirkabir University of Technology and has been teaching several courses (C++ programming, multimedia systems, digital signal processing, digital audio processing, and digital image processing). His research interests include statistical signal processing and applications, digital signal processing and applications, digital audio processing and applications (acoustic modeling, speech coding, text to speech, audio watermarking and steganography, sound source localization, determined and underdetermined blind source separation and scene analyzing), digital image processing and applications (image and video coding, image watermarking and steganography, video watermarking and steganography, object tracking and scene matching) as well as BCI.
SMA received his B.S. and M.S. degrees in Electronics from the Electrical Engineering Department, Amirkabir University of Technology, Tehran, Iran, in 1984 and 1987, respectively, and his Ph.D. in Engineering from the University of Cambridge, Cambridge, U.K., in 1996, in the field of speech processing. Since 1988, he has been a Faculty Member at the Electrical Engineering Department, Amirkabir University of Technology, where he is currently an associate professor and teaches several courses and conducts research in electronics and communications. His research interests include speech processing, acoustic modeling, robust speech recognition, speaker adaptation, speech enhancement as well as audio and speech watermarking.
Declarations
Authors’ Affiliations
References
 Carter GC: Time delay estimation for passive sonar signal processing. IEEE T Acoust S. 1981, ASSP29: 462470.Google Scholar
 Weinstein E: Optimal source localization and tracking from passive array measurements. IEEE T. Acoust. S. 1982, ASSP30: 6976.View ArticleGoogle Scholar
 Wang H, Chu P: Voice source localization for automatic camera pointing system in videoconferencing, in Proceedings of the ICASSP, New Paltz, NY, 19–22. 1997.Google Scholar
 Caffery J Jr, Stüber G: Subscriber location in CDMA cellular networks. IEEE Trans. Veh. Technol. 1998, 47: 406416.View ArticleGoogle Scholar
 Tsui JB: Fundamentals of Global Positioning System Receivers. New York: Wiley; 2000.View ArticleGoogle Scholar
 Klein F: Finding an earthquake's location with modern seismic networks. Northern California: Earthquake Hazards Program; 2000.Google Scholar
 Huang Y, Benesty J, Elko GW, Mersereau RM: Realtime passive source localization: A practical linearcorrection leastsquares approach. IEEE T. Audio P. 2001, 9: 943956.View ArticleGoogle Scholar
 Michaud VJM, Rouat F, Letourneau J: Robust Sound Source Localization Using a Microphone Array on a Mobile Robot. Las Vegas: in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 2; 12281233.Google Scholar
 Daku BLF, Salt JE, Sha L, Prugger AF: An algorithm for locating microseismic events, in Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering . Niagara Falls. May 2004, 2–5: 23112314.Google Scholar
 Patwari N, Ash JN, Kyperountas S, Hero AO III, Moses RL, Correal NS: Locating the nodes. IEEE Signal Process. Mag. 2005, 22: 5469.View ArticleGoogle Scholar
 Gezici S, Zhi T, Giannakis G, Kobayashi H, Molisch A, Poor H, Sahinoglu Z: Localization via ultrawideband radios: a look at positioning aspects for future sensor networks. IEEE Signal Process. Mag. 2005, 22: 7084.View ArticleGoogle Scholar
 Lee JY, Ji SY, Hahn M, YoungJo C: RealTime Sound Localization Using Time Difference for HumanRobot Interaction. Prague: in Proceedings of the 16th IFAC World Congress; 2005.Google Scholar
 Ma WK, Vo BN, Singh SS, Baddeley A: Tracking an unknown timevarying number of speakers using TDOA measurements: a random finite set approach. IEEE Trans. Signal. Process. 2006, 54(9):32913304.View ArticleGoogle Scholar
 Ho KC, Lu X, Kovavisaruch L: Source localization using TDOA and FDOA measurements in the presence of receiver location errors: analysis and solution. IEEE Trans Signal. Process. 2007, 55(2):684696.MathSciNetView ArticleGoogle Scholar
 Cevher V, Sankaranarayanan AC, McClellan JH, Chellappa R: Target tracking using a joint acoustic video system. IEEE Trans. Multimedia. 2007, 9(4):715727.View ArticleGoogle Scholar
 Zhengyuan X, Liu N, Sadler BM: A Simple ClosedForm Linear Source Localization Algorithm. Orlando, FL: in IEEE MILCOM; 2007:17.Google Scholar
 Etten JPV: Navigation systems: fundamentals of low and verylow frequency hyperbolic techniques. Elec. Comm. 1970, 45(3):192212.Google Scholar
 Schmidt RO: A new approach to geometry of range difference location. IEEE Trans. Aerosp. Electron Syst. 1972, AES8(6):821835.View ArticleGoogle Scholar
 Lee HB: A novel procedure for assessing the accuracy of hyperbolic multilateration systems. IEEE Trans. Aerosp. Electron Syst. 1975, 110(1):215.View ArticleGoogle Scholar
 Hahn WR: Optimum signal processing for passive sonar range and bearing estimation. J. Acoust. Soc. Am. 1975, 58: 201207.View ArticleGoogle Scholar
 Julian P, Comparative A: Study of Sound Localization Algorithms for Energy Aware Sensor Network Nodes. IEEE Trans. Circuits and Systems  I. 2004., 51(4):Google Scholar
 Birchfield ST, Gangishetty R: Acoustic Localization by Interaural Level Difference. Philadelphia, PA: in Proceedings of the ICASSP; 2005:11091112.Google Scholar
 Ho KC, Sun M: An accurate algebraic ClosedForm solution for energybased source localization. IEEE Trans. Audio, Speech. Lang. Process 2007, 15: 25422550.View ArticleGoogle Scholar
 Ho KC, Sun M: Passive Source Localization Using Time Differences of Arrival and Gain Ratios of Arrival. IEEE Trans. Signal. Process 2008, 56: 2.View ArticleGoogle Scholar
 Hebrank JH, Wright D: Spectral cues used in the localization of sound sources on the median plane. J. Acoust. Soc. Am. 1974, 56: 18291834.View ArticleGoogle Scholar
 Middlebrooks JC, Makous JC, Green DM: Directional sensitivity of soundpressure level in the human ear canal. J Acoust Soc Am.. 1989, 1: 86.Google Scholar
 Asano F, Suzuki Y, Sone T: Role of spectral cues in median plane localization. J. Acoust. Soc. Am. 1990, 88: 159168.View ArticleGoogle Scholar
 Duda RO, Martens WL: Range dependence of the response of a spherical head model. J. Acoust. Soc. Am. 1998, 104(5):30483058.View ArticleGoogle Scholar
 Cheng CI, Wakefield GH: Introduction to headrelated transfer functions (HRTFS): representations of HRTFs in time, frequency, and space. J. of the Audio Engin. Soc. 2001, 49(4):231248.Google Scholar
 Kulaib A, AlMualla M, Vernon D in Proceedings of the 12th Int . In 2D Binaural Sound Localization for Urban Search and Rescue Robotic. Istanbul: Conference on Climbing and Walking Robots; 2009:911.Google Scholar
 Razin S: Explicit (noniterative) Loran solution. J. Inst. Navigation 1967., 14(3):Google Scholar
 Knapp C, Carter G: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust., Speech. Signal Process. 1976, ASSP24(4):320327.Google Scholar
 Brandstein MS, Silverman H: A practical methodology for speech localization with microphone arrays. Comput Speech Lang. 1997, 11(2):91126.View ArticleGoogle Scholar
 Brandstein MS, Adcock JE, Silverman HF: A ClosedForm location estimator for use with room environment microphone arrays. IEEE Trans. Speech Audio Process. 1997, 5(1):4550.View ArticleGoogle Scholar
 Svnizer P, Matnssoni M, Omologo M: Acoustic Source Location in a THREEDimensional Space Using Crosspower Spectrum Phase. Munich: in Proceedings of the ICASSP; 1997:231.Google Scholar
 Lleida E, Fernandez J, Masgrau E: Robust continuous speech recog. sys. based on a microphone array. Seattle, WA: in Proceedings of the ICASSP; 1998:241244.Google Scholar
 Stephenne A, Champagne B: Cepstral prefiltering for time delay estimation in reverberant environments. Detroit, MI: in Proceedings of the ICASSP; 1995:30553058.Google Scholar
 Omologo M, Svaizer P: Acoustic Event Localization using a CrosspowerSpectrum Phase based Technique. Adelaide: in Proceedings of the ICASSP, vol. 2; 1994:273276.Google Scholar
 Zhang C, Florencio D, Zhang Z: Why Does PHAT Work Well in Low Noise, reverberative Environments. Las Vegas, NV: in Proceedings of the ICASSP; 2008:25652568.Google Scholar
 Cui W, Cao Z, Wei J: DUALMicrophone Source Location Method in 2D Space. Toulouse: in Proceedings of the ICASSP; 2006:845848.Google Scholar
 Sheng X, Hu YH: Maximum likelihood multiplesource localization using acoustic energy measurements with wireless sensor networks. IEEE Trans Signal Process 2005, 53(1):4453.MathSciNetView ArticleGoogle Scholar
 Blatt D, Hero AD III: Energybased sensor network source localization via projection onto convex sets. IEEE Trans Signal Process 2006, 54(9):36143619.View ArticleGoogle Scholar
 Sen S, Nehorai A: Performance analysis of 3D direction estimation based on headrelated transfer function. IEEE Trans. Audio Speech Lang. Process 2009., 17(4):Google Scholar
 Batteau DW: The role of the pinna in human localization. Series B, Biol. Sci 1967, 168: 158180.View ArticleGoogle Scholar
 Roffler SK, Butler RA: Factors that influence the localization of sound in the vertical plane. J. Acoust. Soc. Am. 1968, 43(6):12551259.View ArticleGoogle Scholar
 Oldfield SR, Parker SPA: Acuity of sound localization: a topography of auditory space. Perception 1984, 13(5):601617.View ArticleGoogle Scholar
 Hofman PM, Riswick JGAV, Opstal AJV: Relearing sound localization with new ears. Nature Neurosci. 1998, 1(5):417421.View ArticleGoogle Scholar
 Brown CP: Modeling the Elevation Characteristics of the HeadRelated Impulse Response. M.S. Thesis: San Jose State University, San Jose, CA; 1996.Google Scholar
 Satarzadeh P, Algazi VR, Duda RO: Physical and filter pinna models based on anthropometry. Audio Eng. Soc, Vienna: in Proceedings of the 122nd Conv; 2007:7098.Google Scholar
 Hwang S, Park Y, Park Y: Sound Source Localization using HRTF database. GyeonggiDo: in ICCAS2005; 2005.Google Scholar
 Foy WH: Positionlocation solution by Taylorseries estimation. IEEE Trans. Aerosp. Electron Syst. 1976, AES12: 187194.View ArticleGoogle Scholar
 Abel JS, Smith JO: Source range and depth estimation from multipath range difference measurements. IEEE Trans. Acoust. Speech. Signal. Process. 1989, 37: 11571165.View ArticleGoogle Scholar
 Sommerville DMY: Elements of NonEuclidean Geometry. New York: Dover; 1958. pp. 260–1958Google Scholar
 Eisenhart LP: A Treatise on the Differential Geometry of Curves and Surfaces. New York, Dover; 1960:270271. originally Ginn, 1909Google Scholar
 Smith JO, Abel JS: ClosedForm leastsquares source location estimation from rangedifference measurements. IEEE Trans. Acoust. Speech Signal Process. 1987, ASSP35: 16611669.View ArticleGoogle Scholar
 Delosme JM, Morf M, Friedlander B: A linear equation approach to locating sources from timedifferenceofarrival measurements. Denver, CO: in Proceedings of the ICASSP; 1980:0911.Google Scholar
 Fang BT: Simple solutions for hyperbolic and related position fixes. IEEE Trans. Aerosp. Electron Syst. 1990, 26: 748753.View ArticleGoogle Scholar
 Friedlander B: A passive localization algorithm and its accuracy analysis. IEEE J. Oceanic Eng. 1987, OE12: 234245.View ArticleGoogle Scholar
 Schau HC, Robinson AZ: Passive source localization employing intersecting spherical surfaces from timeofarrival differences. IEEE Trans. Acoust. Speech Signal Process. 1987, ASSP35: 12231225.View ArticleGoogle Scholar
 Abel JS: A divide and conquer approach to leastsquares estimation. IEEE Trans. Aerosp. Electron. Syst. 1990, 26: 423427.View ArticleGoogle Scholar
 Beck A, Stoica P, Li J: Exact and approximate solutions of source localization problems. IEEE Trans. Signal. Process. 2008, 56(5):17701778.MathSciNetView ArticleGoogle Scholar
 So HC, Chan YT, Chan FKW: ClosedForm Formulae for TimeDifferenceofArrival Estimation. IEEE Trans Signal Process. 2008, 56: 6.MathSciNetGoogle Scholar
 Gillette MD, Silverman HF: A linear closedform algorithm for source localization from timedifferences of arrival. IEEE Signal Process. Lett. 2008, 15: 14.View ArticleGoogle Scholar
 Larsson EG, Danev D: Accuracy comparison of LS and squaredrange ls for source localization. IEEE Trans. Signal. Process 2010, 58: 2.View ArticleGoogle Scholar
 Ono N, Sagayama S: Rmeans localization: a simple iterative algorithm for rangedifferencebased source localization. in Proceedings of the ICASSP. March 2010, 14–19: 27182721.Google Scholar
 Chami ZE, Guerin A, Pham A, Servière C: A PHASEbased Dual Microphone Method to Count and Locate Audio Sources in Reverberant Rooms. New Paltz, NY: in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics; 2000:1821.Google Scholar
 Vaseghi SV: Advanced Digital Signal Processing and Noise Reduction. John Wiley & Sons, Ltd, Hoboken, NJ: 3rd edn; 2006.Google Scholar
 Pourmohammad A, Ahadi SM: TDEILDHRTFbased 2D wholeplane sound source localization using only two microphones and source counting. Int. J. Info. Eng. 2012, 2(3):307313.Google Scholar
 Balanis CA: Antenna Theory: Analysis and Design (John Wiley & Sons Inc. Hoboken: NJ; 2005.Google Scholar
 Giancoli DC: Physics for Scientists and Engineers. Prentice Hall: Upper Saddle River; 2000. Vol. IGoogle Scholar
 Friedlander B: On the CramerRao bound for time delay and Doppler estimation. IEEE Trans. Inf. Theory. 1984, 30(3):575580.View ArticleGoogle Scholar
 Gore A, Fazel A, Chakrabartty S: FarField Acoustic Source Localization and Bearing Estimation Using ∑∆ Learners. IEEE Trans. circuits and systemsI: regular papers 2010., 57(4):Google Scholar
 Chen SH, Wang JF, Chen MH, Sun ZW, Liao MJ, Lin SC, Chang SJ: A Design of FarField Speaker Localization System using Independent Component Analysis with Subspace Speech Enhancement. San Diego, CA: in Proc. 11th IEEE Int. Symp. Mult.; 2009:1416.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.