 Research
 Open Access
 Published:
Steerable differential beamformers with planar microphone arrays
EURASIP Journal on Audio, Speech, and Music Processing volume 2020, Article number: 15 (2020)
Abstract
Humanoid robots require to use microphone arrays to acquire speech signals from the human communication partner while suppressing noise, reverberation, and interferences. Unlike many other applications, microphone arrays in humanoid robots have to face the restrictions in size and geometry. To address these challenges, this paper presents an approach to differential beamforming with arbitrary planar array geometries. The major contributions of this work are as follows: (1) a method is presented to design differential beamformers, which works for regular geometries such as linear, circular, and concentric circular ones, as well as irregular geometries, as long as the sensors’ positions are given or can be measured; (2) fundamental requirements for the design of different orders of linear differential microphone arrays (DMAs), partially steerable DMAs, fully steerable DMAs, and robust DMAs are discussed; (3) the validity and limitations of the JacobiAnger expansion approximation is analyzed, where we discuss how to achieve an optimal approximation by properly choosing the reference point; and (4) we show how to design an Nthorder DMA with 2N microphones using the JacobiAnger expansion.
Introduction
It has long been a dream of researchers and engineers to create humanoid robots, which can communicate naturally with humans through speech and language. A prerequisite for this is the ability to acquire speech from the human communication partner with high fidelity/quality and, meanwhile, mitigate or even eliminate the effects of background noise, acoustic feedback, interferences, reverberation, and robot ego noise. This requires to use sensor arrays with multiple microphones arranged into a certain geometry. Unlike many wellstudied applications such as teleconferencing, microphone arrays for robot audition are limited by size and geometry [1–5]. Therefore, how to design small arrays with a flexible geometry and the associated beamforming algorithms that can process broadband speech signals is a critical problem [6–13]. Among different types of available arrays, differential microphone arrays (DMAs), which are designed to measure the differentials of the sound pressure field, are more appropriate for robot audition since they are small in size and can achieve high directivity and frequencyinvariant beampatterns [14–23]. From the early efforts of designing linear DMAs in a multistage manner [24, 25], to the recently developed nullconstraintbased linear DMAs in the shorttime Fourier transform (STFT) domain [26], the flexibility in forming different beampatterns and the robustness of differential beamformers have been significantly improved [27–31].
An important issue in applications of DMAs is the steering flexibility. Linear DMAs do not have much flexibility in terms of beam steering: the beampattern varies with the steering angle and the optimal performance in terms of directivity factor (DF) occurs only at the endfire directions, i.e., the directions along the line that connects all sensors [26, 32]. A number of efforts have been devoted to improving the steering flexibility of DMAs. In [33, 34], twodimensional arrays are used to form multiple linear DMAs, and the resulting beampatterns can be steered to a certain number of directions. In [35, 36], firstorder steerable DMAs were constructed by a linear combination of monopole and two orthogonal dipoles using a fourelement square array. In [37], uniform circular DMAs (CDMAs) were designed and their beampatterns can be perfectly steered to M different directions, i.e., the M angular positions of the array elements. In [38], the authors proposed a steerable secondorder DMA as a linear combination of a monopole and dipoles with seven or nine microphones. In [39], a method was proposed to design a nearly constant beampattern, which can be continuously steered between two directions of the reference beams. In [40], an approach was developed to the design of CDMAs based on an approximation of the beampattern from a leastsquares error (LSE) perspective, where the designed beampattern is almost frequency invariant and can be steered to any look direction in the sensor plane. In [41], concentric circular DMAs (CCDMAs) were developed, which can achieve full flexibility in beam steering in the sensor’s plane and have a flexible array structure (a smaller ring can have less microphones than a larger one).
While great progress has been achieved in the design of DMAs with high directivity, frequencyinvariant beampatterns, high robustness, and good steering flexibility, only a few efforts can be found in the literature to deal with flexibility in array geometry. In [42, 43], a method was presented to design a frequencyinvariant beamformer based on spherical harmonic decompositions. While it is adaptable to an arbitrary array and it is steerable, the solution does not guarantee a perfectly frequencyinvariant beampattern. In [44], a broadband beamformer was proposed for spherical arrays with arbitrary sensor configurations, where the sensors’ positions do not have to satisfy the orthonormality criterion, but the shape of the array is limited to spherical. In [45], a general model was developed to design superdirective beamformers for arbitrary sensor arrays. In [46], the authors extended the work in [43] to develop a steerable beamformer for arbitrary planar arrays, but the beampatterns are not frequency invariant. While they have led to many interesting results, the aforementioned efforts did not address the general problem of differential beamforming. Therefore, further efforts are indispensable to study how to design differential beamformers with flexibility in sensor configurations.
In a recent work [7], we studied the problem of differential beamforming with microphone arrays of arbitrary planar geometry, but many important issues such as beampattern steering, influence of array geometry on beamforming performance, and requirements for designing different beampatterns were not addressed. This work is basically an extension of the study in [7]. In comparison with [7], the major contributions of this paper are as follows. First, a detailed analysis is presented on the design of DMAs to address such issues as the basic requirements for the design of different orders of linear DMAs (LDMAs), limited steerable DMAs (LSDMAs), continuously steerable DMAs (CSDMAs), and robust DMAs. Second, we discuss the validity and limitations of the JacobiAnger expansion approximation, where we propose to achieve an optimal approximation with fixed array geometry and number of microphones by choosing an appropriate reference. Generally, the value of r_{m}, i.e., the distance from the reference point to the mth sensor, determines the accuracy of the approximation. Consequently, with fixed array geometry and number of microphones, we can improve DMAs performance by choosing appropriate reference points, i.e., making the value of r_{m} as small as possible. Third, we present the case of designing an Nthorder DMA with 2N microphones. In previous studies of designing differential beamformers with JacobiAnger expansion, at least 2N+1 microphones are needed to design an Nthorder DMAs. We prove that with the JacobiAnger expansion, we can also design an Nthorder DMA with 2N microphones, but the designed beampattern can be only perfectly steered to M different directions, i.e., the angular positions of the M array elements. This is consistent with the previous conclusion of nullconstraint circular differential microphone arrays [37].
The organization of this paper is as follows. Section 2 presents the signal model, problem formulation, and performance measures. Section 3 describes the desired target frequencyinvariant beampattern. Section 4 discusses how to design differential beamformers with arbitrary array geometries and presents some special cases. Section 5 demonstrates the design of first, second, and thirdorder DMAs and analyzes the steering flexibility. Section 6 presents some simulation results to validate the theoretical derivations, and conclusions are given in Section 7.
Signal model, problem formulation, and performance measures
We consider an array consisting of M sensors, which are distributed in a specified area on a plane. Assume that the center of the array coincides with the origin of the twodimensional Cartesian coordinate system and the azimuthal angles are measured anticlockwise from the x axis. The coordinates of the microphones are then written as r_{m}=r_{m}[cosψ_{m} sinψ_{m}]^{T}, with m=1,2,…,M, where the superscript ^{T} is the transpose operator, r_{m} is the distance from the mth microphone to the origin point, and ψ_{m} is the angular position of the mth array element. The distance between microphones i and j (for i,j=1,2,…,M) is then
where ∥·∥_{2} denotes the Euclidean norm. In this paper, we consider smallsize arrays and assume that δ_{max}≪λ_{min}, where δ_{max}= max{δ_{ij}, i,j=1,2,…,M}, with λ_{min} being the smallest acoustic wavelength in the frequency band of interest. This assumption ensures that the true acoustic pressure differentials can be approximated by finite differences between the microphones’ outputs in the design of DMAs.
With the small spacing assumption, it is natural to consider the farfield scenario. Assume that the incidence angle is characterized by azimuthal angle θ. If we define the wavenumber as k=−(ω/c)[cosθ sinθ]^{T}, the steering vector of length M corresponding to the array is written as [47]
where ȷ is the imaginary unit, ω=2πf is the angular frequency, and f>0 is the temporal frequency.
The objective of beamforming is to recover the source signal of interest that is corrupted by spatial acoustic noise. For that, the signal received at each microphone is multiplied by a complex weight, \(H_{m}^{*} \left (\omega \right), \ m=1,2,\ldots,M\), where the superscript ^{∗} stands for complex conjugation. The weighted outputs are then summed together to form the beamformer’s output [8]. Stacking all the weights together in a vector of length M, we get
Without loss of generality, the distortionless constraint at the look direction (where the desired source is located, θ_{s}) is desired, i.e.,
where the superscript ^{H} is the conjugatetranspose operator. Then, the problem of beamforming is to find the optimal filter with the constraint in (4) so that the beamformer’s output is a good estimate of the source signal of interest. One way of finding such a filter is by making its beampattern as close as possible to a desired target beampattern, which is the approach taken in this work.
In order to evaluate the performance of the designed beamformers, we will use the three commonly used metrics, i.e., the white noise gain (WNG), the directivity factor (DF), and the beampattern.
The WNG evaluates the robustness of a beamformer with respect to the presence of array imperfections as well as other uncertainties. It is defined as [25]
The DF quantifies the ability of the beamformer in suppressing spatial noise from directions other than the look direction and it can be written as [48]
where Γ_{d}(ω) is the pseudocoherence matrix of the noise signal in a diffuse (spherically isotropic) noise field, whose (i,j)th element is
with δ_{ij} being defined in (1).
The beampattern describes the sensitivity of the beamformer to a plane wave impinging on the array from the direction θ. Mathematically, it is defined as
Desired target beampattern for DMAs
DMAs refer to arrays that combine closely spaced sensors to respond to the spatial derivatives of the acoustic pressure field. Early DMAs are based on the uniform linear geometry where differential beamformers are designed in a multistage manner and measure the differentials of the acoustic pressure field by combining the outputs of a number of omnidirectional sensors [25, 49]. A different DMA design method was developed in the STFT domain involves solving a system of linear equations to make the designed beampattern equal a target beampattern [26], which provides a better way to deal with white noise amplification. In this paper, we follow the framework to design differential beamformers with their beampatterns being as close as possible to a target beampattern.
Conventionally, in the design of linear DMAs, the best DF is at the endfire direction, i.e., θ=0^{∘} (or 180^{∘}). The Nthorder frequencyinvariant beampattern with its main beam pointing to the direction of 0^{∘} is given by [26, 49]
where a_{N,n}, n=0,1,…,N are realvalued coefficients and
The values of the coefficients a_{N,n}, n=0,1,…,N in (9) affect the shape of the beampattern of the Nthorder DMA as well as its DF and WNG [25, 49]. One can determine the values of those coefficients using either the a priori information or based on some optimization criteria. For example, maximizing the directivity factor gives the hypercardioid beampattern and maximizing the fronttoback ratio leads to the supercardioid beampattern.
In the direction of the main beam, which is assumed to be θ=0^{∘} for linear DMAs, the directivity pattern should be equal to 1, i.e., \({\mathcal {B}} \left (\mathbf {a}_{N}, 0^{\circ } \right) = 1\). Therefore, we have
In this paper, we attempt to design DMAs with arbitrary planar geometries, whose main beam is no longer limited to the direction of 0^{∘}. Let us assume that we want to steer the beampattern to the angle θ_{s}. Using the fact that cos(nθ)=(e^{ȷnθ}+e^{−ȷnθ})/2, one can write the frequencyinvariant beampattern as [40]
where \(b_{2N,0} = a_{N,0}, b_{2N,i} = \frac {1}{2} a_{N,i}, \ i=\pm 1, \ldots,\pm N\). It is more convenient to write (12) into the following vector form:
where
is a (2N+1)×(2N+1) diagonal matrix and
are vectors of length 2N+1. Clearly, the main beam of (13) points in the direction θ_{s} and \( {\mathcal {B}} \left [ \mathbf {c}_{2N}\left (\theta _{\mathrm {s}} \right), \theta \right ]\) is symmetric with respect to the axis θ_{s} ⇔ θ_{s} + π. The values of the coefficients of widely used beampatterns, i.e., dipole, cardioid, supercardioid, and supercardioid, are summarized in Table 1 (for the interested reader, please see [37] for the plots of those beampatterns). In this work, the beampatterns given in (13) are used as the target beampatterns to design DMAs.
Design of differential beamformers
In the design of differential beamformers with an arbitrary planar array geometry, the objective is to find a proper beamforming filter, h(ω), so that the designed beampattern, \({\mathcal {B}} \left [ \mathbf {h}(\omega),\theta \right ]\), is as close as possible to the target frequencyinvariant beampattern, \( {\mathcal {B}} \left [ \mathbf {c}_{2N}\left (\theta _{\mathrm {s}} \right), \theta \right ] \), i.e.,
In what follows, we show how to design such a beamformer.
To make \({\mathcal {B}} \left [ \mathbf {h}(\omega),\theta \right ]\) close to \({\mathcal {B}} \left [ \mathbf {c}_{2N}\left (\theta _{\mathrm {s}} \right), \theta \right ]\), we need to approximate the exponential function that appears in (8) in terms of e^{ȷnθ}. In our previous work in [40], we found that the optimal approximation of the exponential function that appears in beamformer’s beampattern, \({\mathcal {B}} \left [ \mathbf {h}(\omega),\theta \right ]\), from a leastsquares error perspective is the JacobiAnger expansion [50, 51], i.e.,
where J_{n}(x) is the nthorder Bessel function of the first kind with J_{−n}(x)=(−1)^{n}J_{n}(x). By limiting the expansion to the order \(\pm N, {\mathcal {B}} \left [ \mathbf {h}(\omega), \theta \right ]\) can be approximated by
Generally, the intersensor spacing should be small enough to make the JacobiAnger series a good approximation of the exponential function. More precisely, the value of J_{n}(ωr_{m}/c),n>N determines the accuracy of the approximation. Figure 1 plots J_{n}(ωr_{m}/c) for different values of n. As seen, as ωr_{m}/c increases, the truncation error of higher orders increases. When ωr_{m}/c is large, the zeros of Bessel functions will lead to serious performance degradation [52]. With fixed array geometry and number of microphones, the reference point should be properly chosen to make the value of r_{m} as small as possible for an optimal approximation.
Substituting (17) into (8), we obtain
where
is a vector of length M.
Comparing (13) with (18), one can see the following relation:
with \(n=\pm 1,\pm 2,\dots,\pm N\). It follows immediately that
where
is a (2N+1)×M matrix.
Now, it is clear that the beamforming filter, h(ω), can be obtained by solving the linear system in (21). As a matter of fact, if M=2N+1, the solution of (21) is
But this beamformer is generally sensitive to sensors’ self noise and array imperfections at low frequencies.
To improve the robustness of the designed beamformer, we now consider the case of M>2N+1 and derive a beamforming filter by minimizing the norm of h(ω), i.e., h^{H}(ω)h(ω), subject to the equality constraints given in (20):
whose solution is
This optimization process is equivalent to the maximization of the WNG if the array aperture is small and the approximation error in the desired direction is negligible.
A special case is when the M microphones are distributed in a uniform linear way. If the first sensor is chosen as the reference point, we have
where δ denotes the interelement spacing. Substituting it into the definition of ψ_{n}(ω) in (19) and using J_{−n}(x)=(−1)^{n}J_{n}(x), it can be checked that \(\jmath ^{n} \boldsymbol {\psi }_{n}^{T}(\omega) = \jmath ^{n} \boldsymbol {\psi }_{n}^{T}(\omega)\). Considering the fact that b_{2N,−n}=b_{2N,n},n=1,2,…,N, one can check that the first N constraints (corresponding to \(n=1,2,\dots,N\)) and last N constraints (corresponding to \(n=1,2,\dots,N\)) are the same, so half the constraints are redundant and can be omitted. Meanwhile, for linear DMAs, the beampattern is generally steered to 0^{∘} (or 180^{∘}), where the steering matrix Υ(θ_{s}) is equal to the identity matrix (or multiplies by −1). Now, (20) can be written as the following system of linear equations (here we omit the first N constraints):
where
is now an (N+1)×M matrix and b_{N} is an (N+1)×1 vector consisting of the last N+1 elements of b_{2N}. In this case, the proposed beamformer is equivalent to the linear DMA (LDMA) presented in [26, 51].
As discussed previously, a large value of ωr_{m}/c in the Bessel function, J_{n}(ωr_{m}/c), may lead to performance degradation. For uniform LDMAs, an appropriate choice of the reference point is the middle point of the array line (assuming that M is even), i.e.,
Clearly, if microphones are nonuniformly distributed on a line, this beamformer degenerates to the nonuniform LDMA design method presented in [53].
Another particular case is when the M microphones are distributed as a uniform circular array, i.e.,
The proposed beamformer degenerates to the circular DMA (CDMA) in [40]. Generally, if the array aperture is small, a uniform circular array has the best steering ability; but this geometry may not be applicable in many scenarios, especially for irregularly shaped devices. Therefore, microphone arrays with such geometries as triangular, rectangular, or arbitrary (but sensors’ positions are known), also have tremendous application potential.
Analysis of steerable DMAs
In this section, we study the basic requirements for the design of first, second, and thirdorder DMAs with arbitrary planar microphone array geometries.
To design a firstorder LDMA, at least two microphones are needed. The optimal spatial gain of the designed LDMA occurs at endfire directions, i.e., 0^{∘} and 180^{∘}. With the proposed method, a firstorder continuously steerable DMA (CSDMA) can be designed by adding an additional microphone to form a triangular array with three sensors. To improve robustness, more microphones on a planar array can be used, i.e., M≥4.
To design a secondorder LDMA, at least three microphones are needed (distributed as a linear array), i.e., M=3. Similarly, the maximum DF is achieved only at the angles 0^{∘} and 180^{∘}.
According to (22), to design a secondorder CSDMA, at least five microphones are needed, i.e., M=5. As discussed, at least three microphones are needed to design a secondorder LDMA, and five microphones are needed to design a secondorder CSDMA. A legitimate question one may ask is what can be designed with four microphones, i.e., M=4. Before we answer this question, we first discuss a more general case of designing Nthorder DMAs with 2N microphones. In this case, the microphone array geometry is restricted to a uniform circular array.
As shown in Appendix A, the column rank of Ψ^{H}(ω) is
where \({\mathcal {R}}(\mathbf {A})\) denotes the column rank of A. This explains the fact that to design an Nthorder DMA with 2N microphones, only 2N constraints in (21) are linearly independent.
However, as shown in (21) and (22), it usually requires at least 2N+1 microphones to design an Nthorder DMA. Consider the fact that ψ−N′=ψN′ (see Appendix A), we have to release the constraints on b_{2N,−N} or b_{2N,N}, i.e., drop off one of the following constraints:
In this case, the matrix Ψ(ω) becomes
Now, the problem without one of the constraints in (32) is how to ensure that the designed beampattern is equal to the desired directivity pattern. In the special case of θ_{s}=0, it is written
This means that the two constraints in (32) are the same. In other words, if one constraint is imposed, the other is satisfied at the same time. Similarly, if we want the designed beampattern to be fully steered to the direction θ_{s}≠0, we should have
Substituting (13) into (35), we get
Since b_{2N,−N}=b_{2N,N}≠0, it is easy to verify that
If we limit the steering to the range [0,2π], the solution of (37) is
Using M=2N, we get
which means that the designed beampattern can be perfectly steered to M different directions, i.e., the angular positions of the M array elements [37].
So, with four microphones uniformly distributed, the designed secondorder DMA can be perfectly steered to four different directions, i.e., with θ_{s}∈{0^{∘},90^{∘},180^{∘},270^{∘}}. Similarly, increasing the number of microphones while fixing the DMA order can improve the robustness.
To design a thirdorder LDMA, at least four microphones (distributed as a linear array) are needed, i.e., M=4. A special case is to design a thirdorder DMA with five microphones, i.e., M=5. In this case, robustness can be included. Another special case is to design a thirdorder DMA with six microphones distributed as an uniform circular array. In this scenario, as shown in Appendix A, the designed thirdorder DMA can be perfectly steered to six different directions, i.e., with θ_{s}∈{0^{∘},60^{∘},120^{∘},180^{∘},240^{∘},300^{∘}}.
To design a thirdorder CSDMA, at least seven microphones are needed, i.e., M=7. Similarly, all microphones should not be placed in a straight line (experiments show that at least three microphones should be off the xaxis). Again, one can improve the WNG by using more than seven microphones, i.e., M≥8.
Finally, the feasibility of first, second, and thirdorder DMAs for different number of microphones is summarized in Table 2.
Simulations
In this section, we study the performance of the presented method for the design of differential beamformers, where the performance is evaluated with the three widely used performance metrics, i.e., beampattern, WNG, and the directivity index (DI), which is the DF in decibels [25], i.e.,
Firstorder DMAs
We first study the performance of the presented method for the design of firstorder differential beamformers. The desired target beampattern is the firstorder supercardioid, whose coefficients are given in Table 1. In the simulations, we use three microphones and consider the following four different array geometries (coordinate values measured in centimeters).

ArrayI: an equilateral triangle array, the coordinates of the three microphones are (0,1.0),(−0.87,−0.5), and (0.87,−0.5), respectively, see Fig. 2(a.1).

ArrayII: an obtuse isosceles triangle array, the coordinates of the three microphones are (0,1.5),(−0.87,−0.5), and (0.87,−0.5), respectively, see Fig. 2(b.1).

ArrayIII: an acute isosceles triangle array, the coordinates of the three microphones are (0,0.6), (−0.87,−0.5), and (0.87,−0.5), respectively, see Fig. 2(c.1).

ArrayIV: a scalene triangle array with top microphone being off the central axis, the coordinates of the three microphones are (0,1.04),(−0.87,−0.5), and (0.87,−0.5), respectively, see Fig. 2(d.1).
Figure 2 plots the four different geometries and the corresponding beampatterns, and broadband beampatterns versus frequency designed with the presented algorithm, where the desired look direction is 0^{∘}. As seen, the presented method successfully formed the firstorder supercardioid for all the four geometries and the designed beampatterns are almost frequency invariant. Figure 3 plots the DIs and the WNGs of the designed differential beamformers with the aforementioned four array geometries. It is clearly seen that the DI does not change much with frequency for all the four geometries, indicating that the designed beampatterns are frequency invariant.
Figure 4 plots the DIs and the WNGs for different look directions, θ_{s}, at f=1000 Hz. It is clearly seen that the DIs stay constant (approximately 5 dB) but the WNGs fluctuate with θ_{s} (except ArrayI). The results confirm our previous conclusion that continuously steerable DMAs can be designed by using three microphones if they are not arranged in a linear manner. It is also observed from Fig. 4 that the WNGs for the four geometries are slightly different from each other. This result is interesting from a practical perspective. With the same number of microphones, we can form the same beampattern with the same DI while optimizing the geometry to improve the WNG.
Secondorder DMAs
We then study the performance of the presented method for the design of secondorder DMAs, where the desired target beampattern is the secondorder supercardioid, and its coefficients are chosen according to Table 1. In the simulations, we use five microphones and consider the following four different array geometries (coordinate values measured in centimeters).

ArrayI : a regular pentagon array (also can be considered as a uniform circular microphone array), the coordinates of the five microphones are (1.0,0),(0.31,0.95),(−0.81,0.59),(−0.81,−0.59), and (0.31,−0.95), respectively, see Fig. 5(a.1).

ArrayII: a pentagon array where the radii of the microphones are the same but angles are nonuniform, the coordinates of the five microphones are (1.0,0),(0.47,0.88),(−0.38,0.93),(−0.96,−0.28), and (0.31,−0.95), respectively, see Fig. 5(b.1).

ArrayIII: a pentagon array where the angles of microphones are uniform, but the radius are nonuniform, the coordinates of the five microphones are (1.0,0),(0.19,0.57),(−0.81,0.59),(−0.97,−0.71), and (0.31,−0.95), respectively, see Fig. 5(c.1).

ArrayIV: the coordinates of the microphones is randomly distributed, the coordinates of the five microphones are (0.5,0),(0.92,0.77),(1.0,0),(−0.84,1.0), and (0.31,−0.95), respectively, see Fig. 5(d.1).
Figure 5 plots the array geometries, the corresponding beampatterns, and broadband beampatterns versus frequency. It is clearly seen that the presented method successfully formed the secondorder supercardioid for all the four geometries and the designed beampatterns are almost frequency invariant. Figure 6 plots the DIs and the WNGs as a function of frequency, where the DIs, again, stay constant but the WNGs increase with frequency. It is also seen that both the DIs and WNGs for the four geometries are different. This can be easily explained, and the DI is defined as the ratio between the magnitudesquared beampattern in the look direction and the averaged magnitudesquared beampattern over the entire space. But the beampattern is defined only in the sensor plane. As a result, the DIs are different even though beampatterns are the same. In our study, we focus on the 2dimensional space with the assumption that the sound sources and the sensors are in the same plane. The designed beamformer will achieve good performance if the steering angles are within or near the sensor plane, but it becomes less and less effective as the beamformer is steered away from this plane. This problem has been fully studied with circular differential beamformers [54], and the conclusion also applies to other kinds of planar arrays.
Figure 7 plots the DIs and WNGs as a function of θ_{s} at f=1000 Hz. It is seen that, for some geometries, the DI and WNG change with θ_{s}, which indicates that geometries play an important role on the steering capability of the beamformer. However, it should be noted that the beamformer is continuously steerable regardless of geometry.
Thirdorder DMAs
We then study the performance of the presented method for the design of a thirdorder supercardioid with M=7 and consider the following four different array geometries (coordinate values measured in centimeters).

ArrayI : a regular heptagon microphone array (also can be considered as a uniform circular microphone array), the coordinates of the seven microphones are (1.5,0),(0.94,1.17),(−0.34,1.46),(−1.35,0.66),(−1.35,−0.66),(−0.34,−1.46), and (0.94,−1.17), respectively, see Fig. 8(a.1).

ArrayII: a heptagon microphone array where the angle positions of microphones are nonuniform, but the radii are the same, and the coordinates of the seven microphones are (1.5,0),(1.40,0.54),(0.68,1.34),(−1.35,0.66),(−0.61,−1.37),(0.18,−1.49), and (1.29,−0.77), respectively, see Fig. 8(b.1).

ArrayIII: a heptagon microphone array where the radii of the microphones are nonuniform, but the angle positions are the same, and the coordinates of the seven microphones are (1.2,0),(0.94,1.17),(−0.27,1.17),(−1.35,0.66),(−1.08,−0.53),(−0.34,−1.46), and (0.76,−0.93), respectively, see Fig. 8(c.1).

ArrayIV: the coordinates of the microphones are randomly distributed, and the coordinates of the seven microphones are (0.5,0),(0.5,0.87),(−0.96,1.15),(−1.48,0.26),(−0.77,−0.64),(0.05,−0.30), and (1.29,−0.77), respectively, see Fig. 8(d.1).
Figure 8 plots the four different geometries and the corresponding beampatterns and broadband beampatterns versus frequency. Figure 9 plots the DIs and the WNGs as a function of frequency, f. Similarly, with all the four array geometries, the presented method successfully formed the thirdorder supercardioid and the designed beampatterns are almost frequency invariant. The designed thirdorder DMA is continuously steerable, with a constant DI; but the DIs and WNGs are different for the four geometries.
Robust DMAs
As seen, the DI and the WNG fluctuate with the steering direction. This can be improved by increasing the number of microphones and designing the beamformer with the minimumnorm solution. To demonstrate this, we show an example for the design of robust DMAs with M=25, where the coordinates of the microphones are random numbers generated with the uniform distribution by confining 1≤r_{m}≤2 cm and −π<ψ_{m}≤π. Figure 10 plots the DIs and the WNGs as a function of the steering direction, θ_{s} and f, of the designed second and thirdorder supercardioid, respectively. Now, it is clearly seen both the DIs and the WNGs are almost constant. Clearly, the performance can be further improved if more microphones are used.
Note that there may exist some errors when implementing the beamformer with limited precision in practice. This effect can be modeled by WNG, which evaluates the performance of a beamformer with respect to the presence of array imperfection as well as other uncertainties. So, one can deal with such errors by improving the WNG.
Experiments in real environment
In this subsection, we present an example of using the proposed DMA beamformer in a real office room environment. The size of the office room is approximately 3×4×3 m, where the reverberation time, T_{60}, which is computed from a measured room impulse response, is approximately 600 ms. A loudspeaker is placed 1 m away from the array with an elevation angle of 30^{∘}, which plays back a prerecorded speech signal to simulate a sound source of interest. Note that this source position is arbitrarily selected. The acoustic scenario for the experiments also consists of air conditioning noise, environmental noise from outside of the windows, and some babble noise from another working area, which is over 6 m away. The overall signaltonoiseratio (SNR), which is evaluated from the reference microphone, is approximately 4.5 dB. We first built a concentric circular microphone array, which consists of 7 sensors, a photo of which is shown in Fig. 11. We then choose 4 microphones from the array to form an irregular geometry array, where the 4 used microphones are marked with a circle. We choose the center of the four microphones as the reference point, the coordinates (measured in centimeters) of the four microphones are (1.9,0),(0,3.29),(−1.9,0), and (0,3.29), respectively. The DMA is expected to be mounted in the head of a robot (however, due to logistic issues, in this experiment, the DMA is mounted on a tripod). The desired source is in a randomly selected direction (a loudspeaker about 1 m away and with elevation angle of 30^{∘} to play back a speech signal of interest).
The microphone outputs are first passed through a preamplifier and then fed to a 24bit analogtodigital converter with a sampling rate of 8 kHz. Then, the digitized signals are then processed with a TI floatingpoint processor. The beamformers are implemented in the STFT domain with a frame size of 32 ms (256 points) and an overlapping factor of 75% (a Kaiser window is applied to each frame). The beamforming filters are computed according to (25) with the target beampattern as the 1storder cardioid. Figure 12 plots the timedomain observed signal and its spectrogram and the output of the developed DMA beamformer and its spectrogram. It is seen that the output of the DMA enhanced the desired signal and suppressed reverberation and noise, which indicates the effectiveness of the developed beamformer.
Comparison
As discussed previously, with a fixed array geometry and number of microphones, we can improve the DMA performance by choosing an appropriate reference point, i.e., making the value of r_{m} as small as possible. We present two examples for the design of supercardioids using the conventional method in [51], i.e., (26), and the proposed method, i.e., (29), respectively, in Fig. 13, where we use a uniform linear microphone array consisting of four closely spaced microphones with the interelement spacing being 1 cm. As seen, the beampattern of the linear DMA designed by the proposed method matches better the target beampattern than the beampattern designed with the conventional method. It is clearly seen that the proposed method achieves better performance as compared to the conventional method [51].
The proposed method is also compared to the conventional nullconstrained differential beamformer in [37], where we use ArrayI as shown in Fig. 2(a,1), and the desired beampattern is the firstorder supercardioid with θ_{s}=130^{∘}. The results are plotted in Fig. 14. For reference, the performance of the delayandsum (DS) beamformer is also plotted. As seen, while the DS beamformer has a large WNG, its directivity is very small. In comparison, the differential beamformers have much higher DIs, which are almost frequencyinvariant. It is also seen that the proposed method achieves higher DIs than the conventional nullconstrained differential beamformer.
Conclusions
Towards robotic applications, where microphone arrays face restrictions in size and geometry, we presented in this paper an approach to the design of differential beamformers with arbitrary planar array geometries. By approximating the beampattern with the JacobiAnger expansion, we developed an algorithm that can form beampatterns close to a prespecified target frequencyinvariant beampattern. This method is rather general and it can be used to design differential beamformers with linear, circular, concentric circular DMAs, and arrays where sensors are placed in any specified positions. Based on the proposed method, some basic requirements for the design of first, second, and thirdorder LDMAs, LSDMAs, and CSDMAs were discussed. This study also summarized the fundamental requirements, i.e., the number of microphones and array geometries, for the design of different kinds and orders of DMAs.
\thelikesection Appendix A
\thelikesection Derivation of the column rank of Ψ^{H}(ω)
In case 2N, microphones are uniformly distributed on a circular array, and the vector ψ_{n}(ω) defined in (19) can be written as
with
Since M=2N, it is clear that the mth element of vectors ψ−N′ and ψN′ are
From (43), it is clearly seen
So, we have
where
is a (2N+1)×(2N+1) diagonal matrix, and
with I_{2N} being the 2N×2N identity matrix and i_{1} being the first column of the I_{2N}. It is clearly seen that rank(P)=2N. Then,
where \({\mathcal {R}}\) denotes the column rank of a matrix. As a consequence, we have
According to (44), it is also clear that
According to (49) and (50), we get (31), i.e., the column rank of Ψ^{H}(ω) is 2N.
References
H. W. Löllmann, A. Moore, P. A. Naylor, B. Rafaely, R. Horaud, A. Mazel, W. Kellermann, in HSCMA. Microphone array signal processing for robot audition (IEEESan Francisco, 2017), pp. 51–55.
K. Sekiguchi, Y. Bando, K. Nakamura, K. Nakadai, K. Itoyama, K. Yoshii, in IEEE/RSJ IROS. Online simultaneous localization and mapping of multiple sound sources and asynchronous microphone arrays (IEEEDaejeon, 2016), pp. 1973–1979.
C. Evers, Y. Dorfan, S. Gannot, P. A. Naylor, in Proc. IEEE ICASSP. Source tracking using moving microphone arrays for robot audition (IEEENew Orleans, 2017), pp. 6145–6149.
H. Barfuss, M. Bachmann, M. Buerger, M. Schneider, W. Kellermann, in Proc. IEEE WASPAA. Design of robust twodimensional polynomial beamformers as a convex optimization problem with application to robot audition (IEEENew Paltz, 2017), pp. 106–110.
H. Barfuss, M. Buerger, J. Podschus, W. Kellermann, in HSCMA. HRTFbased twodimensional robust leastsquares frequencyinvariant beamformer design for robot audition (IEEESan Francisco, 2017), pp. 56–60.
J. Benesty, I. Cohen, J. Chen, Fundamentals of signal enhancement and array signal processing (John Wiley & Sons, 2018).
G. Huang, J. Chen, J. Benesty, On the design of differential beamformers with arbitrary planar microphone array. J. Acoust. Soc. Am.144(1), 66–70 (2018).
J. Benesty, J. Chen, Y. Huang, Microphone array signal processing (SpringerVerlag, Berlin, Germany, 2008).
S. Gannot, D. Burshtein, E. Weinstein, Analysis of the power spectral deviation of the general transfer function GSC. IEEE Trans. Signal Process.52:, 1115–1120 (2004).
S. Yan, Y. Ma, C. Hou, Optimal array pattern synthesis for broadband arrays. J. Acoust. Soc. Am.122(5), 2686–2696 (2007).
B. Rafaely, D. Khaykin, Optimal modelbased beamforming and independent steering for spherical loudspeaker arrays. IEEE Trans. Audio, Speech, Lang. Process.19(7), 2234–2238 (2011).
S. Yan, Optimal design of modal beamformers for circular arrays. J. Acoust. Soc. Am.138(4), 2140–2151 (2015).
B. Rafaely, Fundamentals of spherical array processing (SpringerVerlag, Berlin, Germany, 2015).
E. D. Sena, H. Hacihabiboglu, Z. Cvetkovic, On the design and implementation of higherorder differential microphones. IEEE Trans. Audio, Speech, Lang. Process.20:, 162–174 (2012).
M. Buck, Aspects of firstorder differential microphone arrays in the presence of sensor imperfections. European Trans. Telecomm.13(2), 115–122 (2002).
G. Huang, J. Benesty, I. Cohen, J. Chen, Differential beamforming on graphs. IEEE/ACM Trans. Audio, Speech, Lang. Process.28(1), 901–913 (2020).
A. Bernardini, F. Antonacci, A. Sarti, Wave digital implementation of robust firstorder differential microphone arrays. IEEE Signal Process. Lett.25(2), 253–257 (2017).
F. Borra, A. Bernardini, F. Antonacci, A. Sarti, Uniform linear arrays of firstorder steerable differential microphones. IEEE/ACM Trans. Audio, Speech, Lang. Process.27(12), 1906–1918 (2019).
F. Borra, A. Bernardini, F. Antonacci, A. Sarti, Efficient implementations of firstorder steerable differential microphone arrays with arbitrary planar geometry. IEEE/ACM Trans. Audio, Speech, Lang. Process. (2020).
G. Huang, J. Chen, J. Benesty, in Proc. IEEE ICASSP. On the design of robust steerable frequencyinvariant beampatterns with concentric circular microphone arrays (IEEECalgary, 2018), pp. 506–510.
E. TianaRoig, F. Jacobsen, E. F. Grande, Beamforming with a circular microphone array for localization of environmental noise sources. J. Acoust. Soc. Am.128(6), 3535–3542 (2010).
E. TianaRoig, F. Jacobsen, E. FernandezGrande, Beamforming with a circular array of microphones mounted on a rigid sphere (L). J. Acoust. Soc. Am.130(3), 1095–1098 (2011).
A. M. Torres, M. Cobos, B. Pueo, J. J. Lopez, Robust acoustic source localization based on modal beamforming and time–frequency processing using circular microphone arrays. J. Acoust. Soc. Am.132(3), 1511–1520 (2012).
G. W. Elko, Microphone array systems for handsfree telecommunication. Speech Commun.20(3), 229–240 (1996).
G. W. Elko, J. Meyer, in Springer Handbook of Speech Processing, ed. by J. Benesty, M. M. Sondhi, and Y. Huang. Microphone arrays (SpringerVerlagBerlin, Germany, 2008), pp. 1021–1041. Chap. 48.
J. Benesty, J. Chen, Study and design of differential microphone arrays (SpringerVerlag, Berlin, Germany, 2012).
G. Huang, J. Chen, J. Benesty, Design of planar differential microphone arrays with fractional orders. IEEE/ACM Trans. Audio, Speech, Lang. Process.28:, 116–130 (2019).
J. Benesty, I. Cohen, J. Chen, Array processingKronecker product beamforming (SpringerVerlag, Berlin, Germany, 2019).
I. Cohen, J. Benesty, J. Chen, Differential Kronecker product beamforming. IEEE/ACM Trans. Audio, Speech, Lang. Process.27(5), 892–902 (2019).
Y. Buchris, I. Cohen, J. Benesty, A. Amar, Joint sparse concentric array design for frequency and rotationally invariant beampattern. IEEE/ACM Trans. Audio, Speech, Lang. Process.28:, 1143–1158 (2020).
Y. Buchris, I. Cohen, J. Benesty, Frequencydomain design of asymmetric circular differential microphone arrays. IEEE/ACM Trans. Audio, Speech, Lang. Process.26(4), 760–773 (2018).
G. Huang, J. Benesty, I. Cohen, J. Chen, A simple theory and new method of differential beamforming with uniform linear microphone arrays. IEEE/ACM Trans. Audio, Speech, Lang. Process.28(1), 1079–1093 (2020).
G. W. Elko, A. T. N. Pong, in Proc. IEEE ICASSP, 1. A steerable and variable firstorder differential microphone array (IEEEMunich, 1997), pp. 223–226.
R. M. Derkx, K. Janse, Theoretical analysis of a firstorder azimuthsteerable superdirective microphone array. IEEE Trans. Audio, Speech, Lang. Process.17(1), 150–162 (2009).
X. Wu, H. Chen, J. Zhou, T. Guo, Study of the mainlobe misorientation of the firstorder steerable differential array in the presence of microphone gain and phase errors. IEEE Signal Process. Lett.21(6), 667–671 (2014).
X. Wu, H. Chen, Directivity factors of the firstorder steerable differential array with microphone mismatches: deterministic and worstcase analysis. IEEE/ACM Trans. Audio, Speech, Lang. Process.24(2), 300–315 (2016).
J. Benesty, J. Chen, I. Cohen, Design of circular differential microphone arrays (SpringerVerlag, Berlin, Germany, 2015).
J. Byun, Y. C. Park, S. W. Park, Continuously steerable secondorder differential microphone arrays. J. Acoust. Soc. Am.143(3), 225–230 (2018).
A. Bernardini, M. D. Aria, R. Sannino, A. Sarti, Efficient continuous beam steering for planar arrays of differential microphones. IEEE Signal Process. Lett.24(6), 794–798 (2017).
G. Huang, J. Benesty, J. Chen, On the design of frequencyinvariant beampatterns with uniform circular microphone arrays. IEEE/ACM Trans. Audio, Speech, Lang. Process.25(5), 1140–1153 (2017).
G. Huang, J. Chen, J. Benesty, Insights into frequencyinvariant beamforming with concentric circular microphone arrays. IEEE/ACM Trans. Audio, Speech, Lang. Process.26(12), 2305–2318 (2018).
L. C. Parra, in Proc. IEEE WASPAA. Least squares frequencyinvariant beamforming (IEEENew Paltz, 2005), pp. 102–105.
L. C. Parra, Steerable frequencyinvariant beamforming for arbitrary arrays. J. Acoust. Soc. Am.119(6), 3839–3847 (2006).
C. C. Lai, S. Nordholm, Y. H. Leung, Design of steerable spherical broadband beamformers with flexible sensor configurations. IEEE Trans. Audio, Speech, Lang. Process.21(2), 427–438 (2013).
Y. Wang, Y. Yang, Z. He, Y. Han, Y. Ma, A general superdirectivity model for arbitrary sensor arrays. EURASIP J. Adva. Signal Process.2015(1), 68 (2015).
A. Medda, A. Patel, in 2017 51st Asilomar Conference on Signals, Systems, and Computers,. Frequency invariant beamforming for arbitrary planar arrays (Pacific Grove, 2017), pp. 1133–1136.
H. V. Trees, Optimum array processing: part IV of detection, estimation, and modulation theory (John Wiley Sons, Inc, New York, 2002).
M. Brandstein, D. Ward, microphone arrays: signal processing techniques and applications (Springer, 2001).
G. W. Elko, in Audio signal processing for nextgeneration multimedia communication systems. Differential microphone arrays (Springer, 2004), pp. 11–65.
M. Abramowitz, I. A. Stegun, Handbook of mathematical functions: with formulas, graphs, and mathematical tables, vol. 55 (Dover Publications, New York, 1965).
L. Zhao, J. Benesty, J. Chen, Design of robust differential microphone arrays with the Jacobi–Anger expansion. Appl. Acous.110:, 194–206 (2016).
G. Huang, J. Benesty, J. Chen, Design of robust concentric circular differential microphone arrays. J. Acoust. Soc. Am.141(5), 3236–3249 (2017).
H. Zhang, J. Chen, J. Benesty, Study of nonuniform linear differential microphone arrays with the minimumnorm filter. Appl. Acous.98:, 62–69 (2015).
G. Huang, X. Zhao, J. Chen, J. Benesty, in Proc. IEEE ICASSP. Properties and limits of the minimumnorm differential beamformers with circular microphone arrays (IEEE, 2019), pp. 426–430.
Acknowledgements
This work was supported in part by the NSFC Distinguished Young Scientists Fund (grant no. 61425005), Israel Science Foundation (grant no. 576/16), and the ISFNSFC joint research program (grant No. 2514/17 and 61761146001).
Author information
Authors and Affiliations
Contributions
The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Huang, G., Chen, J., Benesty, J. et al. Steerable differential beamformers with planar microphone arrays. J AUDIO SPEECH MUSIC PROC. 2020, 15 (2020). https://doi.org/10.1186/s13636020001851
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13636020001851
Keywords
 Robot audition
 Microphone arrays
 Beamforming
 Differential beamforming
 Frequencyinvariant beampattern