A plethora of interpolation techniques for real-valued scattered data exist that make different assumptions about the distribution of the discrete set of known data points [24]. Because the quality of the interpolation depends on how well these assumptions are fulfilled, the performance of the interpolation methods considerably depends on the specific application. Simple techniques include discontinuous nearest-neighbor interpolation, as well as continuous linear and natural neighbor interpolation. More commonly used are advanced concepts such as deterministic inverse distance weighted or spline interpolation [25], as well as kriging [26]—a stochastic technique from the field of geostatistics that minimizes the spatial variance between the value to be estimated and the ambient measurements. An essential tool for data fitting and interpolation in the field of computer aided geometric design (CAGD) are barycentric coordinates defined on spherical triangles, which can be used to define the associated spherical Bernstein-Bézier polynomials for constructing piece-wise functional and parametric surfaces [27]. For acoustical sound sources, a decomposition into SH basis functions has become particularly popular [28–30], since it not only allows for a synthesis of the radiation pattern in virtual acoustic reality [31], but also for a decomposition of the room impulse response into SH-based spatial components [32]. In case of an order-limited directivity, SH interpolation is physically correct.
Based on the above review, we selected three interpolation approaches for the detailed evaluation. SH interpolation was included because of its widespread use in musical acoustics. Spline interpolation was chosen because it is superior to inverse distance weighting and kriging if only a small number of sample points are available [33, 34]. The spherical triangular interpolation technique corresponds to a piece-wise degree-1 barycentric spherical Bernstein-Bézier polynomial interpolation; in audio technology it is commonly employed in three-dimensional vector based amplitude panning (VBAP) as introduced by Pulkki [35] for robust virtual sound source positioning [22].
2.1 Spherical harmonics interpolation
If the sound pressure on the surface of a sphere is sampled with a finite number of microphones, spherical Fourier coefficients can be calculated from the measured values, which can then be used to estimate the sound pressure function on the entire measuring surface [36]. The limited number of sample points results in an order-limited sound pressure function on the measurement surface. Thus, the spherical function f(θ,ϕ) (θ=azimuth,ϕ=colatitude) is represented by a weighted sum of a finite set of orthogonal base functions:
$$ f(\theta,\phi)=\sum^{N}_{n=0}\sum^{n}_{m=-n}f_{{nm}}Y^{m}_{n}(\theta,\phi), $$
(1)
where \(N\in \mathbb {N}\) indicates the spherical harmonics order and fnm are the considered weights of the corresponding spherical harmonics
$$ Y^{m}_{n}(\theta,\phi)=\sqrt{\frac{2n+1}{4\pi} \frac{(n-m)!}{(n+m)!}}P^{m}_{n}(\cos\theta)e^{im\phi}, $$
(2)
where \(P^{m}_{n}(\cdot)\) are the associated Legendre functions, (·)! represents the factorial function, \(m\in \mathbb {Z}\) specifies the function degree, and \(N\in \mathbb {N}\) the order of the function. Consequently, the Fourier coefficients fnm completely describe the order-constrained function f(θ,ϕ) on the entire sphere and their determination is yet sufficient for a correct SH interpolation.
By sampling the sound pressure function f(θ,ϕ) with a Q channel spherical microphone array, the samples pq=f(θq,ϕq) are given at the positions (θq,ϕq) of the respective microphones for \(q\in \{1,2,...,Q\}=\mathbb {N}_{Q}\). In matrix form Eq. 1 can be written as
$$ \mathbf{f} =\mathbf{Y} \mathbf{f}_{{nm}}, $$
(3)
where the matrix Y of dimensions Q×(N+1)2 is given by
$$ \mathbf{Y} = \left[\begin{array}{cccc} Y_{0}^{0}(\theta_{1},\phi_{1}) & Y_{1}^{-1}(\theta_{1},\phi_{1}) & \cdots & Y_{N}^{N}(\theta_{1},\phi_{1})\\ Y_{0}^{0}(\theta_{2},\phi_{2}) & Y_{1}^{-1}(\theta_{2},\phi_{2}) & \cdots & Y_{N}^{N}(\theta_{2},\phi_{2})\\ \vdots & \vdots & \ddots & \vdots\\ Y_{0}^{0}(\theta_{Q},\phi_{Q}) & Y_{1}^{-1}(\theta_{Q},\phi_{Q}) & \cdots & Y_{N}^{N}(\theta_{Q},\phi_{Q})\\ \end{array}\right] $$
(4)
and the vector \(\mathbf {f} = [p_{1},\dots,p_{Q}]^{T}\) contains the Q sound pressure measurements at position (θq,ϕq) for \(q \in \mathbb {N}_{Q}\).
For the rare scenario, when the number of microphones Q matches the spherical harmonics order N, i.e. Q=(N+1)2, under consideration of perfectly distributed measuring points [37] and thus a well-conditioned full-rank matrix Y, Eq. 3 can be solved with the inverse of matrix Y:
$$ \mathbf{f}_{{nm}} =\mathbf{Y}^{-1}\mathbf{f}. $$
(5)
For Q>(N+1)2 an over-determined system of linear equations results which can be solved through best fit, in the least-squares sense, by taking the Moore-Penrose inverse of Y and thus seeking a solution fnm that minimizes the energy of the error:
$$ \min_{\mathbf{f}_{{nm}}} \|\mathbf{f} - \mathbf{Y} \mathbf{f}_{{nm}}\|^{2} \quad \Longrightarrow \quad \mathbf{f}_{{nm}} =\mathbf{Y}^{\dagger} \mathbf{f}, $$
(6)
with Y†=(YHY)−1YH and ∥·∥ denoting the Euclidean norm. For functions that are not order-limited, errors occur due to spatial aliasing and f≠Yfnm and consequently f(θq,ϕq)≠pq [38].
For Q<(N+1)2, the system of equations is under-determined and Eq. 3 provides infinitely many solutions. In this case the Moore-Penrose inverse of the matrix Y seeks a solution fnm with minimum Euclidean norm, i.e. with minimal wave-spectral power ∥fnm∥2 ([29], p. 79):
$$ \min_{\mathbf{f}_{{nm}}} \| \mathbf{f}_{{nm}} \|^{2} \quad \textrm{s.t.} \quad \mathbf{f} = \mathbf{Y} \mathbf{f}_{{nm}} \quad \Longrightarrow \quad \mathbf{f}_{{nm}} =\mathbf{Y}^{\dagger} \mathbf{f}. $$
(7)
To interpolate samples of the sound pressure measurements on a sphere, the calculated weights of the spherical harmonics can be used in the inverse spherical Fourier transform from Eq. 1 and arbitrary points between the samples can be estimated. The values at the sampling positions (θq,ϕq) for \(q \in \mathbb {N}_{Q}\) can be reproduced exactly if the order N is sufficiently high. In the case of under-determined systems, however, notches occur between the sample points due to the chosen constraint of minimum wave-spectral power and therefore even order-limited functions can no longer be represented accurately.
An indication for the numerical accuracy of SH interpolation based on matrix inversion (Eq. 5) is the condition number κ of YN. A large condition number indicates that small changes in the measured sound pressures f could lead to large changes in the Fourier coefficient matrix fnm. The solution of the linear system of equations is thus highly sensitive to errors and noise in the input data. While κ=1 is ideal, a system with κ>3.5 is considered as ill-conditioned [39]. The condition number depends on the chosen spatial sampling scheme and the SH order N.
2.2 Thin plate pseudo-spline interpolation
The thin plate pseudo-spline solution [40, 41] allows the regularized interpolation of sparsely distributed measurements on the sphere with closed-form expressions that make this approach well suited for numerical computation. The aim is to find a smooth function f(θ,ϕ), where the values for f(θq,ϕq) should be as close as possible to the measured values pq while containing minimum bending energy on the surface of the sphere S. An interpolating (A) or smoothing (B) thin plate pseudo-spline can therefore be obtained by seeking the solution to one of the following problems:
$$ \min_{f} J_{k}(f) \quad \text{s.t.} \quad f(\theta_{q},\phi_{q}) = p_{q} $$
(8)
for (A) or with the option of regularization
$$ \min_{f} \frac{1}{Q}\sum_{q=1}^{Q}(p_{q}-f(\theta_{q},\phi_{q}))^{2} + \lambda J_{k}(f) $$
(9)
for (B), where λ≥0 denotes the tuning parameter and Jk(f) is defined by
$$ J_{k}(f) = \sum_{n=1 }^{\infty}\sum_{m=-n}^{n} \frac{\check f^{2}_{{nm}}}{\xi_{{nm}}}, $$
(10)
with
$$ \check f_{{nm}}=\int_{S} f(\theta,\phi)Y_{n}^{m}(\theta,\phi) d\theta d\phi $$
(11)
and
$$ \xi_{{nm}}=\left[ (n+\frac{1}{2})(n+1)(n+2)\dotsm(n+2k-1) \right]^{-1}. $$
(12)
A solution of the two problems given by Eqs. 8 and 9 is obtained with
$$ f_{Q,k,\lambda}(\theta,\phi)=\sum_{q=1}^{Q}c_{q} R(\theta,\phi;\theta_{q},\phi_{q})+d. $$
(13)
R(θ,ϕ;θq,ϕq) is the reproducing kernel for the Hilbert space \(\mathscr {H}_{k}^{0} (S)\) with norm \(J_{k}^{1/2} (\cdot)\):
$$ \begin{aligned} R(\theta,\phi;\theta_{q},\phi_{q})&=\sum_{n=1}^{\infty}\sum_{m=-n}^{n}\xi_{{nm}}Y_{n}^{m}(\theta,\phi)Y_{n}^{m}(\theta_{q},\phi_{q})\\ &=\frac{1}{2\pi}\sum_{n=1}^{\infty} \frac{1}{(n+1)(n+2) \dotsm (n+2k-1)}P_{n}(z), \end{aligned} $$
(14)
where Pn are the associated Legendre polynomials and z denotes the cosine of the spherical angle γ between the two arguments of the kernel function with
$$ z = \cos\gamma = \sin(\phi) \sin(\phi_{q}) + \cos(\phi) \cos(\phi_{q}) \cos(\theta-\theta_{q}). $$
(15)
The spline order \(M \in \mathbb {N}\) determines the derivability of the solution from Eq. 13. We define the spline order as M=2k−2, and corresponding splines are continuous up to the (M−1)th derivative, so they are called CM−1 smooth.
A closed-form expression for the reproducing kernel R(θ,ϕ;θq,ϕq), suitable for numerical computation, is given by
$$ R(\theta,\phi;\theta_{q},\phi_{q}) = \frac{1}{2\pi}\left[ \frac{1}{(2k-2)!}q_{2k-2}(z)-\frac{1}{(2k-1)!} \right], $$
(16)
with
$$ q_{2k-2}(z)=\int_{0}^{1}(1-h)^{2k-2}(1-2hz+h^{2})^{-1/2} dh $$
(17)
and 2k−2=M.
A recursive evaluation of q2k−2(z) for \(k = \left \{\frac {3}{2}, 2,\frac {5}{2},...,6\right \}\) can be found in ([40, 41], Tab. 1), as well as the determination of the coefficients c and d from Eq. 13 in matrix formFootnote 1:
$$ \left[\begin{array}{c} \mathbf{c} \\ d \end{array}\right] = \left[\begin{array}{cc} \mathbf{R}_{Q} + Q\lambda \mathbf{I}\ & \mathbf{T}\\ \mathbf{T}^{T}& 0 \end{array}\right]^{-1} \left[\begin{array}{c} \mathbf{f} \\ 0 \end{array}\right], $$
(18)
where RQ is the Q×Q matrix with the element i,j defined as (RQ)i,j=R(θi,ϕi;θj,ϕj),I is the Q×Q identity matrix, the vector \(\mathbf {f} = [p_{1},\dots,p_{Q}]^{T}\) contains the Q sound pressure measurements at position (θq,ϕq) for \(q \in \mathbb {N}_{Q}\) and \(\mathbf {T} = [1,\dots,1]^{T}\).
If the measured values are noisy, it can be advantageous to regularize the interpolation in order to suppress outliers; the tuning parameter λ>0 will smooth the estimated function f on the surface of the sphere. Due to the low noise measurement data used for this study, smoothing of the estimation function did not improve the quality of the interpolation (c.f. Section 5), therefore the thin plate pseudo-splines were performed without regularization (λ=0).
2.3 Piece-wise linear, spherical triangular interpolation
The entire set of Q microphone positions (θq,ϕq) can be equivalently expressed as a 3×Q matrix containing its three-dimensional unit direction vectors
$$ \mathbf{U}= \left[\begin{array}{c} \mathbf{u}_{1},\dots,\mathbf{u}_{Q} \end{array}\right],\qquad \mathbf{u}_{q}= \left[\begin{array}{c} \cos\phi_{q}\,\sin\theta_{q}\\ \sin\phi_{q}\,\sin\theta_{q}\\ \cos\theta_{q} \end{array}\right], $$
(19)
Using the Quickhull algorithm [42] vertex index triplets vl=[v1l,v2l,v3l] are obtained to describe a set of triangular facets that span the convex hull of the vertices stored in U.
Any arbitrary unit direction vector u can be represented by the non-negative spherical barycentric/area coordinates g=[g1,g2,g3]T of the vertices Ul of the lth triangle,
$$\begin{array}{*{20}l} \mathbf{u}&=\mathbf{U}_{l}\;\mathbf{g}, & \mathbf{U}_{l}&= [\mathbf{u}_{v_{1l}},\mathbf{u}_{v_{2l}},\mathbf{u}_{v_{3l}}], \end{array} $$
(20)
$$\begin{array}{*{20}l} \mathbf{g}&=\mathbf{U}_{l}^{-1}\,\mathbf{u}, \end{array} $$
(21)
where gi≥0 and \(\sum _{i} g_{i}\geq 1\). Note that the required all-positive spherical barycentric coordinates are only found if a suitable spherical triangle l is selected from the convex hull, which will then contain u. While the spherical barycentric coordinates g reproduce the direction u, spherical triangular interpolation uses the corresponding planar barycentric coordinates \(\tilde g_{i}=\frac {g_{i}}{\sum _{j} g_{j}}\) [27] to linearly interpolate the values measured at the microphones of the triangle l by their weighted average,
$$\begin{array}{*{20}l} f(\mathbf{u})=\tilde g_{1}\,p_{v_{1l}}+\tilde g_{2}\, p_{v_{2l}}+\tilde g_{3}\,p_{v_{3l}}. \end{array} $$
(22)
At the boundaries, this interpolation exactly reproduces the values at the triangle vertices and linearly interpolates the value pairs along any edge of the lth triangle. Because neighboring triangles share edges and vertices, interpolation across triangles is continuous. There is no condition for the first-order derivatives, therefore this interpolation is C0 smooth.
2.4 Robustness and bias
Robustness is often measured by observing the range of amplifications that stochastic perturbations linearly superimposed with the input data can undergo. Due to linearity, it is insightful and common practice to observe changes that uncorrelated Gaussian noise as the only input \(\mathbf {f}=\mathcal {N}\) undergoes, which we adopt to analyze the robustness of the three above-mentioned interpolation methods. We consider the 32 nodes of a pentakis dodecahedron as directional sampling for the input data f, which is interpolated using the 2520 nodes of a Chebyshev-type quadrature [43], yielding the 2520 output values \(\tilde {\mathbf {f}}\).
Figure 2 shows a statistical analysis of the two ratios \(\frac {\text {RMS}\{\tilde {\mathbf {f}}\}}{\text {RMS}\{\mathbf {f}\}}\) and \(\frac {\max \{\tilde {\mathbf {f}}\}}{\max \{\mathbf {f}\}}\), analyzing these ratios for 1000 independent instances of a random input vector f.
Regarding changes between RMS values from input to output, we observe that SH methods for N={7,8}, Spline for all M={1,2,3}, and TrI produce a bias towards smaller output RMS values of 2 dB or more for stochastic input. For TrI, it is understandable that within any triangle, three uncorrelated inputs get averaged linearly, therefore the output RMS gets reduced by stochastic instead of additive interference. For SH interpolation with N=4, this reduction only happens sparsely. The implicit minimization of the Euclidian norm for N≥5 minimizes the output RMS value, and therefore causes the observable bias towards lower RMS. This minimization might be optimistically regarded as an increase in robustness between the sampling nodes for N≤6, but it also implies a general decrease in magnitude there, a bias causing dips between the observed samples when interpolating omnidirectional directivities. Spline methods process constant inputs separately, therefore, it is reasonable to assume that the observed reduction in output RMS rather displays increased robustness to stochastic perturbation. For the chosen spatial sampling scheme, all methods appear robust enough to avoid enlarged output RMS values.
As a more critical test, SH-based interpolation exhibits the largest differences between maxima in the interpolated output compared to those in the input, with around ±3 dB for N={4,5,6,7}, while the settings SH 8 and Spline 3 behave reasonably. Rigorously, TrI as a linear interpolation is capable of precisely avoiding enlarged output maxima, and the same benefit is observed for Spline 1.