Skip to main content

Table 5 Haversine distance error performance comparison (task 3—flying UAV, speech sound source)

From: Noise power spectral density scaled SNR response estimation with restricted range search for sound source localisation using unmanned aerial vehicles

  Haversine distance errorD (rad) p value: paired samplet test (ref. best case)
  Mean Median Min Max 0.25 quartile 0.75 quartile RMSE  
Baseline    
GCC-PHAT (max) 1.143 0.9486 0.00440 3.039 0.5444 1.680 1.362 1.81 ×10−17
GCC-PHAT (sum) 1.375 1.2730 0.02862 3.039 0.7456 1.920 1.582 2.11 ×10−29
GCC-NONLIN (max) 1.022 0.8480 0.00440 2.912 0.4434 1.638 1.233 2.13 ×10−11
GCC-NONLIN (sum) 1.206 1.0604 0.02862 2.971 0.7088 1.665 1.390 8.98 ×10−21
MVDR (max) 0.913 0.8164 0.00708 2.993 0.3166 1.515 1.138 5.83 ×10−7
MVDR (sum) 1.045 0.8897 0.01019 2.501 0.5146 1.680 1.232 2.39 ×10−18
DS (max) 0.881 0.7378 0.00800 3.015 0.2789 1.361 1.141 1.27 ×10−4
DS (sum) 1.083 0.8598 0.01492 3.024 0.4927 1.668 1.328 2.47 ×10−16
DNM (max) 1.088 0.9123 0.00437 3.015 0.5015 1.674 1.300 2.31 ×10−16
DNM (sum) 1.160 0.9798 0.03962 3.056 0.6410 1.698 1.356 1.19 ×10−20
w/ [28] T-F mask    
GCC-PHAT (max) 1.277 1.2783 0.07181 2.776 0.6593 1.824 1.441 1.18 ×10−31
GCC-PHAT (sum) 1.247 1.2153 0.07979 2.827 0.7090 1.735 1.398 1.06 ×10−28
w/ [30] T-F mask    
DS (max) 1.052 0.9775 0.01399 2.540 0.4983 1.602 1.243 3.05 ×10−17
DS (sum) 1.097 1.0249 0.01038 2.540 0.5042 1.644 1.281 7.08 ×10−20
w/ SNR response scaling    
GCC-PHAT (max) 1.207 1.1717 0.07321 2.669 0.6054 1.811 1.390 5.77 ×10−22
GCC-PHAT (sum) 1.325 1.2988 0.11448 2.989 0.7246 1.866 1.501 3.25 ×10−30
GCC-NONLIN (max) 1.170 1.1192 0.07740 2.633 0.5953 1.711 1.347 2.85 ×10−21
GCC-NONLIN (sum) 1.250 1.2045 0.06879 2.956 0.7501 1.761 1.404 1.75 ×10−29
MVDR (max) 1.045 0.9847 0.02859 2.472 0.5349 1.596 1.204 2.97 ×10−16
MVDR (sum) 1.103 1.0282 0.02711 2.729 0.6316 1.594 1.253 1.22 ×10−22
DS (max) 0.999 0.8668 0.03225 2.583 0.4278 1.626 1.198 1.15 ×10−11
DS (sum) 1.115 1.0375 0.01969 2.991 0.5803 1.722 1.292 4.33 ×10−21
DNM (max) 1.174 1.2215 0.02945 2.613 0.6012 1.728 1.345 2.37 ×10−23
DNM (sum) 1.160 1.0620 0.04191 2.740 0.6189 1.724 1.323 5.48 ×10−24
w/ RPSL post-processing    
GCC-PHAT (max) 1.129 1.0451 0.03077 2.691 0.6323 1.547 1.282 3.22 ×10−22
GCC-PHAT (sum) 1.294 1.1847 0.03292 2.877 0.7323 1.769 1.458 1.44 ×10−31
GCC-NONLIN (max) 1.083 1.0325 0.00474 2.543 0.5527 1.568 1.265 1.66 ×10−17
GCC-NONLIN (sum) 1.093 1.0342 0.00901 2.644 0.6126 1.505 1.251 4.22 ×10−20
MVDR (max) 0.826 0.6226 0.02069 2.382 0.2362 1.476 1.063 3.07 ×10−5
MVDR (sum) 0.864 0.6437 0.02214 2.250 0.3550 1.583 1.078 4.67 ×10−8
DS (max) 0.706 0.4435 0.02207 2.424 0.1956 1.176 0.962 n.s.
DS (sum) 0.850 0.6395 0.02145 2.501 0.3336 1.325 1.071 1.28 ×10−6
DNM (max) 0.982 0.8109 0.01968 2.444 0.4646 1.441 1.167 1.33 ×10−14
DNM (sum) 0.980 0.7641 0.03962 2.551 0.4765 1.434 1.169 1.81 ×10−15
w/ [28] T-F mask + RPSL post-processing    
GCC-PHAT (max) 1.167 1.0996 0.09304 2.617 0.6643 1.607 1.316 8.90 ×10−26
GCC-PHAT (sum) 1.285 1.1122 0.04744 2.912 0.6868 1.751 1.473 4.50 ×10−30
w/ [30] T-F mask + RPSL post-processing    
DS (max) (best case) 0.684 0.4362 0.00038 2.593 0.1827 0.937 0.951 N/A
DS (sum) 0.786 0.5264 0.02560 2.452 0.2546 1.288 1.015 2.66 ×10−3
w/ SNR response scaling + RPSL post-processing    
GCC-PHAT (max) 1.067 0.9980 0.07649 2.852 0.5852 1.515 1.241 5.90 ×10−18
GCC-PHAT (sum) 1.202 1.1322 0.06775 2.945 0.6837 1.696 1.373 4.18 ×10−23
GCC-NONLIN (max) 0.893 0.6941 0.01799 2.408 0.4562 1.208 1.082 1.40 ×10−7
GCC-NONLIN (sum) 1.066 0.9414 0.07722 2.510 0.5493 1.565 1.236 2.76 ×10−19
MVDR (max) 0.770 0.5080 0.01199 2.530 0.2716 1.207 0.996 n.s.
MVDR (sum) 0.984 0.8222 0.03901 2.297 0.4840 1.558 1.167 2.44 ×10−15
DS (max) 0.759 0.5406 0.01738 2.344 0.2379 1.268 0.996 n.s.
DS (sum) 0.753 0.5300 0.02859 2.311 0.2693 1.094 0.978 n.s.
DNM (max) 0.996 0.8491 0.04045 2.451 0.4684 1.520 1.189 9.52 ×10−15
DNM (sum) 0.957 0.8381 0.05366 2.303 0.4342 1.371 1.142 1.19 ×10−12
  1. Results from the baseline method are first presented, followed by results using the T-F mask from [28] and [30] and the proposed method (SNR response scaling and RPSL). Best-performing numericals for each category are highlighted in bold
\