Skip to main content

Table 5 Haversine distance error performance comparison (task 3—flying UAV, speech sound source)

From: Noise power spectral density scaled SNR response estimation with restricted range search for sound source localisation using unmanned aerial vehicles

 

Haversine distance errorD (rad)

p value: paired samplet test (ref. best case)

 

Mean

Median

Min

Max

0.25 quartile

0.75 quartile

RMSE

 

Baseline

   

GCC-PHAT (max)

1.143

0.9486

0.00440

3.039

0.5444

1.680

1.362

1.81 ×10−17

GCC-PHAT (sum)

1.375

1.2730

0.02862

3.039

0.7456

1.920

1.582

2.11 ×10−29

GCC-NONLIN (max)

1.022

0.8480

0.00440

2.912

0.4434

1.638

1.233

2.13 ×10−11

GCC-NONLIN (sum)

1.206

1.0604

0.02862

2.971

0.7088

1.665

1.390

8.98 ×10−21

MVDR (max)

0.913

0.8164

0.00708

2.993

0.3166

1.515

1.138

5.83 ×10−7

MVDR (sum)

1.045

0.8897

0.01019

2.501

0.5146

1.680

1.232

2.39 ×10−18

DS (max)

0.881

0.7378

0.00800

3.015

0.2789

1.361

1.141

1.27 ×10−4

DS (sum)

1.083

0.8598

0.01492

3.024

0.4927

1.668

1.328

2.47 ×10−16

DNM (max)

1.088

0.9123

0.00437

3.015

0.5015

1.674

1.300

2.31 ×10−16

DNM (sum)

1.160

0.9798

0.03962

3.056

0.6410

1.698

1.356

1.19 ×10−20

w/ [28] T-F mask

   

GCC-PHAT (max)

1.277

1.2783

0.07181

2.776

0.6593

1.824

1.441

1.18 ×10−31

GCC-PHAT (sum)

1.247

1.2153

0.07979

2.827

0.7090

1.735

1.398

1.06 ×10−28

w/ [30] T-F mask

   

DS (max)

1.052

0.9775

0.01399

2.540

0.4983

1.602

1.243

3.05 ×10−17

DS (sum)

1.097

1.0249

0.01038

2.540

0.5042

1.644

1.281

7.08 ×10−20

w/ SNR response scaling

   

GCC-PHAT (max)

1.207

1.1717

0.07321

2.669

0.6054

1.811

1.390

5.77 ×10−22

GCC-PHAT (sum)

1.325

1.2988

0.11448

2.989

0.7246

1.866

1.501

3.25 ×10−30

GCC-NONLIN (max)

1.170

1.1192

0.07740

2.633

0.5953

1.711

1.347

2.85 ×10−21

GCC-NONLIN (sum)

1.250

1.2045

0.06879

2.956

0.7501

1.761

1.404

1.75 ×10−29

MVDR (max)

1.045

0.9847

0.02859

2.472

0.5349

1.596

1.204

2.97 ×10−16

MVDR (sum)

1.103

1.0282

0.02711

2.729

0.6316

1.594

1.253

1.22 ×10−22

DS (max)

0.999

0.8668

0.03225

2.583

0.4278

1.626

1.198

1.15 ×10−11

DS (sum)

1.115

1.0375

0.01969

2.991

0.5803

1.722

1.292

4.33 ×10−21

DNM (max)

1.174

1.2215

0.02945

2.613

0.6012

1.728

1.345

2.37 ×10−23

DNM (sum)

1.160

1.0620

0.04191

2.740

0.6189

1.724

1.323

5.48 ×10−24

w/ RPSL post-processing

   

GCC-PHAT (max)

1.129

1.0451

0.03077

2.691

0.6323

1.547

1.282

3.22 ×10−22

GCC-PHAT (sum)

1.294

1.1847

0.03292

2.877

0.7323

1.769

1.458

1.44 ×10−31

GCC-NONLIN (max)

1.083

1.0325

0.00474

2.543

0.5527

1.568

1.265

1.66 ×10−17

GCC-NONLIN (sum)

1.093

1.0342

0.00901

2.644

0.6126

1.505

1.251

4.22 ×10−20

MVDR (max)

0.826

0.6226

0.02069

2.382

0.2362

1.476

1.063

3.07 ×10−5

MVDR (sum)

0.864

0.6437

0.02214

2.250

0.3550

1.583

1.078

4.67 ×10−8

DS (max)

0.706

0.4435

0.02207

2.424

0.1956

1.176

0.962

n.s.

DS (sum)

0.850

0.6395

0.02145

2.501

0.3336

1.325

1.071

1.28 ×10−6

DNM (max)

0.982

0.8109

0.01968

2.444

0.4646

1.441

1.167

1.33 ×10−14

DNM (sum)

0.980

0.7641

0.03962

2.551

0.4765

1.434

1.169

1.81 ×10−15

w/ [28] T-F mask + RPSL post-processing

   

GCC-PHAT (max)

1.167

1.0996

0.09304

2.617

0.6643

1.607

1.316

8.90 ×10−26

GCC-PHAT (sum)

1.285

1.1122

0.04744

2.912

0.6868

1.751

1.473

4.50 ×10−30

w/ [30] T-F mask + RPSL post-processing

   

DS (max) (best case)

0.684

0.4362

0.00038

2.593

0.1827

0.937

0.951

N/A

DS (sum)

0.786

0.5264

0.02560

2.452

0.2546

1.288

1.015

2.66 ×10−3

w/ SNR response scaling + RPSL post-processing

   

GCC-PHAT (max)

1.067

0.9980

0.07649

2.852

0.5852

1.515

1.241

5.90 ×10−18

GCC-PHAT (sum)

1.202

1.1322

0.06775

2.945

0.6837

1.696

1.373

4.18 ×10−23

GCC-NONLIN (max)

0.893

0.6941

0.01799

2.408

0.4562

1.208

1.082

1.40 ×10−7

GCC-NONLIN (sum)

1.066

0.9414

0.07722

2.510

0.5493

1.565

1.236

2.76 ×10−19

MVDR (max)

0.770

0.5080

0.01199

2.530

0.2716

1.207

0.996

n.s.

MVDR (sum)

0.984

0.8222

0.03901

2.297

0.4840

1.558

1.167

2.44 ×10−15

DS (max)

0.759

0.5406

0.01738

2.344

0.2379

1.268

0.996

n.s.

DS (sum)

0.753

0.5300

0.02859

2.311

0.2693

1.094

0.978

n.s.

DNM (max)

0.996

0.8491

0.04045

2.451

0.4684

1.520

1.189

9.52 ×10−15

DNM (sum)

0.957

0.8381

0.05366

2.303

0.4342

1.371

1.142

1.19 ×10−12

  1. Results from the baseline method are first presented, followed by results using the T-F mask from [28] and [30] and the proposed method (SNR response scaling and RPSL). Best-performing numericals for each category are highlighted in bold