Estimating the first and second derivatives of discrete audio data

Lewandowski, Marcin

doi:10.1186/s13636-024-00355-5

Methodology
Open access
Published: 18 June 2024

Estimating the first and second derivatives of discrete audio data

Marcin Lewandowski ORCID: orcid.org/0000-0002-8106-1213¹

EURASIP Journal on Audio, Speech, and Music Processing volume 2024, Article number: 31 (2024) Cite this article

239 Accesses
Metrics details

Abstract

A new method for estimating the first and second derivatives of discrete audio signals intended to achieve higher computational precision in analyzing the performance and characteristics of digital audio systems is presented. The method could find numerous applications in modeling nonlinear audio circuit systems, e.g., for audio synthesis and creating audio effects, music recognition and classification, time-frequency analysis based on nonstationary audio signal decomposition, audio steganalysis and digital audio authentication or audio feature extraction methods. The proposed algorithm employs the ordinary 7 point-stencil central-difference formulas with improvements that minimize the round-off and truncation errors. This is achieved by treating the step size of numerical differentiation as a regularization parameter, which acts as a decision threshold in all calculations. This approach requires shifting discrete audio data by fractions of the initial sample rate, which was obtained by fractional delay FIR filters designed with modified 11-term cosine-sum windows for interpolation and shifting of audio signals. The maximum relative error in estimating first and second derivatives of discrete audio signals are respectively in order of $10^{-13}$ and $10^{-10}$ over the entire audio band, which is close to double-precision floating-point accuracy for the first and better than single-precision floating-point accuracy for the second derivative estimation. Numerical testing showed that this performance of the proposed method is not influenced by the type of signal being differentiated (either stationary or nonstationary), and provides better results than other known differentiation methods, in the audio band up to 21 kHz.

1 Introduction

The infinitesimal differential calculus with $h \rightarrow {} 0$ can be obtained only mathematically, for continuous-time and analytically obtainable derivatives. In the case of data measured by digital equipment, infinitesimal differential calculus is no longer valid because of the intrinsic discretization of the data being processed, i.e., sampling the data in time and quantizing its amplitude values. Discrete audio signals are sampled at equal-spaced time intervals, which usually span from $T_s=\frac{1}{8000}$ to $T_s=\frac{1}{384000}$ seconds with fixed-point precision varying from 8 to 24 bits or generated with 32- or 64-bit floating-point precision. Thus, the derivative obtained for a discrete signal is only an estimation of its true value at some point in time. In this paper, a numerical differentiation method is proposed to estimate the first and second derivatives of discrete audio data with no assumptions about the data to be analyzed. The method is intended to minimize numerical errors down to the truncation and round-off errors. Details of the method are presented in Sect. 2, and the method is evaluated and tested in Sect. 3.

1.1 Numerical differentiation of experimental data

Numerical methods for estimating first and second derivatives of discrete data can be classified into two main groups. The first group aims to develop formulas for estimating derivatives numerically without knowledge about the function which generates data points. It includes finite-difference calculation usually obtained by polynomial interpolation, such as quadrature, Lagrange, Legendre, Newton, Chebyshev, Gauss, Hermite, Sterling polynomials [1,2,3,4,5,6,7,8,9,10,11], Taylor series expansion, or method of undetermined coefficients [12,13,14,15,16,17,18,19,20,21,22,23]. All these methods provide the same form of finite-difference coefficients, but their computational time, complexity, and memory storage requirements differ. Moreover, they are very prone to errors due to the noisy, non-exact, and experimental nature of the analyzed signals. Therefore, several methods that use a regularization parameter to be optimized have been proposed. They form the second group of methods and do not offer any explicit differential formula that could be used to calculate the derivative but aim to evaluate data using a function fitted to the data points. Regularization methods provide stable approximation of derivatives [1, 24,25,26,27,28,29], and include Richardson’s extrapolation [1, 3, 9, 30,31,32,33], automatic differentiation [34,35,36,37,38], optimization approach (such as Tikhonov, variational, mollification, and heuristic regularization) [24, 26, 27, 32, 39,40,41,42,43,44,45,46,47,48,49,50,51,52], or smoothing approximations [42, 51, 53,54,55,56,57]. Regularization methods are especially useful in estimating trends in the data but require additional effort to choose regularization or fitting parameters, such as L-curve, GCV, or by heuristic methods [43, 52, 58,59,60,61]. Therefore, regularization methods require some assumptions about the analyzed signal. The volume of work focused on numerical differentiation, in general, is quite considerable and growing each year [62, 63]. Numerical differentiation is an elementary and essential tool in applied sciences used for numerical analysis and in system modeling [2, 3, 30, 64,65,66,67,68,69]. Regardless of the increasing number of publications, there is no generally accepted method for carrying out numerical differentiation for all kinds of data. A major reason for this is that numerical differentiation is an ill-posed problem, i.e., small perturbations in the signal may lead to large errors in the computed derivative [42, 68, 70, 71]. It is a problem known for years [24, 72], especially when dealing with experimental data typically corrupted with some kind of noise due to measurement, rounding, truncation, or other processing errors. Until now, no systematic strategy for the selection of the optimum differentiation method for a given practical problem has been proposed [73, 74].

1.2 Numerical differentiation of discrete audio data

Numerical differentiation of discrete audio data finds numerous applications in solving ordinary and partial differential equations (ODEs and PDEs) as a numerical framework for modeling nonlinear audio circuit systems. It is used, for example, in audio synthesis and creating audio effects [75,76,77,78,79] and music recognition and classification [34, 80,81,82]. It is specifically used for time-frequency analysis based on nonstationary audio signal decomposition (methods derived from empirical mode decomposition [83,84,85,86,87]), enhancement of spectral precision in Fourier-based methods [85, 88,89,90,91], audio steganalysis [92], digital audio authentication [93,94,95], acoustic event detection [96,97,98,99], feature extraction based on Mel-frequency cepstral coefficients (MFCC) [100,101,102,103,104,105,106,107,108], speaker and speech identification and recognition, and sound source tracking [107, 109,110,111,112,113,114,115,116,117,118,119,120]. Audio data, such as music or speech, which are nonstationary over time, cannot be described by a mathematical expression. This is particularly a problem with the use of numerical differentiation which employs approximation to the analyzed data. Approximation may be treated as a smoothing operation or searching for a trend line in the data. Selecting a trend line is achieved by specifying a regularization parameter. The selection of algorithms and methods for finding the regularization parameter depends on the given requirements, but finally every regularization procedure compromises between “smoothness” and “roughness” of data estimate [27, 42, 43, 45, 52, 59, 73, 121,122,123,124,125]. Averaging approximation of audio signal in the time domain translates directly into the frequency domain in the form of the attenuation of higher frequencies in the signal. Averaging corresponds to changes in the shape, cutoff frequency, and cutoff slope of the low-pass filter resulting from the chosen regularization method and selected parameter. A similar process of removing high-frequency content in the signal occurs in the group of numerical differentiation methods based on finite-difference calculation, which can be regarded as a special case of FIR filters known as differentiators. All these approximation methods try to find the best compromise between cut-off frequency, frequency transition region, and stop-band attenuation with an equivalent filter approach. Although averaging is appropriate in applications in which high-frequency content is regarded as noise, it is not acceptable for many kinds of digital audio signals which hold essential information in rapid changes in the amplitude over time. Consequently, there have been some attempts to make a regularization parameter variable based on the actual form of the signal [124, 126]. Nevertheless, no method can successfully separate audio signal from noise without distorting the signal. The numerical differentiation method proposed in this paper estimates the first and second derivatives of discrete audio data using central-difference formulas for calculations and makes no assumptions about the data being analyzed. The method does not incorporate smoothing of input data and employs additional procedures to minimize numerical errors. The main contribution of the proposed method, presented in detail in Sect. 2, can be summarized as follows:

Instead of smoothing or filtering the high-frequency content of the input signal, the step size h (see Sect. 2 and 3) is used as the regularization parameter to be optimized.
Additional procedures to minimize numerical errors employ fractional delay FIR filters designed with modified cosine-sum windows [127] for shifting and interpolation of audio signals and enhance numerical accuracy in estimation of the derivative.
As it will be shown, the maximum relative errors in the estimation of the first and second derivatives are small for discrete audio signals, respectively, of order $10^{-13}$ and $10^{-10}$ in the entire audio band, which is close either to double-precision or single-precision floating-point accuracy of calculations.

Section 3 shows the experiments designed to verify the performance of the proposed method and gives a comparison to other known methods of numerical differentiation, both by analysis of random input samples and through the differences in resulting transfer function.

2 Proposed method

The derivative of a discrete signal is only an estimation of its true value at some point in time because all calculations are performed with non-exact finite-precision arithmetic. Possible errors may result from simplified assumptions in the mathematical model, discretization error, convergence error, and round-off error (due to the finite-precision of numerical calculations).

The first-order derivative of a general signal f being a function of a variable x calculated at $x_0$ point can be expressed as the discrete finite central-difference formula [128]:

$$\begin{aligned} \hat{f}^{(1)}(x_0) \approx \frac{f(x_0 + h) - f(x_0 - h)}{2h}, \end{aligned}$$

(1)

where the step size h should be kept sufficiently small for the accurate approximation (i.e., preserve a small truncation error). However, a decrease in the step size h leads to subtractive cancelation which increases round-off errors. The challenge is to identify the optimal step size $h_0$ to avoid conditions in which the decreasing truncation error is dominated by the round-off error (Fig. 1). The optimal $h_0$ can be found by estimating the round-off and truncation errors associated with (1). According to [9] and truncating all terms of Taylor series expansion greater than 3, these errors can be estimated as:

$$\begin{aligned} \hat{f}^{(1)}(x_0)=\underbrace{\frac{f(x_0 + h) - f(x_0 - h)}{2h}}_{\text {finite central-difference approx.}}+\underbrace{\frac{\text {err}_{x_0+h} - \text {err}_{x_0-h}}{2h}}_{\text {round-off error}}-\underbrace{\frac{f^{(3)}(x_0)}{6}h^2}_{\text {truncation error}}, \end{aligned}$$

(2)

where true values of $f(x_0 \pm h)$ from (1) are represented as a sum of approximations $\hat{f}(x_0 \pm h)$ and round-off errors defined as $\text {err}_{x_0 \pm h}$.

An absolute value of the upper bound of total error can be represented as:

$$\begin{aligned} \left| \hat{f}^{(1)}(x_0) - \frac{f(x_0 + h) - f(x_0 - h)}{2h} \right| \le \frac{2 \cdot \text {eps}}{2h} - \frac{Ph^2}{6}, \end{aligned}$$

(3)

Assuming that the eps (precision value of floating-point numbers) in (3) is set as the upper bound of round-off error, and maximum value of $f^{(3)}(\xi )$ in (2) is set to P. Then, the optimal step size $h_0$ can be determined by differentiating (3) with respect to h and imposing the resulting derivative equal to zero, and then solving for h:

$$\begin{aligned} h_0 \le \root 3 \of {\frac{3 \cdot \text {eps}}{P}}. \end{aligned}$$

(4)

Figure 1 shows that, for the step size $h<h_0$, the relative error of finite central-difference approximation is determined by round-off errors. Otherwise, for $h>h_0$, the truncation error dominates.

2.1 Derivation and analysis of proposed method

For estimation of the first and second derivatives, the 7-point stencil central-differences approximation following [5] was used. The approximation of the first derivative is given by the formula:

$$\begin{aligned} \hat{f}^{(1)}(x) = \frac{1}{60h} \sum \limits _{k=1}^{N-1 \over 2} c_k \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

(5)

for $N=7$ and coefficients $c_k=[45,-9,1]$. The approximation of the second derivative is given by:

$$\begin{aligned} \hat{f}^{(2)}(x) = \frac{c_0 f(x)}{180h^2} + \frac{1}{180h^2} \sum \limits _{k=1}^{N-1 \over 2} c_k \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

(6)

for $N=7$, $c_k=[270,-27,2]$, $c_0=-490$, where h is the step size, N is the order, and $c_k$ for $(k=0,...,(N-1)/2)$ are the coefficients of central-difference approximation ($c_0$ occurs only in the case of the second-order derivative).

As depicted in Fig. 1, the choice of the step size h is critical for the accuracy of the derivative estimation. For this reason, the derivative estimation defined as $\hat{F}$ was computed for several h values as:

$$\begin{aligned} \hat{F}^{(1,2)}(x) = \left. \hat{f}^{(1,2)}(x) \right| _{h=10^{m} \cdot T_{s}} \end{aligned}$$

(7)

for $m=[2,1,0,-1,-2,-3]$, where $\hat{f}^{(1,2)}(x)$ was calculated for h decreased from $100 \cdot T_{s}$ to $0.001 \cdot T_{s}$ (m decreased from 2 to − 3).

Since discrete audio data are sampled at equal-spaced intervals $h=T_{s}=1/f_{s}$, the value of h is fixed over time. Calculations performed for $h=0.1 \cdot T_{s}$, $h=0.01 \cdot T_{s}$, and $h=0.001 \cdot T_{s}$ require components in (5) and (6) to be shifted by corresponding fractions of the initial sample rate. The required shifting operation was performed by fractional delay FIR filters designed with the modified 11-term cosine-sum window as proposed in [127] as in (8) below:

$$\begin{aligned} w_{j} = \sum \limits _{k=0}^{K-1} A_k \cdot \cos \left[ \frac{\pi k \cdot (2j - N + 1 - 2\cdot \delta )}{(N - 1)} \right] \end{aligned}$$

(8)

for $j = 0, ..., N-1$, where N is filter’s order (number of coefficients), $K = 11$ is number of cosine-sum terms from [127], and $\delta$ is the fractional shift of the filter. Filter coefficients are calculated by multiplying this window function with the sinc function given in (9) below:

$$\begin{aligned} sinc_{j} = \left\{ \begin{array}{ll} 2 \cdot \frac{f_c}{f_s} &{} \text {if} \ \ \ \frac{2j - N + 1}{2} - \delta = 0 \\ \frac{\sin \left( \pi \cdot \frac{f_c}{f_s} \left[ (2j - N + 1) - 2\cdot \delta \right] \right) }{\pi \left( \frac{2j - N + 1}{2} - \delta \right) } &{} \text {otherwise} \end{array}\right. \end{aligned}$$

(9)

Figure 2 shows frequency and phase response for one of the designed filters. It shifts input signal by $\delta = 0.06$ fraction of $T_s$, has a number of 8001 coefficients, and works with $f_s = 44100 \cdot 64 \text {Hz}$ and cutoff frequency $f_c = 22050\ \text {Hz}$. Phase-delay plot on the left panel shows the shift of 0.06 fraction of sampling time and passband ripples of the $10^{-14}$ order on the right panel.

To increase the accuracy in derivative estimation, input data were 64 times oversampled. Maximum relative errors in the estimation of the first and second derivatives by $\hat{F}^{(1)}(x)$ and $\hat{F}^{(2)}(x)$ in (7) with different step sizes ($h=10^{m}\cdot T_{s}$ $\forall$ $m=[2,1,0,-1,-2,-3]$) and 300 sinusoidal input signals of frequencies randomly varying from 1 to 21000 Hz (sampling rate $f_{s}=44100$ Hz) are shown in Figs. 3 and 4. Maximum relative errors were calculated as $max \left| \left( \hat{F}^{(1,2)}(x) - f^{(1,2)}(x)\right) /max \left|f^{(1,2)}(x)\right| \right|$, where $f^{(1,2)} (x)$ were exact derivatives computed analytically.

The results shown in Figs. 3 and 4 reveal that there is no optimal step size h for estimating the derivatives over the whole audio frequency band. The step size h should be increased at lower frequencies and decreased at higher frequencies to achieve the lowest relative error of calculations. There are, however, characteristic points where the maximum relative errors for different step sizes intersect with each other. Knowing the data-dependent optimum points (marked as circles in Figs. 3 and 4), it is possible to derive formulas for estimating the derivatives with the highest possible accuracy (minimum error).

Considering the step size h as a regularization parameter, the optimum step sizes $h_{07}$ and $h_{08}$ (where 07 and 08 indicate the order of central-difference approximation for step sizes from $10^2 \cdot T_s$ to $T_s$ needed to get the optimum step size) calculated for 9th order derivative approximation were obtained and used as the threshold values to select ranges in derivative estimations by (7) which provide the lowest maximum relative error. For the first derivative estimation, threshold values using the 9-point stencil central-differences approximation were obtained as:

$$\begin{aligned} h_{07}^{10^{m}}(x) = \root 7 \of {\frac{385 \cdot \text {eps}}{9 \cdot \left. \hat{f}^{(7)}(x) \right| _{h=10^{m} \cdot T_{s}}}} \end{aligned}$$

(10)

for $m=[0,1,2]$, where

$$\begin{aligned} \left. \hat{f}^{(7)}(x) \right| _{h=10^{m}\cdot T_{s}} = \frac{1}{2h^7} \sum \limits _{k=1}^{N-1 \over 2} c_k \cdot \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

(11)

for $m=[0,1,2]$, $N=9$, $c_{k}=[-14,14,-6,1]$ and for the second derivative estimation as:

$$\begin{aligned} h_{08}^{10^{m}}(x) = \root 10 \of {\frac{77112 \cdot \text {eps}}{99 \cdot \left. \hat{f}^{(8)}(x) \right| _{h=10^{m} \cdot T_{s}}}} \end{aligned}$$

(12)

for $m=[0,1,2]$, where

$$\begin{aligned} \left. \hat{f}^{(8)}(x) \right| _{h=10^{m}\cdot T_{s}} = \frac{c_{0}f(x)}{2h^{8}} + \frac{1}{2h^8} \sum \limits _{k=1}^{N-1 \over 2} c_k \cdot \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

(13)

for $m=[0,1,2]$, $N=9$, $c_{k}=[-56,28,-8,1]$, $c_{0}=70$.

Finally, the first and second derivative estimations $\hat{F}^{(1)}(x)$ and $\hat{F}^{(2)}(x)$ given in (7) were modified and derived as vectors which are comprised of derivatives $\hat{f}^{(1)}(x)$ and $\hat{f}^{(2)}(x)$ using (5) and (6) at specific sample indexes [$x_{1}, ..., x_6$] to provide the lowest maximum relative error. These vectors correspond to the derivatives estimated for step sizes of $h=10^{m} \cdot T_{s}$ $\forall$ $m=[2,1,0,-1,-2,-3]$ and occur in the ranges specified by the thresholds calculated through (10) and (12). Estimations of the first and the second derivatives are respectively formulated by (14) and (15) and described with an Algorithm 1.

Threshold values of the $h_{07}$, $h_{08}$ and the div parameter used in (14) and (15) (empirically determined constant dependent on sampling frequency) allow for switching between derivatives calculated with different step sizes h which results in $10^{-13}$ and $10^{-10}$ accuracy (maximum relative errors) of the approximation in the whole audio band. The maximum relative error of estimating the first and second derivatives using the proposed method through (14) and (15) is shown by ‘x’ markers in Figs. 5 and 6 (which at higher frequencies form a thick bottom line).

$$\begin{aligned} \hat{F}^{(1)}(x) = \left\{ \begin{array}{ll} \left. \hat{f}^{(1)}(x_{1})\right| _{h=T_s} = \left. f^{(1)}(x_{1})\right| _{h=100T_s} &{} \text {for}\ x_{1} = \left\{ x \in \mathbb {N}^+ : h_{07}^{100}(x)> \frac{100h - 10h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{2})\right| _{h=T_s} = \left. f^{(1)}(x_{2})\right| _{h=10T_s} &{} \text {for}\ x_{2} = \left\{ x \in \mathbb {N}^+ : h_{07}^{100}(x) \le \frac{100h - 10h}{div} \wedge h_{07}^{10}(x)> \frac{10h - h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{3})\right| _{h=T_s} = \left. f^{(1)}(x_{3})\right| _{h=T_s} &{} \text {for}\ x_{3} = \left\{ x \in \mathbb {N}^+ : h_{07}^{10}(x) \le \frac{10h - h}{div} \wedge h_{07}^{1}(x)> \frac{h - 0.1h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{4})\right| _{h=T_s} = \left. f^{(1)}(x_{4})\right| _{h=0.1T_s} &{} \text {for}\ x_{4} = \left\{ x \in \mathbb {N}^+ : h_{07}^{1}(x) \le \frac{h - 0.1h}{div} \wedge h_{07}^{0.1}(x)> \frac{0.1h - 0.01h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{5})\right| _{h=T_s} = \left. f^{(1)}(x_{5})\right| _{h=0.01T_s} &{} \text {for}\ x_{5} = \left\{ x \in \mathbb {N}^+ : h_{07}^{0.1}(x) \le \frac{0.1h - 0.01h}{div} \wedge h_{07}^{0.01}(x) > \frac{0.01h - 0.001h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{6})\right| _{h=T_s} = \left. f^{(1)}(x_{6})\right| _{h=0.001T_s} &{} \text {for}\ x_{6} = \left\{ x \in \mathbb {N}^+ : h_{07}^{0.01}(x) \le \frac{0.01h - 0.001h}{div} \right\} \end{array}\right. \end{aligned}$$

(14)

$$\begin{aligned} \hat{F}^{(2)}(x) = \left\{ \begin{array}{ll} \left. \hat{f}^{(2)}(x_1)\right| _{h=T_s} = \left. f^{(2)}(x_1)\right| _{h=100T_s} &{} \text {for}\ x_1 = \left\{ x \in \mathbb {N}^+ : h_{08}^{100}(x)> \frac{100h - 10h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_2)\right| _{h=T_s} = \left. f^{(2)}(x_2)\right| _{h=10T_s} &{} \text {for}\ x_2 = \left\{ x \in \mathbb {N}^+ : h_{08}^{100}(x) \le \frac{100h - 10h}{div} \wedge h_{08}^{10}(x)> \frac{10h - h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_3)\right| _{h=T_s} = \left. f^{(2)}(x_3)\right| _{h=T_s} &{} \text {for}\ x_3 = \left\{ x \in \mathbb {N}^+ : h_{08}^{10}(x) \le \frac{10h - h}{div} \wedge h_{08}^{1}(x)> \frac{h - 0.1h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_4)\right| _{h=T_s} = \left. f^{(2)}(x_4)\right| _{h=0.1T_s} &{} \text {for}\ x_4 = \left\{ x \in \mathbb {N}^+ : h_{08}^{1}(x) \le \frac{h - 0.1h}{div} \wedge h_{08}^{0.1}(x)> \frac{0.1h - 0.01h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_5)\right| _{h=T_s} = \left. f^{(2)}(x_5)\right| _{h=0.01T_s} &{} \text {for}\ x_5 = \left\{ x \in \mathbb {N}^+ : h_{08}^{0.1}(x) \le \frac{0.1h - 0.01h}{div} \wedge h_{08}^{0.01}(x) > \frac{0.01h - 0.001h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_6)\right| _{h=T_s} = \left. f^{(2)}(x_6)\right| _{h=0.001T_s} &{} \text {for}\ x_6 = \left\{ x \in \mathbb {N}^+ : h_{08}^{0.01}(x) \le \frac{0.01h - 0.001h}{div} \right\} \end{array}\right. \end{aligned}$$

(15)

3 Comparison with other numerical differentiation methods

Since the exact derivatives of real-world audio signals cannot be calculated, experiments based on synthetic data were conducted. Two kinds of experiments with the use of, respectively, stationary and nonstationary signals, were performed. The testing was intended to compare the proposed method with other numerical differentiation methods.

3.1 Experiments with stationary synthetic data

Stationary synthetic data have been generated as a sum of four harmonic signals with randomly selected frequencies and arbitrarily chosen amplitudes as defined by the following formula:

$$\begin{aligned} f_n(x) = 0.5 \cdot \sin (\omega _{1,n}x) + 0.15 \cdot \cos (\omega _{2,n}x) + 0.2 \cdot \sin (\omega _{3,n}x) + 0.15 \cdot \cos (\omega _{4,n}x) \end{aligned}$$

(16)

for $n=[1, ... ,300]$, where $\omega _{1,n}, ... ,\omega _{4,n}$ are randomly selected frequencies respectively from the range [1–100], [101–3500], [3501–10000], and [10001–22050] Hz. The test data consisted of $n=300$ sets of such generated synthetic data. To simulate actual signal conditions more realistically, a noise was added to the harmonic signal (16). Two ($M=1,2$) normally distributed noise $\Delta _M$ realizations with a standard deviation of $\sigma _M=[2.2204 \cdot 10^{-16},0.001]$ were used. The noise was added to the harmonic signal $f_n (x)$ as shows (17):

$$\begin{aligned} \hat{f}_{M,n}(x) = f_n(x) + \Delta _M, \end{aligned}$$

(17)

where M denotes actual noise used, and n represents the four-sine frequencies set. In all calculations, the sampling frequency $f_s=44100$ Hz was used. The differentiation methods were compared as the error ratio of $\text {SNR}_{1,2,M,n}/\text {SNR}_{0,M,n}$ (where SNR is the signal-to-noise ratio) for $n=300$ data sets and $M=1,2$ noise distributions where $\text {SNR}_{1,2,M,n}$ are signal-to-noise ratios of the estimated first and second derivatives, and $\text {SNR}_{0,M,n}$ is the signal-to-noise ratio of the input signal. The level of errors $\text {SNR}_{0,M,n}$ in the input data has been characterized by the M signal-to-noise ratios, defined as:

$$\begin{aligned} \text {SNR}_{0,M,n}=10 \cdot log{\left( \frac{\sum \nolimits _{l=1}^{L} \left( f_n(x+l)\right) ^2}{\sum \nolimits _{l=1}^{L} \left( \hat{f}_{M,n}(x+l) - f_n(x+l)\right) ^2}\right) }, \end{aligned}$$

(18)

where $L=6000$ is the length of the input data sequence, $f_n$ is the input harmonic data sequence, and $\hat{f}_{M,n}$ is the input harmonic data sequence with $M=1,2$ noise distributions. Signal-to-noise ratios $\text {SNR}_{1,2,M,n}$ of the estimated first and second derivatives have been derived as follows:

$$\begin{aligned} \text {SNR}_{1,2,M,n}=10 \cdot log{\left( \frac{\sum \nolimits _{l=1}^{L} \left( f_{M,n}^{(1,2)}(x+l)\right) ^2}{\sum \nolimits _{l=1}^{L} \left( \hat{F}_{M,n}^{(1,2)}(x+l) - f_n(x+l)\right) ^2}\right) }, \end{aligned}$$

(19)

where $\hat{F}_{M,n}^{(1,2)}$ are estimates of the first and second derivatives and $f_{M,n}^{(1,2)}$ are true derivatives calculated analytically.

3.2 Experiments with nonstationary synthetic data

Nonstationary synthetic data were generated by adding the FM signal v(x) to the previously defined by (16) signal $f_n (x)$ in the following way:

$$\begin{aligned} g_n(x)=f_n(x)+v_n(x) \end{aligned}$$

(20)

for $n=[1, ... , 300]$ in which

$$\begin{aligned} v(x)=sin\left( \omega _Ax - \omega _Bx \cdot cos(x))\right) \end{aligned}$$

(21)

where $\omega _A$ and $\omega _B$ were set so that v(x) changes its frequency from 1 to 21000 Hz during one sinusoidal cycle in a data sequence of length $L=6000$. As in the previous experiment, the noise was added to input data in the same way as shown in (17). The differentiation methods were compared as the ratio of $\text {SNR}_{1,2,M,n}/\text {SNR}_{0,M,n}$ for $n=300$ input harmonic data with FM modulation sets and $M=1,2$ noise distributions.

3.3 Comparison material

The method proposed in this paper and described in Sect. 2 was compared with the following numerical differentiation methods:

Algorithm for numerical differentiation of discrete functions with an arbitrary degree and order of accuracy with the use of the closed explicit formula presented by H. Z. Hassan et al. in [13]. The algorithm is based on the method of undetermined coefficients and the closed form of the Vandermonde inverse matrix (the method labeled later as “Hassan”). The “Hassan” numerical differentiation has been performed with the order of 8.
MaxPol package written in MATLAB (labeled later as “MaxPol”) which is a comprehensive tool for numerical differentiation. The MaxPol is based on the method of undetermined coefficients to render a variety of FIR kernels in a closed form that can be used to approximate the full-band or low-pass derivatives of discrete single or multidimensional signals (images) [14, 15]. Numerical differentiation was performed with a centralized FIR derivative kernel for the full-band operation.
Ordinary 9-point stencil central-difference formulas (hereinafter referred to as “Central-Diff”).

3.4 Results of experiments

The dependence of error ratio $\text {SNR}_{1,2,M,n}/\text {SNR}_{0,M,n}$ for $n=300$ stationary and nonstationary sets of experimental data, and $M=1,2$ noise distributions, obtained for each of the compared methods are presented in Figs. 7, 8, 9 and 10. Figures 7 and 8 show the results for the stationary signals, for the first and second derivatives, respectively. Correspondingly, Figs. 9 and 10 show the results for the nonstationary signals. The error ratio for the method proposed in this work is shown by the solid line. The “Hassan,” “MaxPol,” and “Central-Diff” methods used for the comparison are shown with the dash-dotted, dotted, and dashed lines, respectively. The sets of experimental data are shown along the abscissa and represent the random selection of input data. The error ratio is shown on the linear scale on the ordinate. The noise level shown for each condition represents the noise added to the input data. The results presented in Figs. 7, 8, 9 and 10 reveal that the proposed method gives stable results which are not prone to changes in the input data sequence, both for stationary and nonstationary cases. Among other methods, only the “MaxPol” method performance is close to that of the proposed method in the estimation of the second derivative for the stationary data. In all other cases, the proposed method results are consistently better in performing the numerical differentiation of either stationary or nonstationary data. The presented results also reveal some general characteristics of the compared methods in the estimation of the first and second derivatives in discrete audio signals:

The “Central-Diff” difference formulas and the “Hassan” method are more computationally efficient than “MaxPol” and the proposed method. Estimation of the first derivative for stationary data gives comparable results for the “Central-Diff” formula and “Hassan” method both for no noise and − 80 dB noise conditions (Fig. 7). For the estimation of the second derivative, the “Hassan” method provides slightly better results than the ‘Central-Diff’ formulas (Fig. 8). Similar differences in the performance of these two methods are apparent in the results for nonstationary data (Figs. 8 and 10).
Full-band FIR kernel performance in MaxPol package is very prone to random selection of samples (Figs. 7 and 8) thus to the frequency content of the input data. It was found that when the stationary input data contained frequencies above 15000 Hz (see (16)), the performance of the method drastically decreased. The experiments with nonstationary data revealed that the “MaxPol” method produces lower error ratios than the “Central-Diff” and “Hassan” methods. The performance of the “MaxPol” method is also lower as compared to the proposed method, which is especially evident for nonstationary signals (Figs. 9 and 10).
The noise added to the input data improves to some extent the performance of numerical differentiation of all compared methods. The likely reason is that the random error introduced by the added noise to the input harmonic signal decorrelates consecutive samples. This in turn decreases the computational round-off error during the numerical differentiation process.

3.5 Transfer functions

The advantage of the proposed method is seen in differences between transfer functions showing the impact of the frequency content of input signal on the performance of the numerical differentiation. Transfer functions for the proposed method and the three other methods were calculated as a relationship between Fourier transforms of the estimated derivatives and the input signals, for 300 frequencies varying from 20 to 20000 Hz (the resolution was set to 8192 with 512-point Hann window). The left panels in Figs. 11 and 12 show the transfer functions of selected methods, respectively for the first and second derivatives. The performance of the proposed method (shown with the thick solid line) is nearly identical with the ideal differentiator response over the whole audio frequency band for the first and second-order numerical differentiation, which is not the case for other methods. The right panels in Figs. 11 and 12 show the relative error between the transfer function of the ideal differentiator, proposed method, and “MaxPol” method (as the best out of the other methods). In nearly all cases the performance of the proposed method is better than that of the “MaxPol” method. The only exception is the estimation of the second derivative with input signals below 15000 Hz (Fig. 12, right panel) but as it is seen in Fig. 8 only for stationary input signals.

4 Conclusions

The paper addresses the problem of estimating the first and second derivatives of discrete audio data. Numerical differentiation of discrete audio data has several applications. For example, it is particularly important in the development of numerical solvers for ordinary and partial differential equations PDEs and ODEs. These are fundamental in modeling audio circuit systems for digital audio effects and synthesizers.

The audio signal is always a complex combination of the components ranging four decades in frequency, from a few Hz to tens of thousands Hz, which are processed by both linear and nonlinear systems with parameters varying over time. Thus, it is not possible to derive an analytical mathematical expression for music or speech signals to be applied in the evaluation of numerical methods used for differentiation.

Discrete audio data consist of a sequence of samples that occur at a specific fixed time order therefore every preprocessing operation (like smoothing or approximation) disrupts audio data in some way. For this reason, it is important that the proposed method for estimating the first and second derivatives makes no assumptions about the data being analyzed and does not incorporate smoothing or filtering while preprocessing. To achieve the best possible numerical accuracy in the whole audio band, the step size h should be treated as a regularization parameter and is made variable based on the input signal frequency range. This was achieved with very precise fractional-delay FIR filters designed for interpolation and shifting of the processed audio data.

The comparison with three existing numerical differentiation methods showed that the performance of the proposed method is consistently better than of the other methods, especially in the case of nonstationary discrete audio data. Future research in employing the proposed method for time-domain analysis and modeling of digital audio systems should consider further investigations on increasing the numerical accuracy in estimating the second-order derivative.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

R.L. Burden, J.D. Faires, Numerical analysis, 9th edn. (Brooks/Cole, Cengage Learning, Boston, 2011). OCLC: ocn496962633
S.C. Chapra, R.P. Canale, Numerical methods for engineers, 7th edn. (McGraw-Hill Education, New York, 2015)
Google Scholar
G. Dahlquist, A. Björck, Numerical methods in scientific computing (Society for Industrial and Applied Mathematics, Philadelphia, 2008)
Google Scholar
T. Dokken, T. Lyche, A divided difference formula for the error in Hermite interpolation. BIT Numer. Math. 19(4), 539–540 (1979). https://doi.org/10.1007/bf01931270
Article MathSciNet Google Scholar
B. Fornberg, Generation of finite difference formulas on arbitrarily spaced grids. Math. Comput. 51(184), 699–706 (1988). https://doi.org/10.1090/s0025-5718-1988-0935077-0
Article MathSciNet Google Scholar
L.P. Grabar, Numerical differentiation by means of Chebyshev polynomials orthonormalized on a system of equidistant points. USSR Comput. Math. Math. Phys. 7(6), 215–220 (1967). https://doi.org/10.1016/0041-5553(67)90127-9
Article MathSciNet Google Scholar
B. Kvasov, Numerical differentiation and integration on the basis of interpolation parabolic splines. Chisl. Metody Mekh. Sploshn. Sredy 14(2), 68–80 (1983)
MathSciNet Google Scholar
J. Li, General explicit difference formulas for numerical differentiation. J. Comput. Appl. Math. 183(1), 29–52 (2005). https://doi.org/10.1016/j.cam.2004.12.026
Article MathSciNet Google Scholar
J.H. Mathews, K.D. Fink, Numerical methods using MATLAB, 4th edn. (Pearson, Upper Saddle River, 2004)
Google Scholar
I.W. Selesnick, Maximally flat low-pass digital differentiator. IEEE Trans. Circ. Syst. II Analog Digit. Signal Process. 49(3), 219–223 (2002). https://doi.org/10.1109/tcsii.2002.1013869
Article Google Scholar
Y. Zhang, Y. Chou, J. Chen, Z. Zhang, L. Xiao, Presentation, error analysis and numerical experiments on a group of 1-step-ahead numerical differentiation formulas. J. Comput. Appl. Math. 239, 406–414 (2013). https://doi.org/10.1016/j.cam.2012.09.011
Article MathSciNet Google Scholar
C.F. Gerald, P.O. Wheatley, Applied numerical analysis, 7th edn. (Pearson/Addison-Wesley, Boston, 2004)
Google Scholar
H.Z. Hassan, A.A. Mohamad, G.E. Atteia, An algorithm for the finite difference approximation of derivatives with arbitrary degree and order of accuracy. J. Comput. Appl. Math. 236(10), 2622–2631 (2012). https://doi.org/10.1016/j.cam.2011.12.019
Article MathSciNet Google Scholar
M.S. Hosseini, K.N. Plataniotis, Derivative kernels: Numerics and applications. IEEE Trans. Image Process. 26(10), 4596–4611 (2017). https://doi.org/10.1109/tip.2017.2713950
Article MathSciNet Google Scholar
M.S. Hosseini, K.N. Plataniotis, Finite differences in forward and inverse imaging problems: MaxPol Design. SIAM J. Imaging Sci. 10(4), 1963–1996 (2017). https://doi.org/10.1137/17m1118452
Article MathSciNet Google Scholar
I.R. Khan, R. Ohba, Closed-form expressions for the finite difference approximations of first and higher derivatives based on Taylor series. J. Comput. Appl. Math. 107(2), 179–193 (1999). https://doi.org/10.1016/s0377-0427(99)00088-6
Article MathSciNet Google Scholar
I.R. Khan, R. Ohba, Digital differentiators based on Taylor series. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E82–a(12), 2822–2824 (1999)
Google Scholar
I.R. Khan, R. Ohba, Mathematical proof of explicit formulas for tap-coefficients of Taylor series based FIR digital differentiators. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E84–a(6), 1581–1584 (2001)
Google Scholar
I.R. Khan, R. Ohba, N. Hozumi, Mathematical proof of closed form expressions for finite difference approximations based on Taylor series. J. Comput. Appl. Math. 150(2), 303–309 (2003). https://doi.org/10.1016/S0377-0427(02)00667-2
Article MathSciNet Google Scholar
I.R. Khan, R. Ohba, New finite difference formulas for numerical differentiation. J. Comput. Appl. Math. 126(1–2), 269–276 (2000). https://doi.org/10.1016/s0377-0427(99)00358-1
Article MathSciNet Google Scholar
I.R. Khan, R. Ohba, Taylor series based finite difference approximations of higher-degree derivatives. J. Comput. Appl. Math. 154(1), 115–124 (2003). https://doi.org/10.1016/S0377-0427(02)00816-6
Article MathSciNet Google Scholar
T. Moller, R. Machiraju, K. Mueller, R. Yagel, Evaluation and design of filters using a Taylor series expansion. IEEE Trans. Vis. Comput. Graph. 3(2), 184–199 (1997). https://doi.org/10.1109/2945.597800
Article Google Scholar
P. Sylvester, Numerical formation of finite-difference operators (correspondence). IEEE Trans. Microw. Theory Tech. 18(10), 740–743 (1970). https://doi.org/10.1109/tmtt.1970.1127342
Article Google Scholar
R.S. Anderssen, P. Bloomfield, Numerical differentiation procedures for non-exact data. Numer. Math. 22(3), 157–182 (1974). https://doi.org/10.1007/bf01436965
Article MathSciNet Google Scholar
F.B. Hildebrand, Introduction to numerical analysis, 2nd edn. (Dover Publications, New York, 1987)
Google Scholar
I. Knowles, R.J. Renka, Methods for numerical differentiation of noisy data. Electron. J. Differ. Equ 21, 235–246 (2014)
MathSciNet Google Scholar
S. Lu, S.V. Pereverzev, Numerical differentiation from a viewpoint of regularization theory. Math. Comput. 75(256), 1853–1870 (2006)
Article MathSciNet Google Scholar
F. Nikolovski, I. Stojkovska, Complex-step derivative approximation in noisy environment. J. Comput. Appl. Math. 327, 64–78 (2018). https://doi.org/10.1016/j.cam.2017.05.046
Article MathSciNet Google Scholar
A.G. Ramm, A.B. Smirnova, On stable numerical differentiation. Math. Comput. 70(235), 1131–1153 (2001)
Article MathSciNet Google Scholar
S.C. Chapra, Applied numerical methods with MATLAB for engineers and scientists, 3rd edn. (McGraw-Hill, New York, 2012). OCLC: ocn664665963
C.-C. Tseng, S.-L. Lee, in 2008 IEEE International Symposium on Circuits and Systems. Design of second order digital differentiator using Richardson extrapolation and fractional delay (2008), pp. 1120–1123. https://doi.org/10.1109/iscas.2008.4541619. Issn: 2158-1525
E. Kreyszig, H. Kreyszig, E.J. Norminton, Advanced engineering mathematics, 10th edn. (Wiley, Hoboken, 2011)
Google Scholar
A. Sidi, in Practical Extrapolation Methods: Theory and Applications. Cambridge Monographs on Applied and Computational Mathematics (Cambridge University Press, Cambridge, 2003). https://doi.org/10.1017/cbo9780511546815
A.G. Baydin, B.A. Pearlmutter, A.A. Radul, J.M. Siskind. Automatic Differentiation in Machine Learning: a Survey. J Mach Learn Res. 18(153), 1-43 (2018). https://doi.org/10.1080/10618600.2017.1390470.
G.F. Corliss (ed.), Automatic differentiation of algorithms: From simulation to optimization (Springer, New York, 2002)
Google Scholar
A. Griewank, A. Walther, in Evaluating Derivatives, Principles and Techniques of Algorithmic Differentiation, 2nd edn. Other Titles in Applied Mathematics (Society for Industrial and Applied Mathematics, USA, 2008). https://doi.org/10.1137/1.9780898717761
R.D. Neidinger, Introduction to automatic differentiation and MATLAB object-oriented programming. SIAM Rev. 52(3), 545–563 (2010). https://doi.org/10.1137/080743627
Article MathSciNet Google Scholar
L.B. Rall, Automatic differentiation: Techniques and applications. Lecture notes in computer science, vol. 120 (Springer-Verlag, Berlin; New York, 1981)
R. Chartrand, Numerical differentiation of noisy, nonsmooth data. ISRN Applied Mathematics 2011 (2011). https://doi.org/10.5402/2011/164564
J. Cheng, B. Hofmann, S. Lu, The index function and Tikhonov regularization for ill-posed problems. J. Comput. Appl. Math. 265, 110–119 (2014). https://doi.org/10.1016/j.cam.2013.09.035
Article MathSciNet Google Scholar
J. Cullum, Numerical differentiation and regularization. SIAM J. Numer. Anal. 8(2), 254–265 (1971)
Article MathSciNet Google Scholar
M. Hanke, O. Scherzer, Inverse problems light: Numerical differentiation. Am. Math. Mon. 108(6), 512–521 (2001). https://doi.org/10.1080/00029890.2001.11919778
Article MathSciNet Google Scholar
D.N. Hao, L.H. Chuong, D. Lesnic, Heuristic regularization methods for numerical differentiation. Comput. Math. Appl. 63(4), 816–826 (2012). https://doi.org/10.1016/j.camwa.2011.11.047
Article MathSciNet Google Scholar
B. Hu, S. Lu, Numerical differentiation by a Tikhonov regularization method based on the discrete cosine transform. Appl. Anal. 91(4), 719–736 (2012). https://doi.org/10.1080/00036811.2011.598862
Article MathSciNet Google Scholar
I. Knowles, R. Wallace, A variational method for numerical differentiation. Numer. Math. 70(1), 91–110 (1995). https://doi.org/10.1007/s002110050111
Article MathSciNet Google Scholar
H. Mao, Adaptive choice of the regularization parameter in numerical differentiation. J. Comput. Math. 33(4), 415–427 (2015)
Article MathSciNet Google Scholar
Y. Mathlouthi, A. Mitiche, I.B. Ayed, Regularised differentiation for image derivatives. IET Image Process. 11(5), 310–316 (2017). https://doi.org/10.1049/iet-ipr.2016.0369
Article Google Scholar
D. Murio, The mollification method and the numerical solution of the inverse heat conduction problem by finite differences. Comput. Math. Appl. 17(10), 1385–1396 (1989). https://doi.org/10.1016/0898-1221(89)90022-9
Article MathSciNet Google Scholar
A.G. Ramm, E. Meister, Stable solutions of some ill-posed problems. Math. Methods Appl. Sci. 3(1), 336–363 (1981). https://doi.org/10.1002/mma.1670030125
Article MathSciNet Google Scholar
A.G. Ramm, B.A. Smirnova. On Stable Numerical Differentiation. Mathematics of Computation, vol 70 (American Mathematical Society, 2001), p. 1131-53
A. Savitzky, M.J.E. Golay, Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36(8), 1627–1639 (1964). https://doi.org/10.1021/ac60214a047
Article Google Scholar
J.J. Stickel, Data smoothing and numerical differentiation by a regularization method. Comput. Chem. Eng. 34(4), 467–475 (2010). https://doi.org/10.1016/j.compchemeng.2009.10.007
Article Google Scholar
V.I. Dmitriev, Z.G. Ingtem, Numerical differentiation using spline functions. Comput. Math. Model. 23(3), 312–318 (2012). https://doi.org/10.1007/s10598-012-9139-9
Article MathSciNet Google Scholar
W. Gao, R. Zhang, Multiquadric trigonometric spline quasi-interpolation for numerical differentiation of noisy data: A stochastic perspective. Numer. Algoritm. 77(1), 243–259 (2018). https://doi.org/10.1007/s11075-017-0313-1
Article MathSciNet Google Scholar
M. Li, Y. Wang, L. Ling, Numerical Caputo differentiation by radial basis functions. J. Sci. Comput. 62(1), 300–315 (2015). https://doi.org/10.1007/s10915-014-9857-6
Article MathSciNet Google Scholar
V. Vershinin, N. Pavlov, Approximation of derivatives by smoothing splines. Vychisl. Sistemy 98, 83–91 (1983)
MathSciNet Google Scholar
G. Wahba, in Spline models for observational data. CBMS-NSF Regional Conference series in applied mathematics, vol. 59 (Society for Industrial and Applied Mathematics, Philadelphia, 1990)
P. Craven, G. Wahba, Smoothing noisy data with spline functions. Numer. Math. 31(4), 377–403 (1978). https://doi.org/10.1007/bf01404567
Article MathSciNet Google Scholar
P.H.C. Eilers, A perfect smoother. Anal. Chem. 75(14), 3631–3636 (2003). https://doi.org/10.1021/ac034173t
Article Google Scholar
P.C. Hansen, Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev. 34(4), 561–580 (1992). https://doi.org/10.1137/1034115
Article MathSciNet Google Scholar
P.C. Hansen, Regularization Tools version 4.0 for Matlab 7.3. Numer. Algoritm. 46(2), 189–194 (2007). https://doi.org/10.1007/s11075-007-9136-9
Article MathSciNet Google Scholar
Web of Science. https://www.webofknowledge.com/. Accessed 20 Mar 2024
Scopus. https://www.scopus.com/. Accessed 20 Mar 2024
L. Chaparro, Signals and systems using matlab (Elsevier, Waltham, 2019)
Google Scholar
S.J. Chapman, Matlab® programming for engineers, 6th edn. (Cengage, Boston, 2020). OCLC: on1048936473
J.D. Faires, R.L. Burden, Numerical methods, 4th edn. (Brooks/Cole, Cengage Learning, Boston, 2013). OCLC: ocn809689438
H. Kantz, T. Schreiber, Nonlinear time series analysis (Cambridge Univ. Press, Cambridge, 2010). https://doi.org/10.1017/CBO9780511755798. Oclc: 796208312
W.H. Press (ed.), Numerical recipes: the art of scientific computing, 3rd edn. (Cambridge University Press, Cambridge, 2007). OCLC: ocn123285342
W.y. Yang, W. Cao, J. Kim, K.W. Park, H.H. Park, J. Joung, J.S. Ro, H.L. Lee, C.H. Hong, T. Im, Applied numerical methods using MATLAB®, 2nd edn. (Wiley, Hoboken, 2020)
C.W. Groetsch, in The theory of Tikhonov regularization for Fredholm equations of the first kind. Research notes in mathematics, vol. 105 (Pitman Advanced Pub. Program, Boston, 1984)
A.N. Tikhonov, V.I. Arsenin, in Solutions of ill-posed problems. Scripta series in mathematics (Winston; distributed solely by Halsted Press, Washington: New York, 1977)
W.G. Bickley, Formulae for numerical differentiation. Math. Gaz. 25(263), 19–27 (1941). https://doi.org/10.2307/3606475
Article MathSciNet Google Scholar
J. Wagner, P. Mazurek, A. Miekina, R.Z. Morawski, Regularised differentiation of measurement data in systems for monitoring of human movements. Biomed. Signal Process. Control 43, 265–277 (2018). https://doi.org/10.1016/j.bspc.2018.02.010
Article Google Scholar
J. Wagner, Regularised differentiation of measurement data in systems for healthcare-oriented monitoring of elderly persons, Dissertation. (Warsaw University of Technology, 2020)
F. Eichas, U. Zölzer, in Novel Optical Systems Design and Optimization XIX. Modeling of an optocoupler-based audio dynamic range control circuit, vol. 9948 (International Society for Optics and Photonics, 2016), p. 99480w. https://doi.org/10.1117/12.2235686
S. Marchand, P. Depalle, in Digital Audio Effects (DAFx) Conference. Generalization of the derivative analysis method to non-stationary sinusoidal modeling (Espoo, Finland, 2008), pp. 281–288
D. Medine, Dynamical systems for audio synthesis: Embracing nonlinearities and delay-free loops. Appl. Sci. 6(5), 134 (2016). https://doi.org/10.3390/app6050134
Article Google Scholar
D. Van Nort, J. Braasch, P. Oliveros, Sound texture recognition through dynamical systems modeling of empirical mode decomposition. J. Acoust. Soc. Am. 132(4), 2734–2744 (2012). https://doi.org/10.1121/1.4751535
Article Google Scholar
D.T. Yeh, J.S. Abel, J.O. Smith, Automated physical modeling of nonlinear audio circuits for real-time audio effects part I: theoretical development. IEEE Trans. Audio Speech Lang. Process. 18(4), 728–737 (2010)
Article Google Scholar
I. Goodfellow, Y. Bengio, A. Courville, in Deep learning. Adaptive computation and machine learning (The MIT Press, Cambridge, 2016)
A. Härmä, Classification of time-frequency regions in stereo audio. J. Audio Eng. Soc. 59(10), 707–720 (2011)
Google Scholar
N.J. Nalini, S. Palanivel, Music emotion recognition: The combined evidence of MFCC and residual phase. Egypt. Inf. J. 17(1), 1–10 (2016). https://doi.org/10.1016/j.eij.2015.05.004
Article Google Scholar
N. Attoh-Okine, K. Barner, D. Bentil, R. Zhang, The empirical mode decomposition and the Hilbert-Huang transform. EURASIP J. Adv. Signal Process. 2008(1), 251518–2008251518 (2008). https://doi.org/10.1155/2008/251518
Article Google Scholar
P.C. Chu, C. Fan, N. Huang, Derivative-optimized empirical mode decomposition for the Hilbert-Huang transform. J. Comput. Appl. Math. 259, 57–64 (2014). https://doi.org/10.1016/j.cam.2013.03.046
Article MathSciNet Google Scholar
N.E. Huang, K. Hu, A.C.C. Yang, H.C. Chang, D. Jia, W.K. Liang, J.R. Yeh, C.L. Kao, C.H. Juan, C.K. Peng, J.H. Meijer, Y.H. Wang, S.R. Long, Z. Wu, On Holo-Hilbert spectral analysis: A full informational spectral representation for nonlinear and non-stationary data. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 374(2065), 20150206 (2016). https://doi.org/10.1098/rsta.2015.0206
N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.C. Yen, C.C. Tung, H.H. Liu, The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 454(1971), 903–995 (1998). https://doi.org/10.1098/rspa.1998.0193
M. Lewandowski, A Short-Term Analysis of a Digital Sigma-Delta Modulator with Nonstationary Audio Signals (Audio Engineering Society, Warsaw, 2015)
Google Scholar
F. Jaillet, P. Balazs, M. Dörfler, N. Engelputzeder. On the Structure of the Phase around the Zeros of the Short-Time Fourier Transform. International Conference on Acoustics NAG/DAGA 2009 (2009), pp. 1996. https://hal.science/hal-04125120/file/2009_nagdaga_jaillet_et_al.pdf.
M. Desainte-Catherine, S. Marchand, High-precision Fourier analysis of sounds using signal derivatives. J. Audio Eng. Soc. 48(7/8), 654–667 (2000)
Google Scholar
M.G. Frei, I. Osorio, Intrinsic time-scale decomposition: time-frequency-energy analysis and real-time filtering of non-stationary signals. Proc. R. Soc. A Math. Phys. Eng. Sci. 463(2078), 321–342 (2007). https://doi.org/10.1098/rspa.2006.1761
S. Marchand, Improving spectral analysis precision with an enhanced phase vocoder using signal derivatives. Paper presented at the 1st International Conference on Digital Audio Effects (DAFx), Digital Audio Effects (DAFx) Workshop, Barcelona, November 1998.
Q. Liu, A.H. Sung, M. Qiao, Derivative-based audio steganalysis. ACM Trans. Multimed. Comput. Commun. Appl. 7(3), 18:1–18:19 (2011). https://doi.org/10.1145/2000486.2000492
A.J. Cooper, Detecting Butt-Spliced Edits in Forensic Digital Audio Recordings (Audio Engineering Society, Denmark, 2010)
Google Scholar
C. Grigoras, D. Rappaport, J.M. Smith, Analytical Framework for Digital Audio Authentication (Audio Engineering Society, USA, 2012)
Google Scholar
R. Korycki, Time and spectral analysis methods with machine learning for the authentication of digital audio recordings. Forensic Sci. Int. 230(1–3), 117–126 (2013). https://doi.org/10.1016/j.forsciint.2013.02.020
Article Google Scholar
C. Clavel, T. Ehrette, G. Richard, in 2005 IEEE International Conference on Multimedia and Expo. Events detection for an audio-based surveillance system (2005), pp. 1306–1309. https://doi.org/10.1109/icme.2005.1521669. Issn: 1945-788x
H. Phan, P. Koch, F. Katzberg, M. Maass, R. Mazur, I. McLoughlin, A. Mertins, in 2017 25th European Signal Processing Conference (EUSIPCO). What makes audio event detection harder than classification? (2017), pp. 2739–2743. https://doi.org/10.23919/eusipco.2017.8081709. Issn: 2076-1465
A. Temko, C. Nadeu, Acoustic event detection in meeting-room environments. Pattern Recogn. Lett. 30(14), 1281–1288 (2009). https://doi.org/10.1016/j.patrec.2009.06.009
Article Google Scholar
A. Vafeiadis, K. Votis, D. Giakoumis, D. Tzovaras, L. Chen, R. Hamzaoui, in Audio-based event recognition system for smart homes. Audio-based event recognition system for smart homes (IEEE Xplore, San Francisco, 2017), pp. 1–8. https://doi.org/10.1109/uic-atc.2017.8397489
J.J. Burred, A. Lerch, in Proceedings of the 6th international conference on digital audio effects. A hierarchical approach to automatic musical genre classification (Citeseer, London, 2003), pp. 8–11
C.P. Chan, P.C. Ching, T. Lee, Noisy speech recognition using de-noised multiresolution analysis acoustic features. J. Acoust. Soc. Am. 110(5), 2567–2574 (2001). https://doi.org/10.1121/1.1398054
Article Google Scholar
S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980). https://doi.org/10.1109/tassp.1980.1163420
Article Google Scholar
J.T. Foote, Content-based retrieval of music and audio (Dallas, 1997), pp. 138–147. https://doi.org/10.1117/12.290336
S. Furui, Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoust. Speech Signal Process. 34(1), 52–59 (1986). https://doi.org/10.1109/tassp.1986.1164788
Article Google Scholar
B. Logan, Mel frequency cepstral coefficients for music modeling. Int. Symp. Music Inf. Retr. (2000). https://doi.org/10.5281/zenodo.1416444
E. Pampalk, S. Dixon, G. Widmer, On the evaluation of perceptual similarity measures for music. Paper presented at the 6th International Conference on Digital Audio Effects (DAFx-03), London, 8-11 September 2003
L.R. Rabiner, B.H. Juang, in Fundamentals of speech recognition. Prentice Hall signal processing series (PTR Prentice Hall, Englewood Cliffs, 1993)
P. Ramesh, J.G. Wilpon, M.A. McGee, D.B. Roe, C.H. Lee, L.R. Rabiner, Speaker independent recognition of spontaneously spoken connected digits. Speech Commun. 11(2), 229–235 (1992). https://doi.org/10.1016/0167-6393(92)90017-2
Article Google Scholar
J. Aucouturier, F. Pachet, M. Sandler, “The way it Sounds’’: Timbre models for analysis and retrieval of music signals. IEEE Trans. Multimedia 7(6), 1028–1035 (2005). https://doi.org/10.1109/tmm.2005.858380
Article Google Scholar
A. Eronen, in Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575). Comparison of features for musical instrument recognition (2001), pp. 19–22. https://doi.org/10.1109/aspaa.2001.969532
F. Grondin, F. Michaud, Lightweight and optimized sound source localization and tracking methods for open and closed microphone array configurations. Robot. Auton. Syst. 113, 63–80 (2019). https://doi.org/10.1016/j.robot.2019.01.002
Article Google Scholar
C. Joder, S. Essid, G. Richard, Temporal integration for audio classification with application to musical instrument classification. IEEE Trans. Audio Speech Lang. Process. 17(1), 174–186 (2009). https://doi.org/10.1109/tasl.2008.2007613
Article Google Scholar
K. Kumatani, J. McDonough, B. Raj, Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors. IEEE Signal Process. Mag. 29(6), 127–140 (2012). Publisher: IEEE
A. Marti, M. Cobos, J.J. Lopez, J. Escolano, A steered response power iterative method for high-accuracy acoustic source localization. J. Acoust. Soc. Am. 134(4), 2627–2630 (2013). https://doi.org/10.1121/1.4820885
Article Google Scholar
K. Nakadai, T. Takahashi, H.G. Okuno, H. Nakajima, Y. Hasegawa, H. Tsujino, Design and implementation of robot audition system ‘HARK’ - Open source software for listening to three simultaneous speakers. Adv. Robot. 24(5–6), 739–761 (2010). https://doi.org/10.1163/016918610x493561
Article Google Scholar
F. Nesta, M. Omologo, Generalized state coherence transform for multidimensional TDOA estimation of multiple sources. IEEE Trans. Audio Speech Language Process. 20(1), 246–260 (2012). https://doi.org/10.1109/tasl.2011.2160168
Article Google Scholar
L.R. Rabiner, R.W. Schafer, Theory and applications of digital speech processing, 1st edn. (Pearson, Upper Saddle River, 2011). OCLC: ocn476834107
B. Rafaely, Y. Peled, M. Agmon, D. Khaykin, E. Fisher, in Speech Processing in Modern Communication: Challenges and Perspectives, ed. by I. Cohen, J. Benesty, S. Gannot. Spherical microphone array beamforming. Springer Topics in Signal Processing (Springer, Berlin, 2010), pp. 281–305. https://doi.org/10.1007/978-3-642-11130-3_11
S.S. Tirumala, S.R. Shahamiri, A.S. Garhwal, R. Wang, Speaker identification features extraction methods: A systematic review. Expert Syst. Appl. 90, 250–271 (2017). https://doi.org/10.1016/j.eswa.2017.08.015
Article Google Scholar
M. Woelfel, J. McDonough, Distant Speech Recognition (Wiley, USA, 2009)
Book Google Scholar
K. Ahnert, M. Abel, Numerical differentiation of experimental data: Local versus global methods. Comput. Phys. Commun. 177(10), 764–774 (2007). https://doi.org/10.1016/j.cpc.2007.03.009. Number: 10
D. Aydın, M. Memmedli, R.E. Omay, Smoothing parameter selection for nonparametric regression using smoothing spline. European J. Pure Appl. Math. 6(2), 222–238 (2013)
MathSciNet Google Scholar
S. Chountasis, V.N. Katsikis, D. Pappas, A. Perperoglou, The Whittaker smoother and the Moore-Penrose inverse in signal reconstruction. Appl. Math. Sci. 6(25), 1205–1219 (2012)
MathSciNet Google Scholar
J.H. Friedman, A variable span smoother. Technical report, Stanford Univ CA Lab for Computational Statistics (1984)
G.A. Wood, Data smoothing and differentiation procedures in biomechanics. Exerc. Sport Sci. Rev. 10(1), 308–362 (1982). Number: 1
J. Feng, N. Simon, Gradient-based regularization parameter selection for problems with nonsmooth penalty functions. J. Comput. Graph. Stat. 27(2), 426–435 (2018). https://doi.org/10.1080/10618600.2017.1390470
Article MathSciNet Google Scholar
H. Albrecht, in 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221). A family of cosine-sum windows for high-resolution measurements, vol. 5 (2001), pp. 3081–3084. https://doi.org/10.1109/icassp.2001.940309. Issn: 1520-6149
M.S. Berger, in Nonlinearity and functional analysis: lectures on nonlinear problems in mathematical analysis. Pure and applied mathematics, a series of monographs and textbooks, vol. v. 74 (Academic Press, New York, 1977)

Download references

Acknowledgements

The author wishes to thank Prof. Jan Żera for his commenting on an earlier draft of this paper. His suggestions led to major modifications of the manuscript.

Funding

This work was first supported by the Warsaw University of Technology, Poland, under Grant 1820/10/201/POB2/2021.

Author information

Authors and Affiliations

Institute of Radioelectronics and Multimedia Technology, Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, Warsaw, 00-665, Poland
Marcin Lewandowski

Authors

Marcin Lewandowski
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Not applicable.

Corresponding author

Correspondence to Marcin Lewandowski.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lewandowski, M. Estimating the first and second derivatives of discrete audio data. J AUDIO SPEECH MUSIC PROC. 2024, 31 (2024). https://doi.org/10.1186/s13636-024-00355-5

Download citation

Received: 01 February 2024
Accepted: 29 May 2024
Published: 18 June 2024
DOI: https://doi.org/10.1186/s13636-024-00355-5

Estimating the first and second derivatives of discrete audio data

Abstract

1 Introduction

1.1 Numerical differentiation of experimental data

1.2 Numerical differentiation of discrete audio data

2 Proposed method

2.1 Derivation and analysis of proposed method

3 Comparison with other numerical differentiation methods

3.1 Experiments with stationary synthetic data

3.2 Experiments with nonstationary synthetic data

3.3 Comparison material

3.4 Results of experiments

3.5 Transfer functions

4 Conclusions

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords