Skip to main content

Estimating the first and second derivatives of discrete audio data


A new method for estimating the first and second derivatives of discrete audio signals intended to achieve higher computational precision in analyzing the performance and characteristics of digital audio systems is presented. The method could find numerous applications in modeling nonlinear audio circuit systems, e.g., for audio synthesis and creating audio effects, music recognition and classification, time-frequency analysis based on nonstationary audio signal decomposition, audio steganalysis and digital audio authentication or audio feature extraction methods. The proposed algorithm employs the ordinary 7 point-stencil central-difference formulas with improvements that minimize the round-off and truncation errors. This is achieved by treating the step size of numerical differentiation as a regularization parameter, which acts as a decision threshold in all calculations. This approach requires shifting discrete audio data by fractions of the initial sample rate, which was obtained by fractional delay FIR filters designed with modified 11-term cosine-sum windows for interpolation and shifting of audio signals. The maximum relative error in estimating first and second derivatives of discrete audio signals are respectively in order of \(10^{-13}\) and \(10^{-10}\) over the entire audio band, which is close to double-precision floating-point accuracy for the first and better than single-precision floating-point accuracy for the second derivative estimation. Numerical testing showed that this performance of the proposed method is not influenced by the type of signal being differentiated (either stationary or nonstationary), and provides better results than other known differentiation methods, in the audio band up to 21 kHz.

1 Introduction

The infinitesimal differential calculus with \(h \rightarrow {} 0\) can be obtained only mathematically, for continuous-time and analytically obtainable derivatives. In the case of data measured by digital equipment, infinitesimal differential calculus is no longer valid because of the intrinsic discretization of the data being processed, i.e., sampling the data in time and quantizing its amplitude values. Discrete audio signals are sampled at equal-spaced time intervals, which usually span from \(T_s=\frac{1}{8000}\) to \(T_s=\frac{1}{384000}\) seconds with fixed-point precision varying from 8 to 24 bits or generated with 32- or 64-bit floating-point precision. Thus, the derivative obtained for a discrete signal is only an estimation of its true value at some point in time. In this paper, a numerical differentiation method is proposed to estimate the first and second derivatives of discrete audio data with no assumptions about the data to be analyzed. The method is intended to minimize numerical errors down to the truncation and round-off errors. Details of the method are presented in Sect. 2, and the method is evaluated and tested in Sect. 3.

1.1 Numerical differentiation of experimental data

Numerical methods for estimating first and second derivatives of discrete data can be classified into two main groups. The first group aims to develop formulas for estimating derivatives numerically without knowledge about the function which generates data points. It includes finite-difference calculation usually obtained by polynomial interpolation, such as quadrature, Lagrange, Legendre, Newton, Chebyshev, Gauss, Hermite, Sterling polynomials [1,2,3,4,5,6,7,8,9,10,11], Taylor series expansion, or method of undetermined coefficients [12,13,14,15,16,17,18,19,20,21,22,23]. All these methods provide the same form of finite-difference coefficients, but their computational time, complexity, and memory storage requirements differ. Moreover, they are very prone to errors due to the noisy, non-exact, and experimental nature of the analyzed signals. Therefore, several methods that use a regularization parameter to be optimized have been proposed. They form the second group of methods and do not offer any explicit differential formula that could be used to calculate the derivative but aim to evaluate data using a function fitted to the data points. Regularization methods provide stable approximation of derivatives [1, 24,25,26,27,28,29], and include Richardson’s extrapolation [1, 3, 9, 30,31,32,33], automatic differentiation [34,35,36,37,38], optimization approach (such as Tikhonov, variational, mollification, and heuristic regularization) [24, 26, 27, 32, 39,40,41,42,43,44,45,46,47,48,49,50,51,52], or smoothing approximations [42, 51, 53,54,55,56,57]. Regularization methods are especially useful in estimating trends in the data but require additional effort to choose regularization or fitting parameters, such as L-curve, GCV, or by heuristic methods [43, 52, 58,59,60,61]. Therefore, regularization methods require some assumptions about the analyzed signal. The volume of work focused on numerical differentiation, in general, is quite considerable and growing each year [62, 63]. Numerical differentiation is an elementary and essential tool in applied sciences used for numerical analysis and in system modeling [2, 3, 30, 64,65,66,67,68,69]. Regardless of the increasing number of publications, there is no generally accepted method for carrying out numerical differentiation for all kinds of data. A major reason for this is that numerical differentiation is an ill-posed problem, i.e., small perturbations in the signal may lead to large errors in the computed derivative [42, 68, 70, 71]. It is a problem known for years [24, 72], especially when dealing with experimental data typically corrupted with some kind of noise due to measurement, rounding, truncation, or other processing errors. Until now, no systematic strategy for the selection of the optimum differentiation method for a given practical problem has been proposed [73, 74].

1.2 Numerical differentiation of discrete audio data

Numerical differentiation of discrete audio data finds numerous applications in solving ordinary and partial differential equations (ODEs and PDEs) as a numerical framework for modeling nonlinear audio circuit systems. It is used, for example, in audio synthesis and creating audio effects [75,76,77,78,79] and music recognition and classification [34, 80,81,82]. It is specifically used for time-frequency analysis based on nonstationary audio signal decomposition (methods derived from empirical mode decomposition [83,84,85,86,87]), enhancement of spectral precision in Fourier-based methods [85, 88,89,90,91], audio steganalysis [92], digital audio authentication [93,94,95], acoustic event detection [96,97,98,99], feature extraction based on Mel-frequency cepstral coefficients (MFCC) [100,101,102,103,104,105,106,107,108], speaker and speech identification and recognition, and sound source tracking [107, 109,110,111,112,113,114,115,116,117,118,119,120]. Audio data, such as music or speech, which are nonstationary over time, cannot be described by a mathematical expression. This is particularly a problem with the use of numerical differentiation which employs approximation to the analyzed data. Approximation may be treated as a smoothing operation or searching for a trend line in the data. Selecting a trend line is achieved by specifying a regularization parameter. The selection of algorithms and methods for finding the regularization parameter depends on the given requirements, but finally every regularization procedure compromises between “smoothness” and “roughness” of data estimate [27, 42, 43, 45, 52, 59, 73, 121,122,123,124,125]. Averaging approximation of audio signal in the time domain translates directly into the frequency domain in the form of the attenuation of higher frequencies in the signal. Averaging corresponds to changes in the shape, cutoff frequency, and cutoff slope of the low-pass filter resulting from the chosen regularization method and selected parameter. A similar process of removing high-frequency content in the signal occurs in the group of numerical differentiation methods based on finite-difference calculation, which can be regarded as a special case of FIR filters known as differentiators. All these approximation methods try to find the best compromise between cut-off frequency, frequency transition region, and stop-band attenuation with an equivalent filter approach. Although averaging is appropriate in applications in which high-frequency content is regarded as noise, it is not acceptable for many kinds of digital audio signals which hold essential information in rapid changes in the amplitude over time. Consequently, there have been some attempts to make a regularization parameter variable based on the actual form of the signal [124, 126]. Nevertheless, no method can successfully separate audio signal from noise without distorting the signal. The numerical differentiation method proposed in this paper estimates the first and second derivatives of discrete audio data using central-difference formulas for calculations and makes no assumptions about the data being analyzed. The method does not incorporate smoothing of input data and employs additional procedures to minimize numerical errors. The main contribution of the proposed method, presented in detail in Sect. 2, can be summarized as follows:

  • Instead of smoothing or filtering the high-frequency content of the input signal, the step size h (see Sect. 2 and 3) is used as the regularization parameter to be optimized.

  • Additional procedures to minimize numerical errors employ fractional delay FIR filters designed with modified cosine-sum windows [127] for shifting and interpolation of audio signals and enhance numerical accuracy in estimation of the derivative.

  • As it will be shown, the maximum relative errors in the estimation of the first and second derivatives are small for discrete audio signals, respectively, of order \(10^{-13}\) and \(10^{-10}\) in the entire audio band, which is close either to double-precision or single-precision floating-point accuracy of calculations.

Section 3 shows the experiments designed to verify the performance of the proposed method and gives a comparison to other known methods of numerical differentiation, both by analysis of random input samples and through the differences in resulting transfer function.

2 Proposed method

The derivative of a discrete signal is only an estimation of its true value at some point in time because all calculations are performed with non-exact finite-precision arithmetic. Possible errors may result from simplified assumptions in the mathematical model, discretization error, convergence error, and round-off error (due to the finite-precision of numerical calculations).

The first-order derivative of a general signal f being a function of a variable x calculated at \(x_0\) point can be expressed as the discrete finite central-difference formula [128]:

$$\begin{aligned} \hat{f}^{(1)}(x_0) \approx \frac{f(x_0 + h) - f(x_0 - h)}{2h}, \end{aligned}$$

where the step size h should be kept sufficiently small for the accurate approximation (i.e., preserve a small truncation error). However, a decrease in the step size h leads to subtractive cancelation which increases round-off errors. The challenge is to identify the optimal step size \(h_0\) to avoid conditions in which the decreasing truncation error is dominated by the round-off error (Fig. 1). The optimal \(h_0\) can be found by estimating the round-off and truncation errors associated with (1). According to [9] and truncating all terms of Taylor series expansion greater than 3, these errors can be estimated as:

$$\begin{aligned} \hat{f}^{(1)}(x_0)=\underbrace{\frac{f(x_0 + h) - f(x_0 - h)}{2h}}_{\text {finite central-difference approx.}}+\underbrace{\frac{\text {err}_{x_0+h} - \text {err}_{x_0-h}}{2h}}_{\text {round-off error}}-\underbrace{\frac{f^{(3)}(x_0)}{6}h^2}_{\text {truncation error}}, \end{aligned}$$

where true values of \(f(x_0 \pm h)\) from (1) are represented as a sum of approximations \(\hat{f}(x_0 \pm h)\) and round-off errors defined as \(\text {err}_{x_0 \pm h}\).

Fig. 1
figure 1

Relative error of first derivative approximation of function \(f(x)=sin(\pi t)\) at point \(x_{0}=\frac{\pi }{3}\) on the machine with 52 bits of mantissa as a function of step size h for a different order of approximation. Vertical dashed line represents optimal step size \(h_{0}=5.964 \cdot 10^{-6}\) calculated with (4)

An absolute value of the upper bound of total error can be represented as:

$$\begin{aligned} \left| \hat{f}^{(1)}(x_0) - \frac{f(x_0 + h) - f(x_0 - h)}{2h} \right| \le \frac{2 \cdot \text {eps}}{2h} - \frac{Ph^2}{6}, \end{aligned}$$

Assuming that the eps (precision value of floating-point numbers) in (3) is set as the upper bound of round-off error, and maximum value of \(f^{(3)}(\xi )\) in (2) is set to P. Then, the optimal step size \(h_0\) can be determined by differentiating (3) with respect to h and imposing the resulting derivative equal to zero, and then solving for h:

$$\begin{aligned} h_0 \le \root 3 \of {\frac{3 \cdot \text {eps}}{P}}. \end{aligned}$$

Figure 1 shows that, for the step size \(h<h_0\), the relative error of finite central-difference approximation is determined by round-off errors. Otherwise, for \(h>h_0\), the truncation error dominates.

2.1 Derivation and analysis of proposed method

For estimation of the first and second derivatives, the 7-point stencil central-differences approximation following [5] was used. The approximation of the first derivative is given by the formula:

$$\begin{aligned} \hat{f}^{(1)}(x) = \frac{1}{60h} \sum \limits _{k=1}^{N-1 \over 2} c_k \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

for \(N=7\) and coefficients \(c_k=[45,-9,1]\). The approximation of the second derivative is given by:

$$\begin{aligned} \hat{f}^{(2)}(x) = \frac{c_0 f(x)}{180h^2} + \frac{1}{180h^2} \sum \limits _{k=1}^{N-1 \over 2} c_k \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

for \(N=7\), \(c_k=[270,-27,2]\), \(c_0=-490\), where h is the step size, N is the order, and \(c_k\) for \((k=0,...,(N-1)/2)\) are the coefficients of central-difference approximation (\(c_0\) occurs only in the case of the second-order derivative).

As depicted in Fig. 1, the choice of the step size h is critical for the accuracy of the derivative estimation. For this reason, the derivative estimation defined as \(\hat{F}\) was computed for several h values as:

$$\begin{aligned} \hat{F}^{(1,2)}(x) = \left. \hat{f}^{(1,2)}(x) \right| _{h=10^{m} \cdot T_{s}} \end{aligned}$$

for \(m=[2,1,0,-1,-2,-3]\), where \(\hat{f}^{(1,2)}(x)\) was calculated for h decreased from \(100 \cdot T_{s}\) to \(0.001 \cdot T_{s}\) (m decreased from 2 to − 3).

Since discrete audio data are sampled at equal-spaced intervals \(h=T_{s}=1/f_{s}\), the value of h is fixed over time. Calculations performed for \(h=0.1 \cdot T_{s}\), \(h=0.01 \cdot T_{s}\), and \(h=0.001 \cdot T_{s}\) require components in (5) and (6) to be shifted by corresponding fractions of the initial sample rate. The required shifting operation was performed by fractional delay FIR filters designed with the modified 11-term cosine-sum window as proposed in [127] as in (8) below:

$$\begin{aligned} w_{j} = \sum \limits _{k=0}^{K-1} A_k \cdot \cos \left[ \frac{\pi k \cdot (2j - N + 1 - 2\cdot \delta )}{(N - 1)} \right] \end{aligned}$$

for \(j = 0, ..., N-1\), where N is filter’s order (number of coefficients), \(K = 11\) is number of cosine-sum terms from [127], and \(\delta\) is the fractional shift of the filter. Filter coefficients are calculated by multiplying this window function with the sinc function given in (9) below:

$$\begin{aligned} sinc_{j} = \left\{ \begin{array}{ll} 2 \cdot \frac{f_c}{f_s} &{} \text {if} \ \ \ \frac{2j - N + 1}{2} - \delta = 0 \\ \frac{\sin \left( \pi \cdot \frac{f_c}{f_s} \left[ (2j - N + 1) - 2\cdot \delta \right] \right) }{\pi \left( \frac{2j - N + 1}{2} - \delta \right) } &{} \text {otherwise} \end{array}\right. \end{aligned}$$

Figure 2 shows frequency and phase response for one of the designed filters. It shifts input signal by \(\delta = 0.06\) fraction of \(T_s\), has a number of 8001 coefficients, and works with \(f_s = 44100 \cdot 64 \text {Hz}\) and cutoff frequency \(f_c = 22050\ \text {Hz}\). Phase-delay plot on the left panel shows the shift of 0.06 fraction of sampling time and passband ripples of the \(10^{-14}\) order on the right panel.

Fig. 2
figure 2

Example filter design with modified 11-term cosine-sum window [127] for \(F_s = 64 \cdot 44100\) Hz, \(N = 8001\) taps, \(f_c = 22050\) Hz, and delay of \(0.06 \cdot T_s\). Frequency response and phase delay are shown on the left and filter’s passband ripples are shown on the right panel

To increase the accuracy in derivative estimation, input data were 64 times oversampled. Maximum relative errors in the estimation of the first and second derivatives by \(\hat{F}^{(1)}(x)\) and \(\hat{F}^{(2)}(x)\) in (7) with different step sizes (\(h=10^{m}\cdot T_{s}\) \(\forall\) \(m=[2,1,0,-1,-2,-3]\)) and 300 sinusoidal input signals of frequencies randomly varying from 1 to 21000 Hz (sampling rate \(f_{s}=44100\) Hz) are shown in Figs. 3 and 4. Maximum relative errors were calculated as \(max \left| \left( \hat{F}^{(1,2)}(x) - f^{(1,2)}(x)\right) /max \left|f^{(1,2)}(x)\right| \right|\), where \(f^{(1,2)} (x)\) were exact derivatives computed analytically.

Fig. 3
figure 3

Maximum relative errors (dot marks) of first derivative estimation using (7) obtained for 300 sinusoidal input signals of frequencies varying randomly from 1 to 21000 Hz for different step sizes \(h=10^{m} \cdot T_{s}\) \(\forall\) \(m=[2,1,0,-1,-2,-3]\). Input signals were sampled at 44100 Hz and 64 times oversampled

Fig. 4
figure 4

Maximum relative errors (dot marks) of second derivative estimation using (7). Other details as in Fig. 3

The results shown in Figs. 3 and 4 reveal that there is no optimal step size h for estimating the derivatives over the whole audio frequency band. The step size h should be increased at lower frequencies and decreased at higher frequencies to achieve the lowest relative error of calculations. There are, however, characteristic points where the maximum relative errors for different step sizes intersect with each other. Knowing the data-dependent optimum points (marked as circles in Figs. 3 and 4), it is possible to derive formulas for estimating the derivatives with the highest possible accuracy (minimum error).

Considering the step size h as a regularization parameter, the optimum step sizes \(h_{07}\) and \(h_{08}\) (where 07 and 08 indicate the order of central-difference approximation for step sizes from \(10^2 \cdot T_s\) to \(T_s\) needed to get the optimum step size) calculated for 9th order derivative approximation were obtained and used as the threshold values to select ranges in derivative estimations by (7) which provide the lowest maximum relative error. For the first derivative estimation, threshold values using the 9-point stencil central-differences approximation were obtained as:

$$\begin{aligned} h_{07}^{10^{m}}(x) = \root 7 \of {\frac{385 \cdot \text {eps}}{9 \cdot \left. \hat{f}^{(7)}(x) \right| _{h=10^{m} \cdot T_{s}}}} \end{aligned}$$

for \(m=[0,1,2]\), where

$$\begin{aligned} \left. \hat{f}^{(7)}(x) \right| _{h=10^{m}\cdot T_{s}} = \frac{1}{2h^7} \sum \limits _{k=1}^{N-1 \over 2} c_k \cdot \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

for \(m=[0,1,2]\), \(N=9\), \(c_{k}=[-14,14,-6,1]\) and for the second derivative estimation as:

$$\begin{aligned} h_{08}^{10^{m}}(x) = \root 10 \of {\frac{77112 \cdot \text {eps}}{99 \cdot \left. \hat{f}^{(8)}(x) \right| _{h=10^{m} \cdot T_{s}}}} \end{aligned}$$

for \(m=[0,1,2]\), where

$$\begin{aligned} \left. \hat{f}^{(8)}(x) \right| _{h=10^{m}\cdot T_{s}} = \frac{c_{0}f(x)}{2h^{8}} + \frac{1}{2h^8} \sum \limits _{k=1}^{N-1 \over 2} c_k \cdot \left[ f(x + kh) - f(x - kh) \right] \end{aligned}$$

for \(m=[0,1,2]\), \(N=9\), \(c_{k}=[-56,28,-8,1]\), \(c_{0}=70\).

Finally, the first and second derivative estimations \(\hat{F}^{(1)}(x)\) and \(\hat{F}^{(2)}(x)\) given in (7) were modified and derived as vectors which are comprised of derivatives \(\hat{f}^{(1)}(x)\) and \(\hat{f}^{(2)}(x)\) using (5) and (6) at specific sample indexes [\(x_{1}, ..., x_6\)] to provide the lowest maximum relative error. These vectors correspond to the derivatives estimated for step sizes of \(h=10^{m} \cdot T_{s}\) \(\forall\) \(m=[2,1,0,-1,-2,-3]\) and occur in the ranges specified by the thresholds calculated through (10) and (12). Estimations of the first and the second derivatives are respectively formulated by (14) and (15) and described with an Algorithm 1.

figure a

Algorithm 1 The proposed method for the first and second derivatives estimation

Threshold values of the \(h_{07}\), \(h_{08}\) and the div parameter used in (14) and (15) (empirically determined constant dependent on sampling frequency) allow for switching between derivatives calculated with different step sizes h which results in \(10^{-13}\) and \(10^{-10}\) accuracy (maximum relative errors) of the approximation in the whole audio band. The maximum relative error of estimating the first and second derivatives using the proposed method through (14) and (15) is shown by ‘x’ markers in Figs. 5 and 6 (which at higher frequencies form a thick bottom line).

$$\begin{aligned} \hat{F}^{(1)}(x) = \left\{ \begin{array}{ll} \left. \hat{f}^{(1)}(x_{1})\right| _{h=T_s} = \left. f^{(1)}(x_{1})\right| _{h=100T_s} &{} \text {for}\ x_{1} = \left\{ x \in \mathbb {N}^+ : h_{07}^{100}(x)> \frac{100h - 10h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{2})\right| _{h=T_s} = \left. f^{(1)}(x_{2})\right| _{h=10T_s} &{} \text {for}\ x_{2} = \left\{ x \in \mathbb {N}^+ : h_{07}^{100}(x) \le \frac{100h - 10h}{div} \wedge h_{07}^{10}(x)> \frac{10h - h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{3})\right| _{h=T_s} = \left. f^{(1)}(x_{3})\right| _{h=T_s} &{} \text {for}\ x_{3} = \left\{ x \in \mathbb {N}^+ : h_{07}^{10}(x) \le \frac{10h - h}{div} \wedge h_{07}^{1}(x)> \frac{h - 0.1h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{4})\right| _{h=T_s} = \left. f^{(1)}(x_{4})\right| _{h=0.1T_s} &{} \text {for}\ x_{4} = \left\{ x \in \mathbb {N}^+ : h_{07}^{1}(x) \le \frac{h - 0.1h}{div} \wedge h_{07}^{0.1}(x)> \frac{0.1h - 0.01h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{5})\right| _{h=T_s} = \left. f^{(1)}(x_{5})\right| _{h=0.01T_s} &{} \text {for}\ x_{5} = \left\{ x \in \mathbb {N}^+ : h_{07}^{0.1}(x) \le \frac{0.1h - 0.01h}{div} \wedge h_{07}^{0.01}(x) > \frac{0.01h - 0.001h}{div} \right\}\\ \left. \hat{f}^{(1)}(x_{6})\right| _{h=T_s} = \left. f^{(1)}(x_{6})\right| _{h=0.001T_s} &{} \text {for}\ x_{6} = \left\{ x \in \mathbb {N}^+ : h_{07}^{0.01}(x) \le \frac{0.01h - 0.001h}{div} \right\} \end{array}\right. \end{aligned}$$
$$\begin{aligned} \hat{F}^{(2)}(x) = \left\{ \begin{array}{ll} \left. \hat{f}^{(2)}(x_1)\right| _{h=T_s} = \left. f^{(2)}(x_1)\right| _{h=100T_s} &{} \text {for}\ x_1 = \left\{ x \in \mathbb {N}^+ : h_{08}^{100}(x)> \frac{100h - 10h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_2)\right| _{h=T_s} = \left. f^{(2)}(x_2)\right| _{h=10T_s} &{} \text {for}\ x_2 = \left\{ x \in \mathbb {N}^+ : h_{08}^{100}(x) \le \frac{100h - 10h}{div} \wedge h_{08}^{10}(x)> \frac{10h - h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_3)\right| _{h=T_s} = \left. f^{(2)}(x_3)\right| _{h=T_s} &{} \text {for}\ x_3 = \left\{ x \in \mathbb {N}^+ : h_{08}^{10}(x) \le \frac{10h - h}{div} \wedge h_{08}^{1}(x)> \frac{h - 0.1h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_4)\right| _{h=T_s} = \left. f^{(2)}(x_4)\right| _{h=0.1T_s} &{} \text {for}\ x_4 = \left\{ x \in \mathbb {N}^+ : h_{08}^{1}(x) \le \frac{h - 0.1h}{div} \wedge h_{08}^{0.1}(x)> \frac{0.1h - 0.01h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_5)\right| _{h=T_s} = \left. f^{(2)}(x_5)\right| _{h=0.01T_s} &{} \text {for}\ x_5 = \left\{ x \in \mathbb {N}^+ : h_{08}^{0.1}(x) \le \frac{0.1h - 0.01h}{div} \wedge h_{08}^{0.01}(x) > \frac{0.01h - 0.001h}{div} \right\}\\ \left. \hat{f}^{(2)}(x_6)\right| _{h=T_s} = \left. f^{(2)}(x_6)\right| _{h=0.001T_s} &{} \text {for}\ x_6 = \left\{ x \in \mathbb {N}^+ : h_{08}^{0.01}(x) \le \frac{0.01h - 0.001h}{div} \right\} \end{array}\right. \end{aligned}$$
Fig. 5
figure 5

Maximum relative error of first derivative estimation obtained using proposed method (“x” marks). Input signals were sampled at 44100 Hz and 64 times oversampled. Grayed-out markers are the maximum relative errors as obtained in Fig. 3

Fig. 6
figure 6

Maximum relative error of second derivative estimation obtained using proposed method (“x” marks). Input signals were sampled at 44100 Hz and 64 times oversampled. Grayed-out markers are the maximum relative errors as obtained in Fig. 4

3 Comparison with other numerical differentiation methods

Since the exact derivatives of real-world audio signals cannot be calculated, experiments based on synthetic data were conducted. Two kinds of experiments with the use of, respectively, stationary and nonstationary signals, were performed. The testing was intended to compare the proposed method with other numerical differentiation methods.

3.1 Experiments with stationary synthetic data

Stationary synthetic data have been generated as a sum of four harmonic signals with randomly selected frequencies and arbitrarily chosen amplitudes as defined by the following formula:

$$\begin{aligned} f_n(x) = 0.5 \cdot \sin (\omega _{1,n}x) + 0.15 \cdot \cos (\omega _{2,n}x) + 0.2 \cdot \sin (\omega _{3,n}x) + 0.15 \cdot \cos (\omega _{4,n}x) \end{aligned}$$

for \(n=[1, ... ,300]\), where \(\omega _{1,n}, ... ,\omega _{4,n}\) are randomly selected frequencies respectively from the range [1–100], [101–3500], [3501–10000], and [10001–22050] Hz. The test data consisted of \(n=300\) sets of such generated synthetic data. To simulate actual signal conditions more realistically, a noise was added to the harmonic signal (16). Two (\(M=1,2\)) normally distributed noise \(\Delta _M\) realizations with a standard deviation of \(\sigma _M=[2.2204 \cdot 10^{-16},0.001]\) were used. The noise was added to the harmonic signal \(f_n (x)\) as shows (17):

$$\begin{aligned} \hat{f}_{M,n}(x) = f_n(x) + \Delta _M, \end{aligned}$$

where M denotes actual noise used, and n represents the four-sine frequencies set. In all calculations, the sampling frequency \(f_s=44100\) Hz was used. The differentiation methods were compared as the error ratio of \(\text {SNR}_{1,2,M,n}/\text {SNR}_{0,M,n}\) (where SNR is the signal-to-noise ratio) for \(n=300\) data sets and \(M=1,2\) noise distributions where \(\text {SNR}_{1,2,M,n}\) are signal-to-noise ratios of the estimated first and second derivatives, and \(\text {SNR}_{0,M,n}\) is the signal-to-noise ratio of the input signal. The level of errors \(\text {SNR}_{0,M,n}\) in the input data has been characterized by the M signal-to-noise ratios, defined as:

$$\begin{aligned} \text {SNR}_{0,M,n}=10 \cdot log{\left( \frac{\sum \nolimits _{l=1}^{L} \left( f_n(x+l)\right) ^2}{\sum \nolimits _{l=1}^{L} \left( \hat{f}_{M,n}(x+l) - f_n(x+l)\right) ^2}\right) }, \end{aligned}$$

where \(L=6000\) is the length of the input data sequence, \(f_n\) is the input harmonic data sequence, and \(\hat{f}_{M,n}\) is the input harmonic data sequence with \(M=1,2\) noise distributions. Signal-to-noise ratios \(\text {SNR}_{1,2,M,n}\) of the estimated first and second derivatives have been derived as follows:

$$\begin{aligned} \text {SNR}_{1,2,M,n}=10 \cdot log{\left( \frac{\sum \nolimits _{l=1}^{L} \left( f_{M,n}^{(1,2)}(x+l)\right) ^2}{\sum \nolimits _{l=1}^{L} \left( \hat{F}_{M,n}^{(1,2)}(x+l) - f_n(x+l)\right) ^2}\right) }, \end{aligned}$$

where \(\hat{F}_{M,n}^{(1,2)}\) are estimates of the first and second derivatives and \(f_{M,n}^{(1,2)}\) are true derivatives calculated analytically.

3.2 Experiments with nonstationary synthetic data

Nonstationary synthetic data were generated by adding the FM signal v(x) to the previously defined by (16) signal \(f_n (x)\) in the following way:

$$\begin{aligned} g_n(x)=f_n(x)+v_n(x) \end{aligned}$$

for \(n=[1, ... , 300]\) in which

$$\begin{aligned} v(x)=sin\left( \omega _Ax - \omega _Bx \cdot cos(x))\right) \end{aligned}$$

where \(\omega _A\) and \(\omega _B\) were set so that v(x) changes its frequency from 1 to 21000 Hz during one sinusoidal cycle in a data sequence of length \(L=6000\). As in the previous experiment, the noise was added to input data in the same way as shown in (17). The differentiation methods were compared as the ratio of \(\text {SNR}_{1,2,M,n}/\text {SNR}_{0,M,n}\) for \(n=300\) input harmonic data with FM modulation sets and \(M=1,2\) noise distributions.

3.3 Comparison material

The method proposed in this paper and described in Sect. 2 was compared with the following numerical differentiation methods:

  • Algorithm for numerical differentiation of discrete functions with an arbitrary degree and order of accuracy with the use of the closed explicit formula presented by H. Z. Hassan et al. in [13]. The algorithm is based on the method of undetermined coefficients and the closed form of the Vandermonde inverse matrix (the method labeled later as “Hassan”). The “Hassan” numerical differentiation has been performed with the order of 8.

  • MaxPol package written in MATLAB (labeled later as “MaxPol”) which is a comprehensive tool for numerical differentiation. The MaxPol is based on the method of undetermined coefficients to render a variety of FIR kernels in a closed form that can be used to approximate the full-band or low-pass derivatives of discrete single or multidimensional signals (images) [14, 15]. Numerical differentiation was performed with a centralized FIR derivative kernel for the full-band operation.

  • Ordinary 9-point stencil central-difference formulas (hereinafter referred to as “Central-Diff”).

3.4 Results of experiments

The dependence of error ratio \(\text {SNR}_{1,2,M,n}/\text {SNR}_{0,M,n}\) for \(n=300\) stationary and nonstationary sets of experimental data, and \(M=1,2\) noise distributions, obtained for each of the compared methods are presented in Figs. 7, 8, 9 and 10. Figures 7 and 8 show the results for the stationary signals, for the first and second derivatives, respectively. Correspondingly, Figs. 9 and 10 show the results for the nonstationary signals. The error ratio for the method proposed in this work is shown by the solid line. The “Hassan,” “MaxPol,” and “Central-Diff” methods used for the comparison are shown with the dash-dotted, dotted, and dashed lines, respectively. The sets of experimental data are shown along the abscissa and represent the random selection of input data. The error ratio is shown on the linear scale on the ordinate. The noise level shown for each condition represents the noise added to the input data. The results presented in Figs. 7, 8, 9 and 10 reveal that the proposed method gives stable results which are not prone to changes in the input data sequence, both for stationary and nonstationary cases. Among other methods, only the “MaxPol” method performance is close to that of the proposed method in the estimation of the second derivative for the stationary data. In all other cases, the proposed method results are consistently better in performing the numerical differentiation of either stationary or nonstationary data. The presented results also reveal some general characteristics of the compared methods in the estimation of the first and second derivatives in discrete audio signals:

  • The “Central-Diff” difference formulas and the “Hassan” method are more computationally efficient than “MaxPol” and the proposed method. Estimation of the first derivative for stationary data gives comparable results for the “Central-Diff” formula and “Hassan” method both for no noise and − 80 dB noise conditions (Fig. 7). For the estimation of the second derivative, the “Hassan” method provides slightly better results than the ‘Central-Diff’ formulas (Fig. 8). Similar differences in the performance of these two methods are apparent in the results for nonstationary data (Figs. 8 and 10).

  • Full-band FIR kernel performance in MaxPol package is very prone to random selection of samples (Figs. 7 and 8) thus to the frequency content of the input data. It was found that when the stationary input data contained frequencies above 15000 Hz (see (16)), the performance of the method drastically decreased. The experiments with nonstationary data revealed that the “MaxPol” method produces lower error ratios than the “Central-Diff” and “Hassan” methods. The performance of the “MaxPol” method is also lower as compared to the proposed method, which is especially evident for nonstationary signals (Figs. 9 and 10).

  • The noise added to the input data improves to some extent the performance of numerical differentiation of all compared methods. The likely reason is that the random error introduced by the added noise to the input harmonic signal decorrelates consecutive samples. This in turn decreases the computational round-off error during the numerical differentiation process.

Fig. 7
figure 7

First derivative error ratio \(\text {SNR}_{1,M,n}/\text {SNR}_{0,M,n}\) for stationary synthetic data generated using (17). Method proposed in this work is shown by the solid line, methods used for the comparison are shown with the dash-dotted, dotted, and dashed lines, respectively. Left and right panels show calculations for added noise levels of − 313 dB (\(M=1\)) and − 80 dB (\(M=2\)), respectively

Fig. 8
figure 8

Second derivative error ratio \(\text {SNR}_{2,M,n}/\text {SNR}_{0,M,n}\) for stationary synthetic data. Other details as in Fig. 7

Fig. 9
figure 9

First derivative error ratio \(\text {SNR}_{1,M,n}/\text {SNR}_{0,M,n}\) for nonstationary synthetic data (18). Other details as in Fig. 7

Fig. 10
figure 10

Second derivative error ratio of \(\text {SNR}_{2,M,n}/\text {SNR}_{0,M,n}\) for nonstationary synthetic data. Other details as in Fig. 7

3.5 Transfer functions

The advantage of the proposed method is seen in differences between transfer functions showing the impact of the frequency content of input signal on the performance of the numerical differentiation. Transfer functions for the proposed method and the three other methods were calculated as a relationship between Fourier transforms of the estimated derivatives and the input signals, for 300 frequencies varying from 20 to 20000 Hz (the resolution was set to 8192 with 512-point Hann window). The left panels in Figs. 11 and 12 show the transfer functions of selected methods, respectively for the first and second derivatives. The performance of the proposed method (shown with the thick solid line) is nearly identical with the ideal differentiator response over the whole audio frequency band for the first and second-order numerical differentiation, which is not the case for other methods. The right panels in Figs. 11 and 12 show the relative error between the transfer function of the ideal differentiator, proposed method, and “MaxPol” method (as the best out of the other methods). In nearly all cases the performance of the proposed method is better than that of the “MaxPol” method. The only exception is the estimation of the second derivative with input signals below 15000 Hz (Fig. 12, right panel) but as it is seen in Fig. 8 only for stationary input signals.

Fig. 11
figure 11

Transfer functions of numerical differentiation methods for estimation of the first derivative in the audio band (left panel) and relative error for proposed method and the “MaxPol” method (right panel)

Fig. 12
figure 12

Transfer functions of numerical differentiation methods for estimation of the second derivative in the audio band (left panel) and relative error for proposed method and the “MaxPol” method (right panel)

4 Conclusions

The paper addresses the problem of estimating the first and second derivatives of discrete audio data. Numerical differentiation of discrete audio data has several applications. For example, it is particularly important in the development of numerical solvers for ordinary and partial differential equations PDEs and ODEs. These are fundamental in modeling audio circuit systems for digital audio effects and synthesizers.

The audio signal is always a complex combination of the components ranging four decades in frequency, from a few Hz to tens of thousands Hz, which are processed by both linear and nonlinear systems with parameters varying over time. Thus, it is not possible to derive an analytical mathematical expression for music or speech signals to be applied in the evaluation of numerical methods used for differentiation.

Discrete audio data consist of a sequence of samples that occur at a specific fixed time order therefore every preprocessing operation (like smoothing or approximation) disrupts audio data in some way. For this reason, it is important that the proposed method for estimating the first and second derivatives makes no assumptions about the data being analyzed and does not incorporate smoothing or filtering while preprocessing. To achieve the best possible numerical accuracy in the whole audio band, the step size h should be treated as a regularization parameter and is made variable based on the input signal frequency range. This was achieved with very precise fractional-delay FIR filters designed for interpolation and shifting of the processed audio data.

The comparison with three existing numerical differentiation methods showed that the performance of the proposed method is consistently better than of the other methods, especially in the case of nonstationary discrete audio data. Future research in employing the proposed method for time-domain analysis and modeling of digital audio systems should consider further investigations on increasing the numerical accuracy in estimating the second-order derivative.

Availability of data and materials

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.


  1. R.L. Burden, J.D. Faires, Numerical analysis, 9th edn. (Brooks/Cole, Cengage Learning, Boston, 2011). OCLC: ocn496962633

  2. S.C. Chapra, R.P. Canale, Numerical methods for engineers, 7th edn. (McGraw-Hill Education, New York, 2015)

    Google Scholar 

  3. G. Dahlquist, A. Björck, Numerical methods in scientific computing (Society for Industrial and Applied Mathematics, Philadelphia, 2008)

    Google Scholar 

  4. T. Dokken, T. Lyche, A divided difference formula for the error in Hermite interpolation. BIT Numer. Math. 19(4), 539–540 (1979).

    Article  MathSciNet  Google Scholar 

  5. B. Fornberg, Generation of finite difference formulas on arbitrarily spaced grids. Math. Comput. 51(184), 699–706 (1988).

    Article  MathSciNet  Google Scholar 

  6. L.P. Grabar, Numerical differentiation by means of Chebyshev polynomials orthonormalized on a system of equidistant points. USSR Comput. Math. Math. Phys. 7(6), 215–220 (1967).

    Article  MathSciNet  Google Scholar 

  7. B. Kvasov, Numerical differentiation and integration on the basis of interpolation parabolic splines. Chisl. Metody Mekh. Sploshn. Sredy 14(2), 68–80 (1983)

    MathSciNet  Google Scholar 

  8. J. Li, General explicit difference formulas for numerical differentiation. J. Comput. Appl. Math. 183(1), 29–52 (2005).

    Article  MathSciNet  Google Scholar 

  9. J.H. Mathews, K.D. Fink, Numerical methods using MATLAB, 4th edn. (Pearson, Upper Saddle River, 2004)

    Google Scholar 

  10. I.W. Selesnick, Maximally flat low-pass digital differentiator. IEEE Trans. Circ. Syst. II Analog Digit. Signal Process. 49(3), 219–223 (2002).

    Article  Google Scholar 

  11. Y. Zhang, Y. Chou, J. Chen, Z. Zhang, L. Xiao, Presentation, error analysis and numerical experiments on a group of 1-step-ahead numerical differentiation formulas. J. Comput. Appl. Math. 239, 406–414 (2013).

    Article  MathSciNet  Google Scholar 

  12. C.F. Gerald, P.O. Wheatley, Applied numerical analysis, 7th edn. (Pearson/Addison-Wesley, Boston, 2004)

    Google Scholar 

  13. H.Z. Hassan, A.A. Mohamad, G.E. Atteia, An algorithm for the finite difference approximation of derivatives with arbitrary degree and order of accuracy. J. Comput. Appl. Math. 236(10), 2622–2631 (2012).

    Article  MathSciNet  Google Scholar 

  14. M.S. Hosseini, K.N. Plataniotis, Derivative kernels: Numerics and applications. IEEE Trans. Image Process. 26(10), 4596–4611 (2017).

    Article  MathSciNet  Google Scholar 

  15. M.S. Hosseini, K.N. Plataniotis, Finite differences in forward and inverse imaging problems: MaxPol Design. SIAM J. Imaging Sci. 10(4), 1963–1996 (2017).

    Article  MathSciNet  Google Scholar 

  16. I.R. Khan, R. Ohba, Closed-form expressions for the finite difference approximations of first and higher derivatives based on Taylor series. J. Comput. Appl. Math. 107(2), 179–193 (1999).

    Article  MathSciNet  Google Scholar 

  17. I.R. Khan, R. Ohba, Digital differentiators based on Taylor series. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E82–a(12), 2822–2824 (1999)

    Google Scholar 

  18. I.R. Khan, R. Ohba, Mathematical proof of explicit formulas for tap-coefficients of Taylor series based FIR digital differentiators. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E84–a(6), 1581–1584 (2001)

    Google Scholar 

  19. I.R. Khan, R. Ohba, N. Hozumi, Mathematical proof of closed form expressions for finite difference approximations based on Taylor series. J. Comput. Appl. Math. 150(2), 303–309 (2003).

    Article  MathSciNet  Google Scholar 

  20. I.R. Khan, R. Ohba, New finite difference formulas for numerical differentiation. J. Comput. Appl. Math. 126(1–2), 269–276 (2000).

    Article  MathSciNet  Google Scholar 

  21. I.R. Khan, R. Ohba, Taylor series based finite difference approximations of higher-degree derivatives. J. Comput. Appl. Math. 154(1), 115–124 (2003).

    Article  MathSciNet  Google Scholar 

  22. T. Moller, R. Machiraju, K. Mueller, R. Yagel, Evaluation and design of filters using a Taylor series expansion. IEEE Trans. Vis. Comput. Graph. 3(2), 184–199 (1997).

    Article  Google Scholar 

  23. P. Sylvester, Numerical formation of finite-difference operators (correspondence). IEEE Trans. Microw. Theory Tech. 18(10), 740–743 (1970).

    Article  Google Scholar 

  24. R.S. Anderssen, P. Bloomfield, Numerical differentiation procedures for non-exact data. Numer. Math. 22(3), 157–182 (1974).

    Article  MathSciNet  Google Scholar 

  25. F.B. Hildebrand, Introduction to numerical analysis, 2nd edn. (Dover Publications, New York, 1987)

    Google Scholar 

  26. I. Knowles, R.J. Renka, Methods for numerical differentiation of noisy data. Electron. J. Differ. Equ 21, 235–246 (2014)

    MathSciNet  Google Scholar 

  27. S. Lu, S.V. Pereverzev, Numerical differentiation from a viewpoint of regularization theory. Math. Comput. 75(256), 1853–1870 (2006)

    Article  MathSciNet  Google Scholar 

  28. F. Nikolovski, I. Stojkovska, Complex-step derivative approximation in noisy environment. J. Comput. Appl. Math. 327, 64–78 (2018).

    Article  MathSciNet  Google Scholar 

  29. A.G. Ramm, A.B. Smirnova, On stable numerical differentiation. Math. Comput. 70(235), 1131–1153 (2001)

    Article  MathSciNet  Google Scholar 

  30. S.C. Chapra, Applied numerical methods with MATLAB for engineers and scientists, 3rd edn. (McGraw-Hill, New York, 2012). OCLC: ocn664665963

  31. C.-C. Tseng, S.-L. Lee, in 2008 IEEE International Symposium on Circuits and Systems. Design of second order digital differentiator using Richardson extrapolation and fractional delay (2008), pp. 1120–1123. Issn: 2158-1525

  32. E. Kreyszig, H. Kreyszig, E.J. Norminton, Advanced engineering mathematics, 10th edn. (Wiley, Hoboken, 2011)

    Google Scholar 

  33. A. Sidi, in Practical Extrapolation Methods: Theory and Applications. Cambridge Monographs on Applied and Computational Mathematics (Cambridge University Press, Cambridge, 2003).

  34. A.G. Baydin, B.A. Pearlmutter, A.A. Radul, J.M. Siskind. Automatic Differentiation in Machine Learning: a Survey. J Mach Learn Res. 18(153), 1-43 (2018).

  35. G.F. Corliss (ed.), Automatic differentiation of algorithms: From simulation to optimization (Springer, New York, 2002)

    Google Scholar 

  36. A. Griewank, A. Walther, in Evaluating Derivatives, Principles and Techniques of Algorithmic Differentiation, 2nd edn. Other Titles in Applied Mathematics (Society for Industrial and Applied Mathematics, USA, 2008).

  37. R.D. Neidinger, Introduction to automatic differentiation and MATLAB object-oriented programming. SIAM Rev. 52(3), 545–563 (2010).

    Article  MathSciNet  Google Scholar 

  38. L.B. Rall, Automatic differentiation: Techniques and applications. Lecture notes in computer science, vol. 120 (Springer-Verlag, Berlin; New York, 1981)

  39. R. Chartrand, Numerical differentiation of noisy, nonsmooth data. ISRN Applied Mathematics 2011 (2011).

  40. J. Cheng, B. Hofmann, S. Lu, The index function and Tikhonov regularization for ill-posed problems. J. Comput. Appl. Math. 265, 110–119 (2014).

    Article  MathSciNet  Google Scholar 

  41. J. Cullum, Numerical differentiation and regularization. SIAM J. Numer. Anal. 8(2), 254–265 (1971)

    Article  MathSciNet  Google Scholar 

  42. M. Hanke, O. Scherzer, Inverse problems light: Numerical differentiation. Am. Math. Mon. 108(6), 512–521 (2001).

    Article  MathSciNet  Google Scholar 

  43. D.N. Hao, L.H. Chuong, D. Lesnic, Heuristic regularization methods for numerical differentiation. Comput. Math. Appl. 63(4), 816–826 (2012).

    Article  MathSciNet  Google Scholar 

  44. B. Hu, S. Lu, Numerical differentiation by a Tikhonov regularization method based on the discrete cosine transform. Appl. Anal. 91(4), 719–736 (2012).

    Article  MathSciNet  Google Scholar 

  45. I. Knowles, R. Wallace, A variational method for numerical differentiation. Numer. Math. 70(1), 91–110 (1995).

    Article  MathSciNet  Google Scholar 

  46. H. Mao, Adaptive choice of the regularization parameter in numerical differentiation. J. Comput. Math. 33(4), 415–427 (2015)

    Article  MathSciNet  Google Scholar 

  47. Y. Mathlouthi, A. Mitiche, I.B. Ayed, Regularised differentiation for image derivatives. IET Image Process. 11(5), 310–316 (2017).

    Article  Google Scholar 

  48. D. Murio, The mollification method and the numerical solution of the inverse heat conduction problem by finite differences. Comput. Math. Appl. 17(10), 1385–1396 (1989).

    Article  MathSciNet  Google Scholar 

  49. A.G. Ramm, E. Meister, Stable solutions of some ill-posed problems. Math. Methods Appl. Sci. 3(1), 336–363 (1981).

    Article  MathSciNet  Google Scholar 

  50. A.G. Ramm, B.A. Smirnova. On Stable Numerical Differentiation. Mathematics of Computation, vol 70 (American Mathematical Society, 2001), p. 1131-53

  51. A. Savitzky, M.J.E. Golay, Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36(8), 1627–1639 (1964).

    Article  Google Scholar 

  52. J.J. Stickel, Data smoothing and numerical differentiation by a regularization method. Comput. Chem. Eng. 34(4), 467–475 (2010).

    Article  Google Scholar 

  53. V.I. Dmitriev, Z.G. Ingtem, Numerical differentiation using spline functions. Comput. Math. Model. 23(3), 312–318 (2012).

    Article  MathSciNet  Google Scholar 

  54. W. Gao, R. Zhang, Multiquadric trigonometric spline quasi-interpolation for numerical differentiation of noisy data: A stochastic perspective. Numer. Algoritm. 77(1), 243–259 (2018).

    Article  MathSciNet  Google Scholar 

  55. M. Li, Y. Wang, L. Ling, Numerical Caputo differentiation by radial basis functions. J. Sci. Comput. 62(1), 300–315 (2015).

    Article  MathSciNet  Google Scholar 

  56. V. Vershinin, N. Pavlov, Approximation of derivatives by smoothing splines. Vychisl. Sistemy 98, 83–91 (1983)

    MathSciNet  Google Scholar 

  57. G. Wahba, in Spline models for observational data. CBMS-NSF Regional Conference series in applied mathematics, vol. 59 (Society for Industrial and Applied Mathematics, Philadelphia, 1990)

  58. P. Craven, G. Wahba, Smoothing noisy data with spline functions. Numer. Math. 31(4), 377–403 (1978).

    Article  MathSciNet  Google Scholar 

  59. P.H.C. Eilers, A perfect smoother. Anal. Chem. 75(14), 3631–3636 (2003).

    Article  Google Scholar 

  60. P.C. Hansen, Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev. 34(4), 561–580 (1992).

    Article  MathSciNet  Google Scholar 

  61. P.C. Hansen, Regularization Tools version 4.0 for Matlab 7.3. Numer. Algoritm. 46(2), 189–194 (2007).

    Article  MathSciNet  Google Scholar 

  62. Web of Science. Accessed 20 Mar 2024

  63. Scopus. Accessed 20 Mar 2024

  64. L. Chaparro, Signals and systems using matlab (Elsevier, Waltham, 2019)

    Google Scholar 

  65. S.J. Chapman, Matlab® programming for engineers, 6th edn. (Cengage, Boston, 2020). OCLC: on1048936473

  66. J.D. Faires, R.L. Burden, Numerical methods, 4th edn. (Brooks/Cole, Cengage Learning, Boston, 2013). OCLC: ocn809689438

  67. H. Kantz, T. Schreiber, Nonlinear time series analysis (Cambridge Univ. Press, Cambridge, 2010). Oclc: 796208312

  68. W.H. Press (ed.), Numerical recipes: the art of scientific computing, 3rd edn. (Cambridge University Press, Cambridge, 2007). OCLC: ocn123285342

  69. W.y. Yang, W. Cao, J. Kim, K.W. Park, H.H. Park, J. Joung, J.S. Ro, H.L. Lee, C.H. Hong, T. Im, Applied numerical methods using MATLAB®, 2nd edn. (Wiley, Hoboken, 2020)

  70. C.W. Groetsch, in The theory of Tikhonov regularization for Fredholm equations of the first kind. Research notes in mathematics, vol. 105 (Pitman Advanced Pub. Program, Boston, 1984)

  71. A.N. Tikhonov, V.I. Arsenin, in Solutions of ill-posed problems. Scripta series in mathematics (Winston; distributed solely by Halsted Press, Washington: New York, 1977)

  72. W.G. Bickley, Formulae for numerical differentiation. Math. Gaz. 25(263), 19–27 (1941).

    Article  MathSciNet  Google Scholar 

  73. J. Wagner, P. Mazurek, A. Miekina, R.Z. Morawski, Regularised differentiation of measurement data in systems for monitoring of human movements. Biomed. Signal Process. Control 43, 265–277 (2018).

    Article  Google Scholar 

  74. J. Wagner, Regularised differentiation of measurement data in systems for healthcare-oriented monitoring of elderly persons, Dissertation. (Warsaw University of Technology, 2020)

  75. F. Eichas, U. Zölzer, in Novel Optical Systems Design and Optimization XIX. Modeling of an optocoupler-based audio dynamic range control circuit, vol. 9948 (International Society for Optics and Photonics, 2016), p. 99480w.

  76. S. Marchand, P. Depalle, in Digital Audio Effects (DAFx) Conference. Generalization of the derivative analysis method to non-stationary sinusoidal modeling (Espoo, Finland, 2008), pp. 281–288

  77. D. Medine, Dynamical systems for audio synthesis: Embracing nonlinearities and delay-free loops. Appl. Sci. 6(5), 134 (2016).

    Article  Google Scholar 

  78. D. Van Nort, J. Braasch, P. Oliveros, Sound texture recognition through dynamical systems modeling of empirical mode decomposition. J. Acoust. Soc. Am. 132(4), 2734–2744 (2012).

    Article  Google Scholar 

  79. D.T. Yeh, J.S. Abel, J.O. Smith, Automated physical modeling of nonlinear audio circuits for real-time audio effects part I: theoretical development. IEEE Trans. Audio Speech Lang. Process. 18(4), 728–737 (2010)

    Article  Google Scholar 

  80. I. Goodfellow, Y. Bengio, A. Courville, in Deep learning. Adaptive computation and machine learning (The MIT Press, Cambridge, 2016)

  81. A. Härmä, Classification of time-frequency regions in stereo audio. J. Audio Eng. Soc. 59(10), 707–720 (2011)

    Google Scholar 

  82. N.J. Nalini, S. Palanivel, Music emotion recognition: The combined evidence of MFCC and residual phase. Egypt. Inf. J. 17(1), 1–10 (2016).

    Article  Google Scholar 

  83. N. Attoh-Okine, K. Barner, D. Bentil, R. Zhang, The empirical mode decomposition and the Hilbert-Huang transform. EURASIP J. Adv. Signal Process. 2008(1), 251518–2008251518 (2008).

    Article  Google Scholar 

  84. P.C. Chu, C. Fan, N. Huang, Derivative-optimized empirical mode decomposition for the Hilbert-Huang transform. J. Comput. Appl. Math. 259, 57–64 (2014).

    Article  MathSciNet  Google Scholar 

  85. N.E. Huang, K. Hu, A.C.C. Yang, H.C. Chang, D. Jia, W.K. Liang, J.R. Yeh, C.L. Kao, C.H. Juan, C.K. Peng, J.H. Meijer, Y.H. Wang, S.R. Long, Z. Wu, On Holo-Hilbert spectral analysis: A full informational spectral representation for nonlinear and non-stationary data. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 374(2065), 20150206 (2016).

  86. N.E. Huang, Z. Shen, S.R. Long, M.C. Wu, H.H. Shih, Q. Zheng, N.C. Yen, C.C. Tung, H.H. Liu, The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 454(1971), 903–995 (1998).

  87. M. Lewandowski, A Short-Term Analysis of a Digital Sigma-Delta Modulator with Nonstationary Audio Signals (Audio Engineering Society, Warsaw, 2015)

    Google Scholar 

  88. F. Jaillet, P. Balazs, M. Dörfler, N. Engelputzeder. On the Structure of the Phase around the Zeros of the Short-Time Fourier Transform. International Conference on Acoustics NAG/DAGA 2009 (2009), pp. 1996.

  89. M. Desainte-Catherine, S. Marchand, High-precision Fourier analysis of sounds using signal derivatives. J. Audio Eng. Soc. 48(7/8), 654–667 (2000)

    Google Scholar 

  90. M.G. Frei, I. Osorio, Intrinsic time-scale decomposition: time-frequency-energy analysis and real-time filtering of non-stationary signals. Proc. R. Soc. A Math. Phys. Eng. Sci. 463(2078), 321–342 (2007).

  91. S. Marchand, Improving spectral analysis precision with an enhanced phase vocoder using signal derivatives. Paper presented at the 1st International Conference on Digital Audio Effects (DAFx), Digital Audio Effects (DAFx) Workshop, Barcelona, November 1998.

  92. Q. Liu, A.H. Sung, M. Qiao, Derivative-based audio steganalysis. ACM Trans. Multimed. Comput. Commun. Appl. 7(3), 18:1–18:19 (2011).

  93. A.J. Cooper, Detecting Butt-Spliced Edits in Forensic Digital Audio Recordings (Audio Engineering Society, Denmark, 2010)

    Google Scholar 

  94. C. Grigoras, D. Rappaport, J.M. Smith, Analytical Framework for Digital Audio Authentication (Audio Engineering Society, USA, 2012)

    Google Scholar 

  95. R. Korycki, Time and spectral analysis methods with machine learning for the authentication of digital audio recordings. Forensic Sci. Int. 230(1–3), 117–126 (2013).

    Article  Google Scholar 

  96. C. Clavel, T. Ehrette, G. Richard, in 2005 IEEE International Conference on Multimedia and Expo. Events detection for an audio-based surveillance system (2005), pp. 1306–1309. Issn: 1945-788x

  97. H. Phan, P. Koch, F. Katzberg, M. Maass, R. Mazur, I. McLoughlin, A. Mertins, in 2017 25th European Signal Processing Conference (EUSIPCO). What makes audio event detection harder than classification? (2017), pp. 2739–2743. Issn: 2076-1465

  98. A. Temko, C. Nadeu, Acoustic event detection in meeting-room environments. Pattern Recogn. Lett. 30(14), 1281–1288 (2009).

    Article  Google Scholar 

  99. A. Vafeiadis, K. Votis, D. Giakoumis, D. Tzovaras, L. Chen, R. Hamzaoui, in Audio-based event recognition system for smart homes. Audio-based event recognition system for smart homes (IEEE Xplore, San Francisco, 2017), pp. 1–8.

  100. J.J. Burred, A. Lerch, in Proceedings of the 6th international conference on digital audio effects. A hierarchical approach to automatic musical genre classification (Citeseer, London, 2003), pp. 8–11

  101. C.P. Chan, P.C. Ching, T. Lee, Noisy speech recognition using de-noised multiresolution analysis acoustic features. J. Acoust. Soc. Am. 110(5), 2567–2574 (2001).

    Article  Google Scholar 

  102. S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980).

    Article  Google Scholar 

  103. J.T. Foote, Content-based retrieval of music and audio (Dallas, 1997), pp. 138–147.

  104. S. Furui, Speaker-independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans. Acoust. Speech Signal Process. 34(1), 52–59 (1986).

    Article  Google Scholar 

  105. B. Logan, Mel frequency cepstral coefficients for music modeling. Int. Symp. Music Inf. Retr. (2000).

  106. E. Pampalk, S. Dixon, G. Widmer, On the evaluation of perceptual similarity measures for music. Paper presented at the 6th International Conference on Digital Audio Effects (DAFx-03), London, 8-11 September 2003

  107. L.R. Rabiner, B.H. Juang, in Fundamentals of speech recognition. Prentice Hall signal processing series (PTR Prentice Hall, Englewood Cliffs, 1993)

  108. P. Ramesh, J.G. Wilpon, M.A. McGee, D.B. Roe, C.H. Lee, L.R. Rabiner, Speaker independent recognition of spontaneously spoken connected digits. Speech Commun. 11(2), 229–235 (1992).

    Article  Google Scholar 

  109. J. Aucouturier, F. Pachet, M. Sandler, “The way it Sounds’’: Timbre models for analysis and retrieval of music signals. IEEE Trans. Multimedia 7(6), 1028–1035 (2005).

    Article  Google Scholar 

  110. A. Eronen, in Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575). Comparison of features for musical instrument recognition (2001), pp. 19–22.

  111. F. Grondin, F. Michaud, Lightweight and optimized sound source localization and tracking methods for open and closed microphone array configurations. Robot. Auton. Syst. 113, 63–80 (2019).

    Article  Google Scholar 

  112. C. Joder, S. Essid, G. Richard, Temporal integration for audio classification with application to musical instrument classification. IEEE Trans. Audio Speech Lang. Process. 17(1), 174–186 (2009).

    Article  Google Scholar 

  113. K. Kumatani, J. McDonough, B. Raj, Microphone array processing for distant speech recognition: From close-talking microphones to far-field sensors. IEEE Signal Process. Mag. 29(6), 127–140 (2012). Publisher: IEEE

  114. A. Marti, M. Cobos, J.J. Lopez, J. Escolano, A steered response power iterative method for high-accuracy acoustic source localization. J. Acoust. Soc. Am. 134(4), 2627–2630 (2013).

    Article  Google Scholar 

  115. K. Nakadai, T. Takahashi, H.G. Okuno, H. Nakajima, Y. Hasegawa, H. Tsujino, Design and implementation of robot audition system ‘HARK’ - Open source software for listening to three simultaneous speakers. Adv. Robot. 24(5–6), 739–761 (2010).

    Article  Google Scholar 

  116. F. Nesta, M. Omologo, Generalized state coherence transform for multidimensional TDOA estimation of multiple sources. IEEE Trans. Audio Speech Language Process. 20(1), 246–260 (2012).

    Article  Google Scholar 

  117. L.R. Rabiner, R.W. Schafer, Theory and applications of digital speech processing, 1st edn. (Pearson, Upper Saddle River, 2011). OCLC: ocn476834107

  118. B. Rafaely, Y. Peled, M. Agmon, D. Khaykin, E. Fisher, in Speech Processing in Modern Communication: Challenges and Perspectives, ed. by I. Cohen, J. Benesty, S. Gannot. Spherical microphone array beamforming. Springer Topics in Signal Processing (Springer, Berlin, 2010), pp. 281–305.

  119. S.S. Tirumala, S.R. Shahamiri, A.S. Garhwal, R. Wang, Speaker identification features extraction methods: A systematic review. Expert Syst. Appl. 90, 250–271 (2017).

    Article  Google Scholar 

  120. M. Woelfel, J. McDonough, Distant Speech Recognition (Wiley, USA, 2009)

    Book  Google Scholar 

  121. K. Ahnert, M. Abel, Numerical differentiation of experimental data: Local versus global methods. Comput. Phys. Commun. 177(10), 764–774 (2007). Number: 10

  122. D. Aydın, M. Memmedli, R.E. Omay, Smoothing parameter selection for nonparametric regression using smoothing spline. European J. Pure Appl. Math. 6(2), 222–238 (2013)

    MathSciNet  Google Scholar 

  123. S. Chountasis, V.N. Katsikis, D. Pappas, A. Perperoglou, The Whittaker smoother and the Moore-Penrose inverse in signal reconstruction. Appl. Math. Sci. 6(25), 1205–1219 (2012)

    MathSciNet  Google Scholar 

  124. J.H. Friedman, A variable span smoother. Technical report, Stanford Univ CA Lab for Computational Statistics (1984)

  125. G.A. Wood, Data smoothing and differentiation procedures in biomechanics. Exerc. Sport Sci. Rev. 10(1), 308–362 (1982). Number: 1

  126. J. Feng, N. Simon, Gradient-based regularization parameter selection for problems with nonsmooth penalty functions. J. Comput. Graph. Stat. 27(2), 426–435 (2018).

    Article  MathSciNet  Google Scholar 

  127. H. Albrecht, in 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221). A family of cosine-sum windows for high-resolution measurements, vol. 5 (2001), pp. 3081–3084. Issn: 1520-6149

  128. M.S. Berger, in Nonlinearity and functional analysis: lectures on nonlinear problems in mathematical analysis. Pure and applied mathematics, a series of monographs and textbooks, vol. v. 74 (Academic Press, New York, 1977)

Download references


The author wishes to thank Prof. Jan Żera for his commenting on an earlier draft of this paper. His suggestions led to major modifications of the manuscript.


This work was first supported by the Warsaw University of Technology, Poland, under Grant 1820/10/201/POB2/2021.

Author information

Authors and Affiliations



Not applicable.

Corresponding author

Correspondence to Marcin Lewandowski.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lewandowski, M. Estimating the first and second derivatives of discrete audio data. J AUDIO SPEECH MUSIC PROC. 2024, 31 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: