Skip to main content

Articles

Page 11 of 12

  1. In 2003 and 2004, the ISO/IEC MPEG standardization committee added two amendments to their MPEG-4 audio coding standard. These amendments concern parametric coding techniques and encompass Spectral Band Replic...

    Authors: AC den Brinker, J Breebaart, P Ekstrand, J Engdegård, F Henn, K Kjörling, W Oomen and H Purnhagen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:468971
  2. Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing ...

    Authors: Björn Schuller, Martin Wöllmer, Tobias Moosmayr and Gerhard Rigoll
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:942617
  3. While linear prediction (LP) has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consistin...

    Authors: Toon van Waterschoot and Marc Moonen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2008:706935
  4. Text corpus size is an important issue when building a language model (LM). This is a particularly important issue for languages where little data is available. This paper introduces an LM adaptation technique...

    Authors: ArnarThor Jensson, Koji Iwano and Sadaoki Furui
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2008:573832
  5. Robust automatic language identification (LID) is a task of identifying the language from a short utterance spoken by an unknown speaker. One of the mainstream approaches named parallel phone recognition langu...

    Authors: Hongbin Suo, Ming Li, Ping Lu and Yonghong Yan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:674859
  6. This paper investigates the problem of speaker recognition in noisy conditions. A new approach called nonnegative tensor principal component analysis (NTPCA) with sparse constraint is proposed for speech featu...

    Authors: Qiang Wu and Liqing Zhang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:578612
  7. Improving the intelligibility of speech in different environments is one of the main objectives of hearing aid signal processing algorithms. Hearing aids typically employ beamforming techniques using multiple ...

    Authors: Sriram Srinivasan, Ashish Pandharipande and Kees Janse
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:824797
  8. Online personalization of hearing instruments refers to learning preferred tuning parameter values from user feedback through a control wheel (or remote control), during normal operation of the hearing aid. We...

    Authors: Alexander Ypma, Job Geurts, Serkan Özer, Erik van der Werf and Bert de Vries
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:183456
  9. A proven method for achieving effective automatic speech recognition (ASR) due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed ...

    Authors: Umit H. Yapanel and John H.L. Hansen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:148967
  10. Perception of moving sound sources obeys different brain processes from those mediating the localization of static sound events. In view of these specificities, a preprocessing model was designed, based on the...

    Authors: R Kronland-Martinet and T Voinier
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:849696
  11. The present paper proposes a new approach for detecting music boundaries, such as the boundary between music pieces or the boundary between a music piece and a speech section for automatic segmentation of musi...

    Authors: Yoshiaki Itoh, Akira Iwabuchi, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka and Shi-Wook Lee
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:480786
  12. Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher qu...

    Authors: Demetrios Cantzos, Athanasios Mouchtaris and Chris Kyriakakis
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:462830
  13. We propose a novel approach to improve adaptive decorrelation filtering- (ADF-) based speech source separation in diffuse noise. The effects of noise on system adaptation and separation outputs are handled sep...

    Authors: Rong Hu and Yunxin Zhao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:349214
  14. This paper proposes a new algorithm for a directional aid with hearing defenders. Users of existing hearing defenders experience distorted information, or in worst cases, directional information may not be per...

    Authors: Benny Sällberg, Farook Sattar and Ingvar Claesson
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:274684
  15. We propose a new low complexity, low delay, and fast converging frequency-domain adaptive algorithm for network echo cancellation in VoIP exploiting MMax and sparse partial (SP) tap-selection criteria in the f...

    Authors: Xiang(Shawn) Lin, Andy W.H. Khong, Milŏs Doroslovăcki and Patrick A. Naylor
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:156960
  16. Binaural cue coding (BCC) is an efficient technique for spatial audio rendering by using the side information such as interchannel level difference (ICLD), interchannel time difference (ICTD), and interchannel...

    Authors: Bo Qiu, Yong Xu, Yadong Lu and Jun Yang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:618104
  17. The behavior of time delay estimation (TDE) is well understood and therefore attractive to apply in acoustic source localization (ASL). A time delay between microphones maps into a hyperbola. Furthermore, the ...

    Authors: Pasi Pertilä, Teemu Korhonen and Ari Visa
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:278185
  18. Rhythmic information plays an important role in Music Information Retrieval. Example applications include automatically annotating large databases by genre, meter, ballroom dance style or tempo, fully automate...

    Authors: Björn Schuller, Florian Eyben and Gerhard Rigoll
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:846135
  19. The phasor representation is introduced to identify the characteristic of the active noise control (ANC) systems. The conventional representation, transfer function, cannot explain the fact that the performanc...

    Authors: Fu-Kun Chen, Ding-Horng Chen and Yue-Dar Jou
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:126859
  20. A multiresolution source/filter model for coding of audio source signals (spot recordings) is proposed. Spot recordings are a subset of the multimicrophone recordings of a music performance, before the mixing ...

    Authors: Athanasios Mouchtaris, Kiki Karadimou and Panagiotis Tsakalides
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:624321
  21. The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally ...

    Authors: YousefAjami Alotaibi, Sid-Ahmed Selouani and Douglas O'Shaughnessy
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:679831
  22. This paper deals with continuous-time filter transfer functions that resemble tuning curves at particular set of places on the basilar membrane of the biological cochlea and that are suitable for practical VLS...

    Authors: AG Katsiamis, EM Drakakis and RF Lyon
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:063685
  23. This work is the result of an interdisciplinary collaboration between scientists from the fields of audio signal processing, phonetics and cognitive neuroscience aiming at studying the perception of modificati...

    Authors: Sølvi Ystad, Cyrille Magne, Snorre Farner, Gregory Pallone, Mitsuko Aramaki, Mireille Besson and Richard Kronland-Martinet
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:030194
  24. Multistage vector quantization (MSVQ) is a technique for low complexity implementation of high-dimensional quantizers, which has found applications within speech, audio, and image coding. In this paper, a mult...

    Authors: Pradeepa Yahampath
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:067146
  25. Variability of speaker accent is a challenge for effective human communication as well as speech technology including automatic speech recognition and accent identification. The motivation of this study is to ...

    Authors: Ayako Ikeno and John HL Hansen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:076030
  26. A noise suppression algorithm is proposed based on filtering the spectrotemporal modulations of noisy signals. The modulations are estimated from a multiscale representation of the signal spectrogram generated...

    Authors: Nima Mesgarani and Shihab Shamma
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:042357
  27. We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential) are comp...

    Authors: YanMing Cheng, Changxue Ma and Lynette Melnar
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2008:568737
  28. This paper experimentally shows the importance of perceptual continuity of the expressive strength in vocal timbre for natural change in vocal expression. In order to synthesize various and continuous expressi...

    Authors: Tomoko Yonezawa, Noriko Suzuki, Shinji Abe, Kenji Mase and Kiyoshi Kogure
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:023807
  29. Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the c...

    Authors: Visar Berisha and Andreas Spanias
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:016816
  30. Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significant...

    Authors: Karthikeyan Umapathy and Sridhar Krishnan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:051563
  31. This paper proposes a new technique for improving the performance of linear prediction analysis by utilizing a refined version of the autocorrelation function. Problems in analyzing voiced speech using linear ...

    Authors: M Shahidur Rahman and Tetsuya Shimamura
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:045962
  32. Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech r...

    Authors: Annika Hämäläinen, Lou Boves, Johan de Veth and Louis ten Bosch
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:046460
  33. When applying automatic speech recognition (ASR) to meeting recordings including spontaneous speech, the performance of ASR is greatly reduced by the overlap of speech events. In this paper, a method of separa...

    Authors: Futoshi Asano, Kiyoshi Yamamoto, Jun Ogata, Miichi Yamada and Masami Nakamura
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:027616
  34. We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recogniti...

    Authors: Bhiksha Raj, Lorenzo Turicchia, Bent Schmidt-Nielsen and Rahul Sarpeshkar
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:065420
  35. Dereverberation is required in various speech processing applications such as handsfree telephony and voice-controlled systems, especially when signals are applied that are recorded in a moderately or highly r...

    Authors: Koen Eneman and Marc Moonen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:051831
  36. In various adaptive estimation applications, such as acoustic echo cancellation within teleconferencing systems, the input signal is a highly correlated speech. This, in general, leads to extremely slow conver...

    Authors: Yan Wu Jennifer, John Homer, Geert Rombouts and Marc Moonen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:071495
  37. We investigate novel algorithms to improve the convergence and reduce the complexity of time-domain convolutive blind source separation (BSS) algorithms. First, we propose MMax partial update time-domain convo...

    Authors: Qiongfeng Pan and Tyseer Aboulnasr
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:092528
  38. A sparse system identification algorithm for network echo cancellation is presented. This new approach exploits both the fast convergence of the improved proportionate normalized least mean square (IPNLMS) alg...

    Authors: Andy W.H. Khong, Patrick A. Naylor and Jacob Benesty
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:084376
  39. The μ-law proportionate normalized least mean square (MPNLMS) algorithm has been proposed recently to solve the slow convergence problem of the proportionate normalized least mean square (PNLMS) algorithm afte...

    Authors: Hongyang Deng and Miloš Doroslovački
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:096101
  40. This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assume...

    Authors: Koji Iwano, Tomoaki Yoshinaga, Satoshi Tamura and Sadaoki Furui
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:064506
  41. This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped s...

    Authors: Abdeldjalil Aïssa-El-Bey, Karim Abed-Meraim and Yves Grenier
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:085438
  42. An acoustic echo cancellation structure with a single loudspeaker and multiple microphones is, from a system identification perspective, generally modelled as a single-input multiple-output system. Such a syst...

    Authors: Fredric Lindstrom, Christian Schüldt and Ingvar Claesson
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:078439
  43. Proportionate adaptive filters can improve the convergence speed for the identification of sparse systems as compared to their conventional counterparts. In this paper, the idea of proportionate adaptation is ...

    Authors: Stefan Werner, José A Apolinário Jr. and Paulo S R Diniz
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:034242
  44. The paper provides an analysis of the transient and the steady-state behavior of a filtered-x partial-error affine projection algorithm suitable for multichannel active noise control. The analysis relies on energ...

    Authors: Alberto Carini and Giovanni L Sicuranza
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:031314
  45. This paper investigates the significance of combining cepstral features derived from the modified group delay function and from the short-time spectral magnitude like the MFCC. The conventional group delay fun...

    Authors: Rajesh M. Hegde, Hema A. Murthy and V. R. R. Gadde
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2006 2007:079032

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

  • Citation Impact 2023
    Journal Impact Factor: 1.7
    5-year Journal Impact Factor: 1.6
    Source Normalized Impact per Paper (SNIP): 1.051
    SCImago Journal Rank (SJR): 0.414

    Speed 2023
    Submission to first editorial decision (median days): 17
    Submission to acceptance (median days): 154

    Usage 2023
    Downloads: 368,607
    Altmetric mentions: 70

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here