Skip to main content

Articles

Page 7 of 8

  1. Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequ...

    Authors: Hyunsin Park, Tetsuya Takiguchi and Yasuo Ariki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:690451

    Content type: Research Article

    Published on:

  2. There are many ways of synthesizing sound on a computer. The method that we consider, called a mass-spring system, synthesizes sound by simulating the vibrations of a network of interconnected masses, springs, an...

    Authors: Don Morgan and Sanzheng Qiao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:947823

    Content type: Research Article

    Published on:

  3. In 2003 and 2004, the ISO/IEC MPEG standardization committee added two amendments to their MPEG-4 audio coding standard. These amendments concern parametric coding techniques and encompass Spectral Band Replic...

    Authors: AC den Brinker, J Breebaart, P Ekstrand, J Engdegård, F Henn, K Kjörling, W Oomen and H Purnhagen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:468971

    Content type: Review Article

    Published on:

  4. Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing ...

    Authors: Björn Schuller, Martin Wöllmer, Tobias Moosmayr and Gerhard Rigoll

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:942617

    Content type: Research Article

    Published on:

  5. While linear prediction (LP) has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consistin...

    Authors: Toon van Waterschoot and Marc Moonen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2008:706935

    Content type: Research Article

    Published on:

  6. Text corpus size is an important issue when building a language model (LM). This is a particularly important issue for languages where little data is available. This paper introduces an LM adaptation technique...

    Authors: ArnarThor Jensson, Koji Iwano and Sadaoki Furui

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2008:573832

    Content type: Research Article

    Published on:

  7. Robust automatic language identification (LID) is a task of identifying the language from a short utterance spoken by an unknown speaker. One of the mainstream approaches named parallel phone recognition langu...

    Authors: Hongbin Suo, Ming Li, Ping Lu and Yonghong Yan

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:674859

    Content type: Research Article

    Published on:

  8. This paper investigates the problem of speaker recognition in noisy conditions. A new approach called nonnegative tensor principal component analysis (NTPCA) with sparse constraint is proposed for speech featu...

    Authors: Qiang Wu and Liqing Zhang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:578612

    Content type: Research Article

    Published on:

  9. Improving the intelligibility of speech in different environments is one of the main objectives of hearing aid signal processing algorithms. Hearing aids typically employ beamforming techniques using multiple ...

    Authors: Sriram Srinivasan, Ashish Pandharipande and Kees Janse

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:824797

    Content type: Research Article

    Published on:

  10. Online personalization of hearing instruments refers to learning preferred tuning parameter values from user feedback through a control wheel (or remote control), during normal operation of the hearing aid. We...

    Authors: Alexander Ypma, Job Geurts, Serkan Özer, Erik van der Werf and Bert de Vries

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:183456

    Content type: Research Article

    Published on:

  11. A proven method for achieving effective automatic speech recognition (ASR) due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed ...

    Authors: Umit H. Yapanel and John H.L. Hansen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:148967

    Content type: Research Article

    Published on:

  12. Perception of moving sound sources obeys different brain processes from those mediating the localization of static sound events. In view of these specificities, a preprocessing model was designed, based on the...

    Authors: R Kronland-Martinet and T Voinier

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:849696

    Content type: Research Article

    Published on:

  13. The present paper proposes a new approach for detecting music boundaries, such as the boundary between music pieces or the boundary between a music piece and a speech section for automatic segmentation of musi...

    Authors: Yoshiaki Itoh, Akira Iwabuchi, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka and Shi-Wook Lee

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:480786

    Content type: Research Article

    Published on:

  14. Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher qu...

    Authors: Demetrios Cantzos, Athanasios Mouchtaris and Chris Kyriakakis

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:462830

    Content type: Research Article

    Published on:

  15. We propose a novel approach to improve adaptive decorrelation filtering- (ADF-) based speech source separation in diffuse noise. The effects of noise on system adaptation and separation outputs are handled sep...

    Authors: Rong Hu and Yunxin Zhao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:349214

    Content type: Research Article

    Published on:

  16. This paper proposes a new algorithm for a directional aid with hearing defenders. Users of existing hearing defenders experience distorted information, or in worst cases, directional information may not be per...

    Authors: Benny Sällberg, Farook Sattar and Ingvar Claesson

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:274684

    Content type: Research Article

    Published on:

  17. We propose a new low complexity, low delay, and fast converging frequency-domain adaptive algorithm for network echo cancellation in VoIP exploiting MMax and sparse partial (SP) tap-selection criteria in the f...

    Authors: Xiang(Shawn) Lin, Andy W.H. Khong, Milŏs Doroslovăcki and Patrick A. Naylor

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:156960

    Content type: Research Article

    Published on:

  18. Binaural cue coding (BCC) is an efficient technique for spatial audio rendering by using the side information such as interchannel level difference (ICLD), interchannel time difference (ICTD), and interchannel...

    Authors: Bo Qiu, Yong Xu, Yadong Lu and Jun Yang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:618104

    Content type: Research Article

    Published on:

  19. The behavior of time delay estimation (TDE) is well understood and therefore attractive to apply in acoustic source localization (ASL). A time delay between microphones maps into a hyperbola. Furthermore, the ...

    Authors: Pasi Pertilä, Teemu Korhonen and Ari Visa

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:278185

    Content type: Research Article

    Published on:

  20. Rhythmic information plays an important role in Music Information Retrieval. Example applications include automatically annotating large databases by genre, meter, ballroom dance style or tempo, fully automate...

    Authors: Björn Schuller, Florian Eyben and Gerhard Rigoll

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:846135

    Content type: Research Article

    Published on:

  21. The phasor representation is introduced to identify the characteristic of the active noise control (ANC) systems. The conventional representation, transfer function, cannot explain the fact that the performanc...

    Authors: Fu-Kun Chen, Ding-Horng Chen and Yue-Dar Jou

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:126859

    Content type: Research Article

    Published on:

  22. A multiresolution source/filter model for coding of audio source signals (spot recordings) is proposed. Spot recordings are a subset of the multimicrophone recordings of a music performance, before the mixing ...

    Authors: Athanasios Mouchtaris, Kiki Karadimou and Panagiotis Tsakalides

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:624321

    Content type: Research Article

    Published on:

  23. The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally ...

    Authors: YousefAjami Alotaibi, Sid-Ahmed Selouani and Douglas O'Shaughnessy

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2008 2008:679831

    Content type: Research Article

    Published on:

  24. This paper deals with continuous-time filter transfer functions that resemble tuning curves at particular set of places on the basilar membrane of the biological cochlea and that are suitable for practical VLS...

    Authors: AG Katsiamis, EM Drakakis and RF Lyon

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:063685

    Content type: Research Article

    Published on:

  25. This work is the result of an interdisciplinary collaboration between scientists from the fields of audio signal processing, phonetics and cognitive neuroscience aiming at studying the perception of modificati...

    Authors: Sølvi Ystad, Cyrille Magne, Snorre Farner, Gregory Pallone, Mitsuko Aramaki, Mireille Besson and Richard Kronland-Martinet

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:030194

    Content type: Research Article

    Published on:

  26. Multistage vector quantization (MSVQ) is a technique for low complexity implementation of high-dimensional quantizers, which has found applications within speech, audio, and image coding. In this paper, a mult...

    Authors: Pradeepa Yahampath

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:067146

    Content type: Research Article

    Published on:

  27. Variability of speaker accent is a challenge for effective human communication as well as speech technology including automatic speech recognition and accent identification. The motivation of this study is to ...

    Authors: Ayako Ikeno and John HL Hansen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:076030

    Content type: Research Article

    Published on:

  28. A noise suppression algorithm is proposed based on filtering the spectrotemporal modulations of noisy signals. The modulations are estimated from a multiscale representation of the signal spectrogram generated...

    Authors: Nima Mesgarani and Shihab Shamma

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:042357

    Content type: Research Article

    Published on:

  29. We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential) are comp...

    Authors: YanMing Cheng, Changxue Ma and Lynette Melnar

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2008:568737

    Content type: Research Article

    Published on:

  30. This paper experimentally shows the importance of perceptual continuity of the expressive strength in vocal timbre for natural change in vocal expression. In order to synthesize various and continuous expressi...

    Authors: Tomoko Yonezawa, Noriko Suzuki, Shinji Abe, Kenji Mase and Kiyoshi Kogure

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:023807

    Content type: Research Article

    Published on:

  31. Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the c...

    Authors: Visar Berisha and Andreas Spanias

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:016816

    Content type: Research Article

    Published on:

  32. Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significant...

    Authors: Karthikeyan Umapathy and Sridhar Krishnan

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:051563

    Content type: Research Article

    Published on:

  33. This paper proposes a new technique for improving the performance of linear prediction analysis by utilizing a refined version of the autocorrelation function. Problems in analyzing voiced speech using linear ...

    Authors: M Shahidur Rahman and Tetsuya Shimamura

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:045962

    Content type: Research Article

    Published on:

  34. Recent research on the TIMIT corpus suggests that longer-length acoustic models are more appropriate for pronunciation variation modelling than the context-dependent phones that conventional automatic speech r...

    Authors: Annika Hämäläinen, Lou Boves, Johan de Veth and Louis ten Bosch

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:046460

    Content type: Research Article

    Published on:

  35. When applying automatic speech recognition (ASR) to meeting recordings including spontaneous speech, the performance of ASR is greatly reduced by the overlap of speech events. In this paper, a method of separa...

    Authors: Futoshi Asano, Kiyoshi Yamamoto, Jun Ogata, Miichi Yamada and Masami Nakamura

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:027616

    Content type: Research Article

    Published on:

  36. We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recogniti...

    Authors: Bhiksha Raj, Lorenzo Turicchia, Bent Schmidt-Nielsen and Rahul Sarpeshkar

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:065420

    Content type: Research Article

    Published on:

  37. Dereverberation is required in various speech processing applications such as handsfree telephony and voice-controlled systems, especially when signals are applied that are recorded in a moderately or highly r...

    Authors: Koen Eneman and Marc Moonen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:051831

    Content type: Research Article

    Published on:

  38. In various adaptive estimation applications, such as acoustic echo cancellation within teleconferencing systems, the input signal is a highly correlated speech. This, in general, leads to extremely slow conver...

    Authors: Yan Wu Jennifer, John Homer, Geert Rombouts and Marc Moonen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:071495

    Content type: Research Article

    Published on:

  39. We investigate novel algorithms to improve the convergence and reduce the complexity of time-domain convolutive blind source separation (BSS) algorithms. First, we propose MMax partial update time-domain convo...

    Authors: Qiongfeng Pan and Tyseer Aboulnasr

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:092528

    Content type: Research Article

    Published on:

  40. A sparse system identification algorithm for network echo cancellation is presented. This new approach exploits both the fast convergence of the improved proportionate normalized least mean square (IPNLMS) alg...

    Authors: Andy W.H. Khong, Patrick A. Naylor and Jacob Benesty

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:084376

    Content type: Research Article

    Published on:

  41. The μ-law proportionate normalized least mean square (MPNLMS) algorithm has been proposed recently to solve the slow convergence problem of the proportionate normalized least mean square (PNLMS) algorithm afte...

    Authors: Hongyang Deng and Miloš Doroslovački

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:096101

    Content type: Research Article

    Published on:

  42. This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assume...

    Authors: Koji Iwano, Tomoaki Yoshinaga, Satoshi Tamura and Sadaoki Furui

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:064506

    Content type: Research Article

    Published on:

  43. This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped s...

    Authors: Abdeldjalil Aïssa-El-Bey, Karim Abed-Meraim and Yves Grenier

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:085438

    Content type: Research Article

    Published on:

  44. An acoustic echo cancellation structure with a single loudspeaker and multiple microphones is, from a system identification perspective, generally modelled as a single-input multiple-output system. Such a syst...

    Authors: Fredric Lindstrom, Christian Schüldt and Ingvar Claesson

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2007 2007:078439

    Content type: Research Article

    Published on:

Latest Tweets

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here