Skip to main content

Articles

Page 4 of 8

  1. In a bid to enhance the search performance, this paper presents an improved version of reduced candidate mechanism (RCM), an algebraic codebook search conducted on an algebraic code-excited linear prediction (...

    Authors: Ning-Yun Ku, Cheng-Yu Yeh and Shaw-Hwa Hwang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:30

    Content type: Research

    Published on:

  2. This paper proposes two novel approaches for parameter estimation of a superpositional intonation model. These approaches present linguistic and paralinguistic assumptions for initializing a pre-existing stand...

    Authors: Humberto M Torres, Jorge A Gurlekian, Hansjörg Mixdorff and Hartmut Pfitzinger

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:28

    Content type: Research

    Published on:

  3. In this paper, unsupervised learning is used to separate percussive and harmonic sounds from monaural non-vocal polyphonic signals. Our algorithm is based on a modified non-negative matrix factorization (NMF) ...

    Authors: Francisco Jesus Canadas-Quesada, Pedro Vera-Candeas, Nicolas Ruiz-Reyes, Julio Carabias-Orti and Pablo Cabanas-Molero

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:26

    Content type: Research

    Published on:

  4. Composers may not provide instructions for playing their works, especially for instrument solos, and therefore, different musicians may give very different interpretations of the same work. Such differences us...

    Authors: Yi-Ju Lin, Tien-Ming Wang, Ta-Chun Chen, Yin-Lin Chen, Wei-Chen Chang and Alvin WY Su

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:25

    Content type: Research

    Published on:

  5. This paper investigates the estimation of underlying articulatory targets of Thai vowels as invariant representation of vocal tract shapes by means of analysis-by-synthesis based on acoustic data. The basic id...

    Authors: Santitham Prom-on, Peter Birkholz and Yi Xu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:23

    Content type: Research

    Published on:

  6. The paper describes an auditory processing-based feature extraction strategy for robust speech recognition in environments, where conventional automatic speech recognition (ASR) approaches are not successful. ...

    Authors: Hari Krishna Maganti and Marco Matassoni

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:21

    Content type: Research

    Published on:

  7. In this paper, a two-stage scheme is proposed to deal with the difficult problem of acoustic echo cancellation (AEC) in single-channel scenario in the presence of noise. In order to overcome the major challeng...

    Authors: Upal Mahbub, Shaikh Anowarul Fattah, Wei-Ping Zhu and M Omair Ahmad

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:20

    Content type: Research

    Published on:

  8. Neural network language models (NNLM) have been proved to be quite powerful for sequence modeling, including feed-forward NNLM (FNNLM), recurrent NNLM (RNNLM), etc. One main issue concerned for NNLM is the hea...

    Authors: Yongzhe Shi, Wei-Qiang Zhang, Meng Cai and Jia Liu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:19

    Content type: Research

    Published on:

  9. When several acoustic sources are simultaneously active in a meeting room scenario, and both the position of the sources and the identity of the time-overlapped sound classes have been estimated, the problem o...

    Authors: Rupayan Chakraborty, Climent Nadeu and Taras Butko

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:18

    Content type: Research

    Published on:

  10. Speech enhancement has an increasing demand in mobile communications and faces a great challenge in a real ambient noisy environment. This paper develops an effective spatial-frequency domain speech enhancemen...

    Authors: Yue Xian Zou, Peng Wang, Yong Qing Wang, Christian H Ritz and Jiangtao Xi

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:17

    Content type: Research

    Published on:

  11. It was recently shown that delta-sigma quantization (DSQ) can be used for optimal multiple description (MD) coding of Gaussian sources. The DSQ scheme combined oversampling, prediction, and noise-shaping in or...

    Authors: Jack Leegaard, Jan Østergaard, Søren Holdt Jensen and Ram Zamir

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:16

    Content type: Research

    Published on:

  12. Previously, a dereverberation method based on generalized spectral subtraction (GSS) using multi-channel least mean-squares (MCLMS) has been proposed. The results of speech recognition experiments showed that ...

    Authors: Zhaofeng Zhang, Longbiao Wang and Atsuhiko Kai

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:15

    Content type: Research

    Published on:

  13. The robustness of n-gram language models depends on the quality of text data on which they have been trained. The text corpora collected from various resources such as web pages or electronic documents are charac...

    Authors: Ján Staš, Jozef Juhár and Daniel Hládek

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:14

    Content type: Research

    Published on:

  14. We present a feature enhancement method that uses neural networks (NNs) to map the reverberant feature in a log-melspectral domain to its corresponding anechoic feature. The mapping is done by cascade NNs trai...

    Authors: Aditya Arie Nugraha, Kazumasa Yamamoto and Seiichi Nakagawa

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:13

    Content type: Research

    Published on:

  15. Decision tree-clustered context-dependent hidden semi-Markov models (HSMMs) are typically used in statistical parametric speech synthesis to represent probability densities of acoustic features given contextua...

    Authors: Soheil Khorram, Hossein Sameti, Fahimeh Bahmaninezhad, Simon King and Thomas Drugman

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:12

    Content type: Research

    Published on:

  16. Eigenphone-based speaker adaptation outperforms conventional maximum likelihood linear regression (MLLR) and eigenvoice methods when there is sufficient adaptation data. However, it suffers from severe over-fi...

    Authors: Wen-Lin Zhang, Wei-Qiang Zhang, Dan Qu and Bi-Cheng Li

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:11

    Content type: Research

    Published on:

  17. Three-dimensional (3D) audio technologies are booming with the success of 3D video technology. The surge in audio channels makes its huge data unacceptable for transmitting bandwidth and storage media, and the...

    Authors: Shi Dong, Ruimin Hu, Xiaochen Wang, Yuhong Yang and Weiping Tu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:10

    Content type: Research

    Published on:

  18. An approach is proposed for creating location-specific audio textures for virtual location-exploration services. The presented approach creates audio textures by processing a small amount of audio recorded at ...

    Authors: Toni Heittola, Annamaria Mesaros, Dani Korpi, Antti Eronen and Tuomas Virtanen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:9

    Content type: Research

    Published on:

  19. In this paper, an analytical approach to estimate the instantaneous frequencies of a multicomponent signal is presented. A non-stationary signal composed of oscillation modes or resonances is described by a mu...

    Authors: Mohammadali Sebghati, Hamidreza Amindavar and James A Ritcey

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:8

    Content type: Research

    Published on:

  20. This paper presents an optical music recognition (OMR) system to process the handwritten musical scores of Kunqu Opera written in Gong-Che Notation (GCN). First, it introduces the background of Kunqu Opera and GC...

    Authors: Gen-Fang Chen and Jia-Shing Sheu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:7

    Content type: Research

    Published on:

  21. We present in this paper a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movement of such speakers is limited by their athetoid symptoms, a...

    Authors: Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:5

    Content type: Research

    Published on:

  22. We propose a novel approach of integrating exemplar-based template matching with statistical modeling to improve continuous speech recognition. We choose the template unit to be context-dependent phone segment...

    Authors: Xie Sun and Yunxin Zhao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:4

    Content type: Research

    Published on:

  23. This paper proposes a new aliasing cancelation algorithm for the transition between non-aliased coding and transform coding with time domain aliasing cancelation (TDAC). It is effectively utilized for unified ...

    Authors: Jeongook Song and Hong-Goo Kang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:3

    Content type: Research

    Published on:

  24. We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in...

    Authors: Madoka Miki, Norihide Kitaoka, Chiyomi Miyajima, Takanori Nishino and Kazuya Takeda

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:2

    Content type: Research

    Published on:

  25. Bandwidth extension is an effective technique for enhancing the quality of audio signals by reconstructing their high-frequency components. In this paper, a novel blind bandwidth extension method is proposed b...

    Authors: Chang-Chun Bao, Xin Liu, Yong-Tao Sha and Xing-Tao Zhang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:1

    Content type: Research

    Published on:

  26. Prosody and prosodic boundaries carry significant information regarding linguistics and paralinguistics and are important aspects of speech. In the field of prosodic event detection, many local acoustic featur...

    Authors: Junhong Zhao, Wei-Qiang Zhang, Hua Yuan, Michael T Johnson, Jia Liu and Shanhong Xia

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:30

    Content type: Research

    Published on:

  27. In this paper, we propose a novel noise-robustness method known as weighted sub-band histogram equalization (WS-HEQ) to improve speech recognition accuracy in noise-corrupted environments. Considering the obse...

    Authors: Jeih-weih Hung and Hao-teng Fan

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:29

    Content type: Research

    Published on:

  28. The framework of voice conversion system is expected to emphasize both the static and dynamic characteristics of the speech signal. The conventional approaches like Mel frequency cepstrum coefficients and line...

    Authors: Jagannath H Nirmal, Mukesh A Zaveri, Suprava Patnaik and Pramod H Kachare

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:28

    Content type: Research

    Published on:

  29. This paper investigates real-time N-dimensional wideband sound source localization in outdoor (far-field) and low-degree reverberation cases, using a simple N-microphone arrangement. Outdoor sound source localiza...

    Authors: Ali Pourmohammad and Seyed Mohammad Ahadi

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:27

    Content type: Research

    Published on:

  30. Affective computing, especially from speech, is one of the key steps toward building more natural and effective human-machine interaction. In recent years, several emotional speech corpora in different languag...

    Authors: Caglar Oflazoglu and Serdar Yildirim

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:26

    Content type: Research

    Published on:

  31. The performance of thresholding-based methods for speech enhancement largely depends upon the estimation of the exact threshold value. In this paper, a new thresholding-based speech enhancement approach, where...

    Authors: Tahsina Farah Sanam and Celia Shahnaz

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:25

    Content type: Research

    Published on:

  32. This paper investigates multi-modal aspects of audiovisual quality assessment for interactive communication services. It shows how perceived auditory and visual qualities integrate to an overall audiovisual qu...

    Authors: Benjamin Belmudez and Sebastian Möller

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:24

    Content type: Research

    Published on:

  33. Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much int...

    Authors: Javier Tejedor, Doroteo T Toledano, Xavier Anguera, Amparo Varona, Lluís F Hurtado, Antonio Miguel and José Colás

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:23

    Content type: Research

    Published on:

  34. The recurrent neural network language model (RNNLM) has shown significant promise for statistical language modeling. In this work, a new class-based output layer method is introduced to further improve the RNN...

    Authors: Yongzhe Shi, Wei-Qiang Zhang, Jia Liu and Michael T Johnson

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:22

    Content type: Research

    Published on:

  35. This paper proposes a novel and robust voice activity detection (VAD) algorithm utilizing long-term spectral flatness measure (LSFM) which is capable of working at 10 dB and lower signal-to-noise ratios(SNRs)....

    Authors: Yanna Ma and Akinori Nishihara

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:87

    Content type: Research

    Published on:

    The Erratum to this article has been published in EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:30

  36. Many features have been proposed for speech-based emotion recognition, and a majority of them are frame based or statistics estimated from frame-based features. Temporal information is typically modelled on a ...

    Authors: Vidhyasaharan Sethu, Eliathamby Ambikairajah and Julien Epps

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:19

    Content type: Research

    Published on:

  37. Nonnegative matrix factorization (NMF) is developed for parts-based representation of nonnegative signals with the sparseness constraint. The signals are adequately represented by a set of basis vectors and th...

    Authors: Jen-Tzung Chien and Hsin-Lung Hsieh

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:18

    Content type: Research

    Published on:

  38. In this study, we focus on the classification of neutral and stressed speech based on a physical model. In order to represent the characteristics of the vocal folds and vocal tract during the process of speech...

    Authors: Xiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka and Kazuya Takeda

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:17

    Content type: Research

    Published on:

  39. This paper presents a bimodal acoustic-visual synthesis technique that concurrently generates the acoustic speech signal and a 3D animation of the speaker’s outer face. This is done by concatenating bimodal di...

    Authors: Slim Ouni, Vincent Colotte, Utpala Musti, Asterios Toutios, Brigitte Wrobel-Dautcourt, Marie-Odile Berger and Caroline Lavecchia

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:16

    Content type: Research

    Published on:

  40. Most existing automatic chord recognition systems use a chromagram in front-end processing and some sort of classifier (e.g., hidden Markov model, Gaussian mixture model (GMM), support vector machine, or other...

    Authors: Maksim Khadkevich and Maurizio Omologo

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:15

    Content type: Research

    Published on:

  41. Cochannel speech separation aims to separate two speech signals from a single mixture. In a supervised scenario, the identities of two speakers are given, and current methods use pre-trained speaker models for...

    Authors: Ke Hu and DeLiang Wang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:14

    Content type: Research

    Published on:

  42. A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address...

    Authors: Yannis Panagakis and Constantine Kotropoulos

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:13

    Content type: Research

    Published on:

  43. Multiple-model based speech recognition (MMSR) has been shown to be quite successful in noisy speech recognition. Since it employs multiple hidden Markov model (HMM) sets that correspond to various noise types...

    Authors: Yongjoo Chung and John HL Hansen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:12

    Content type: Research

    Published on:

  44. A novel speech bandwidth extension method based on audio watermark is presented in this paper. The time-domain and frequency-domain envelope parameters are extracted from the high-frequency components of speec...

    Authors: Zhe Chen, Chengyong Zhao, Guosheng Geng and Fuliang Yin

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:10

    Content type: Research

    Published on:

  45. This paper presents a novel lossless compression technique of the context-based adaptive arithmetic coding which can be used to further compress the quantized parameters in audio codec. The key feature of the ...

    Authors: Jing Wang, Xuan Ji, Shenghui Zhao, Xiang Xie and Jingming Kuang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:9

    Content type: Research

    Published on:

Latest Tweets

Your browser needs to have JavaScript enabled to view this timeline

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here