Skip to main content

Advertisement

Articles

Page 4 of 7

  1. An approach is proposed for creating location-specific audio textures for virtual location-exploration services. The presented approach creates audio textures by processing a small amount of audio recorded at ...

    Authors: Toni Heittola, Annamaria Mesaros, Dani Korpi, Antti Eronen and Tuomas Virtanen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:9

    Content type: Research

    Published on:

  2. In this paper, an analytical approach to estimate the instantaneous frequencies of a multicomponent signal is presented. A non-stationary signal composed of oscillation modes or resonances is described by a mu...

    Authors: Mohammadali Sebghati, Hamidreza Amindavar and James A Ritcey

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:8

    Content type: Research

    Published on:

  3. This paper presents an optical music recognition (OMR) system to process the handwritten musical scores of Kunqu Opera written in Gong-Che Notation (GCN). First, it introduces the background of Kunqu Opera and GC...

    Authors: Gen-Fang Chen and Jia-Shing Sheu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:7

    Content type: Research

    Published on:

  4. We present in this paper a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movement of such speakers is limited by their athetoid symptoms, a...

    Authors: Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi and Yasuo Ariki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:5

    Content type: Research

    Published on:

  5. We propose a novel approach of integrating exemplar-based template matching with statistical modeling to improve continuous speech recognition. We choose the template unit to be context-dependent phone segment...

    Authors: Xie Sun and Yunxin Zhao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:4

    Content type: Research

    Published on:

  6. This paper proposes a new aliasing cancelation algorithm for the transition between non-aliased coding and transform coding with time domain aliasing cancelation (TDAC). It is effectively utilized for unified ...

    Authors: Jeongook Song and Hong-Goo Kang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:3

    Content type: Research

    Published on:

  7. We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in...

    Authors: Madoka Miki, Norihide Kitaoka, Chiyomi Miyajima, Takanori Nishino and Kazuya Takeda

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:2

    Content type: Research

    Published on:

  8. Bandwidth extension is an effective technique for enhancing the quality of audio signals by reconstructing their high-frequency components. In this paper, a novel blind bandwidth extension method is proposed b...

    Authors: Chang-Chun Bao, Xin Liu, Yong-Tao Sha and Xing-Tao Zhang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2014 2014:1

    Content type: Research

    Published on:

  9. Prosody and prosodic boundaries carry significant information regarding linguistics and paralinguistics and are important aspects of speech. In the field of prosodic event detection, many local acoustic featur...

    Authors: Junhong Zhao, Wei-Qiang Zhang, Hua Yuan, Michael T Johnson, Jia Liu and Shanhong Xia

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:30

    Content type: Research

    Published on:

  10. In this paper, we propose a novel noise-robustness method known as weighted sub-band histogram equalization (WS-HEQ) to improve speech recognition accuracy in noise-corrupted environments. Considering the obse...

    Authors: Jeih-weih Hung and Hao-teng Fan

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:29

    Content type: Research

    Published on:

  11. The framework of voice conversion system is expected to emphasize both the static and dynamic characteristics of the speech signal. The conventional approaches like Mel frequency cepstrum coefficients and line...

    Authors: Jagannath H Nirmal, Mukesh A Zaveri, Suprava Patnaik and Pramod H Kachare

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:28

    Content type: Research

    Published on:

  12. This paper investigates real-time N-dimensional wideband sound source localization in outdoor (far-field) and low-degree reverberation cases, using a simple N-microphone arrangement. Outdoor sound source localiza...

    Authors: Ali Pourmohammad and Seyed Mohammad Ahadi

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:27

    Content type: Research

    Published on:

  13. Affective computing, especially from speech, is one of the key steps toward building more natural and effective human-machine interaction. In recent years, several emotional speech corpora in different languag...

    Authors: Caglar Oflazoglu and Serdar Yildirim

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:26

    Content type: Research

    Published on:

  14. The performance of thresholding-based methods for speech enhancement largely depends upon the estimation of the exact threshold value. In this paper, a new thresholding-based speech enhancement approach, where...

    Authors: Tahsina Farah Sanam and Celia Shahnaz

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:25

    Content type: Research

    Published on:

  15. This paper investigates multi-modal aspects of audiovisual quality assessment for interactive communication services. It shows how perceived auditory and visual qualities integrate to an overall audiovisual qu...

    Authors: Benjamin Belmudez and Sebastian Möller

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:24

    Content type: Research

    Published on:

  16. Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much int...

    Authors: Javier Tejedor, Doroteo T Toledano, Xavier Anguera, Amparo Varona, Lluís F Hurtado, Antonio Miguel and José Colás

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:23

    Content type: Research

    Published on:

  17. The recurrent neural network language model (RNNLM) has shown significant promise for statistical language modeling. In this work, a new class-based output layer method is introduced to further improve the RNN...

    Authors: Yongzhe Shi, Wei-Qiang Zhang, Jia Liu and Michael T Johnson

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:22

    Content type: Research

    Published on:

  18. This paper proposes a novel and robust voice activity detection (VAD) algorithm utilizing long-term spectral flatness measure (LSFM) which is capable of working at 10 dB and lower signal-to-noise ratios(SNRs)....

    Authors: Yanna Ma and Akinori Nishihara

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:87

    Content type: Research

    Published on:

    The Erratum to this article has been published in EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:30

  19. Many features have been proposed for speech-based emotion recognition, and a majority of them are frame based or statistics estimated from frame-based features. Temporal information is typically modelled on a ...

    Authors: Vidhyasaharan Sethu, Eliathamby Ambikairajah and Julien Epps

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:19

    Content type: Research

    Published on:

  20. Nonnegative matrix factorization (NMF) is developed for parts-based representation of nonnegative signals with the sparseness constraint. The signals are adequately represented by a set of basis vectors and th...

    Authors: Jen-Tzung Chien and Hsin-Lung Hsieh

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:18

    Content type: Research

    Published on:

  21. In this study, we focus on the classification of neutral and stressed speech based on a physical model. In order to represent the characteristics of the vocal folds and vocal tract during the process of speech...

    Authors: Xiao Yao, Takatoshi Jitsuhiro, Chiyomi Miyajima, Norihide Kitaoka and Kazuya Takeda

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:17

    Content type: Research

    Published on:

  22. This paper presents a bimodal acoustic-visual synthesis technique that concurrently generates the acoustic speech signal and a 3D animation of the speaker’s outer face. This is done by concatenating bimodal di...

    Authors: Slim Ouni, Vincent Colotte, Utpala Musti, Asterios Toutios, Brigitte Wrobel-Dautcourt, Marie-Odile Berger and Caroline Lavecchia

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:16

    Content type: Research

    Published on:

  23. Most existing automatic chord recognition systems use a chromagram in front-end processing and some sort of classifier (e.g., hidden Markov model, Gaussian mixture model (GMM), support vector machine, or other...

    Authors: Maksim Khadkevich and Maurizio Omologo

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:15

    Content type: Research

    Published on:

  24. Cochannel speech separation aims to separate two speech signals from a single mixture. In a supervised scenario, the identities of two speakers are given, and current methods use pre-trained speaker models for...

    Authors: Ke Hu and DeLiang Wang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:14

    Content type: Research

    Published on:

  25. A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address...

    Authors: Yannis Panagakis and Constantine Kotropoulos

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:13

    Content type: Research

    Published on:

  26. Multiple-model based speech recognition (MMSR) has been shown to be quite successful in noisy speech recognition. Since it employs multiple hidden Markov model (HMM) sets that correspond to various noise types...

    Authors: Yongjoo Chung and John HL Hansen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:12

    Content type: Research

    Published on:

  27. A novel speech bandwidth extension method based on audio watermark is presented in this paper. The time-domain and frequency-domain envelope parameters are extracted from the high-frequency components of speec...

    Authors: Zhe Chen, Chengyong Zhao, Guosheng Geng and Fuliang Yin

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:10

    Content type: Research

    Published on:

  28. This paper presents a novel lossless compression technique of the context-based adaptive arithmetic coding which can be used to further compress the quantized parameters in audio codec. The key feature of the ...

    Authors: Jing Wang, Xuan Ji, Shenghui Zhao, Xiang Xie and Jingming Kuang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:9

    Content type: Research

    Published on:

  29. This article analyzes and compares influence of different types of spectral and prosodic features for Czech and Slovak emotional speech classification based on Gaussian mixture models (GMM). Influence of initi...

    Authors: Jiří Přibil and Anna Přibilová

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:8

    Content type: Research

    Published on:

  30. In this article, we describe a speaker adaptation method based on the probabilistic 2-mode analysis of training models. Probabilistic 2-mode analysis is a probabilistic extension of multilinear analysis. We ap...

    Authors: Yongwon Jeong

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:7

    Content type: Research

    Published on:

  31. Availability of large amounts of raw unlabeled data has sparked the recent surge in semi-supervised learning research. In most works, however, it is assumed that labeled and unlabeled data come from the same d...

    Authors: Konstantin Markov and Tomoko Matsui

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:6

    Content type: Research

    Published on:

  32. A comprehensive system for facial animation of generic 3D head models driven by speech is presented in this article. In the training stage, audio-visual information is extracted from audio-visual training data...

    Authors: Lucas D Terissi, Mauricio Cerda, Juan C Gómez, Nancy Hitschfeld-Kahler and Bernard Girau

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:5

    Content type: Research

    Published on:

  33. Blind source separation (BSS) and sound activity detection (SAD) from a sound source mixture with minimum prior information are two major requirements for computational auditory scene analysis that recognizes ...

    Authors: Kohei Nagira, Takuma Otsuka and Hiroshi G Okuno

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:4

    Content type: Research

    Published on:

  34. We propose an efficient solution to the problem of sparse linear prediction analysis of the speech signal. Our method is based on minimization of a weighted l2-norm of the prediction error. The weighting function...

    Authors: Vahid Khanagha and Khalid Daoudi

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:3

    Content type: Research

    Published on:

  35. A lot of effort has been made in Computational Auditory Scene Analysis (CASA) to segregate target speech from monaural mixtures. Based on the principle of CASA, this article proposes an improved algorithm for ...

    Authors: Wang Yu, Lin Jiajun, Chen Ning and Yuan Wenhao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:2

    Content type: Research

    Published on:

  36. The work presented in this article studies how the context information can be used in the automatic sound event detection process, and how the detection system can benefit from such information. Humans are usi...

    Authors: Toni Heittola, Annamaria Mesaros, Antti Eronen and Tuomas Virtanen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2013 2013:1

    Content type: Research

    Published on:

  37. This article describes a modified technique for enhancing noisy speech to improve automatic speech recognition (ASR) performance. The proposed approach improves the widely used spectral subtraction which inher...

    Authors: Hari Krishna Maganti and Marco Matassoni

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:29

    Content type: Research

    Published on:

  38. Conventional parametric stereo (PS) audio coding employs inter-channel phase difference and overall phase difference as phase parameters. In this article, it is shown that those parameters cannot correctly rep...

    Authors: Dong-il Hyun, Young-cheol Park and Dae Hee Youn

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:27

    Content type: Research

    Published on:

  39. In this article, the authors propose an optimally designed fixed beamformer (BF) for stereophonic acoustic echo cancelation (SAEC) in real hands-free communication applications. Several contributions related t...

    Authors: Matteo Pirro, Stefano Squartini, Laura Romoli and Francesco Piazza

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:26

    Content type: Research

    Published on:

  40. The rapid spread in digital data usage in many real life applications have urged new and effective ways to ensure their security. Efficient secrecy can be achieved, at least in part, by implementing steganogra...

    Authors: Fatiha Djebbar, Beghdad Ayad, Karim Abed Meraim and Habib Hamam

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:25

    Content type: Review

    Published on:

  41. Mood is an important aspect of music and knowledge of mood can be used as a basic feature in music recommender and retrieval systems. A listening experiment was carried out establishing ratings for various moo...

    Authors: Bert den Brinker, Ralph van Dinther and Janto Skowronek

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:24

    Content type: Research

    Published on:

  42. A vast amount of audio features have been proposed in the literature to characterize the content of audio signals. In order to overcome specific problems related to the existing features (such as lack of discrimi...

    Authors: Toni Mäkinen, Serkan Kiranyaz, Jenni Raitoharju and Moncef Gabbouj

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:23

    Content type: Research

    Published on:

  43. Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance wh...

    Authors: Sridhar Krishna Nemala, Dmitry N Zotkin, Ramani Duraiswami and Mounya Elhilali

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:22

    Content type: Research

    Published on:

  44. A new method to secure speech communication using the discrete wavelet transforms (DWT) and the fast Fourier transform is presented in this article. In the first phase of the hiding technique, we separate the ...

    Authors: Siwar Rekik, Driss Guerchi, Sid-Ahmed Selouani and Habib Hamam

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:20

    Content type: Research

    Published on:

  45. In this article, we present the evaluation results for the task of speaker diarization of broadcast news, which was part of the Albayzin 2010 evaluation campaign of language and speech technologies. The evalua...

    Authors: Martin Zelenák, Henrik Schulz and Javier Hernando

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:19

    Content type: Research

    Published on:

Latest Tweets

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here


Advertisement