Skip to main content

Articles

Page 9 of 11

  1. The availability of haptic interfaces in music content processing offers interesting possibilities of performer-instrument interaction for musical expression. These new musical instruments can precisely modulate ...

    Authors: Victor Zappi, Antonio Pistillo, Sylvain Calinon, Andrea Brogni and Darwin Caldwell
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2012 2012:2
  2. Most of voice activity detection (VAD) schemes are operated in the discrete Fourier transform (DFT) domain by classifying each sound frame into speech or noise based on the DFT coefficients. These coefficients...

    Authors: Shiwen Deng and Jiqing Han
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:12
  3. The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition...

    Authors: Josef Psutka, Jan Švec, Josef V Psutka, Jan Vaněk, Aleš Pražák, Luboš Šmídl and Pavel Ircing
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:10
  4. In this article, a novel technique based on the empirical mode decomposition methodology for processing speech features is proposed and investigated. The empirical mode decomposition generalizes the Fourier an...

    Authors: Kuo-Hau Wu, Chia-Ping Chen and Bing-Feng Yeh
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:9
  5. This article proposes a multiscale product (MP)-based method for estimating the open quotient (OQ) from the speech waveform. The MP is operated by calculating the wavelet transform coefficients of the speech s...

    Authors: Wafa Saidi, Aicha Bouzid and Noureddine Ellouze
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:8
  6. In high-quality conferencing systems, it is desired to perform noise reduction with as limited speech distortion as possible. Previous work, based on time varying amplification controlled by signal-to-noise ra...

    Authors: Markus Borgh, Magnus Berggren, Christian Schüldt, Fredric Lindström and Ingvar Claesson
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:7
  7. The first large vocabulary speech recognition system for the Persian language is introduced in this paper. This continuous speech recognition system uses most standard and state-of-the-art speech and language ...

    Authors: Hossein Sameti, Hadi Veisi, Mohammad Bahrani, Bagher Babaali and Khosro Hosseinzadeh
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:6
  8. The spectrum subtraction method is one of the most common methods by which to remove noise from a spectrum. Like many noise reduction methods, the spectrum subtraction method uses discrete Fourier transform (D...

    Authors: Toshio Yoshizawa, Shigeki Hirobayashi and Tadanobu Misawa
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:5
  9. This work studies the task of automatic emotion detection in music. Music may evoke more than one different emotion at the same time. Single-label classification and regression cannot model this multiplicity. ...

    Authors: Konstantinos Trohidis, Grigorios Tsoumakas, George Kalliris and Ioannis Vlahavas
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:4
  10. The frequency-to-channel mapping for Cochlear implant (CI) signal processors was originally designed to optimize speech perception and generally does not preserve the harmonic structure of music sounds. An alg...

    Authors: Sherif Abdellatif Omran, Waikong Lai, Michael Büchler and Norbert Dillier
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:2
  11. Recently, audio segmentation has attracted research interest because of its usefulness in several applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Moreover, a pre...

    Authors: Taras Butko and Climent Nadeu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:1
  12. Authors: Bhiksha Raj, Paris Smaragdis, Malcolm Slaney, Chung-Hsien Wu, Liming Chen and Hyoung-Gook Kim
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2010:467278
  13. We address the question of whether and how boosting and bagging can be used for speech recognition. In order to do this, we compare two different boosting schemes, one at the phoneme level and one at the utter...

    Authors: Christos Dimitrakakis and Samy Bengio
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:426792
  14. To overcome harmonic structure distortions of complex tones in the low frequency range due to the frequency to electrode mapping function used in Nucleus cochlear implants, two modified frequency maps based on...

    Authors: Sherif A. Omran, Waikong Lai and Norbert Dillier
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2010:948565
  15. This paper describes a novel approach for localization of multiple sources overlapping in time. The proposed algorithm relies on acoustic maps computed in multi-microphone settings, which are descriptions of t...

    Authors: Alessio Brutti, Maurizio Omologo and Piergiorgio Svaizer
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:147495
  16. Correlogram is an important representation for periodic signals. It is widely used in pitch estimation and source separation. For these applications, major problems of correlogram are its low resolution and re...

    Authors: Xueliang Zhang, Wenju Liu and Bo Xu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:252374
  17. In this paper we present a method to search for environmental sounds in large unstructured databases of user-submitted audio, using a general sound events taxonomy from ecological acoustics. We discuss the use...

    Authors: Gerard Roma, Jordi Janer, Stefan Kersten, Mattia Schirosa, Perfecto Herrera and Xavier Serra
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:960863
  18. Degrouping is the key component in MPEG Layer II audio decoding. It mainly contains the arithmetic operations of division and modulo. So far no dedicated degrouping algorithm and architecture is well realized....

    Authors: Tsung-Han Tsai
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:737450
  19. Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic...

    Authors: Gordon Wichern, Brandon Mechtley, Alex Fink, Harvey Thornburg and Andreas Spanias
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:192363
  20. A multimicrophone speech enhancement algorithm for binaural hearing aids that preserves interaural time delays was proposed recently. The algorithm is based on multichannel Wiener filtering and relies on a voi...

    Authors: Jasmina Catic, Torsten Dau, JörgM Buchholz and Fredrik Gran
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:840294
  21. A method is described for quantifying the quality of wideband speech codecs. Two parameters are derived from signal-based speech quality model estimations: (i) a wideband equipment impairment factor

    Authors: Sebastian Möller, Nicolas Côté, Valérie Gautier-Turbin, Nobuhiko Kitawaki and Akira Takahashi
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:782731
  22. In multiway loudspeaker systems, digital signal processing techniques have been used to correct the frequency response, the propagation time, and the lobbing errors. These solutions are mainly based on correct...

    Authors: Hmaied Shaiek and JeanMarc Boucher
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:928439
  23. Humans represent sounds to others and receive information about sounds from others using onomatopoeia. Such representation is useful for obtaining and reporting the acoustic features and impressions of actual ...

    Authors: Masayuki Takada, Nozomu Fujisawa, Fumino Obata and Shin-ichiro Iwamiya
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:674248
  24. We give a brief discussion on the amplitude and frequency variation rates of the sinusoid representation of signals. In particular, we derive three inequalities that show that these rates are upper bounded by ...

    Authors: Xue Wen and Mark Sandler
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:941732
  25. This paper presents a method for estimating the amplitude of coincident partials generated by harmonic musical sources (instruments and vocals). It was developed as an alternative to the commonly used interpol...

    Authors: JaymeGarciaArnal Barbedo and George Tzanetakis
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:523791
  26. Nowadays, audio podcasting has been widely used by many online sites such as newspapers, web portals, journals, and so forth, to deliver audio content to users through download or subscription. Within 1 to 30 ...

    Authors: MN Nguyen, Qi Tian and Ping Xue
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:572571
  27. Frequency-domain blind source separation (BSS) performs poorly in high reverberation because the independence assumption collapses at each frequency bins when the number of bins increases. To improve the separ...

    Authors: Lin Wang, Heping Ding and Fuliang Yin
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:797962
  28. Speaker identification performance is almost perfect in neutral talking environments. However, the performance is deteriorated significantly in shouted talking environments. This work is devoted to proposing, ...

    Authors: Ismail Shahin
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:862138
  29. Theoretical and applied environmental sounds research is gaining prominence but progress has been hampered by the lack of a comprehensive, high quality, accessible database of environmental sounds. An ongoing ...

    Authors: Brian Gygi and Valeriy Shafiro
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:654914
  30. This paper presents a model-based method for coding the LSF parameters of LPC speech coders on a "long-term" basis, that is, beyond the usual 20–30 ms frame duration. The objective is to provide efficient LSF ...

    Authors: Laurent Girin
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:597039
  31. Authors: Georg Stemmer, Elmar Nöth and Vijay Parsa
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:835974
  32. When a number of speakers are simultaneously active, for example in meetings or noisy public places, the sources of interest need to be separated from interfering speakers and from each other in order to be ro...

    Authors: Dorothea Kolossa, Ramon Fernandez Astudillo, Eugen Hoffmann and Reinhold Orglmeister
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:651420
  33. The aim of the study is to transpose and extend to a set of environmental sounds the notion of sound descriptors usually used for musical sounds. Four separate primary studies dealing with interior car sounds,...

    Authors: Nicolas Misdariis, Antoine Minard, Patrick Susini, Guillaume Lemaitre, Stephen McAdams and Etienne Parizet
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:362013
  34. Mood of Music is among the most relevant and commercially promising, yet challenging attributes for retrieval in large music collections. In this respect this article first provides a short overview on methods...

    Authors: Björn Schuller, Johannes Dorfner and Gerhard Rigoll
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:735854
  35. This work explores the effect of mismatches between adults' and children's speech due to differences in various acoustic correlates on the automatic speech recognition performance under mismatched conditions. ...

    Authors: Shweta Ghai and Rohit Sinha
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:318785
  36. Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the commu...

    Authors: Xiaojuan Ma, Christiane Fellbaum and Perry Cook
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:404860
  37. The paper considers the task of recognizing phonemes and words from a singing input by using a phonetic hidden Markov model recognizer. The system is targeted to both monophonic singing and singing in polyphon...

    Authors: Annamaria Mesaros and Tuomas Virtanen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:546047
  38. With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech ...

    Authors: Ravichander Vipperla, Steve Renals and Joe Frankel
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:525783
  39. We revisit an original concept of speech coding in which the signal is separated into the carrier modulated by the signal envelope. A recently developed technique, called frequency-domain linear prediction (FD...

    Authors: Petr Motlicek, Sriram Ganapathy, Hynek Hermansky and Harinath Garudadri
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:856280
  40. Spoken utterance retrieval was largely studied in the last decades, with the purpose of indexing large audio databases or of detecting keywords in continuous speech streams. While the indexing of closed corpor...

    Authors: Mickael Rouvier, Georges Linarès and Benjamin Lecouteux
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:326578
  41. Breathy and whispery voices are nonmodal phonations produced by an air escape through the glottis and may carry important linguistic or paralinguistic information (intentions, attitudes, and emotions), dependi...

    Authors: CarlosToshinori Ishi, Hiroshi Ishiguro and Norihiro Hagita
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:528193
  42. The automatic recognition of children's speech is well known to be a challenge, and so is the influence of affect that is believed to downgrade performance of a speech recogniser. In this contribution, we inve...

    Authors: Stefan Steidl, Anton Batliner, Dino Seppi and Björn Schuller
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:783954
  43. Fractional Fourier transform (FrFT) has been proposed to improve the time-frequency resolution in signal analysis and processing. However, selecting the FrFT transform order for the proper analysis of multicom...

    Authors: Hui Yin, Climent Nadeu and Volker Hohmann
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2009:304579

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

  • 2022 Citation Impact
    2.4 - 2-year Impact Factor
    2.0 - 5-year Impact Factor
    1.081 - SNIP (Source Normalized Impact per Paper)
    0.458 - SJR (SCImago Journal Rank)

    2023 Speed
    17 days submission to first editorial decision for all manuscripts (Median)
    154 days submission to accept (Median)

    2023 Usage 
    368,607 downloads
    70 Altmetric mentions 

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here