Skip to main content

Articles

Page 6 of 8

  1. We address the question of whether and how boosting and bagging can be used for speech recognition. In order to do this, we compare two different boosting schemes, one at the phoneme level and one at the utter...

    Authors: Christos Dimitrakakis and Samy Bengio

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2011:426792

    Content type: Research

    Published on:

  2. To overcome harmonic structure distortions of complex tones in the low frequency range due to the frequency to electrode mapping function used in Nucleus cochlear implants, two modified frequency maps based on...

    Authors: Sherif A. Omran, Waikong Lai and Norbert Dillier

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2011 2010:948565

    Content type: Research Article

    Published on:

  3. This paper describes a novel approach for localization of multiple sources overlapping in time. The proposed algorithm relies on acoustic maps computed in multi-microphone settings, which are descriptions of t...

    Authors: Alessio Brutti, Maurizio Omologo and Piergiorgio Svaizer

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:147495

    Content type: Research Article

    Published on:

  4. Correlogram is an important representation for periodic signals. It is widely used in pitch estimation and source separation. For these applications, major problems of correlogram are its low resolution and re...

    Authors: Xueliang Zhang, Wenju Liu and Bo Xu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:252374

    Content type: Research Article

    Published on:

  5. In this paper we present a method to search for environmental sounds in large unstructured databases of user-submitted audio, using a general sound events taxonomy from ecological acoustics. We discuss the use...

    Authors: Gerard Roma, Jordi Janer, Stefan Kersten, Mattia Schirosa, Perfecto Herrera and Xavier Serra

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:960863

    Content type: Research Article

    Published on:

  6. Degrouping is the key component in MPEG Layer II audio decoding. It mainly contains the arithmetic operations of division and modulo. So far no dedicated degrouping algorithm and architecture is well realized....

    Authors: Tsung-Han Tsai

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:737450

    Content type: Research Article

    Published on:

  7. Organizing a database of user-contributed environmental sound recordings allows sound files to be linked not only by the semantic tags and labels applied to them, but also to other sounds with similar acoustic...

    Authors: Gordon Wichern, Brandon Mechtley, Alex Fink, Harvey Thornburg and Andreas Spanias

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:192363

    Content type: Research Article

    Published on:

  8. A multimicrophone speech enhancement algorithm for binaural hearing aids that preserves interaural time delays was proposed recently. The algorithm is based on multichannel Wiener filtering and relies on a voi...

    Authors: Jasmina Catic, Torsten Dau, JörgM Buchholz and Fredrik Gran

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:840294

    Content type: Research Article

    Published on:

  9. A method is described for quantifying the quality of wideband speech codecs. Two parameters are derived from signal-based speech quality model estimations: (i) a wideband equipment impairment factor

    Authors: Sebastian Möller, Nicolas Côté, Valérie Gautier-Turbin, Nobuhiko Kitawaki and Akira Takahashi

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:782731

    Content type: Research Article

    Published on:

  10. In multiway loudspeaker systems, digital signal processing techniques have been used to correct the frequency response, the propagation time, and the lobbing errors. These solutions are mainly based on correct...

    Authors: Hmaied Shaiek and JeanMarc Boucher

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:928439

    Content type: Research Article

    Published on:

  11. Humans represent sounds to others and receive information about sounds from others using onomatopoeia. Such representation is useful for obtaining and reporting the acoustic features and impressions of actual ...

    Authors: Masayuki Takada, Nozomu Fujisawa, Fumino Obata and Shin-ichiro Iwamiya

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:674248

    Content type: Research Article

    Published on:

  12. We give a brief discussion on the amplitude and frequency variation rates of the sinusoid representation of signals. In particular, we derive three inequalities that show that these rates are upper bounded by ...

    Authors: Xue Wen and Mark Sandler

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:941732

    Content type: Research Article

    Published on:

  13. This paper presents a method for estimating the amplitude of coincident partials generated by harmonic musical sources (instruments and vocals). It was developed as an alternative to the commonly used interpol...

    Authors: JaymeGarciaArnal Barbedo and George Tzanetakis

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:523791

    Content type: Research Article

    Published on:

  14. Nowadays, audio podcasting has been widely used by many online sites such as newspapers, web portals, journals, and so forth, to deliver audio content to users through download or subscription. Within 1 to 30 ...

    Authors: MN Nguyen, Qi Tian and Ping Xue

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:572571

    Content type: Research Article

    Published on:

  15. Frequency-domain blind source separation (BSS) performs poorly in high reverberation because the independence assumption collapses at each frequency bins when the number of bins increases. To improve the separ...

    Authors: Lin Wang, Heping Ding and Fuliang Yin

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:797962

    Content type: Research Article

    Published on:

  16. Speaker identification performance is almost perfect in neutral talking environments. However, the performance is deteriorated significantly in shouted talking environments. This work is devoted to proposing, ...

    Authors: Ismail Shahin

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:862138

    Content type: Research Article

    Published on:

  17. Theoretical and applied environmental sounds research is gaining prominence but progress has been hampered by the lack of a comprehensive, high quality, accessible database of environmental sounds. An ongoing ...

    Authors: Brian Gygi and Valeriy Shafiro

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:654914

    Content type: Research Article

    Published on:

  18. This paper presents a model-based method for coding the LSF parameters of LPC speech coders on a "long-term" basis, that is, beyond the usual 20–30 ms frame duration. The objective is to provide efficient LSF ...

    Authors: Laurent Girin

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:597039

    Content type: Research Article

    Published on:

  19. Authors: Georg Stemmer, Elmar Nöth and Vijay Parsa

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:835974

    Content type: Editorial

    Published on:

  20. When a number of speakers are simultaneously active, for example in meetings or noisy public places, the sources of interest need to be separated from interfering speakers and from each other in order to be ro...

    Authors: Dorothea Kolossa, Ramon Fernandez Astudillo, Eugen Hoffmann and Reinhold Orglmeister

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:651420

    Content type: Research Article

    Published on:

  21. The aim of the study is to transpose and extend to a set of environmental sounds the notion of sound descriptors usually used for musical sounds. Four separate primary studies dealing with interior car sounds,...

    Authors: Nicolas Misdariis, Antoine Minard, Patrick Susini, Guillaume Lemaitre, Stephen McAdams and Etienne Parizet

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:362013

    Content type: Research Article

    Published on:

  22. Mood of Music is among the most relevant and commercially promising, yet challenging attributes for retrieval in large music collections. In this respect this article first provides a short overview on methods...

    Authors: Björn Schuller, Johannes Dorfner and Gerhard Rigoll

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:735854

    Content type: Research Article

    Published on:

  23. This work explores the effect of mismatches between adults' and children's speech due to differences in various acoustic correlates on the automatic speech recognition performance under mismatched conditions. ...

    Authors: Shweta Ghai and Rohit Sinha

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:318785

    Content type: Research Article

    Published on:

  24. Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the commu...

    Authors: Xiaojuan Ma, Christiane Fellbaum and Perry Cook

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:404860

    Content type: Research Article

    Published on:

  25. The paper considers the task of recognizing phonemes and words from a singing input by using a phonetic hidden Markov model recognizer. The system is targeted to both monophonic singing and singing in polyphon...

    Authors: Annamaria Mesaros and Tuomas Virtanen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:546047

    Content type: Research Article

    Published on:

  26. With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech ...

    Authors: Ravichander Vipperla, Steve Renals and Joe Frankel

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:525783

    Content type: Research Article

    Published on:

  27. We revisit an original concept of speech coding in which the signal is separated into the carrier modulated by the signal envelope. A recently developed technique, called frequency-domain linear prediction (FD...

    Authors: Petr Motlicek, Sriram Ganapathy, Hynek Hermansky and Harinath Garudadri

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:856280

    Content type: Research Article

    Published on:

  28. Spoken utterance retrieval was largely studied in the last decades, with the purpose of indexing large audio databases or of detecting keywords in continuous speech streams. While the indexing of closed corpor...

    Authors: Mickael Rouvier, Georges Linarès and Benjamin Lecouteux

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:326578

    Content type: Research Article

    Published on:

  29. Breathy and whispery voices are nonmodal phonations produced by an air escape through the glottis and may carry important linguistic or paralinguistic information (intentions, attitudes, and emotions), dependi...

    Authors: CarlosToshinori Ishi, Hiroshi Ishiguro and Norihiro Hagita

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:528193

    Content type: Research Article

    Published on:

  30. The automatic recognition of children's speech is well known to be a challenge, and so is the influence of affect that is believed to downgrade performance of a speech recogniser. In this contribution, we inve...

    Authors: Stefan Steidl, Anton Batliner, Dino Seppi and Björn Schuller

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2010:783954

    Content type: Research Article

    Published on:

  31. Fractional Fourier transform (FrFT) has been proposed to improve the time-frequency resolution in signal analysis and processing. However, selecting the FrFT transform order for the proper analysis of multicom...

    Authors: Hui Yin, Climent Nadeu and Volker Hohmann

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2010 2009:304579

    Content type: Research Article

    Published on:

  32. This paper proposes a query by example system for generic audio. We estimate the similarity of the example signal and the samples in the queried database by calculating the distance between the probability den...

    Authors: Marko Helén and Tuomas Virtanen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2010:179303

    Content type: Research Article

    Published on:

  33. This paper proposes a method for transcribing drums from polyphonic music using a network of connected hidden Markov models (HMMs). The task is to detect the temporal locations of unpitched percussive sounds (...

    Authors: Jouni Paulus and Anssi Klapuri

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:497292

    Content type: Research Article

    Published on:

  34. We are developing a method of Web-based unsupervised language model adaptation for recognition of spoken documents. The proposed method chooses keywords from the preliminary recognition result and retrieves We...

    Authors: Akinori Ito, Yasutomo Kajiura, Motoyuki Suzuki and Shozo Makino

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:140575

    Content type: Research Article

    Published on:

  35. Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS, and small amount of training data. In or...

    Authors: Christophe Lévy, Georges Linarès and Jean-François Bonastre

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:806186

    Content type: Research Article

    Published on:

  36. This paper describes SynFace, a supportive technology that aims at enhancing audio-based spoken communication in adverse acoustic conditions by providing the missing visual information in the form of an animat...

    Authors: Giampiero Salvi, Jonas Beskow, Samer Al Moubayed and Björn Granström

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:191940

    Content type: Research Article

    Published on:

  37. We describe here the control, shape and appearance models that are built using an original photogrammetric method to capture characteristics of speaker-specific facial articulation, anatomy, and texture. Two o...

    Authors: Gérard Bailly, Oxana Govokhina, Frédéric Elisei and Gaspard Breton

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:769494

    Content type: Research Article

    Published on:

  38. We describe a method for the synthesis of visual speech movements using a hybrid unit selection/model-based approach. Speech lip movements are captured using a 3D stereo face capture system and split up into p...

    Authors: JamesD Edge, Adrian Hilton and Philip Jackson

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:597267

    Content type: Research Article

    Published on:

  39. Computer-Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with non-native speech that is particularly challenging. Since unconstrained non-na...

    Authors: Joost van Doremalen, Catia Cucchiarini and Helmer Strik

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2010:973954

    Content type: Research Article

    Published on:

  40. Robust recognition of general audio events constitutes a topic of intensive research in the signal processing community. This work presents an efficient methodology for acoustic surveillance of atypical situat...

    Authors: Stavros Ntalampiras, Ilyas Potamitis and Nikos Fakotakis

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:594103

    Content type: Research Article

    Published on:

  41. Wireless-VoIP communications introduce perceptual degradations that are not present with traditional VoIP communications. This paper investigates the effects of such degradations on the performance of three st...

    Authors: TiagoH Falk and Wai-Yip Chan

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:104382

    Content type: Research Article

    Published on:

  42. This paper presents an image-based talking head system, which includes two parts: analysis and synthesis. The audiovisual analysis part creates a face model of a recorded human subject, which is composed of a ...

    Authors: Kang Liu and Joern Ostermann

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:174192

    Content type: Research Article

    Published on:

  43. Audiovisual text-to-speech systems convert a written text into an audiovisual speech signal. Typically, the visual mode of the synthetic speech is synthesized separately from the audio, the latter being either...

    Authors: Wesley Mattheyses, Lukas Latacz and Werner Verhelst

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:169819

    Content type: Research Article

    Published on:

  44. The paper presents an adaptive system for Voiced/Unvoiced (V/UV) speech detection in the presence of background noise. Genetic algorithms were used to select the features that offer the best V/UV detection acc...

    Authors: F Beritelli, S Casale, A Russo and S Serrano

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:965436

    Content type: Research Article

    Published on:

  45. Design and implementation strategies of spatial sound rendering are investigated in this paper for automotive scenarios. Six design methods are implemented for various rendering modes with different number of ...

    Authors: MingsianR Bai and Jhih-Ren Hong

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:876297

    Content type: Research Article

    Published on:

  46. In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appro...

    Authors: Andreas Maier, Tino Haderlein, Florian Stelzle, Elmar Nöth, Emeka Nkenke, Frank Rosanowski, Anne Schützenberger and Maria Schuster

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2010:926951

    Content type: Research Article

    Published on:

  47. The problem of overlapping harmonics is particularly acute in musical sound separation and has not been addressed adequately. We propose a monaural system based on binary time-frequency masking with an emphasi...

    Authors: Yipeng Li and DeLiang Wang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2009 2009:130567

    Content type: Research Article

    Published on:

Latest Tweets

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here