Skip to main content

Advertisement

Articles

Page 2 of 7

  1. Autocorrelation domain is a proper domain for clean speech signal and noise separation. In this paper, a method is proposed to decrease effects of noise on the clean speech signal, autocorrelation-based noise ...

    Authors: Gholamreza Farahani

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:13

    Content type: Research

    Published on:

  2. Various musical descriptors have been developed for Cover Song Identification (CSI). However, different descriptors are based on various assumptions, designed for representing distinct characteristics of music...

    Authors: Ning Chen, Mingyu Li and Haidong Xiao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:12

    Content type: Research

    Published on:

  3. The automatic sound event classification (SEC) has attracted a growing attention in recent years. Feature extraction is a critical factor in SEC system, and the deep neural network (DNN) algorithms have achiev...

    Authors: Junjie Zhang, Jie Yin, Qi Zhang, Jun Shi and Yan Li

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:11

    Content type: Research

    Published on:

  4. This paper outlines a package synchronization scheme for blind speech watermarking in the discrete wavelet transform (DWT) domain. Following two-level DWT decomposition, watermark bits and synchronization code...

    Authors: Hwai-Tsu Hu, Shiow-Jyu Lin and Ling-Yuan Hsu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:10

    Content type: Research

    Published on:

  5. With the exponential growth in computing power and progress in speech recognition technology, spoken dialog systems (SDSs) with which a user interacts through natural speech has been widely used in human-compu...

    Authors: Chung-Hsien Wu, Ming-Hsiang Su and Wei-Bin Liang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:9

    Content type: Research

    Published on:

  6. The benefit of auditory models for solving three music recognition tasks—onset detection, pitch estimation, and instrument recognition—is analyzed. Appropriate features are introduced which enable the use of s...

    Authors: Klaus Friedrichs, Nadja Bauer, Rainer Martin and Claus Weihs

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:7

    Content type: Research

    Published on:

  7. The incorporation of grammatical information into speech recognition systems is often used to increase performance in morphologically rich languages. However, this introduces demands for sufficiently large tra...

    Authors: Gregor Donaj and Zdravko Kačič

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:6

    Content type: Research

    Published on:

  8. This article presents the original results of Polish language statistical analysis, based on the orthographic and phonemic language corpus. Phonemic language corpus for Polish was developed by using automatic ...

    Authors: Piotr Kłosowski

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:5

    Content type: Research

    Published on:

  9. This research paper presents parametrization of emotional speech using a pool of common features utilized in emotion recognition such as fundamental frequency, formants, energy, MFCC, PLP, and LPC coefficients. T...

    Authors: Dorota Kamińska, Tomasz Sapiński and Gholamreza Anbarjafari

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:3

    Content type: Research

    Published on:

  10. Cantor Digitalis is a performative singing synthesizer that is composed of two main parts: a chironomic control interface and a parametric voice synthesizer. The control interface is based on a pen/touch graph...

    Authors: Lionel Feugère, Christophe d’Alessandro, Boris Doval and Olivier Perrotin

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:2

    Content type: Research

    Published on:

  11. Present-day IP transport platforms being what they are, it will never be possible to rule out conflicts between the available services. The logical consequence of this assertion is the inevitable conclusion th...

    Authors: Tadeus Uhl, Stefan Paulsen and Krzysztof Nowicki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:1

    Content type: Research

    Published on:

  12. In this study, we investigate the effect of tiny acoustic differences on the efficiency of prosodic information transmission. Study participants listened to textually ambiguous sentences, which could be unders...

    Authors: Bohan Chen, Norihide Kitaoka and Kazuya Takeda

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:19

    Content type: Research

    Published on:

  13. Statistics of pauses appearing in Polish as a potential source of biometry information for automatic speaker recognition were described. The usage of three main types of acoustic pauses (silent, filled and bre...

    Authors: Magdalena Igras-Cybulska, Bartosz Ziółko, Piotr Żelasko and Marcin Witkowski

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:18

    Content type: Research

    Published on:

  14. We present an algorithm for the estimation of fundamental frequencies in voiced audio signals. The method is based on an autocorrelation of a signal with a segment of the same signal. During operation, frequen...

    Authors: Michael Staudacher, Viktor Steixner, Andreas Griessner and Clemens Zierhofer

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:17

    Content type: Research

    Published on:

  15. We present a novel non-iterative and rigorously motivated approach for estimating hidden Markov models (HMMs) and factorial hidden Markov models (FHMMs) of high-dimensional signals. Our approach utilizes the a...

    Authors: Yochay R. Yeminy, Yosi Keller and Sharon Gannot

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:16

    Content type: Research

    Published on:

  16. Substantial amounts of resources are usually required to robustly develop a language model for an open vocabulary speech recognition system as out-of-vocabulary (OOV) words can hurt recognition accuracy. In th...

    Authors: Vataya Chunwijitra, Ananlada Chotimongkol and Chai Wutiwiwatchai

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:15

    Content type: Research

    Published on:

  17. A new voice activity detection algorithm based on long-term pitch divergence is presented. The long-term pitch divergence not only decomposes speech signals with a bionic decomposition but also makes full use ...

    Authors: Xu-Kui Yang, Liang He, Dan Qu and Wei-Qiang Zhang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:14

    Content type: Research

    Published on:

  18. In multichannel spatial audio coding (SAC), the accurate representations of virtual sounds and the efficient compressions of spatial parameters are the key to perfect reproduction of spatial sound effects in 3...

    Authors: Li Gao, Ruimin Hu, Xiaochen Wang, Gang Li, Yuhong Yang and Weiping Tu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:13

    Content type: Research

    Published on:

  19. Adaptive muting method using an optimized parametric shaping function as a part of the ITU-T G.722 Appendix IV packet loss concealment algorithm is proposed. The packet loss concealment algorithm incorporating...

    Authors: Bong-Ki Lee and Joon-Hyuk Chang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:11

    Content type: Research

    Published on:

  20. Automatic speech recognition is becoming more ubiquitous as recognition performance improves, capable devices increase in number, and areas of new application open up. Neural network acoustic models that can u...

    Authors: Ryan Price, Ken-ichi Iso and Koichi Shinoda

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:10

    Content type: Research

    Published on:

  21. Audio classification, classifying audio segments into broad categories such as speech, non-speech, and silence, is an important front-end problem in speech signal processing. Dozens of features have been propo...

    Authors: Xu-Kui Yang, Liang He, Dan Qu, Wei-Qiang Zhang and Michael T. Johnson

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:9

    Content type: Research

    Published on:

  22. Current text-to-speech systems do not support the effective provision of the semantics and the cognitive aspects of the documents’ typographic cues (e.g., font type, style, and size). A novel approach is intro...

    Authors: Dimitrios Tsonos and Georgios Kouroupetroglou

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:8

    Content type: Research

    Published on:

  23. Time-frequency (T-F) masking is an effective method for stereo speech source separation. However, reliable estimation of the T-F mask from sound mixtures is a challenging task, especially when room reverberati...

    Authors: Yang Yu, Wenwu Wang and Peng Han

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:7

    Content type: Research

    Published on:

  24. Today, a large amount of audio data is available on the web in the form of audiobooks, podcasts, video lectures, video blogs, news bulletins, etc. In addition, we can effortlessly record and store audio data s...

    Authors: Tejas Godambe, Sai Krishna Rallabandi, Suryakanth V. Gangashetty, Ashraf Alkhairy and Afshan Jafri

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:6

    Content type: Research

    Published on:

  25. Indian classical music, including its two varieties, Carnatic and Hindustani music, has a rich music tradition and enjoys a wide audience from various parts of the world. The Carnatic music which is more popul...

    Authors: Stanly Mammen, Ilango Krishnamurthi, A. Jalaja Varma and G. Sujatha

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:5

    Content type: Research

    Published on:

  26. Query-by-example spoken term detection (QbE STD) aims at retrieving data from a speech repository given an acoustic query containing the term of interest as input. Nowadays, it is receiving much interest due t...

    Authors: Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez and Carmen Garcia-Mateo

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2016 2016:1

    Content type: Research

    Published on:

  27. Using a proper distribution function for speech signal or for its representations is of crucial importance in statistical-based speech processing algorithms. Although the most commonly used probability density...

    Authors: Ali Aroudi, Hadi Veisi, Hossein Sameti and Zahra Mafakheri

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:35

    Content type: Research

    Published on:

  28. Using a recently proposed informed spatial filter, it is possible to effectively and robustly reduce reverberation from speech signals captured in noisy environments using multiple microphones. Late reverberat...

    Authors: Sebastian Braun and Emanuël A. P. Habets

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:34

    Content type: Research

    Published on:

  29. Audio segmentation is important as a pre-processing task to improve the performance of many speech technology tasks and, therefore, it has an undoubted research interest. This paper describes the database, the...

    Authors: Diego Castán, David Tavarez, Paula Lopez-Otero, Javier Franco-Pedroso, Héctor Delgado, Eva Navas, Laura Docio-Fernández, Daniel Ramos, Javier Serrano, Alfonso Ortega and Eduardo Lleida

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:33

    Content type: Research

    Published on:

  30. The need to have a large amount of parallel data is a large hurdle for the practical use of voice conversion (VC). This paper presents a novel framework of exemplar-based VC that only requires a small number o...

    Authors: Ryo Aihara, Takao Fujii, Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:32

    Content type: Research

    Published on:

  31. In this paper, a semi-fragile and blind digital speech watermarking technique for online speaker recognition systems based on the discrete wavelet packet transform (DWPT) and quantization index modulation (QIM...

    Authors: Mohammad Ali Nematollahi, Mohammad Ali Akhaee, S. A. R. Al-Haddad and Hamurabi Gamboa-Rosales

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:31

    Content type: Research

    Published on:

  32. The presence of physical task stress induces changes in the speech production system which in turn produces changes in speaking behavior. This results in measurable acoustic correlates including changes to for...

    Authors: Keith W. Godin and John H. L. Hansen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:29

    Content type: Research

    Published on:

  33. The identity of musical instruments is reflected in the acoustic attributes of musical notes played with them. Recently, it has been argued that these characteristics of musical identity (or timbre) can be bes...

    Authors: Kailash Patil and Mounya Elhilali

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:27

    Content type: Research

    Published on:

  34. In recent years, deep learning has not only permeated the computer vision and speech recognition research fields but also fields such as acoustic event detection (AED). One of the aims of AED is to detect and ...

    Authors: Miquel Espi, Masakiyo Fujimoto, Keisuke Kinoshita and Tomohiro Nakatani

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:26

    Content type: Research

    Published on:

  35. A multimodal voice conversion (VC) method for noisy environments is proposed. In our previous non-negative matrix factorization (NMF)-based VC method, source and target exemplars are extracted from parallel tr...

    Authors: Kenta Masaka, Ryo Aihara, Tetsuya Takiguchi and Yasuo Ariki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:24

    Content type: Research

    Published on:

  36. In this paper we present the Latin Music Mood Database, an extension of the Latin Music Database but for the task of music mood/emotion classification. The method for assigning mood labels to the musical recor...

    Authors: Carolina L. dos Santos and Carlos N. Silla Jr

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:23

    Content type: Research

    Published on:

  37. Support vector machines (SVMs) have played an important role in the state-of-the-art language recognition systems. The recently developed extreme learning machine (ELM) tends to have better scalability and ach...

    Authors: Jiaming Xu, Wei-Qiang Zhang, Jia Liu and Shanhong Xia

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:22

    Content type: Research

    Published on:

  38. Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia inf...

    Authors: Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez, Carmen Garcia-Mateo, Antonio Cardenal, Julian David Echeverry-Correa, Alejandro Coucheiro-Limeres, Julia Olcoz and Antonio Miguel

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:21

    Content type: Research

    Published on:

  39. The automatic recognition of MP3 compressed speech presents a challenge to the current systems due to the lossy nature of compression which causes irreversible degradation of the speech wave. This article eval...

    Authors: Michal Borsky, Petr Pollak and Petr Mizera

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:20

    Content type: Research

    Published on:

  40. We investigate the automatic recognition of emotions in the singing voice and study the worth and role of a variety of relevant acoustic parameters. The data set contains phrases and vocalises sung by eight re...

    Authors: Florian Eyben, Gláucia L Salomão, Johan Sundberg, Klaus R Scherer and Björn W Schuller

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:19

    Content type: Research

    Published on:

  41. Over recent years, i-vector-based framework has been proven to provide state-of-the-art performance in speaker verification. Each utterance is projected onto a total factor space and is represented by a low-di...

    Authors: Wei Li, Tianfan Fu and Jie Zhu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2015 2015:18

    Content type: Research

    Published on:

Latest Tweets

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here


Advertisement