Skip to content

Advertisement

Articles

Sort by
Page 1 of 13
  1. Content type: Research

    Query-by-example Spoken Term Detection (QbE STD) aims to retrieve data from a speech repository given an acoustic (spoken) query containing the term of interest as the input. This paper presents the systems su...

    Authors: Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez, Jorge Proença, Fernando Perdigão, Fernando García-Granada, Emilio Sanchis, Anna Pompili and Alberto Abad

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2018 2018:2

    Published on:

  2. Content type: Research

    Automatic extraction of acoustic regions of interest from recordings captured in realistic clinical environments is a necessary preprocessing step in any cry analysis system. In this study, we propose a hidden...

    Authors: Gaurav Naithani, Jaana Kivinummi, Tuomas Virtanen, Outi Tammela, Mikko J. Peltola and Jukka M. Leppänen

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2018 2018:1

    Published on:

  3. Content type: Research

    Audio signals are a type of high-dimensional data, and their clustering is critical. However, distance calculation failures, inefficient index trees, and cluster overlaps, derived from the equidistance, redund...

    Authors: Wenfa Li, Gongming Wang and Ke Li

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:26

    Published on:

  4. Content type: Research

    In speech enhancement, noise power spectral density (PSD) estimation plays a key role in determining appropriate de-nosing gains. In this paper, we propose a robust noise PSD estimator for binaural speech enha...

    Authors: Youna Ji, Yonghyun Baek and Young-cheol Park

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:25

    Published on:

  5. Content type: Research

    Large vocabulary continuous speech recognition (LVCSR) has naturally been demanded for transcribing daily conversations, while developing spoken text data to train LVCSR is costly and time-consuming. In this p...

    Authors: Vataya Chunwijitra and Chai Wutiwiwatchai

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:24

    Published on:

  6. Content type: Research

    Robustness against background noise is a major research area for speech-related applications such as speech recognition and speaker recognition. One of the many solutions for this problem is to detect speech-d...

    Authors: Gökay Dişken, Zekeriya Tüfekci and Ulus Çevik

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:23

    Published on:

  7. Content type: Research

    Within search-on-speech, Spoken Term Detection (STD) aims to retrieve data from a speech repository given a textual representation of a search term. This paper presents an international open evaluation for sea...

    Authors: Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez, Luis Serrano, Inma Hernaez, Alejandro Coucheiro-Limeres, Javier Ferreiros, Julia Olcoz and Jorge Llombart

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:22

    Published on:

  8. Content type: Research

    The task of speaker diarization is to answer the question "who spoke when?" In this paper, we present different clustering approaches which consist of Evolutionary Computation Algorithms (ECAs) such as Genetic...

    Authors: Karim Dabbabi, Salah Hajji and Adnen Cherif

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:21

    Published on:

  9. Content type: Research

    An artificial neural network is an important model for training features of voice conversion (VC) tasks. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as Mel Cepstr...

    Authors: Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi and Yasuo Ariki

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:18

    Published on:

  10. Content type: Research

    Audio fingerprinting has been an active research field typically used for music identification. Robust audio fingerprinting technology is used to successfully perform content-based audio identification regardl...

    Authors: Dominic Williams, Akash Pooransingh and Jesse Saitoo

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:17

    Published on:

  11. Content type: Research

    In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model. Voice conversion is a technique where only speaker-specific information in the source spee...

    Authors: Toru Nakashika and Yasuhiro Minami

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:16

    Published on:

  12. Content type: Research

    Onset detection still has room for improvement, especially when dealing with polyphonic music signals. For certain purposes in which the correctness of the result is a must, user intervention is hence required...

    Authors: Jose J. Valero-Mas and José M. Iñesta

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:15

    Published on:

  13. Content type: Research

    Speech synthesis has been applied in many kinds of practical applications. Currently, state-of-the-art speech synthesis uses statistical methods based on hidden Markov model (HMM). Speech synthesized by statis...

    Authors: Gia-Nhu Nguyen and Trung-Nghia Phung

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:14

    Published on:

  14. Content type: Research

    Autocorrelation domain is a proper domain for clean speech signal and noise separation. In this paper, a method is proposed to decrease effects of noise on the clean speech signal, autocorrelation-based noise ...

    Authors: Gholamreza Farahani

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:13

    Published on:

  15. Content type: Research

    Various musical descriptors have been developed for Cover Song Identification (CSI). However, different descriptors are based on various assumptions, designed for representing distinct characteristics of music...

    Authors: Ning Chen, Mingyu Li and Haidong Xiao

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:12

    Published on:

  16. Content type: Research

    The automatic sound event classification (SEC) has attracted a growing attention in recent years. Feature extraction is a critical factor in SEC system, and the deep neural network (DNN) algorithms have achiev...

    Authors: Junjie Zhang, Jie Yin, Qi Zhang, Jun Shi and Yan Li

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:11

    Published on:

  17. Content type: Research

    This paper outlines a package synchronization scheme for blind speech watermarking in the discrete wavelet transform (DWT) domain. Following two-level DWT decomposition, watermark bits and synchronization code...

    Authors: Hwai-Tsu Hu, Shiow-Jyu Lin and Ling-Yuan Hsu

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:10

    Published on:

  18. Content type: Research

    With the exponential growth in computing power and progress in speech recognition technology, spoken dialog systems (SDSs) with which a user interacts through natural speech has been widely used in human-compu...

    Authors: Chung-Hsien Wu, Ming-Hsiang Su and Wei-Bin Liang

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:9

    Published on:

  19. Content type: Research

    The benefit of auditory models for solving three music recognition tasks—onset detection, pitch estimation, and instrument recognition—is analyzed. Appropriate features are introduced which enable the use of s...

    Authors: Klaus Friedrichs, Nadja Bauer, Rainer Martin and Claus Weihs

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:7

    Published on:

  20. Content type: Research

    This article presents the original results of Polish language statistical analysis, based on the orthographic and phonemic language corpus. Phonemic language corpus for Polish was developed by using automatic ...

    Authors: Piotr Kłosowski

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:5

    Published on:

  21. Content type: Research

    The incorporation of grammatical information into speech recognition systems is often used to increase performance in morphologically rich languages. However, this introduces demands for sufficiently large tra...

    Authors: Gregor Donaj and Zdravko Kačič

    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2017 2017:6

    Published on:

Page 1 of 13

2016 Journal Metrics

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here


Advertisement