Skip to main content

Articles

Page 1 of 11

  1. The vast amount of information stored in audio repositories makes necessary the development of efficient and automatic methods to search on audio content. In that direction, search on speech (SoS) has received...

    Authors: Javier Tejedor and Doroteo T. Toledano
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:15
  2. Analyzing songs is a problem that is being investigated to aid various operations on music access platforms. At the beginning of these problems is the identification of the person who sings the song. In this s...

    Authors: Serhat Hizlisoy, Recep Sinan Arslan and Emel Çolakoğlu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:14
  3. Accurately representing the sound field with high spatial resolution is crucial for immersive and interactive sound field reproduction technology. In recent studies, there has been a notable emphasis on effici...

    Authors: Zining Liang, Wen Zhang and Thushara D. Abhayapala
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:13
  4. This work constitutes the first approach for automatically classifying the surface that the voiding flow impacts in non-invasive sound uroflowmetry tests using machine learning. Often, the voiding flow impacts...

    Authors: Marcos Lazaro Alvarez, Laura Arjona, Miguel E. Iglesias Martínez and Alfonso Bahillo
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:12
  5. Speech synthesis has made significant strides thanks to the transition from machine learning to deep learning models. Contemporary text-to-speech (TTS) models possess the capability to generate speech of excep...

    Authors: Huda Barakat, Oytun Turk and Cenk Demiroglu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:11
  6. Claimed identities of speakers can be verified by means of automatic speaker verification (ASV) systems, also known as voice biometric systems. Focusing on security and robustness against spoofing attacks on A...

    Authors: Priyanka Gupta, Hemant A. Patil and Rodrigo Capobianco Guido
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:10
  7. Audio effects are an ubiquitous tool in music production due to the interesting ways in which they can shape the sound of music. Guitar effects, the subset of all audio effects focusing on guitar signals, are ...

    Authors: Reemt Hinrichs, Kevin Gerkens, Alexander Lange and Jörn Ostermann
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:9
  8. Recent advancements in deep learning-based speech enhancement models have extensively used attention mechanisms to achieve state-of-the-art methods by demonstrating their effectiveness. This paper proposes a t...

    Authors: Sivaramakrishna Yecchuri and Sunny Dayal Vanambathina
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:8
  9. Chinese traditional music, a vital expression of Chinese cultural heritage, possesses both a profound emotional resonance and artistic allure. This study sets forth to refine and analyze the acoustical feature...

    Authors: Lingyun Xie, Yuehong Wang and Yan Gao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:7
  10. Speech coding is a method to reduce the amount of data needs to represent speech signals by exploiting the statistical properties of the speech signal. Recently, in the speech coding process, a neural network ...

    Authors: Gebremichael Kibret Sheferaw, Waweru Mwangi, Michael Kimwele and Adane Mamuye
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:6
  11. Melody harmonization, which involves generating a chord progression that complements a user-provided melody, continues to pose a significant challenge. A chord progression must not only be in harmony with the ...

    Authors: Shangda Wu, Yue Yang, Zhaowen Wang, Xiaobing Li and Maosong Sun
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:4
  12. Musical instrument sound synthesis (MISS) often utilizes a text-to-speech framework because of its similarity to speech in terms of generating sounds from symbols. Moreover, a plucked string instrument, such a...

    Authors: Junya Koguchi and Masanori Morise
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:3
  13. Shouted and normal speech classification plays an important role in many speech-related applications. The existing works are often based on magnitude-based features and ignore phase-based features, which are d...

    Authors: Khomdet Phapatanaburi, Longbiao Wang, Meng Liu, Seiichi Nakagawa, Talit Jumphoo and Peerapong Uthansakul
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:2
  14. Acoustic scene classification (ASC) is the process of identifying the acoustic environment or scene from which an audio signal is recorded. In this work, we propose an encoder-decoder-based approach to ASC, wh...

    Authors: Yun-Fei Shao, Xin-Xin Ma, Yong Ma and Wei-Qiang Zhang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:1
  15. Acoustic sensing by multiple devices connected in a wireless acoustic sensor network (WASN) creates new opportunities for multichannel signal processing. However, the autonomy of agents in such a network still...

    Authors: Aleksej Chinaev, Niklas Knaepper and Gerald Enzner
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:55
  16. Target speaker separation aims to separate the speech components of the target speaker from mixed speech and remove extraneous components such as noise. In recent years, deep learning-based speech separation m...

    Authors: Jing Wang, Hanyue Liu, Liang Xu, Wenjing Yang, Weiming Yi and Fang Liu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:53
  17. The task of bandwidth extension addresses the generation of missing high frequencies of audio signals based on knowledge of the low-frequency part of the sound. This task applies to various problems, such as a...

    Authors: Pierre-Amaury Grumiaux and Mathieu Lagrange
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:51
  18. This study focuses on exploring the acoustic differences between synthesized Guzheng pieces and real Guzheng performances, with the aim of improving the quality of synthesized Guzheng music. A dataset with con...

    Authors: Huiwen Xue, Chenxin Sun, Mingcheng Tang, Chenrui Hu, Zhengqing Yuan, Min Huang and Zhongzhe Xiao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:50
  19. Predominant source separation is the separation of one or more desired predominant signals, such as voice or leading instruments, from polyphonic music. The proposed work uses time-frequency filtering on predo...

    Authors: Lekshmi Chandrika Reghunath and Rajeev Rajan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:49
  20. Speakers with dysarthria often struggle to accurately pronounce words and effectively communicate with others. Automatic speech recognition (ASR) is a powerful tool for extracting the content from speakers wit...

    Authors: Zhaopeng Qian, Kejing Xiao and Chongchong Yu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:48
  21. This article presents the research work on improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling. The speech recognition system is b...

    Authors: Kavya Manohar, Jayan A R and Rajeev Rajan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:47
  22. Speaker embeddings, from the ECAPA-TDNN speaker verification network, were recently introduced as features for the task of clustering microphones in ad hoc arrays. Our previous work demonstrated that, in compa...

    Authors: Stijn Kindt, Jenthe Thienpondt, Luca Becker and Nilesh Madhu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:46

    The Correction to this article has been published in EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:5

  23. Non-parallel data voice conversion (VC) has achieved considerable breakthroughs due to self-supervised pre-trained representation (SSPR) being used in recent years. Features extracted by the pre-trained model ...

    Authors: Hao Huang, Lin Wang, Jichen Yang, Ying Hu and Liang He
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:45
  24. Appropriate background music in e-commerce advertisements can help stimulate consumption and build product image. However, many factors like emotion and product category should be taken into account, which mak...

    Authors: Le Ma, Xinda Wu, Ruiyuan Tang, Chongjun Zhong and Kejun Zhang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:44
  25. Snoring affects 57 % of men, 40 % of women, and 27 % of children in the USA. Besides, snoring is highly correlated with obstructive sleep apnoea (OSA), which is characterised by loud and frequent snoring. OSA ...

    Authors: Jingtan Li, Mengkai Sun, Zhonghao Zhao, Xingcan Li, Gaigai Li, Chen Wu, Kun Qian, Bin Hu, Yoshiharu Yamamoto and Björn W. Schuller
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:43
  26. Unsupervised anomalous sound detection (ASD) aims to detect unknown anomalous sounds of devices when only normal sound data is available. The autoencoder (AE) and self-supervised learning based methods are two...

    Authors: Jian Guan, Youde Liu, Qiuqiang Kong, Feiyang Xiao, Qiaoxi Zhu, Jiantong Tian and Wenwu Wang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:42
  27. In recent years, the speaker-independent, single-channel speech separation problem has made significant progress with the development of deep neural networks (DNNs). However, separating the speech of each inte...

    Authors: Chunxi Wang, Maoshen Jia and Xinfeng Zhang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:41
  28. Currently, Graph Neural Networks have been extended to the field of speech signal processing. It is the more compact and flexible way to represent speech sequences by graphs. However, the structures of the rel...

    Authors: Yan Li, Yapeng Wang, Xu Yang and Sio-Kei Im
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:40
  29. Acoustic echo cancelation (AEC) is a system identification problem that has been addressed by various techniques and most commonly by normalized least mean square (NLMS) adaptive algorithms. However, performin...

    Authors: Amin Saremi, Balaji Ramkumar, Ghazaleh Ghaffari and Zonghua Gu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:39
  30. In this paper, two approaches are proposed for estimating the direction of arrival (DOA) and power spectral density (PSD) of stationary point sources by using a single, rotating, directional microphone. These ...

    Authors: Elisa Tengan, Thomas Dietzen, Filip Elvander and Toon van Waterschoot
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:38
  31. This paper presents three cascade algorithms for combined acoustic feedback cancelation (AFC) and noise reduction (NR) in speech applications. A prediction error method (PEM)-based adaptive feedback cancelatio...

    Authors: Santiago Ruiz, Toon van Waterschoot and Marc Moonen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:37
  32. A three-stage approach is proposed for speaker counting and speech separation in noisy and reverberant environments. In the spatial feature extraction, a spatial coherence matrix (SCM) is computed using whiten...

    Authors: Yicheng Hsu and Mingsian R. Bai
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:36
  33. In this paper, we propose a technique for removing a specific type of interference from a monaural recording. Nonstationary interferences are generally challenging to eliminate from such recordings. However, i...

    Authors: Takao Kawamura, Kouei Yamaoka, Yukoh Wakabayashi, Nobutaka Ono and Ryoichi Miyazaki
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:35
  34. Traffic congestion can lead to negative driving emotions, significantly increasing the likelihood of traffic accidents. Reducing negative driving emotions as a means to mitigate speeding, reckless overtaking, ...

    Authors: Lekai Zhang, Yingfan Wang, Kailun He, Hailong Zhang, Baixi Xing, Xiaofeng Liu and Fo Hu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:34
  35. Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. F...

    Authors: Zhiyong Chen and Shugong Xu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:33
  36. In many signal processing applications, metadata may be advantageously used in conjunction with a high dimensional signal to produce a desired output. In the case of classical Sound Source Localization (SSL) a...

    Authors: Eric Grinstein, Vincent W. Neo and Patrick A. Naylor
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:32
  37. In the past decades, convolutional neural networks (CNNs) have been commonly adopted in audio perception tasks, which aim to learn latent representations. However, for audio analysis, CNNs may exhibit limitati...

    Authors: Te Zeng and Francis C. M. Lau
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:31
  38. The presence of noise and reverberation significantly impedes speech clarity and intelligibility. To mitigate these effects, numerous deep learning-based network models have been proposed for speech enhancemen...

    Authors: Shiyun Xu, Zehua Zhang and Mingjiang Wang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:30
  39. In multichannel signal processing with distributed sensors, choosing the optimal subset of observed sensor signals to be exploited is crucial in order to maximize algorithmic performance and reduce computation...

    Authors: Michael Günther, Andreas Brendel and Walter Kellermann
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:29
  40. Personalized voice triggering is a key technology in voice assistants and serves as the first step for users to activate the voice assistant. Personalized voice triggering involves keyword spotting (KWS) and s...

    Authors: Xingwei Liang, Zehua Zhang and Ruifeng Xu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:28
  41. The goal of sound event detection and localization (SELD) is to identify each individual sound event class and its activity time from a piece of audio, while estimating its spatial location at the time of acti...

    Authors: Yuting Zhou and Hongjie Wan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:27
  42. Analysis of couple interactions using speech processing techniques is an increasingly active multi-disciplinary field that poses challenges such as automatic relationship quality assessment and behavioral codi...

    Authors: Tuğçe Melike Koçak, Büşra Çilem Dibek, Esma Nafiye Polat, Nilüfer Kafesçioğlu and Cenk Demiroğlu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:26
  43. More and more smart home devices with microphones come into our life in these years; it is highly desirable to connect these microphones as wireless acoustic sensor networks (WASNs) so that these devices can b...

    Authors: Zhe Han, Yuxuan Ke, Xiaodong Li and Chengshi Zheng
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:25
  44. In recent years, the adoption of deep learning techniques has allowed to obtain major breakthroughs in the automatic music generation research field, sparking a renewed interest in generative music. A great de...

    Authors: Luca Comanducci, Davide Gioiosa, Massimiliano Zanoni, Fabio Antonacci and Augusto Sarti
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:24
  45. Emotion plays a dominant role in speech. The same utterance with different emotions can lead to a completely different meaning. The ability to perform various of emotion during speaking is also one of the typi...

    Authors: Tong Liu and Xiaochen Yuan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:23
  46. Speech emotion recognition (SER) is a hot topic in speech signal processing. With the advanced development of the cheap computing power and proliferation of research in data-driven methods, deep learning appro...

    Authors: Gang Liu, Shifang Cai and Ce Wang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:22
  47. Recent years have witnessed a great progress in single-channel speech separation by applying self-attention based networks. Despite the excellent performance in mining relevant long-sequence contextual informa...

    Authors: Kunpeng Wang, Hao Zhou, Jingxiang Cai, Wenna Li and Juan Yao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:21

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

  • 2022 Citation Impact
    2.4 - 2-year Impact Factor
    2.0 - 5-year Impact Factor
    1.081 - SNIP (Source Normalized Impact per Paper)
    0.458 - SJR (SCImago Journal Rank)

    2023 Speed
    17 days submission to first editorial decision for all manuscripts (Median)
    154 days submission to accept (Median)

    2023 Usage 
    368,607 downloads
    70 Altmetric mentions 

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here