Skip to main content

Articles

Page 2 of 11

  1. This study focuses on exploring the acoustic differences between synthesized Guzheng pieces and real Guzheng performances, with the aim of improving the quality of synthesized Guzheng music. A dataset with con...

    Authors: Huiwen Xue, Chenxin Sun, Mingcheng Tang, Chenrui Hu, Zhengqing Yuan, Min Huang and Zhongzhe Xiao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:50
  2. Predominant source separation is the separation of one or more desired predominant signals, such as voice or leading instruments, from polyphonic music. The proposed work uses time-frequency filtering on predo...

    Authors: Lekshmi Chandrika Reghunath and Rajeev Rajan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:49
  3. Speakers with dysarthria often struggle to accurately pronounce words and effectively communicate with others. Automatic speech recognition (ASR) is a powerful tool for extracting the content from speakers wit...

    Authors: Zhaopeng Qian, Kejing Xiao and Chongchong Yu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:48
  4. This article presents the research work on improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling. The speech recognition system is b...

    Authors: Kavya Manohar, Jayan A R and Rajeev Rajan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:47
  5. Speaker embeddings, from the ECAPA-TDNN speaker verification network, were recently introduced as features for the task of clustering microphones in ad hoc arrays. Our previous work demonstrated that, in compa...

    Authors: Stijn Kindt, Jenthe Thienpondt, Luca Becker and Nilesh Madhu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:46

    The Correction to this article has been published in EURASIP Journal on Audio, Speech, and Music Processing 2024 2024:5

  6. Non-parallel data voice conversion (VC) has achieved considerable breakthroughs due to self-supervised pre-trained representation (SSPR) being used in recent years. Features extracted by the pre-trained model ...

    Authors: Hao Huang, Lin Wang, Jichen Yang, Ying Hu and Liang He
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:45
  7. Appropriate background music in e-commerce advertisements can help stimulate consumption and build product image. However, many factors like emotion and product category should be taken into account, which mak...

    Authors: Le Ma, Xinda Wu, Ruiyuan Tang, Chongjun Zhong and Kejun Zhang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:44
  8. Snoring affects 57 % of men, 40 % of women, and 27 % of children in the USA. Besides, snoring is highly correlated with obstructive sleep apnoea (OSA), which is characterised by loud and frequent snoring. OSA ...

    Authors: Jingtan Li, Mengkai Sun, Zhonghao Zhao, Xingcan Li, Gaigai Li, Chen Wu, Kun Qian, Bin Hu, Yoshiharu Yamamoto and Björn W. Schuller
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:43
  9. Unsupervised anomalous sound detection (ASD) aims to detect unknown anomalous sounds of devices when only normal sound data is available. The autoencoder (AE) and self-supervised learning based methods are two...

    Authors: Jian Guan, Youde Liu, Qiuqiang Kong, Feiyang Xiao, Qiaoxi Zhu, Jiantong Tian and Wenwu Wang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:42
  10. In recent years, the speaker-independent, single-channel speech separation problem has made significant progress with the development of deep neural networks (DNNs). However, separating the speech of each inte...

    Authors: Chunxi Wang, Maoshen Jia and Xinfeng Zhang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:41
  11. Currently, Graph Neural Networks have been extended to the field of speech signal processing. It is the more compact and flexible way to represent speech sequences by graphs. However, the structures of the rel...

    Authors: Yan Li, Yapeng Wang, Xu Yang and Sio-Kei Im
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:40
  12. Acoustic echo cancelation (AEC) is a system identification problem that has been addressed by various techniques and most commonly by normalized least mean square (NLMS) adaptive algorithms. However, performin...

    Authors: Amin Saremi, Balaji Ramkumar, Ghazaleh Ghaffari and Zonghua Gu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:39
  13. In this paper, two approaches are proposed for estimating the direction of arrival (DOA) and power spectral density (PSD) of stationary point sources by using a single, rotating, directional microphone. These ...

    Authors: Elisa Tengan, Thomas Dietzen, Filip Elvander and Toon van Waterschoot
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:38
  14. This paper presents three cascade algorithms for combined acoustic feedback cancelation (AFC) and noise reduction (NR) in speech applications. A prediction error method (PEM)-based adaptive feedback cancelatio...

    Authors: Santiago Ruiz, Toon van Waterschoot and Marc Moonen
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:37
  15. A three-stage approach is proposed for speaker counting and speech separation in noisy and reverberant environments. In the spatial feature extraction, a spatial coherence matrix (SCM) is computed using whiten...

    Authors: Yicheng Hsu and Mingsian R. Bai
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:36
  16. In this paper, we propose a technique for removing a specific type of interference from a monaural recording. Nonstationary interferences are generally challenging to eliminate from such recordings. However, i...

    Authors: Takao Kawamura, Kouei Yamaoka, Yukoh Wakabayashi, Nobutaka Ono and Ryoichi Miyazaki
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:35
  17. Traffic congestion can lead to negative driving emotions, significantly increasing the likelihood of traffic accidents. Reducing negative driving emotions as a means to mitigate speeding, reckless overtaking, ...

    Authors: Lekai Zhang, Yingfan Wang, Kailun He, Hailong Zhang, Baixi Xing, Xiaofeng Liu and Fo Hu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:34
  18. Speaker recognition, the process of automatically identifying a speaker based on individual characteristics in speech signals, presents significant challenges when addressing heterogeneous-domain conditions. F...

    Authors: Zhiyong Chen and Shugong Xu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:33
  19. In many signal processing applications, metadata may be advantageously used in conjunction with a high dimensional signal to produce a desired output. In the case of classical Sound Source Localization (SSL) a...

    Authors: Eric Grinstein, Vincent W. Neo and Patrick A. Naylor
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:32
  20. In the past decades, convolutional neural networks (CNNs) have been commonly adopted in audio perception tasks, which aim to learn latent representations. However, for audio analysis, CNNs may exhibit limitati...

    Authors: Te Zeng and Francis C. M. Lau
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:31
  21. The presence of noise and reverberation significantly impedes speech clarity and intelligibility. To mitigate these effects, numerous deep learning-based network models have been proposed for speech enhancemen...

    Authors: Shiyun Xu, Zehua Zhang and Mingjiang Wang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:30
  22. In multichannel signal processing with distributed sensors, choosing the optimal subset of observed sensor signals to be exploited is crucial in order to maximize algorithmic performance and reduce computation...

    Authors: Michael Günther, Andreas Brendel and Walter Kellermann
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:29
  23. Personalized voice triggering is a key technology in voice assistants and serves as the first step for users to activate the voice assistant. Personalized voice triggering involves keyword spotting (KWS) and s...

    Authors: Xingwei Liang, Zehua Zhang and Ruifeng Xu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:28
  24. The goal of sound event detection and localization (SELD) is to identify each individual sound event class and its activity time from a piece of audio, while estimating its spatial location at the time of acti...

    Authors: Yuting Zhou and Hongjie Wan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:27
  25. Analysis of couple interactions using speech processing techniques is an increasingly active multi-disciplinary field that poses challenges such as automatic relationship quality assessment and behavioral codi...

    Authors: Tuğçe Melike Koçak, Büşra Çilem Dibek, Esma Nafiye Polat, Nilüfer Kafesçioğlu and Cenk Demiroğlu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:26
  26. More and more smart home devices with microphones come into our life in these years; it is highly desirable to connect these microphones as wireless acoustic sensor networks (WASNs) so that these devices can b...

    Authors: Zhe Han, Yuxuan Ke, Xiaodong Li and Chengshi Zheng
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:25
  27. In recent years, the adoption of deep learning techniques has allowed to obtain major breakthroughs in the automatic music generation research field, sparking a renewed interest in generative music. A great de...

    Authors: Luca Comanducci, Davide Gioiosa, Massimiliano Zanoni, Fabio Antonacci and Augusto Sarti
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:24
  28. Emotion plays a dominant role in speech. The same utterance with different emotions can lead to a completely different meaning. The ability to perform various of emotion during speaking is also one of the typi...

    Authors: Tong Liu and Xiaochen Yuan
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:23
  29. Speech emotion recognition (SER) is a hot topic in speech signal processing. With the advanced development of the cheap computing power and proliferation of research in data-driven methods, deep learning appro...

    Authors: Gang Liu, Shifang Cai and Ce Wang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:22
  30. Recent years have witnessed a great progress in single-channel speech separation by applying self-attention based networks. Despite the excellent performance in mining relevant long-sequence contextual informa...

    Authors: Kunpeng Wang, Hao Zhou, Jingxiang Cai, Wenna Li and Juan Yao
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:21
  31. The human auditory system employs a number of principles to facilitate the selection of perceptually separated streams from a complex sound mixture. The brain leverages multi-scale redundant representations of...

    Authors: Ashwin Bellur, Karan Thakkar and Mounya Elhilali
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:20
  32. Music inpainting is a sub-task of automated music generation that aims to infill incomplete musical pieces to help musicians in their musical composition process. Many methods have been developed for this task...

    Authors: Mauricio Araneda-Hernandez, Felipe Bravo-Marquez, Denis Parra and Rodrigo F. Cádiz
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:19
  33. A two-stage lightweight online dereverberation algorithm for hearing devices is presented in this paper. The approach combines a multi-channel multi-frame linear filter with a single-channel single-frame post-...

    Authors: Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning and Timo Gerkmann
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:18
  34. In the development of acoustic signal processing algorithms, their evaluation in various acoustic environments is of utmost importance. In order to advance evaluation in realistic and reproducible scenarios, s...

    Authors: Thomas Dietzen, Randall Ali, Maja Taseska and Toon van Waterschoot
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:17
  35. Voice activity detection remains a significant challenge in the presence of transients since transients are more dominant than speech, though it has achieved satisfactory performance in quasi-stationary noisy ...

    Authors: Xiao-Yuan Guo, Chun-Xian Gao and Hui Liu
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:16
  36. With the rise of deep learning, spoken language understanding (SLU) for command-and-control applications such as a voice-controlled virtual assistant can offer reliable hands-free operation to physically disab...

    Authors: Pu Wang and Hugo Van hamme
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:15
  37. Spoken language recognition has made significant progress in recent years, for which automatic speech recognition has been used as a parallel branch to extract phonetic features. However, there is still a lack...

    Authors: Zimu Li, Yanyan Xu, Dengfeng Ke and Kaile Su
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:14
  38. We present a new dataset of 3000 artificial music tracks with rich annotations based on real instrument samples and generated by algorithmic composition with respect to music theory. Our collection provides gr...

    Authors: Fabian Ostermann, Igor Vatolkin and Martin Ebeling
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:13
  39. Electromagnetic components greatly contribute to the peculiar timbre of analog audio gear. Indeed, distortion effects due to the nonlinear behavior of magnetic materials are known to play an important role in ...

    Authors: Oliviero Massi, Alessandro Ilic Mezza, Riccardo Giampiccolo and Alberto Bernardini
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:12
  40. Most music listeners have an intuitive understanding of the notion of rhythm complexity. Musicologists and scientists, however, have long sought objective ways to measure and model such a distinctively percept...

    Authors: Alessandro Ilic Mezza, Massimiliano Zanoni and Augusto Sarti
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:11
  41. Speech feature model is the basis of speech and noise separation, speech expression, and different styles of speech conversion. With the development of signal processing methods, the feature types and dimensio...

    Authors: Xiaoping Xie, Yongzhen Chen, Rufeng Shen and Dan Tian
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:10
  42. Speech is the most common form of human communication, and many conversations use digital communication links. For efficient transmission, acoustic speech waveforms are usually converted to digital form, with ...

    Authors: Douglas O’Shaughnessy
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:8
  43. The paper uses the K-graphs learning method to construct weighted, connected, undirected multiple graphs, aiming to reveal intrinsic relationships of speech samples in the inter-frame and intra-frame. To benefit ...

    Authors: Tingting Wang, Haiyan Guo, Zirui Ge, Qiquan Zhang and Zhen Yang
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:7
  44. Recently, supervised speech separation has made great progress. However, limited by the nature of supervised training, most existing separation methods require ground-truth sources and are trained on synthetic...

    Authors: Jiangyu Han and Yanhua Long
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:6
  45. The aim of this paper is to investigate the influence of personality traits, characterized by the BFI (Big Five Inventory) and its significant revision called BFI-2, on music recommendation error. The BFI-2 de...

    Authors: Mariusz Kleć, Alicja Wieczorkowska, Krzysztof Szklanny and Włodzimierz Strus
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:4
  46. SincNet architecture has shown significant benefits over traditional Convolutional Neural Networks (CNN), especially for speaker recognition applications. SincNet comprises parameterized Sinc functions as filt...

    Authors: Prashanth H C, Madhav Rao, Dhanya Eledath and Ramasubramanian V
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:3

    The Correction to this article has been published in EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:9

  47. Music source separation (MSS) is to isolate musical instrument signals from the given music mixture. Stripes widely exist in music spectrograms, which potentially indicate high-level music information. For exa...

    Authors: Jiale Qian, Xinlu Liu, Yi Yu and Wei Li
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:2
  48. The purpose of this paper is to show a music mixing system that is capable of automatically mixing separate raw recordings with good quality regardless of the music genre. This work recalls selected methods fo...

    Authors: Damian Koszewski, Thomas Görne, Grazina Korvel and Bozena Kostek
    Citation: EURASIP Journal on Audio, Speech, and Music Processing 2023 2023:1

Who reads the journal?

Learn more about the impact the EURASIP Journal on Audio, Speech, and Music Processing has worldwide

Annual Journal Metrics

  • Citation Impact 2023
    Journal Impact Factor: 1.7
    5-year Journal Impact Factor: 1.6
    Source Normalized Impact per Paper (SNIP): 1.051
    SCImago Journal Rank (SJR): 0.414

    Speed 2023
    Submission to first editorial decision (median days): 17
    Submission to acceptance (median days): 154

    Usage 2023
    Downloads: 368,607
    Altmetric mentions: 70

Funding your APC

​​​​​​​Open access funding and policy support by SpringerOpen​​

​​​​We offer a free open access support service to make it easier for you to discover and apply for article-processing charge (APC) funding. Learn more here