Skip to main content

Call for Papers: Recent advances in computational sound scene analysis

Submission Deadline: 1st April 2022

The amount of audio data being generated has dramatically increased over the past decade, spanning from user-generated content, recordings in audiovisual archives, to sensor data captured in urban, nature or domestic environments. The need to detect and identify sound events in environmental recordings as well as to recognize the context of an audio recording has led to the emergence of a new field of research: computational sound scene analysis. Emerging applications of sound scene analysis include the development of sound recognition technologies for smart homes and smart cities, security, audio retrieval and archiving, ambient assisted living, and automatic biodiversity assessment. The field has been fueled by recent advances in machine learning and signal processing theory, from learning methodologies for learning from limited data or weak annotations, to new audio representations for sound scenes and events. In addition, there are open questions on how to model sound taxonomies, how to evaluate user experience for sound scene analysis, and on ethical and policy issues regarding the creation and deployment of sound scene analysis systems. This special issue seeks to provide a venue for recent research in novel methods, data-driven empirical research and datasets, new software implementations and toolboxes, and overview and tutorial material relevant to computational analysis of sound scenes and sound events.

Topics of interest include but are not limited to:

  • Methodology: signal processing, machine learning, auditory perception, taxonomies, and ontologies related to sound scenes and events
  • Tasks and applications: acoustic scene classification, sound event detection and localization, sound source separation, audio tagging, audio captioning, detection of rare sound events, anomaly audio event detection, computational bioacoustic scene analysis, urban soundscape analysis, and cross-modal analysis (e.g. audio recognition/analysis with information from video, texts, image, language, etc.)
  • Machine learning methodologies for sound scene analysis: self-supervised learning, few-shot learning, meta-learning, generative models, explainable machine learning, continual learning, curriculum learning, active learning, multi-task learning, and attention mechanisms
  • Human-centered sound scene analysis: human-computer interaction and interfaces, user-centered evaluation, visualization of audio events and scenes, and user annotation
  • Evaluation, datasets, software tools, and reproducibility in computational sound scene and event analysis
  • Ethics and policy: legal and societal aspects of computational sound scene analysis; ethical and privacy issues related to designing, implementing and deploying sound scene analysis systems; privacy-preserving sound scene analysis; federated learning for sound scene analysis
  • Performance metrics: studies for developing effective evaluation metrics and tools for related tasks in audio scene analysis, event detection, and audio tagging

The EURASIP Journal on Audio, Speech, and Music Processing recognizes novel contributions of the following types within its area:

  • Empirical Research: Data-driven research, new experimental results, and new data sets
  • Methodology: New theory and methods for the processing of speech, audio, and music signals
  • Software: New software implementations and toolboxes for speech, audio, and music processing
  • Review: Timely and comprehensive overview and tutorial material covering recent developments within the field

Submission instructions:

Guest Editors:
Jakob Abeßer, Fraunhofer IDMT, Germany
Emmanouil Benetos, Queen Mary University of London, UK
Annamaria Mesaros, Tampere University, Finland
Wenwu Wang, University of Surrey, UK