TY - JOUR AU - Mesaros, Annamaria AU - Virtanen, Tuomas PY - 2010 DA - 2010/02/23 TI - Automatic Recognition of Lyrics in Singing JO - EURASIP Journal on Audio, Speech, and Music Processing SP - 546047 VL - 2010 IS - 1 AB - The paper considers the task of recognizing phonemes and words from a singing input by using a phonetic hidden Markov model recognizer. The system is targeted to both monophonic singing and singing in polyphonic music. A vocal separation algorithm is applied to separate the singing from polyphonic music. Due to the lack of annotated singing databases, the recognizer is trained using speech and linearly adapted to singing. Global adaptation to singing is found to improve singing recognition performance. Further improvement is obtained by gender-specific adaptation. We also study adaptation with multiple base classes defined by either phonetic or acoustic similarity. We test phoneme-level and word-level n-gram language models. The phoneme language models are trained on the speech database text. The large-vocabulary word-level language model is trained on a database of textual lyrics. Two applications are presented. The recognizer is used to align textual lyrics to vocals in polyphonic music, obtaining an average error of 0.94 seconds for line-level alignment. A query-by-singing retrieval application based on the recognized words is also constructed; in 57% of the cases, the first retrieved song is the correct one. SN - 1687-4722 UR - https://doi.org/10.1155/2010/546047 DO - 10.1155/2010/546047 ID - Mesaros2010 ER -