From: Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling
Original text
Subword tokenized text