Skip to main content

Table 2 Subword tokenization illustrating the usage of continuity marker symbol ‘+

From: Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling

Original text

Subword tokenized text