Skip to main content

Table 1 Characteristics of Thai spoken language vs. written language

From: Classification-based spoken text selection for LVCSR language modeling

Spoken language

Written language

(1) A sentence is incomplete or fragmented (missing a subject or a verb) [10, 34]. Connected phrases maybe found continuously [34].

(1) A sentence is complete.

(2) A sentence is less sophisticated: fewer subordinate clauses [34].

(2) A sentence is more sophisticated: more subordinate clauses [34].

(3) A sentence starts with a topic-comment structure [34].

(3) A sentence starts with a subject-predicate form [34].

(4) Repetition, word duplication or paraphrasing, often appears [35].

(4) A sentence contains less repetition [35].

(5) A filler, a word or expression which is filled up when a speaker is in the process of thinking, often appears [35].

(5) A filler does not appear [35].

(6) A final particle, e.g. /khâʔ/, /khráp/, /nî:aʔ/, and /c-â:ʔ/, often appears [35].

(6) A sentence contains fewer final particles [35].

(7) Slang and foreign words are often used.

(7) Formal lexicon is used.