Skip to main content

Advertisement

Table 9 Frequency of the word occurrence in the orthographic corpus file

From: Statistical analysis of orthographic and phonemic language corpus for word-based and phoneme-based Polish language modelling

No. Frequency of occurrence Word
i f(w i )·100 [%] w i
1 3.34041 w
2 2.31575 i
3 1.83890 na
4 1.80585 z
5 1.72883
6 1.56392 nie
7 1.26101 do
8 0.95783 Że
9 0.94306 to
10 0.75176 o
11 0.75055 jest
12 0.61910 a
13 0.43553 jak
14 0.42700 po
15 0.39629 od
16 0.38103 ale
17 0.36794 za
18 0.33652 przez
19 0.32741 co
20 0.28822 dla
21 0.28032 czy
22 0.26489 tym
23 0.26386 juŻ
24 0.23640
25 0.23636 tak
26 0.23209 tylko
27 0.21745 ma
28 0.20633 moŻe
29 0.19593 tego
30 0.19353 ze