No. | Component type | No. of unique | No. of components |
---|---|---|---|
 |  | components | in the corpus |
1 | single words | 1,943,462 | 230,301,313 |
2 | 2-word sequences | 75,395,184 | 246,110,034 |
3 | 3-word sequences | 170,180,746 | 246,066,692 |
4 | 4-word sequences | 217,586,930 | 246,023,356 |
5 | 5-word sequences | 232,439,967 | 245,980,021 |