Figure 3From: RNN language model with word clustering and class-based output layer Cumulative unigram probability distribution for Penn Treebank Corpus with about one million words (Zipf’s law). Back to article page