Open Source Text Processing Project: berkeleylm

Deep Learning Specialization on Coursera


Project Website:

Github Link:


An N-gram Language Model Library from UC Berkeley

This project provides a library for estimating storing large n-gram language models in memory and accessing them efficiently. It is described in’>this paper. Its data structures are faster and smaller than’>SRILM and nearly as fast as’>KenLM despite being written in Java instead of C++. It also achieves the best published lossless encoding of the Google n-gram corpus.

Leave a Reply

Your email address will not be published. Required fields are marked *