Open Source Text Processing Project: KenLM

KenLM: Faster and Smaller Language Model Queries Project Website: http://kheafield.com/code/kenlm/ Github Link: https://github.com/kpu/kenlm Description KenLM Language Model Toolkit benchmark | dependencies | developers | estimation | filter | moses | structures Ken Models with Computer Engineer Barbie KenLM estimates, filters, … Continue reading

Open Source Text Processing Project: IRSTLM

IRSTLM: The IRST Language Modeling Toolkit Project Website: http://hlt-mt.fbk.eu/technologies/irstlm Github Link: https://github.com/irstlm-team/irstlm Description The IRST Language Modeling (IRSTLM) Toolkit features algorithms and data structures suitable to estimate, store, and access very large n-gram language models. Our software has been integrated … Continue reading

Open Source Text Processing Project: SRILM

SRILM – The SRI Language Modeling Toolkit Project Website: http://www.speech.sri.com/projects/srilm/ Github Link: None Description SRILM – The SRI Language Modeling Toolkit SRILM is a toolkit for building and applying statistical language models (LMs), primarily for use in speech recognition, statistical … Continue reading

Open Source Text Processing Project: Thot

Thot: a Toolkit for Statistical Machine Translation Project Website: http://daormar.github.io/thot/ Github Link: https://github.com/daormar/thot Description Thot is an open source software toolkit for statistical machine translation (SMT). Originally, Thot incorporated tools to train phrase-based models. The new version of Thot now … Continue reading

Open Source Text Processing Project: Open Text Summarizer

Open Text Summarizer Project Website: http://libots.sourceforge.net/ Github Link: None Description Automatic text summarization is the technique, where a computer program summarizes a document. A text is put into the computer and a highlighted (summarized) text is returned. The Open Text … Continue reading

Open Source Text Processing Project: Maximum Entropy Modeling Toolkit

Maximum Entropy Modeling Toolkit for Python and C++ Project Website: http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html Github Link: https://github.com/lzhang10/maxent Description The Maximum Entropy Toolkit provides a set of tools and library for constructing maximum entropy (maxent) model in either Python or C++. Maxent Entropy Model … Continue reading

Open Source Text Processing Project: CRF++

CRF++: Yet Another CRF toolkit Project Website: https://taku910.github.io/crfpp/ Github Link: None Description CRF++ is a simple, customizable, and open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data. CRF++ is designed for generic purpose and will be applied … Continue reading

Open Source Text Processing Project: GibbsLDA++

GibbsLDA++: A C/C++ Implementation of Latent Dirichlet Allocation Project Website: http://gibbslda.sourceforge.net/ Github Link: None Description GibbsLDA++ is a C/C++ implementation of Latent Dirichlet Allocation (LDA) using Gibbs Sampling technique for parameter estimation and inference. It is very fast and is … Continue reading