Gensim: Topic Modelling for Humans
Project Website: https://radimrehurek.com/gensim/
Github Link: https://github.com/piskvorky/gensim/
Gensim is a FREE Python library:
Scalable statistical semantics
Analyze plain-text documents for semantic structure
Retrieve semantically similar documents
Gensim started off as a collection of various Python scripts for the Czech Digital Mathematics Library dml.cz in 2008, where it served to generate a short list of the most similar articles to a given article (gensim = “generate similar”).
Later versions of gensim improved this efficiency and scalability tremendously.
By now, gensim is—to my knowledge—the most robust, efficient and hassle-free piece of software to realize unsupervised semantic modelling from plain text. It stands in contrast to brittle homework-assignment-implementations that do not scale on one hand, and robust java-esque projects that take forever just to run “hello world”.