Open Source Text Processing Project: PTStemmer

PTStemmer – A Stemming toolkit for the Portuguese language Project Website: https://code.google.com/archive/p/ptstemmer/ Github Link: None Description FEATURES Java, Python, and .NET C# implementations of Orengo, Porter, and Savoy stemmers Fast: can stem more than 1.5M words/second on a normal desktop … Continue reading

Open Source Text Processing Project: OleanderStemmingLibrary

Oleander C++ stemming library Project Website: http://www.oleandersolutions.com/stemming/stemming.html Github Link: https://github.com/OleanderSoftware/OleanderStemmingLibrary Description Stemming is a normalization process used to reduce words down to their root. Stemming removes inflectional suffixes so that morphological variants of the same word can be compared more … Continue reading

Open Source Text Processing Project: PyStemmer

Python stemming library using snowball stemmers Project Website: https://pypi.python.org/pypi/PyStemmer Github Link: https://github.com/snowballstem/pystemmer Description PyStemmer is a Python interface to the stemming algorithms from the Snowball project (http://snowball.tartarus.org/). A stemming algorithm (or stemmer) is a process for removing the commoner morphological … Continue reading

Open Source Text Processing Project: Snowball

Snowball Project Website: http://snowballstem.org/ Github Link: https://github.com/snowballstem/snowball Description Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. This site describes Snowball, and presents several useful stemmers which have been implemented using it. … Continue reading

Open Source Text Processing Project: The Porter Stemming Algorithm

The Porter Stemming Algorithm Project Website: http://tartarus.org/martin/PorterStemmer/ Github Link: None Description This is the ‘official’ home page for distribution of the Porter Stemming Algorithm, written and maintained by its author, Martin Porter. The Porter stemming algorithm (or ‘Porter stemmer’) is … Continue reading