Open Source Text Processing Project: topia.termextract

topia.termextract: Content Term Extraction using POS Tagging Project Website: https://pypi.python.org/pypi/topia.termextract/ Github Link: None Description This package determines important terms within a given piece of content. It uses linguistic tools such as Parts-Of-Speech (POS) and some simple statistical analysis to determine … Continue reading

Open Source Text Processing Project: tagger

tagger: A Python module for extracting relevant tags from text documents Project Website: None Github Link: https://github.com/apresta/tagger Description Module for extracting tags from text documents. Extracting tags from a text document involves at least three steps: splitting the document into … Continue reading

Open Source Text Processing Project: RAKE

RAKE: A python implementation of the Rapid Automatic Keyword Extraction Project Website: None Github Link: https://github.com/aneesha/RAKE Description A Python implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm as described in: Rose, S., Engel, D., Cramer, N., & Cowley, W. … Continue reading

Open Source Text Processing Project: KEA

KEA: Keyphrase Extraction Algorithm Project Website: http://www.nzdl.org/Kea/ Github Link: None Description Keywords and keyphrases (multi-word units) are widely used in large document collections. They describe the content of single documents and provide a kind of semantic metadata that is useful … Continue reading