Open Source Text Processing Project: PyTeaser

Pyteaser: Summarizes news articles by providing an url Project Website: http://xiaoxu193.github.io/PyTeaser/ Github Link: https://github.com/xiaoxu193/PyTeaser Description PyTeaser takes any news article and extract a brief summary from it. It’s based on the original Scala project. Summaries are created by ranking sentences … Continue reading

Open Source Text Processing Project: Python TextTeaser

TextTeaser: Official version of TextTeaser Project Website: None Github Link: https://github.com/DataTeaser/textteaser Description TextTeaser is an automatic summarization algorithm. This is now the official version of TextTeaser. Future developments of TextTeaser will be in this repository. The original Scala TextTeaser can … Continue reading

Open Source Text Processing Project: TextTeaser

TextTeaser is an automatic summarization algorithm Project Website: None Github Link: https://github.com/MojoJolo/textteaser Description TextTeaser is an automatic summarization algorithm that combines the power of natural language processing and machine learning to produce good results. TextTeaser is ported in Python and … Continue reading

Open Source Text Processing Project: summarizer

summarizer: A multidocument text summarizer Project Website: None Github Link: https://github.com/kylehg/summarizer Description UNMAINTAINED: CIS-530 Final Project NOTE: This was a school project. It is very likely riddled with bugs, and is entirely unmaintained. It should not be considered for any … Continue reading

Open Source Text Processing Project: topia.termextract

topia.termextract: Content Term Extraction using POS Tagging Project Website: https://pypi.python.org/pypi/topia.termextract/ Github Link: None Description This package determines important terms within a given piece of content. It uses linguistic tools such as Parts-Of-Speech (POS) and some simple statistical analysis to determine … Continue reading

Open Source Text Processing Project: tagger

tagger: A Python module for extracting relevant tags from text documents Project Website: None Github Link: https://github.com/apresta/tagger Description Module for extracting tags from text documents. Extracting tags from a text document involves at least three steps: splitting the document into … Continue reading

Open Source Text Processing Project: Moses

Moses, the machine translation system Project Website: http://www.statmt.org/moses/ Github Link: https://github.com/moses-smt/mosesdecoder Description Moses is a statistical machine translation system that allows you to automatically train translation models for any language pair. All you need is a collection of translated texts … Continue reading

Open Source Text Processing Project: RAKE

RAKE: A python implementation of the Rapid Automatic Keyword Extraction Project Website: None Github Link: https://github.com/aneesha/RAKE Description A Python implementation of the Rapid Automatic Keyword Extraction (RAKE) algorithm as described in: Rose, S., Engel, D., Cramer, N., & Cowley, W. … Continue reading

Open Source Text Processing Project: KEA

KEA: Keyphrase Extraction Algorithm Project Website: http://www.nzdl.org/Kea/ Github Link: None Description Keywords and keyphrases (multi-word units) are widely used in large document collections. They describe the content of single documents and provide a kind of semantic metadata that is useful … Continue reading

Open Source Text Processing Project: Open Text Summarizer

Open Text Summarizer Project Website: http://libots.sourceforge.net/ Github Link: None Description Automatic text summarization is the technique, where a computer program summarizes a document. A text is put into the computer and a highlighted (summarized) text is returned. The Open Text … Continue reading