Text Processing Book: Speech and Language Processing (3rd ed. draft)

Speech and Language Processing (3rd ed. draft) Project Website: https://web.stanford.edu/~jurafsky/slp3/ Description   Chapter Slides Relation to 2nd ed. 1: Introduction [Ch. 1 in 2nd ed.] 2: Regular Expressions, Text Normalization, and Edit Distance Text [pptx] [pdf] Edit Distance [pptx] [pdf] … Continue reading

Open Source Text Processing Project: TextRank

Python implementation of TextRank algorithm Project Website: None Github Link: https://github.com/davidadamojr/TextRank Description This is a python implementation of TextRank for automatic keyword and sentence extraction (summarization) as done in https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf. However, this implementation uses Levenshtein Distance as the relation between … Continue reading

Open Source Text Processing Project: MEAD

MEAD Project Website: http://www.summarization.com/mead/ Github Link: None Description MEAD is the most elaborate publicly available platform for multi-lingual summarization and evaluation.The platform implements multiple summarization algorithms such as position-based, centroid-based, largest common subsequence, and keywords. The methods for evaluating the … Continue reading

Open Source Text Processing Project: SWING

SWING: An Open-Source Text Summarizer from WING Project Website: http://wing.comp.nus.edu.sg/downloads/swing/ Github Link: https://github.com/WING-NUS/SWING Description The Summarizer from the Web IR / NLP Group (WING), hence SWING, is a modular, state-of-the-art automatic extractive text summarization system. It produces informative summaries from … Continue reading

Open Source Text Processing Project: PyTeaser

Pyteaser: Summarizes news articles by providing an url Project Website: http://xiaoxu193.github.io/PyTeaser/ Github Link: https://github.com/xiaoxu193/PyTeaser Description PyTeaser takes any news article and extract a brief summary from it. It’s based on the original Scala project. Summaries are created by ranking sentences … Continue reading

Open Source Text Processing Project: Python TextTeaser

TextTeaser: Official version of TextTeaser Project Website: None Github Link: https://github.com/DataTeaser/textteaser Description TextTeaser is an automatic summarization algorithm. This is now the official version of TextTeaser. Future developments of TextTeaser will be in this repository. The original Scala TextTeaser can … Continue reading

Open Source Text Processing Project: TextTeaser

TextTeaser is an automatic summarization algorithm Project Website: None Github Link: https://github.com/MojoJolo/textteaser Description TextTeaser is an automatic summarization algorithm that combines the power of natural language processing and machine learning to produce good results. TextTeaser is ported in Python and … Continue reading

Open Source Text Processing Project: summarizer

summarizer: A multidocument text summarizer Project Website: None Github Link: https://github.com/kylehg/summarizer Description UNMAINTAINED: CIS-530 Final Project NOTE: This was a school project. It is very likely riddled with bugs, and is entirely unmaintained. It should not be considered for any … Continue reading

Open Source Text Processing Project: Open Text Summarizer

Open Text Summarizer Project Website: http://libots.sourceforge.net/ Github Link: None Description Automatic text summarization is the technique, where a computer program summarizes a document. A text is put into the computer and a highlighted (summarized) text is returned. The Open Text … Continue reading

Open Source Text Processing Project: Reduction

Reduction Project Website: None Github Link: https://github.com/adamfabish/Reduction Description Reduction is a python script which automatically summarizes a text by extracting the sentences which are deemed to be most important. Example usage: from reduction import * reduction = Reduction() text = … Continue reading