Open Source Text Processing Project: Stanford Word Segmenter

Stanford Word Segmenter Project Website: http://nlp.stanford.edu/software/segmenter.shtml Github Link: None Description Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages require … Continue reading

Open Source Text Processing Project: The Stanford Parser (A statistical parser)

The Stanford Parser: A statistical parser Project Website: http://nlp.stanford.edu/software/lex-parser.shtml Github Link: None Description A natural language parser is a program that works out the grammatical structure of sentences, for instance, which groups of words go together (as “phrases”) and which … Continue reading

Open Source Text Processing Project: Stanford Named Entity Recognizer (NER)

Stanford Named Entity Recognizer (NER) Project Website: http://nlp.stanford.edu/software/CRF-NER.shtml Github Link: None Description Stanford NER is a Java implementation of a Named Entity Recognizer. Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, … Continue reading

Open Source Text Processing Project: Stanford Log-linear Part-Of-Speech Tagger

Stanford Log-linear Part-Of-Speech Tagger Project Website: http://nlp.stanford.edu/software/tagger.shtml Github Link: None Description A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as … Continue reading

Open Source Text Processing Project: Stanford CoreNLP

Stanford CoreNLP – a suite of core NLP tools Project Website: http://stanfordnlp.github.io/CoreNLP/ Github Link: https://github.com/stanfordnlp/CoreNLP Description Stanford CoreNLP provides a set of natural language analysis tools. It can give the base forms of words, their parts of speech, whether they … Continue reading

Open Source Text Processing Project: Pattern

Pattern Project Website: http://www.clips.ua.ac.be/pattern Github Link: https://github.com/clips/pattern Description Pattern is a web mining module for the Python programming language. It has tools for data mining (Google, Twitter and Wikipedia API, a web crawler, a HTML DOM parser), natural language processing … Continue reading

Open Source Text Processing Project: MBSP

MBSP for Python Project Website: http://www.clips.ua.ac.be/pages/MBSP Description MBSP is a text analysis system based on the TiMBL and MBT memory based learning applications developed at CLiPS and ILK. It provides tools for Tokenization and Sentence Splitting, Part of Speech Tagging, … Continue reading

Text Processing Course: Stanford Deep Learning for Natural Language Processing

Name: Deep Learning for Natural Language Processing Website: http://cs224d.stanford.edu/ Description Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP … Continue reading

Text Processing Course: Introduction to Natural Language Processing

Name: Introduction to Natural Language Processing Website: https://www.coursera.org/course/nlpintro Description This course provides an introduction to the field of Natural Language Processing, including topics like Language Models, Parsing, Semantics, Question Answering, and Sentiment Analysis. This course provides an introduction to the … Continue reading

Text Processing Course: Text Mining and Analytics

Name: Text Mining and Analytics Website: https://www.coursera.org/course/textanalytics Description Explore algorithms for mining and analyzing big text data to discover interesting patterns, extract useful knowledge, and support decision making. This course will cover the major techniques for mining and analyzing text … Continue reading