Open Source Text Processing Project: Jieba

Jieba: Chinese text segmentation Project Website: None Github Link: https://github.com/fxsjy/jieba Description “Jieba” (Chinese for “to stutter”) Chinese text segmentation: built to be the best Python Chinese word segmentation module. Features Support three types of segmentation mode: Accurate Mode attempts to … Continue reading

Open Source Text Processing Project: Stanford Log-linear Part-Of-Speech Tagger

Stanford Log-linear Part-Of-Speech Tagger Project Website: http://nlp.stanford.edu/software/tagger.shtml Github Link: None Description A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as … Continue reading

Open Source Text Processing Project: Stanford CoreNLP

Stanford CoreNLP – a suite of core NLP tools Project Website: http://stanfordnlp.github.io/CoreNLP/ Github Link: https://github.com/stanfordnlp/CoreNLP Description Stanford CoreNLP provides a set of natural language analysis tools. It can give the base forms of words, their parts of speech, whether they … Continue reading