Getting started with NLTK

About NLTK Open Source Text Processing Project: NLTK Install NLTK 1. Install the latest NLTK pakage on Ubuntu 16.04.1 LTS: textprocessing@ubuntu:~$ sudo pip install -U nltk Collecting nltk Downloading nltk-3.2.2.tar.gz (1.2MB) 35% |███████████▍ | 409kB 20.8MB/s eta 0:00:0 …… 100% … Continue reading

Open Source Text Processing Project: segtok

segtok: sentence segmentation and word tokenization tools Project Website: http://fnl.es/segtok-a-segmentation-and-tokenization-library.html Github Link: https://github.com/fnl/segtok Description A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features. The segtok package provides two modules, segtok.segmenter and segtok.tokenizer. The segmenter provides functionality for … Continue reading

Open Source Text Processing Project: OpenNLP

Apache OpenNLP Project Website: https://opennlp.apache.org/ Github Link: None Description The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, … Continue reading