textprocessing | TextProcessing | A Text Processing Portal for Humans

Getting started with WordNet

Posted on May 12, 2017 by textprocessingMay 13, 2017

About WordNet WordNet is a lexical database for English: WordNet® is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means … Continue reading →

A Beginner’s Guide to TextBlob

Posted on April 7, 2017 by textprocessingApril 7, 2017

About TextBlob Open Source Text Processing Project: TextBlob Install TextBlob Install the latest TextBlob on Ubuntu 16.04.1 LTS: textprocessing@ubuntu:~$ sudo pip install -U textblob Collecting textblob Downloading textblob-0.12.0-py2.py3-none-any.whl (631kB) Requirement already up-to-date: nltk>=3.1 in /usr/local/lib/python2.7/dist-packages (from textblob) Requirement already up-to-date: … Continue reading →

Getting started with Word2Vec

Posted on March 8, 2017 by textprocessingMarch 19, 2017

1. Source by Google Project with Code: Word2Vec Blog: Learning the meaning behind words Paper: [1] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013. … Continue reading →

Getting started with NLTK

Posted on February 19, 2017 by textprocessingFebruary 19, 2017

About NLTK Open Source Text Processing Project: NLTK Install NLTK 1. Install the latest NLTK pakage on Ubuntu 16.04.1 LTS: textprocessing@ubuntu:~$ sudo pip install -U nltk Collecting nltk Downloading nltk-3.2.2.tar.gz (1.2MB) 35% |███████████▍ | 409kB 20.8MB/s eta 0:00:0 …… 100% … Continue reading →

Open Source Text Processing Project: Wapiti

Posted on December 21, 2016 by textprocessingDecember 21, 2016

Wapiti – A simple and fast discriminative sequence labelling toolkit Project Website: https://wapiti.limsi.fr/ Github Link: https://github.com/Jekub/Wapiti Description Wapiti is a very fast toolkit for segmenting and labeling sequences with discriminative models. It is based on maxent models, maximum entropy Markov … Continue reading →

Open Source Text Processing Project: segtok

Posted on December 19, 2016 by textprocessingDecember 19, 2016

segtok: sentence segmentation and word tokenization tools Project Website: http://fnl.es/segtok-a-segmentation-and-tokenization-library.html Github Link: https://github.com/fnl/segtok Description A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features. The segtok package provides two modules, segtok.segmenter and segtok.tokenizer. The segmenter provides functionality for … Continue reading →

Open Source Text Processing Project: nlp-with-ruby

Posted on December 13, 2016 by textprocessingDecember 13, 2016

nlp-with-ruby: Awesome NLP with Ruby Project Website: None Github Link: https://github.com/arbox/nlp-with-ruby Description This curated list comprises awesome resources, libraries, information sources about computational processing of texts in human languages with Ruby. That field is often referred to as NLP, Computational … Continue reading →

Open Source Text Processing Project: textacy

Posted on November 7, 2016 by textprocessingNovember 7, 2016

textacy: higher-level NLP built on spaCy Project Website: https://textacy.readthedocs.io Github Link: https://github.com/chartbeat-labs/textacy Description textacy is a Python library for performing higher-level natural language processing (NLP) tasks, built on the high-performance spaCy library. With the basics — tokenization, part-of-speech tagging, dependency … Continue reading →

Open Source Text Processing Project: vivekn sentiment

Posted on October 28, 2016 by textprocessingOctober 28, 2016

Sentiment analysis using machine learning techniques Project Website: http://sentiment.vivekn.com/ Github Link: https://github.com/vivekn/sentiment Description Sentiment analysis using machine learning techniques. Check info.py for the training and testing code. A demo of the tool is available here Refer this paper for more … Continue reading →

Open Source Deep Learning Project: Paddle

Posted on August 30, 2016 by textprocessingAugust 30, 2016

Paddle: PArallel Distributed Deep LEarning Project Website: http://www.paddlepaddle.org/ Github Link: https://github.com/baidu/Paddle Description PaddlePaddle (PArallel Distributed Deep LEarning) is an easy-to-use, efficient, flexible and scalable deep learning platform, which is originally developed by Baidu scientists and engineers for the purpose of … Continue reading →