Open Source Text Processing Project: Snowball

Snowball

Project Website:

Github Link:

Description

Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. This site describes Snowball, and presents several useful stemmers which have been implemented using it.

The Snowball compiler translates a Snowball script into another language – currently ISO C, Java and Python are supported.

Since it effectively provides a ‘suffix STRIPPER GRAMmar’, I had toyed with the idea of calling it ‘strippergram’, but good sense has prevailed, and so it is ‘Snowball’ named as a tribute to SNOBOL, the excellent string handling language of Messrs Farber, Griswold, Poage and Polonsky from the 1960s.

Martin Porter


Leave a Reply

Your email address will not be published. Required fields are marked *