Open Source Deep Learning Project: mxnet

maxnet: Flexible and Efficient Library for Deep Learning

Project Website:

Github Link:


MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavours of symbolic programming and imperative programming to maximize efficiency and productivity. In its core, a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. The library is portable and lightweight, and it scales to multiple GPUs and multiple machines.

MXNet is also more than a deep learning project. It is also a collection of blue prints and guidelines for building deep learning system, and interesting insights of DL systems for hackers.


Design notes providing useful insights that can re-used by other DL projects
Flexible configuration for arbitrary computation graph
Mix and match good flavours of programming to maximize flexibility and efficiency
Lightweight, memory efficient and portable to smart devices
Scales up to multi GPUs and distributed setting with auto parallelism
Support for python, R, C++ and Julia
Cloud-friendly and directly compatible with S3, HDFS, and Azure

Open Source Deep Learning Project: TensorFlow

TensorFlow is an Open Source Software Library for Machine Intelligence

Project Website:

Github Link:


TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

Open Source Deep Learning Project: Blocks

Blocks: A Theano framework for building and training neural networks

Project Website: None

Github Link:


Blocks is a framework that helps you build neural network models on top of Theano. Currently it supports and provides:

Constructing parametrized Theano operations, called “bricks”
Pattern matching to select variables and bricks in large models
Algorithms to optimize your model
Saving and resuming of training
Monitoring and analyzing values during training progress (on the training set as well as on test sets)
Application of graph transformations, such as dropout
In the future we also hope to support:

Dimension, type and axes-checking
See Also:
Fuel, the data processing engine developed primarily for Blocks.
Blocks-examples for maintained examples of scripts using Blocks.
Blocks-extras for semi-maintained additional Blocks components.

Open Source Deep Learning Project: Pylearn2

Pylearn2: A machine learning research library

Project Website:

Github Link:


Pylearn2 is a machine learning library. Most of its functionality is built on top of Theano. This means you can write Pylearn2 plugins (new models, algorithms, etc) using mathematical expressions, and Theano will optimize and stabilize those expressions for you, and compile them to a backend of your choice (CPU or GPU).

Pylearn2 Vision
Researchers add features as they need them. We avoid getting bogged down by too much top-down planning in advance.
A machine learning toolbox for easy scientific experimentation.
All models/algorithms published by the LISA lab should have reference implementations in Pylearn2.
Pylearn2 may wrap other libraries such as scikit-learn when this is practical
Pylearn2 differs from scikit-learn in that Pylearn2 aims to provide great flexibility and make it possible for a researcher to do almost anything, while scikit-learn aims to work as a “black box” that can produce good results even if the user does not understand the implementation
Dataset interface for vector, images, video, …
Small framework for all what is needed for one normal MLP/RBM/SDA/Convolution experiments.
Easy reuse of sub-component of Pylearn2.
Using one sub-component of the library does not force you to use / learn to use all of the other sub-components if you choose not to.
Support cross-platform serialization of learned models.
Remain approachable enough to be used in the classroom (IFT6266 at the University of Montreal).

Open Source Deep Learning Project: Torch


Project Website:

Github Link:


What is Torch?
Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.

A summary of core features:

a powerful N-dimensional array
lots of routines for indexing, slicing, transposing, …
amazing interface to C, via LuaJIT
linear algebra routines
neural network, and energy-based models
numeric optimization routines
Fast and efficient GPU support
Embeddable, with ports to iOS, Android and FPGA backends
Why Torch?
The goal of Torch is to have maximum flexibility and speed in building your scientific algorithms while making the process extremely simple. Torch comes with a large ecosystem of community-driven packages in machine learning, computer vision, signal processing, parallel processing, image, video, audio and networking among others, and builds on top of the Lua community.

At the heart of Torch are the popular neural network and optimization libraries which are simple to use, while having maximum flexibility in implementing complex neural network topologies. You can build arbitrary graphs of neural networks, and parallelize them over CPUs and GPUs in an efficient manner.

Using Torch
Start with our Getting Started guide to download and try Torch yourself. Torch is open-source, so you can also start with the code on the GitHub repo.

Torch is constantly evolving: it is already used within Facebook, Google, Twitter, NYU, IDIAP, Purdue and several other companies and research labs.

Open Source Deep Learning Project: Theano


Project Website:

Github Link:


Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Theano features:

tight integration with NumPy – Use numpy.ndarray in Theano-compiled functions.
transparent use of a GPU – Perform data-intensive calculations up to 140x faster than with CPU.(float32 only)
efficient symbolic differentiation – Theano does your derivatives for function with one or many inputs.
speed and stability optimizations – Get the right answer for log(1+x) even when x is really tiny.
dynamic C code generation – Evaluate expressions faster.
extensive unit-testing and self-verification – Detect and diagnose many types of errors.
Theano has been powering large-scale computationally intensive scientific investigations since 2007. But it is also approachable enough to be used in the classroom (University of Montreal’s deep learning/machine learning classes).

Text Processing Book: Speech and Language Processing (3rd ed. draft)

Speech and Language Processing (3rd ed. draft)

Project Website:



Chapter Slides Relation to 2nd ed.
1: Introduction [Ch. 1 in 2nd ed.]
2: Regular Expressions, Text Normalization, and Edit Distance Text [pptx] [pdf]
Edit Distance [pptx] [pdf]
[Ch. 2 and parts of Ch. 3 in 2nd ed.]
3: Finite State Transducers
4: N-Grams LM [pptx] [pdf] [Ch. 4 in 2nd ed.]
5: Neural Language Models and RNNs
6: Spelling Correction and the Noisy Channel Spelling [pptx] [pdf] [expanded from pieces in Ch. 5 in 2nd ed.]
7: Classification: Naive Bayes, Logistic Regression, Sentiment NB [pptx] [pdf]
Sentiment [pptx] [pdf]
[new in this edition]
8: Hidden Markov Models [Ch. 6 in 2nd ed.]
9: Part-of-Speech Tagging [Ch. 5 in 2nd ed.]
10: Formal Grammars of English
11: Syntactic Parsing
12: Statistical Parsing
13: Dependency Parsing
14: Language and Complexity
15: Vector Semantics Vector [pptx] [pdf] [expanded from parts of Ch. 19 and 20 in 2nd ed.]
16: Semantics with Dense Vectors Dense Vector [pptx] [pdf] [new in this edition]
18: Computing with Word Senses: WSD and WordNet Intro, Sim [pptx] [pdf]
WSD [pptx] [pdf]
[expanded from parts of Ch. 19 and 20 in 2nd ed.]
21: Lexicons for Sentiment and Affect Extraction SentLex [pptx] [pdf] [new in this edition]
16: The Representation of Sentence Meaning
17: Computational Semantics
??: Neural Models of Sentence Meaning (LSTM, CNN, etc.)
20: Information Extraction [Ch. 22 in 2nd ed.]
22: Semantic Role Labeling and Argument Structure SRL [pptx] [pdf]
Select [pptx] [pdf]
[expanded from parts of Ch. 19 and 20 in 2nd ed.]
23: Coreference Resolution and Entity Linking
24: Discourse Coherence
25: Summarization
26: Machine Translation
27: Question Answering
28: Conversational Agents
29: Speech Recognition
30: Speech Synthesis

About the Author
Dan Jurafsky is an associate professor in the Department of Linguistics, and by courtesy in Department of Computer Science, at Stanford University. Previously, he was on the faculty of the University of Colorado, Boulder, in the Linguistics and Computer Science departments and the Institute of Cognitive Science. He was born in Yonkers, New York, and received a B.A. in Linguistics in 1983 and a Ph.D. in Computer Science in 1992, both from the University of California at Berkeley. He received the National Science Foundation CAREER award in 1998 and the MacArthur Fellowship in 2002. He has published over 90 papers on a wide range of topics in speech and language processing.

James H. Martin is a professor in the Department of Computer Science and in the Department of Linguistics, and a fellow in the Institute of Cognitive Science at the University of Colorado at Boulder. He was born in New York City, received a B.S. in Comoputer Science from Columbia University in 1981 and a Ph.D. in Computer Science from the University of California at Berkeley in 1988. He has authored over 70 publications in computer science including the book A Computational Model of Metaphor Interpretation.

Open Source Text Processing Project: Festival

The Festival Speech Synthesis System

Project Website:
Github Link: None


Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and an Emacs interface. Festival is multi-lingual (currently English (British and American), and Spanish) though English is the most advanced. Other groups release new languages for the system. And full tools and documentation for build new voices are available through Carnegie Mellon’s FestVox project (

The system is written in C++ and uses the Edinburgh Speech Tools Library for low level architecture and has a Scheme (SIOD) based command interpreter for control. Documentation is given in the FSF texinfo format which can generate, a printed manual, info files and HTML.

Festival is free software. Festival and the speech tools are distributed under an X11-type licence allowing unrestricted commercial and non-commercial use alike.

Open Source Text Processing Project: PyJulius

PyJulius: Python interface to Julius speech recognition engine

Project Website:
Github Link:


pyjulius provides a simple interface to connect to julius module server

First you will need to run julius with the -module option (documentation here or man julius). Julius will wait for a client to connect, this is what Client does in a threaded way.

Let’s just write a simple program that will print whatever the julius server sends until you press CTRL+C:

#!/usr/bin/env python
import sys
import pyjulius
import Queue

# Initialize and try to connect
client = pyjulius.Client(‘localhost’, 10500)
except pyjulius.ConnectionError:
print ‘Start julius as module first!’

# Start listening to the server
while 1:
result = client.results.get(False)
except Queue.Empty:
print repr(result)
except KeyboardInterrupt:
print ‘Exiting…’
client.stop() # send the stop signal
client.join() # wait for the thread to die
client.disconnect() # disconnect from julius
If you are only interested in recognitions, wait for an instance of Sentence objects in the queue:

if isinstance(result, pyjulius.Sentence):
print ‘Sentence “%s” recognized with score %.2f’ % (result, result.score)
If you do not want Client to interpret the raw xml Element, you can set modelize attribute to False

If you encounter any encoding issues, have a look at the -charconv option of julius and set the Client.encoding to the right value

Open Source Text Processing Project: eSpeak

eSpeak text to speech

Project Website:
Github Link: None


eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows.
eSpeak uses a “formant synthesis” method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings.

eSpeak is available as:

A command line program (Linux and Windows) to speak text from a file or from stdin.
A shared library version for use by other programs. (On Windows this is a DLL).
A SAPI5 version for Windows, so it can be used with screen-readers and other programs that support the Windows SAPI5 interface.
eSpeak has been ported to other platforms, including Android, Mac OSX and Solaris.
Includes different Voices, whose characteristics can be altered.
Can produce speech output as a WAV file.
SSML (Speech Synthesis Markup Language) is supported (not complete), and also HTML.
Compact size. The program and its data, including many languages, totals about 2 Mbytes.
Can be used as a front-end to MBROLA diphone voices, see mbrola.html. eSpeak converts text to phonemes with pitch and length information.
Can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.
Potential for other languages. Several are included in varying stages of progress. Help from native speakers for these or other languages is welcome.
Development tools are available for producing and tuning phoneme data.
Written in C.