CUDAMat: Python module for performing basic dense linear algebra computations on the GPU using CUDA

Project Website: None

Github Link:

**Description**

The aim of the cudamat project is to make it easy to perform basic matrix calculations on CUDA-enabled GPUs from Python. cudamat provides a Python matrix class that performs calculations on a GPU. At present, some of the operations our GPU matrix class supports include:

Easy conversion to and from instances of numpy.ndarray.

Limited slicing support.

Matrix multiplication and transpose.

Elementwise addition, subtraction, multiplication, and division.

Elementwise application of exp, log, pow, sqrt.

Summation, maximum and minimum along rows or columns.

Conversion of CUDA errors into Python exceptions.

The current feature set of cudamat is biased towards features needed for implementing some common machine learning algorithms. We have included implementations of feedforward neural networks and restricted Boltzmann machines in the examples that come with cudamat.

Example:

import numpy as np

import cudamat as cm

cm.cublas_init()

# create two random matrices and copy them to the GPU

a = cm.CUDAMatrix(np.random.rand(32, 256))

b = cm.CUDAMatrix(np.random.rand(256, 32))

# perform calculations on the GPU

c = cm.dot(a, b)

d = c.sum(axis = 0)

# copy d back to the host (CPU) and print

print(d.asarray())