Documentation for gnumpy


Getting started

Use "import gnumpy", and/or "from gnumpy import <what you put here is up to you>" Module gnumpy contains class garray, which behaves much like numpy.ndarray Module gnumpy also contains methods like tile() and rand(), which behave like their numpy counterparts except that they deal with gnumpy.garray instances, instead of numpy.ndarray instances. Gnumpy imports Vlad's cudamat in order to use a GPU board, or failing that, imports Ilya's npmat in order to run everything in simulation mode, on the CPU. Don't touch cudamat (or gpu_lock) yourself. Gnumpy takes care of that. The first time you create a garray instance, gnumpy claims a gpu board.
Switching from numpy to gnumpy

For some things you'll likely still use numpy Avoiding unintended use of numpy arrays
Advanced features

gnumpy.status() will show some basic information about the internal state of gnumpy. Gnumpy arrays can safely be used with pickle/cPickle. GPU memory usage debugging Running on CPU vs. on GPU Logistic function Telling gnumpy which GPU board to use More time-consuming error checking The special values inf and nan
Known issues

As of 2010-05-21, cudamat has some bugs in dealing with really large arrays. Not all of those have been worked around in gnumpy. As of 2010-05-16, cudamat has some bugs in dealing with arrays of size zero. Not all of those have been worked around in gnumpy. Creating a numpy array from a tuple that includes garray's is terribly slow, and interrupting it results in a segfault. Some numpy features are not (yet) implemented. You'll see a NotImplemented exception if you try to use them. Please let me know if this happens, and I'll try to implement that specific feature. If you ever see an error message from cudamat, then I consider that a bug and I would like to know about it.

Differences from numpy ndarrays


This section is only relevant if you like editing your arrays (using operators like += or slice assign). In numpy, many operations don't copy the data to a new array, but only make a reference a.k.a. alias. In gnumpy, there are three operations that create an alias: Note: slice assignment "A[:,3]=B" has nothing to do with aliasing, and works exactly like in numpy.
All data is of type float32.

However, there are some boolean operotors. Those use 1.0 to represent True, and 0.0 to represent False. Note that 32 bit might not be enough for finite difference gradient approximations. To do those, you may have to run in simulation mode, with high numerical precision. Because there is only one data type, some comparisons to non-number data, like "A < None", might raise an exception (in numpy they don't).
Not all numpy operations are supported yet.

For example, cosines and argmax are not yet supported. Simple slice assignment is supported, but slice assignment using a stride other than 1 is not supported. Changing array shape by "A.shape = (3,4)" should be avoided. Instead use "A = A.reshape((3,4))". When you try to use a feature that's missing, you'll either not be able to find a method for it, or your program will crash with a NotImplemented exception that describes which feature is missing. However, if you need (read: "would like") one of the missing features to be implemented, let me know and maybe you'll have it one hour later. Do realize, though, that a gpu cannot do everything faster than the cpu.

Differences from cudamat. Some features behave more like numpy.ndarray:

Gnumpy supports "A=B+C" notation. "A" will be a newly created array. garray objects have arbitrary dimensionality, and broadcasting for binary operators. Data is stored in row-major order ("C" order). This means that row slices are cheaper than column slices (row slices are simply aliases).

Here, technically speaking, "row" means the first axis.

Execution speed

Gnumpy is reasonably (but not maximally) optimized. If you profile a program that uses gnumpy and you find an opportunity for efficiency improvement, PLEASE let me know! If you make a program run much faster by using cudamat directly, PLEASE let me know. Some things are slow on a gpu

Rule of thumb: anything that's simple is fast on the gpu; complicated operations may well be faster on the cpu. The second rule: if you care about execution speed and want to make a wise choice, just try both alternatives, with a stopwatch.