Overview of GPU Projects

This is a list of GPU projects that students can choose from. Some of the projects have also been contributed by other professors in CCIS.

Right now, not all GPU projects are fully defined. If there is interest in a particular project, we will refine its description. In particular, I also asked professors for interesting general algorithms. We can then examine those general algorithms to identify interesting computational micro-kernels that we will implement.

As an example, recall the breadth-first search algorithm. The kernel requires a hash array to detect if we have previously seen a new state. Hash arrays can be implemented to use streaming access by using delayed duplicate detection. In this case, we wait until we have many hash requests. We then apply the hash function to find hash indices. We sort based on hash index. We then stream the hash requests and the hash array, both in hash index order, and compare the two to detect which hash requests refer to previously seen states.

Where a faculty member provides only the higher-level algorithm, it will be our job to identify the micro-kernel, and reformulate the micro-kernel in a form suitable for streaming access.

GPU Projects

Multi-Dimensional Range Search (contributed by Prof. Donghui Zhang)
Traditionally, multi-dimensional range search are supported through multi-dimensional index structures such as the R-tree. Such indices typically suffer from the "curse of dimensionality". For instance, the R-tree performs poorer than linear scan for range search when the dimensionality is more than 12. Furthermore, such indices use random access. There are two ways to support multi-dimensional range search for high dimensions, both utilizing streaming access.

(a) X-tree. The X-tree combines the R-tree with array. Every node can either be the root of a sub-tree or be an array of all elements in the sub-tree. As the dimensionality increases, the X-tree nodes are more likely to be arrays.

(b) Space filling curves. Using space filling curves such as the Hilbert curve and the Z-ordering, one can organize a multi-dimensional dataset into an one-dim array (or sequential file), and a search region can be mapped to one or multiple ranges of curve values. To retrieve large chunks of records from this array using the ranges can utilize streaming access.

Nearest Neighbor Search (contributed by Prof. Donghui Zhang)
Traditionally, the R-tree are used to support NN search given an arbitrary query point. The idea is to perform best-first search: keep expanding the index entries that are the closest to the query point, till an object stored in some leaf node is revealed. It is a random-access method.

It will be interesting to implement the NN search algorithm when the objects are indexed using space filling curves.

Update the Space Filling Curve File (contributed by Prof. Donghui Zhang)
When the objects are organized into an array or file in increasing order of space filling curves, it will be interesting to implement objects updates. It will be really inefficient to reconstruct the whole file for each new object insertion. One idea is to buffer the updates, e.g. whenever there are 100 new updates, integrate them into the original file via reconstruction. Another idea of avoiding reconstruction is to leave some space in the file (so that a few insertions in the middle can be applied in-place and does notrewrite the whole file). Yet another idea is to break the file into multiple pieces. It will be interesting to design a systematic way of organizing the records which, probably by incorporating all these ideas and possibly some other ideas, efficiently supports updates.
Kernels for Face and Fingerprint Recognition (contributed by Merav Algon)
"Complete discriminant evaluation and feature extraction in kernel space for face recognition" by Xudong Jian, Bappaditya Mandal and Alex Kot. (Also, see: Manhua Liu, Xudong Jiang, Alex ChiChung Kot: Efficient fingerprint search based on database clustering. Pattern Recognition 40(6): 1793-1803 (2007)). We have an internal copy of the first paper. For copyright reasons, I can't post it publicly. You'll find it in the course directory (if you have a CCIS account), and you can always get it from a TA with a CCIS account. I recommend asking Kapil Arya (username same as first name, but lower case) at ccis.neu.edu.