Research of Gene Cooperman

under construction This page will always be obsolete at any given point in time, but sometimes more obsolete than other times. I'm happy to correspond. Here are my refereed publications and some examples of my software. I last updated this page in April, 2008.

A brief descripiont of my research follows.

I have a background from the 80s and 90s in computational algebra (especially computational group theory). This has served me well as a testbed for parallel computating. This work led to the TOP-C (Task Oriented Parallel C/C++) model of parallel computing. In a nutshell, it was always designed for commodity computing, and it emphasizes a task-oriented model with lazy updates of globally shared memory. This allows for good latency tolerance, while providing an exceptionally easy model for end-users to implement a generalization of task-oriented parallelism allowing for non-trivial parallelism. Some outgrowths of that work are my support for parallel GAP (Groups, Algorithms and Programming), parallel GCL (GNU Common LISP), ParGeant4 (Geant4 is a million line program developed at CERN and elsewhere, which is used to design and simulate experiments on the LHC, the largest collider in the world). My software page describes this software further.

A current theme in my High Performance Computing Laboratory at Northeastern University is adaptation of data structures and low-level software access algorithms to quickly changing technology. In the 90s, computers became faster. Now, we simply have more of them, and with the growth of heterogeneous computing, we have more types of them.

Three research directions are:

  1. Disk-Based Parallel Computation: Commodity computing is now seeing many cores, but the RAM is not growing in proportion. Our solution is to use the disk as an extension of RAM. The bandwidth of 50 local disks in a cluster is approximately the same as a single RAM subsystem. While this may solve the bandwidth problem of disk, the latency problem remains. We have developed over five years a series of applications that overcome this barrier. We are now working on some general tools that others can use to quickly design and implement disk-based computations. A particular emphasis is on a broad variety search algorithms. A byproduct of this work is our entry in the race to find how many moves are needed to solve Rubik's cube. (This serves as a widely accessible demonstration of the power of our methods.)
  2. User-Level Distributed, Multi-Threaded Checkpointing: The user-level approach allows us to bundle the checkpointing capability with the application or with the computational facility, as opposed to kernel-level solutions, which (at least in binary form) are bound to particular versions of the kernel, and therefore to the computational facility. As one expects, we require no modification of kernel or of application binary. We have demonstrated that it works with OpenMPI, with MPICH-2, with SciPy (iPython), with the Java JVM, and a variety of other applications. Our latest version is DMTCP (available at SourceForge), and is available under GPL.
  3. Converting Distributed Memory Parallelism to Thread-Parallelism: This is a newer project. As the move to many-core computing provides less RAM per core (and for other reasons), it becomes desirable to migrate MPI or other distributed memory code to thread-parallel code. We are investigating to what extent some of this can be done semi-automatically for properly structured code.


The Blue Ribbon Online Free Speech Campaign
The Blue Ribbon Online Free Speech Campaign!

Gene Cooperman
College of Computer Science, 215CN
Northeastern University
Boston, MA 02115
e-mail: gene@ccs.neu.edu
Phone: (617) 373-8686
Fax: (617) 373-5121