Research of Gene Cooperman
This page will always be obsolete at any given point in time,
but sometimes more obsolete than other times. I'm happy to
correspond.
Here are my
refereed publications
and some examples of my software.
I last updated this page in April, 2008.
A brief descripiont of my research follows.
I have a background from the 80s and 90s in computational algebra
(especially computational group theory). This has served me well
as a testbed for parallel computating. This work led to the
TOP-C
(Task Oriented Parallel C/C++) model of parallel computing.
In a nutshell, it was always designed for commodity computing,
and it emphasizes a task-oriented model with lazy updates
of globally shared memory. This allows for good latency tolerance,
while providing an exceptionally easy model for end-users to implement
a generalization of task-oriented parallelism allowing for
non-trivial parallelism. Some outgrowths of that work are my
support for parallel GAP (Groups, Algorithms and Programming),
parallel GCL (GNU Common LISP), ParGeant4 (Geant4 is a million
line program developed at CERN and elsewhere, which is used
to design and simulate experiments on the LHC, the largest collider
in the world). My software page
describes this software further.
A current theme in my High Performance Computing Laboratory
at Northeastern University is adaptation of data structures
and low-level software access algorithms to quickly changing
technology. In the 90s, computers became faster. Now, we simply
have more of them, and with the growth of heterogeneous computing,
we have more types of them.
Three research directions are:
- Disk-Based Parallel Computation:
Commodity computing is now seeing many cores, but the RAM
is not growing in proportion. Our solution is to use the disk as
an extension of RAM. The bandwidth of 50 local disks in a cluster
is approximately the same as a single RAM subsystem. While
this may solve the bandwidth problem of disk, the latency problem remains.
We have developed over five years a series of applications that overcome
this barrier. We are now working on some general tools that others can
use to quickly design and implement disk-based computations. A particular
emphasis is on a broad variety search algorithms. A byproduct of this
work is our entry in the race to find how many moves are needed
to solve Rubik's cube. (This serves as a widely accessible demonstration
of the power of our methods.)
- User-Level Distributed, Multi-Threaded Checkpointing:
The user-level approach allows us to bundle the checkpointing
capability with the application or with the computational facility,
as opposed to kernel-level solutions, which (at least in binary form)
are bound to particular versions of the kernel, and therefore to
the computational facility. As one expects, we require no modification
of kernel or of application binary. We have demonstrated that
it works with OpenMPI, with MPICH-2, with SciPy (iPython), with the
Java JVM, and a variety of other applications. Our latest version
is DMTCP
(available at SourceForge), and is available under GPL.
- Converting Distributed Memory Parallelism to Thread-Parallelism:
This is a newer project. As the move to many-core computing provides
less RAM per core (and for other reasons), it becomes desirable
to migrate MPI or other distributed memory code to thread-parallel code.
We are investigating to what extent some of this can be done
semi-automatically for properly structured code.
Gene Cooperman
College of Computer Science, 215CN
Northeastern University
Boston, MA 02115
e-mail: gene@ccs.neu.edu
Phone: (617) 373-8686
Fax: (617) 373-5121