Notes on a Solution to Robust Slaves
This is the current summary. Soon, I will add to this by splitting
the solution into individual tasks for the networks group.
We are looking for a way to continue the TOP-C computation even
after one slave dies or is slow to respond.
the previous web page on robust slaves for how the problem was posed.
This page tries to summarize the solution.
The TOP-C software is built in four layers:
Software in one layer can call a lower layer, but not vice versa.
We decided to divide the design into two parts:
- topc/src/topc.c: If master, keep table of slaves, and when
slave is available, call GenerateTaskInput() to create
task input, and to send it to a slave. If slave, wait for
task input, call DoTask(), and send back task output.
- topc/src/comm*.c: Communication layers for sequential, mpi,
Each file presents a common interface to topc.c for sending
and receiving messages, but internally, it does its work by
different means. In particular, topc/src/comm-mpi.c does
its work by calling the MPI layer.
- topc/src/mpinu/*: MPINU is a subset of MPI. It does its work
by making calls to sockets in the operating system.
- sockets: See UNIX man pages (section 2 of man pages), and
pointers to sockets on course web page.
- The master determining the state of a slave (within the lower layers).
- Determining a policy, given information about the state of a slave.
Note also that there are four interesting states in which a slave
- Dead slave
- Slave not responding; Transient delay
- Very large slave task
- Very slow slave (due to heavy CPU load, or intensive disk I/O, or other)
The issue for Berkeley Sockets is that if a process (or even a thread
in a process) reads or writes from a dead socket (e.g. socket connected
to dead slave), then the entire process dies.
The proposed solutions fall into the categories:
- Active Strategy: Create additional process or thread
- Strongly Active: manager process communicates with master process
via shared memory, via named pipes or via some other safe mechanism;
Manager process can keep data structure for queue of outstanding tasks
and collect replies from slaves. If the manager process dies,
the master recreates the manager process, either from its own
internal data structure of outstanding tasks, or else from
some persistent file committed by manager process.
- Weakly Active: manager process acting as taste tester:
try the socket. If manager process dies, then master
considers the corresponding slave dead, and creates a
new manager process
- Passive Strategy: Master waits for slave until some user-specified
timeout. If the slave does not reply in time, then the
master considers it dead.
Eugenio pointed out that the strongly active manager process can be viewed
as a database manager. We can require that slaves initiate contact
with the database manager at all times. The database manager can commit
its data, in case it dies. There is still a question about
how the master will inform the slave processes that an update of
the shared data is pending.
One possibility is for the master to send a UDP packet to the slave.
Another possibility is that the slave can collect its updates when
it communicates with the database manager.