We consider the problem of load balancing in a ring network. We present an analysis of the following local algorithm. In each step, each node of the ring examines the number of tokens at its clockwise neighbor and sends a token to the neighbor if the neighbor has fewer tokens. We show that in a synchronous model, for any initial token distribution D, the algorithm converges to a completely balanced distribution within 4OPT(D) + n steps, where OPT(D) is the time taken by the optimal centralized algorithm to balance D completely and $n$ is the number of nodes. Our main result is an analysis of the algorithm in an asynchronous model in which local computations and messages may be arbitrarily delayed, subject to the constraint that each message is eventually delivered and each computation is eventually performed. By generalizing our analysis for the synchronous model, we show that for any initial token distribution D, the algorithm converges to a completely balanced distribution within 8OPT(D) + 2n rounds, where a round is a minimal sequence of steps in which every component of the network is scheduled at least once. We also show that for every initial token distribution, the message complexity of the algorithm is asymptotically optimal among all algorithms that move tokens in the clockwise direction.