Erlang-Style Concurrency

Jesse A. Tov
2 May 2007

Code: array.erl clock.erl dp.erl factorial.erl global_server.erl kv.erl listener.erl print_server.erl ring.erl subscription.erl

Let's design a scrabble server:
 - Game state with threads representing agents, shared memory,
   explicit synchronization
 - State machine (works okay so long as thing are I/O bound)
 - Message passing concurrent processes

Why?
 - The world is parallel and distributed.
 - The way we've learned to think about programming is not.
 - "To program a real-world application, we _observe_ the concurrency
   patterns = no guesswork."
 - Also parallel speed-up.

 - Erlang not just about parallel speed-up.  Also:
   - Reliability
   - Modeling inherent concurrency

So you've decided you want to model concurrency with concurrency.
You shouldn't have to pay for massive concurrency:
 - In performance
 - In complexity

Thesis of Concurrency Oriented Programming:
 - You can't add concurrency to a sequential language in a sane way.
 - You need a COPL.  COPL characteristics:
   - Processes (self contained)
   - Strong isolation (separate heaps . . .)
   - Unique, unforgeable pids
   - No sharing -- message passing
   - Message passing is asynchronous and potentially unreliable (send
     and pray)
   - You can find out if another process fails, and why (might lie,
     e.g. network errors)
 - Claim: Programming such a system is surprisingly easy.
 - Claim: Erlang satisfies the above.

Some Erlang success stories:

AXD301
 - Ericsson (ATM) phone switch
 - Impressive (nine 9s?) uptime
 - 1.1 M lines of code, mostly _not_ concurrent (more on this later)
 - 10 Gbit = 50-100k phone calls
 - expandable to 160 Gbit

Bluetail Mail Robustifier / Alteon SSL Accelerator
 - Network appliances
 - Each ~30 Kloc
 - BMR deployed by Swedish ISP since 1999 (as of 2003)
 - ASA "best in class", market leader

ejabberd
 - Server for XMPP chat/presence protocol
 - Replacing open source Jabber 1.4/2 as _the_ XMPP server to use
 - Why?  Anecdotally, less trouble
 - Easy to cluster
 - Everybody, including jabber.org, has switched over


LANGUAGE INTRODUCTION

Erlang is
 - Untyped, single-assignment functional language
 - Lists, tuples, symbols (called "atoms"), binary data
 - Light weight processes
 - Message passing
 - Transparent distribution
 - Runs on OS independent VM
 - Soft real-time GC

Sequential examples:

  fac(0) -> 1;
  fac(N) when N > 0 -> N * fac(N - 1).

  lookup(Key, {Key, Val, _, _}) -> {ok, Val};
  lookup(Key, {Key1, _, L, _}) when Key < Key1 -> lookup(Key, L);
  lookup(Key, {Key1, _, _, R}) -> lookup(Key, R);
  lookup(Key, _) -> not_found.

  adder(A) -> fun(B) -> A + B end.
  Add1 = adder(1).
  Add1(5).

A simple concurrent example:

  account(Balance) ->
    receive
      {deposit, Amount, Whom} ->
        Whom ! {deposit_receipt, Amount},
        account(Balance + Amount);
      {balance, Whom} ->
        Whom ! {balance, Balance},
        account(Balance);
      {withdrawal, Amount, Whom} when Amount > Balance ->
        Whom ! overdraft,
        account(Balance);
      {withdrawal, Amount, Whom} ->
        Whom ! {withdrawal_receipt, Amount},
        account(Balance - Amount)
    end.

  Account = spawn(fun () -> account(0) end).

  Account ! {withdrawal, 200}.
  receive
    {withdrawal_receipt, Amount} -> ok,
    overdraft -> not_ok
  end.

  Account ! {deposit, 300}.
  receive {deposit_receipt, A} -> A end.

Actor version:
  account(Balance) ->
    receive
      {deposit, Amount, Whom} ->
        become account(Balance + Amount) |
        Whom ! {deposit_receipt, Amount};
      {balance, Whom} ->
        Whom ! {balance, Balance};
      {withdrawal, Amount, Whom} when Amount > Balance ->
        Whom ! overdraft;
      {withdrawal, Amount, Whom} ->
        become account(Balance - Amount) |
        Whom ! {withdrawal_receipt, Amount},
    end.

Comparison with Actors
----------------------

Similarities:

 - can only talk to someone if you have their address (well, unless you
   look in the process dictionary)

 - can create new processes
 - addresses are first-class, so you can pass them in messages
    -> variable topology

 - message passing is asynchronous

 - Actor laws apply:
   1) finite number of events between any two events
   2) no event immediately causes more than a finite number of events
   3) each event is generated by at most one send
   4) arrivals are well-ordered
   5) event ordering is well-founded
   6) no event immediately causes more than a finite number of process
      creations
   7) actors have a finite number of acquaintances

Differences:

 - Erlang processes correspond to serialized actors.  Erlang has
   "constant values" instead of unserialized actors.  (No need for CPS
   all the time.)

 - If process A sends two messages to process B, they arrive in the order
   they were sent.  Actors don't have this restriction.  (Claim: this
   makes programming easier.)

 - Erlang messages are not guaranteed to be delivered -- if you want to
   be sure, get a receipt.

 - Erlang processes are internally sequential; Actors may have some
   internal parallelism (for example, becoming and sending
   simultaneously).

 - In the actor model, no global information.  In erlang, you can
   "register" a process to a global name, and you can create globally
   unique tokens (called refs).


Dining philosophers:

In CSP:

  FORK =
    *[ phil(i) ? pickup() -> phil(i) ? putdown()
       []
       phil((i-1) mod 5) ? pickup() -> phil((i-1) mod 5) ? putdown()
     ]

  ROOM =
    occupancy: integer;
    occupancy := 0;
    *[
       (i: 0..4) occupancy < 4; phil(i) ? enter() ->
          occupancy := occupancy + 1
       []
       (i: 0..4) phil(i) ? exit() ->
          occupancy := occupancy - 1
     ]

  PHIL ==
    *[ true ->
         THINK;
         room ! enter();
         fork(i) ! pickup();
         fork((i+1) mod 5) ! pickup();
         EAT;
         fork(i) ! putdown();
         fork((i+1) mod 5) ! putdown();
         room ! exit();
     ]


  fork(Left, Right) ->
    receive
      {pickup, Left} ->
        receive {putdown, Left} -> fork(Left, Right) end;
      {pickup, Right} ->
        receive {putdown, Right} -> fork(Left, Right) end
    end.

  room(Occupants, Capacity) ->
    receive
      {knock, Whom} when Occupants < Capacity ->
        Whom ! {enter, self()},
        room(Occupants + 1, Capacity);
      {leave, _} ->
        room(Occupants - 1, Capacity)
    end.

  philosopher(Name, Room, LFork, RFork) ->
    think(Name),
    Room ! {knock, self()},
    receive {enter, Room} -> ok end,
    LFork ! {pickup, self()},
    RFork ! {pickup, self()},
    receive
      {a_fork, LFork} ->
        receive {a_fork, RFork} -> ok end;
      {a_fork, RFork} ->
        receive {a_fork, LFork} -> ok end
    end.
    eat(Name),
    LFork ! {putdown, self()},
    RFork ! {putdown, self()},
    Room ! {leave, self()},
    philosopher(Name, Room, LFork, RFork).

  fork_setup() ->
    receive
      {left, Left} ->
        receive
          {right, Right} ->
            fork(Left, Right)
        end
    end.

  new_philosopher(N, Room, Left, Right) ->
    Phil = spawn_link(fun () -> philosopher(N, Room, Left, Right) end),
    Left ! {right, Phil},
    Right ! {left, Phil}.

  start(N) ->
    spawn(fun () ->
      Room = spawn_link(fun () -> room(0, N - 1) end),
      Fork = spawn_link(fun fork_setup/0),
      setup(N, Room, Fork, Fork)
    end).

  setup(N, Room, Left, Last) when N > 1 ->
    Right = spawn_link(fun fork_setup/0),
    new_philosopher(N, Room, Left, Right),
    setup(N - 1, Room, Right, Last);

  setup(1, Room, Left, Right) ->
    new_philosopher(1, Room, Left, Right),
    receive
      _ -> exit({done})
    end.

(Spawn_link/link: When a process exits, its linked processes receive a
signal.  When an internode connection goes does, linked processes
receive an exit signal with reason 'noconnection'.  Exit/2 can fake an
exit.)


SCHEDULING

Fairness:
 - processes can be waiting, runnable, or running.
 - processes blocked on receive are waiting; others are runnable, and
   one is running.
 - small time slices; when running process runs out of slice, marked
   runnable
 - goal is to schedule all runnable processes in the order they became
   runnable, subject to priority constraints

Priorities:
 - priorities: high, normal, low.
 - high processes are scheduled fairly, in preference to others.
 - normal (or low) processes are scheduled fairly if there are only
   normal (or low) processes runnable.
 - when mix of normal/low, normal get scheduled /normaladvantage/(=8)
   times as much as low.

CULTURE & CRITIQUE

Recommended programming techniques

Many are standard (non-Erlang-specific):
 - Use libraries
 - Reduce interdependency
 - Don't optimize

A few are interesting!

Abstract out concurrency.
 - Structure the system so 95% client code and 5% concurrency code written
   by experts.
 - A few patterns cover a whole lot of stuff:
    - client/server
    - worker/supervisor
    - event-handler
    - upgrade-handler
    - keep alive
    - hot/standby
 - The OTP library provides "behaviours", essentially higher-order
   implementations of patterns.
 - E.g. gen_server (we could use it to write our bank server)

    handle_call({deposit, Amount}, _, Balance) ->
      {reply, {deposit_receipt, Amount}, Balance + Amount};
    handle_call({balance}, _, Balance) ->
      {reply, {balance, Balance}, Balance};
    handle_call({withdrawal, Amount}, _, Balance) when Amount > Balance ->
      {reply, {overdraft}, Balance);
    handle_call({withdrawal, Amount}, _, Balance) ->
      {reply, {withdrawal_receipt, Amount}, Balance - Amount).

 - In AXD301, behavior instances:
    - gen_server: 122
    - gen_event: 36
    - supervisor: 20
    - gen_fsm: 10

 - In BRM/ASA:
    - gen_server: 56
    - gen_event: 9
    - supervisor: 19
    - gen_fsm: 1

 - More generally, considering dirty code (code with side effects):
    - AXD301:  776/1472 dirty modules
    -       :  4090/57412 dirty functions
    - BRM/ASA: 132/253 dirty modules
    -        : 610/6876 dirty functions
    - Observation: dirtiness isn't well partitioned

Error handling:
 - Separate error cases from normal cases -- generally good advice
 - In Erlang: handle errors in a separate process
    - Processes should have only one purpose
 - Identify the error kernel: small portion of your code that you need
   to be bug free.

 - Armstrong's expectation: structure processes in a worker/supervisor
   tree.
 - Reality: cascading restarts don't work -- supervision hierarchies are
   really flat.

Processes and messages:
 - Assign exactly one process to each concurrent activity in the system
 - Reality: too expensive for AXD301 (120 calls/second * 6 processes) +
   100k monitoring processes

 - Flush unknown messages (or you have space problems)
 - Can processes leak?

 - Encapsulate message passing into functions
 - Reality: then you can't parallelize RPCs

How well does this work?
 - No one knows -- only anecdotes, no evidence
 - Logs weren't design to keep track of non-events (e.g. successful
   upgrades)
 - There are very few events

Pi-calculus?


REFERENCES

Joe Armstrong.  Making reliable systems in the presence of software
errors.  Ph.D. thesis.

----.  Concurrency oriented programming in Erlang.  Invited talk at the
Lightweight Languages Workshop 2002 (LL2).

Thomas Noll and Chanchal Kumar Roy.  Modeling Erlang in the Pi-Calculus.
Erlang Workshop '05.

Erlang specification 4.7.

Erlang Reference Manual and implementation.

Programming rules and conventions.
http://www.erlang.se/doc/programming_rules.shtml.
Home