Friday, March 28, 2008

* Distributed computing models
* Agreement and consensus problems
* Byzantine Generals Problem
-- Impossiblity proof
-- A distributed algorithm
-- With authenticated messages

DISTRIBUTED COMPUTING MODELS

A distributed system is a collection of processes (nodes, machines)
communicating with one another.  There are different dimensions that
capture the distributed computing models.

Mode of communication: shared memory vs message-passing.  We will
primarily consider message-passing models.

Asynchrony: Whether the computing and/or the communication happens in
a lock-step (synchronized) manner.  Asynchrony can apply to both
computation (processes) or communication.  Asynchrony in processes
means different processing speeds.  Synchrony in processes means
common clock.  Asynchrony in communication means arbitrary message
delays.  Synchrony means bounded (known) delay.  Whenever we are
working with a distributed system, need to be very careful about the
precise models.  We hope to explore a range of models in this unit.

Failures: Whether nodes and/or links can fail.  What kind of failures?
Fail-stop: a processor completely stops.  Byzantine: a faulty
processor can act arbitrarily.  Message loss.

Topology: In the message-passing case, is there an underlying network
that captures the nodes that a node can directly communicate with?
(Asynchrony on top of this would decide the delays on these links.)

CONSENSUS PROBLEMS

The most basic problem that has been studied extensively is that of
consensus.  In these problems, each process starts off with a
private value and the goal is to terminate with the following
correctness properties: (a) agreement: the output values of all
processes are identical, and (b) validity: the output value of each
process is the value of some process.

We will discuss a couple of consensus problems, but their precise
definitions would be different.

BYZANTINE GENERALS

Imagine an army of Byzantine generals distributed at the edge of a
city about to attack.  The generals can communicate only by messenger.
They must decide a common plan of action.  For simplicity, assume each
has a recommendation of "attack" or "retreat".  We would like all the
generals to exchange these recommendations and execute a private
function that decides "attack" or "retreat".  Assume that the
messengers are always safe.  Easy to do, right?  They exchange and all
execute the same function.

What if some of the generals are traitors?  How do we now specify the
requirements.  Suppose each general ends up with all the values, one
for each other general.  For every i,

-- Any two loyal generals use the same value of v(i).

-- If the ith general is loyal, then the value he sends must be used
by every loyal general as the value of v(i).

Byzantine Generals Problem: A commanding general sends an order to his
n-1 lieutenant generals such that:

IC1. All loyal lieutenant generals obey the same order.

IC2. If the commander is loyal, then every loyal lieutenant obeys the
order sent.

Application: The only way we know to implement a reliable computer
system is to use several different "processors" to compute a result
and then perform a majority vote.  This works fine if nonfaulty
processors work in a predictable way, does not work with Byzantine
failures.  Consensus problems of the above arise in several
distributed systems.

IMPOSSIBILITY PROOF FOR BYZANTINE GENERALS

Theorem: With 3 processes and 1 traitor, it is impossible to solve
Byzantine Generals.

Proof Sketch: Indistinguishability technique.  We will make it impossible for
a loyal lt. general to distinguish between two scenarios.

Let L1 be a loyal lt and L2 be the other lt.  Let C be the commander.
Suppose C is loyal and wants to send the order ATTACK to the two lts.
C and L2 both communicate with L1.  In order for condition IC2 to be
satisfied, L1 must execute the order if C is loyal.  Suppose the
communication from L2 is to suggest exactly the opposite of what C is
ordering.  Then, L1 hears ATTACK from C and RETREAT from L2.  To
satisfy IC2, L1 must attack.  Suppose C is a traitor and sends ATTACK
to L1 and RETREAT to L2, then L2 forwards to L1 -- again L1 must
attack.

Similarly, if L2 is loyal it must follow the same order that C is
giving.  But then condition IC1 is not satisfied in the case that C is
a traitor and sends different messages to L1 and L2.

End Proof

Theorem: No solution with fewer than 3m+1 generals can cope with m
traitors.

Proof: Suppose we have a solution with 3m generals that copes with m
traitors.  Call these generals the Albanian generals.  We will come up
a solution for 3 generals that can cope with 1 traitor.  Each of the 3
Byzantine generals simulates m of the Albanian generals in the
3m-protocol.  The Byzantine commander simulates the Albanian commander
and m-1 of the Albanian lts, while the Byzantine lt simulates m
Albanian lts.  

Consider two cases.  First is when the Byzantine commander is loyal.
In this case, he is simulating the loyal Albanian commander and m-1
loyal Albanian lts.  The loyal Byzantine lt is simulating m loyal
Albanian lts.  The Byzantine traitor can simulate at most m Albanian
traitors.  By the correctness of the 3m-protocol, all of the 2m loyal
Albanian lts end up with the same order of the loyal Albanian
commander -- which is the same as the loyal Byzantine commander.  So
both IC1 and IC2 are satisfied.

The other is when the Byzantine commander is not loyal.  In this case,
the two loyal Byzantine lts simulate m loyal Albanian lts each.  So by
condition IC1 of the 3m-protocol, they execute the same order.

End Proof

SOLUTION FOR BYZANTINE GENERALS

Model: Synchronous communication and computation.  In each step, each
general can do any amount of computation and send any number of
messages to any of the other generals.  Assumptions about the message
system:

A1. Every message that is sent is delivered correctly in one round.

A2. The receiver of a message knows who sent it.

A3. (The absence of a message can be detected.)

Algorithm BG(0):

(1) The commander send his value to every LT.

(2) Each LT uses the value sent by the commander, or the value RETREAT
if he received no value.

End Algorithm

Algorithm BG(m):

(1) The commander send his value to every LT.

(2) For each i, let v_i denote the value received by LT i (or else be
RETREAT if no value received).  LT i acts as commander and uses
BG(m-1) to send the value v_i to the other n-1 LTs.

(3) For each i, and for each j <> i, let v_j be the value that LT i
receives from LT j as part of the BG(m-1) algorithm.  LT i uses the
value majority(v_1,...,v_{n-1}).

End Algorithm

Note that BG(0) works if C is not a traitor and as long as the number of
generals is more than twice the number of traitors.

Lemma: For any m and k, Algorithm BG(m) satisfies IC2 if there are
more than 2k+m generals and at most k traitors.

Proof: Proof by induction.  The base case is the claim above.

We now assume for m-1, m > 0, and prove for m.  Note that we need to
argue about loyal commanders only.  The loyal commander sends the same
value to all LTs.  In the remainder, there are n-1 > 2k+m-1 generals
and at most k traitors.  So, by the induction hypothesis, IC2 is
satisfied whenever a loyal LT forwards the commander's order -- this
happens > k + m - 1 >= k times.  That is, the majority computed by
each of the loyal LTs is the same as the order sent by the commander.

End Proof

Theorem: For any m, BG(m) satisfies conditions IC1 and IC2 if there
are more than 3m generals and at most m traitors.

Proof: By induction on m.  Trivial for m = 0.  Suppose the theorem is
true for BG(m-1), m > 0.

If the commander is loyal, the the claim directly follows from the
above lemma.

If the commander is a traitor.  All we need to ensure is that all the
loyal LTs end up with the same value.  At most m-1 of the LTs are
traitors.  Each LT executes BG(m-1) as a commander with the value
received.  Since the number of traitors is at most m-1 and the number
of generals 3m-1 > 3(m-1), all of these BG(m-1) executions satisfy IC1
and IC2 by induction.  So every loyal LT ends up with the same value
v_j for every LT_j.  So they compute the same majority value.

End Proof:

BYZANTINE GENERALS WITH AUTHENTICATED MESSAGES

A4. (a) A loyal general's signature cannot be forged, and any
alteration can be detected; (b) Anyone can verify the authenticity of
a general's signature.

Each general i keeps track of orders in V_i.  At termination, it selects an
order from V_i deterministically using the same function choice(V_i).

BGS(M)

Initially V_i is empty.

(1) The commander signs and sends his value to every LT.

(2) For each i:

(A) If LT i receives a message of the form v:0 from the commander and
he has not received an order yet, then

(i) V_i = {v}

(ii) LT i sends v:0:i to every other LT.

(B) If LT is receives a message of the form v:0:j_1:...:j_k and v is
not in the set V_i, then

(i) V_i = V_i U {v}

(ii) if k < m, then send message v:0:j_1:...;j_k:i to every LT other
than j_1, ..., j_k.

(3) For each i: When LT is will receive no more messages, obey the
order choice(V_i).

Theorem: For any m, Algorithm BGS(m) solves the Byzantine Generals
problem if there are at most m traitors.

Proof: If the commander is loyal, every V_i will be the same singleton
set {v} where v is the order sent by the commander.  So IC2 is
satisfied.

Since IC1 follows from IC2 when the commander is loyal, we only need
to consider when the commander is a traitor.  Two loyal LTs V_i and
V_j obey the same order if V_i = V_j.

Suppose i puts order v in V_i.  If i receives this order in step 2(A),
then he sends to j so j receives it (and put it in V_j).  If i
receives this order in step 2(B), then i receives a message of the
form v:0:j_1:...:j_k.  If j is one of the j_i's, then j must have
received v before.  Otherwise, if k < m, then i will send it out.  If
k = m, then one of the j_i is loyal.  This j_i received the value v
and must have sent v to j.

End Proof