Friday, March 28, 2008 * Distributed computing models * Agreement and consensus problems * Byzantine Generals Problem -- Impossiblity proof -- A distributed algorithm -- With authenticated messages DISTRIBUTED COMPUTING MODELS A distributed system is a collection of processes (nodes, machines) communicating with one another. There are different dimensions that capture the distributed computing models. Mode of communication: shared memory vs message-passing. We will primarily consider message-passing models. Asynchrony: Whether the computing and/or the communication happens in a lock-step (synchronized) manner. Asynchrony can apply to both computation (processes) or communication. Asynchrony in processes means different processing speeds. Synchrony in processes means common clock. Asynchrony in communication means arbitrary message delays. Synchrony means bounded (known) delay. Whenever we are working with a distributed system, need to be very careful about the precise models. We hope to explore a range of models in this unit. Failures: Whether nodes and/or links can fail. What kind of failures? Fail-stop: a processor completely stops. Byzantine: a faulty processor can act arbitrarily. Message loss. Topology: In the message-passing case, is there an underlying network that captures the nodes that a node can directly communicate with? (Asynchrony on top of this would decide the delays on these links.) CONSENSUS PROBLEMS The most basic problem that has been studied extensively is that of consensus. In these problems, each process starts off with a private value and the goal is to terminate with the following correctness properties: (a) agreement: the output values of all processes are identical, and (b) validity: the output value of each process is the value of some process. We will discuss a couple of consensus problems, but their precise definitions would be different. BYZANTINE GENERALS Imagine an army of Byzantine generals distributed at the edge of a city about to attack. The generals can communicate only by messenger. They must decide a common plan of action. For simplicity, assume each has a recommendation of "attack" or "retreat". We would like all the generals to exchange these recommendations and execute a private function that decides "attack" or "retreat". Assume that the messengers are always safe. Easy to do, right? They exchange and all execute the same function. What if some of the generals are traitors? How do we now specify the requirements. Suppose each general ends up with all the values, one for each other general. For every i, -- Any two loyal generals use the same value of v(i). -- If the ith general is loyal, then the value he sends must be used by every loyal general as the value of v(i). Byzantine Generals Problem: A commanding general sends an order to his n-1 lieutenant generals such that: IC1. All loyal lieutenant generals obey the same order. IC2. If the commander is loyal, then every loyal lieutenant obeys the order sent. Application: The only way we know to implement a reliable computer system is to use several different "processors" to compute a result and then perform a majority vote. This works fine if nonfaulty processors work in a predictable way, does not work with Byzantine failures. Consensus problems of the above arise in several distributed systems. IMPOSSIBILITY PROOF FOR BYZANTINE GENERALS Theorem: With 3 processes and 1 traitor, it is impossible to solve Byzantine Generals. Proof Sketch: Indistinguishability technique. We will make it impossible for a loyal lt. general to distinguish between two scenarios. Let L1 be a loyal lt and L2 be the other lt. Let C be the commander. Suppose C is loyal and wants to send the order ATTACK to the two lts. C and L2 both communicate with L1. In order for condition IC2 to be satisfied, L1 must execute the order if C is loyal. Suppose the communication from L2 is to suggest exactly the opposite of what C is ordering. Then, L1 hears ATTACK from C and RETREAT from L2. To satisfy IC2, L1 must attack. Suppose C is a traitor and sends ATTACK to L1 and RETREAT to L2, then L2 forwards to L1 -- again L1 must attack. Similarly, if L2 is loyal it must follow the same order that C is giving. But then condition IC1 is not satisfied in the case that C is a traitor and sends different messages to L1 and L2. End Proof Theorem: No solution with fewer than 3m+1 generals can cope with m traitors. Proof: Suppose we have a solution with 3m generals that copes with m traitors. Call these generals the Albanian generals. We will come up a solution for 3 generals that can cope with 1 traitor. Each of the 3 Byzantine generals simulates m of the Albanian generals in the 3m-protocol. The Byzantine commander simulates the Albanian commander and m-1 of the Albanian lts, while the Byzantine lt simulates m Albanian lts. Consider two cases. First is when the Byzantine commander is loyal. In this case, he is simulating the loyal Albanian commander and m-1 loyal Albanian lts. The loyal Byzantine lt is simulating m loyal Albanian lts. The Byzantine traitor can simulate at most m Albanian traitors. By the correctness of the 3m-protocol, all of the 2m loyal Albanian lts end up with the same order of the loyal Albanian commander -- which is the same as the loyal Byzantine commander. So both IC1 and IC2 are satisfied. The other is when the Byzantine commander is not loyal. In this case, the two loyal Byzantine lts simulate m loyal Albanian lts each. So by condition IC1 of the 3m-protocol, they execute the same order. End Proof SOLUTION FOR BYZANTINE GENERALS Model: Synchronous communication and computation. In each step, each general can do any amount of computation and send any number of messages to any of the other generals. Assumptions about the message system: A1. Every message that is sent is delivered correctly in one round. A2. The receiver of a message knows who sent it. A3. (The absence of a message can be detected.) Algorithm BG(0): (1) The commander send his value to every LT. (2) Each LT uses the value sent by the commander, or the value RETREAT if he received no value. End Algorithm Algorithm BG(m): (1) The commander send his value to every LT. (2) For each i, let v_i denote the value received by LT i (or else be RETREAT if no value received). LT i acts as commander and uses BG(m-1) to send the value v_i to the other n-1 LTs. (3) For each i, and for each j <> i, let v_j be the value that LT i receives from LT j as part of the BG(m-1) algorithm. LT i uses the value majority(v_1,...,v_{n-1}). End Algorithm Note that BG(0) works if C is not a traitor and as long as the number of generals is more than twice the number of traitors. Lemma: For any m and k, Algorithm BG(m) satisfies IC2 if there are more than 2k+m generals and at most k traitors. Proof: Proof by induction. The base case is the claim above. We now assume for m-1, m > 0, and prove for m. Note that we need to argue about loyal commanders only. The loyal commander sends the same value to all LTs. In the remainder, there are n-1 > 2k+m-1 generals and at most k traitors. So, by the induction hypothesis, IC2 is satisfied whenever a loyal LT forwards the commander's order -- this happens > k + m - 1 >= k times. That is, the majority computed by each of the loyal LTs is the same as the order sent by the commander. End Proof Theorem: For any m, BG(m) satisfies conditions IC1 and IC2 if there are more than 3m generals and at most m traitors. Proof: By induction on m. Trivial for m = 0. Suppose the theorem is true for BG(m-1), m > 0. If the commander is loyal, the the claim directly follows from the above lemma. If the commander is a traitor. All we need to ensure is that all the loyal LTs end up with the same value. At most m-1 of the LTs are traitors. Each LT executes BG(m-1) as a commander with the value received. Since the number of traitors is at most m-1 and the number of generals 3m-1 > 3(m-1), all of these BG(m-1) executions satisfy IC1 and IC2 by induction. So every loyal LT ends up with the same value v_j for every LT_j. So they compute the same majority value. End Proof: BYZANTINE GENERALS WITH AUTHENTICATED MESSAGES A4. (a) A loyal general's signature cannot be forged, and any alteration can be detected; (b) Anyone can verify the authenticity of a general's signature. Each general i keeps track of orders in V_i. At termination, it selects an order from V_i deterministically using the same function choice(V_i). BGS(M) Initially V_i is empty. (1) The commander signs and sends his value to every LT. (2) For each i: (A) If LT i receives a message of the form v:0 from the commander and he has not received an order yet, then (i) V_i = {v} (ii) LT i sends v:0:i to every other LT. (B) If LT is receives a message of the form v:0:j_1:...:j_k and v is not in the set V_i, then (i) V_i = V_i U {v} (ii) if k < m, then send message v:0:j_1:...;j_k:i to every LT other than j_1, ..., j_k. (3) For each i: When LT is will receive no more messages, obey the order choice(V_i). Theorem: For any m, Algorithm BGS(m) solves the Byzantine Generals problem if there are at most m traitors. Proof: If the commander is loyal, every V_i will be the same singleton set {v} where v is the order sent by the commander. So IC2 is satisfied. Since IC1 follows from IC2 when the commander is loyal, we only need to consider when the commander is a traitor. Two loyal LTs V_i and V_j obey the same order if V_i = V_j. Suppose i puts order v in V_i. If i receives this order in step 2(A), then he sends to j so j receives it (and put it in V_j). If i receives this order in step 2(B), then i receives a message of the form v:0:j_1:...:j_k. If j is one of the j_i's, then j must have received v before. Otherwise, if k < m, then i will send it out. If k = m, then one of the j_i is loyal. This j_i received the value v and must have sent v to j. End Proof