Leader election

Not to be confused with Leadership election.

In distributed computing, leader election is the process of designating a single process as the organizer of some task distributed among several computers (nodes). Before the task is begun, all network nodes are either unaware which node will serve as the "leader" (or coordinator) of the task, or unable to communicate with the current coordinator. After a leader election algorithm has been run, however, each node throughout the network recognizes a particular, unique node as the task leader.

The network nodes communicate among themselves in order to decide which of them will get into the "leader" state. For that, they need some method in order to break the symmetry among them. For example, if each node has unique and comparable identities, then the nodes can compare their identities, and decide that the node with the highest identity is the leader.

The definition of this problem is often attributed to LeLann, who formalized it as a method to create a new token in a token ring network in which the token has been lost.

Leader election algorithms are designed to be economical in terms of total bytes transmitted, and time. The algorithm suggested by Gallager, Humblet, and Spira[1] for general undirected graphs has had a strong impact on the design of distributed algorithms in general, and won the Dijkstra Prize for an influential paper in distributed computing.

Many other algorithms were suggested for different kind of network graphs, such as undirected rings, unidirectional rings, complete graphs, grids, directed Euler graphs, and others. A general method that decouples the issue of the graph family from the design of the leader election algorithm was suggested by Korach, Kutten, and Moran.[2]

Definition

The problem of leader election is for each processor eventually to decide that whether it is a leader or not subject to only one processor decides that it is the leader.[3] An algorithm solves the leader election problem if:

  1. States of processors are divided into elected and not-elected states. Once elected, it remains as elected (similarly if not elected).
  2. In every execution, exactly one processor becomes elected and the rest determine that they are not elected.

A valid leader election algorithm must meet the following conditions:[4]

  1. Termination: the algorithm should finish within a finite time once the leader is selected. In randomized approaches this condition is sometimes weakened (for example, requiring termination with probability 1).
  2. Uniqueness: there is exactly one processor that considers itself as leader.
  3. Agreement: all other processors know who the leader is.

An algorithm for leader election may vary in following aspects:[5]

Algorithms

Leader election in rings

Ring network topology

A ring network is a connected-graph topology in which each node is exactly connected to two other nodes, i.e., for a graph with n nodes, there are exactly n edges connecting the nodes. A ring can be unidirectional, which means processors only communicate in one direction (a node could only send messages to the left or only send messages to the right), or bidirectional, meaning processors may transmit and receive messages in both directions (a node could send messages to the left and right).

Anonymous rings

A ring is said to be anonymous if every processor is identical. More formally, the system has the same state machine for every processor.[6] There is no deterministic algorithm to elect a leader in anonymous rings, even when the size of the network is known to the processes.[7][8] This is due to the fact that there is no possibility of breaking symmetry in an anonymous ring if all processes run at the same speed. The state of processors after some steps only depends on the initial state of neighbouring nodes. So, because their states are identical and execute the same procedures, in every round the same messages are sent by each processor. Therefore, each processor state also changes identically and as a result if one processor is elected as a leader, so are all the others.

Randomized (probabilistic) leader election

A common approach to solve the problem of leader election in anonymous rings is the use of probabilistic algorithms. In such approaches, generally processors assume some identities based on a probabilistic function and communicate it to the rest of the network. At the end, through the application of an algorithm, a leader is selected (with high probability).

Synchronous ring

Itai and Rodeh[9] introduced an algorithm for a unidirectional ring with synchronized processes. They assume the size of the ring (number of nodes) is known to the processes. For a ring of size n, a≤n processors are active. Each processor decides with probability of a^(-1) whether to become a candidate. At the end of each phase, each processor calculates the number of candidates c and if it is equal to 1, it becomes the leader. To determine the value of c, each candidate sends a token (pebble) at the start of the phase which is passed around the ring, returning after exactly n time units to its sender. Every processor determines c by counting the number of pebbles which passed through.
This algorithm achieves leader election with expected message complexity of O(nlogn). A similar approach is also used in which a time-out mechanism is employed to detect deadlocks in the system.[10] There are also algorithms for rings of special sizes such as prime size[11][12] and odd size.[13]

Asynchronous ring

In the case of asynchronous systems, the problem of leader election becomes more difficult. This is because of the uncertainty introduced by the arbitrary response time of processors due to the lack of a global clock. To tackle this problem, there are various approaches introduced. For instance, Itai and Rodeh extended their algorithm by adding a message buffer and wake up messages to trigger the computation in processes.

Uniform algorithm

In typical approaches to leader election, the size of the ring is assumed to be known to the processes. In the case of anonymous rings, without using an external entity, it is not possible to elect a leader. Even assuming an algorithm exists, the leader could not estimate the size of the ring. i.e. in any anonymous ring, there is a positive probability that an algorithm computes a wrong ring size.[14] To overcome this problem, Fisher and Jiang used a so-called leader oracle Ω? that each processor can ask whether there is a unique leader. They show that from some point upward, it is guaranteed to return the same answer to all processes.[15]

Rings with unique IDs

In one of the early works, Chang and Roberts[16] proposed a uniform algorithm in which a processor with the highest ID is selected as the leader. Each processor sends its ID in a clockwise direction. A process receiving a message and compares it with its own. If it is bigger, it passes it through, otherwise it will discard the message. They show that this algorithm uses at most O(n^2) messages and O(nlogn) in the average case.
Hirschberg and Sinclair[17] improved this algorithm with O(nlogn) message complexity by introducing a 2 directional message passing scheme allowing the processors to send messages in both directions.

Leader Election in Mesh

Mesh network topology. Red nodes denote corners, blue border and gray interior.

The mesh is another popular form of network topology specially in parallel systems, redundant memory systems and interconnection networks.[18]
In a mesh structure, nodes are either corner (only two neighbours), border (only three neighbours) or interior (with four neighbours). The number of edges in a mesh of size a x b is m=2ab-a-b.

Unoriented mesh

A typical algorithm to solve the leader election in an unoriented mesh is to only elect one of the four corner nodes as the leader. Since the corner nodes might not be aware of the state of other processes, the algorithm should first wake up the corner nodes. A leader can be elected as follows.[19]

  1. Wake-up process: in which k nodes initiate the election process. Each initiator sends a wake-up message to all its neighbouring nodes. If a node is not initiator, it simply forwards the messages to the other nodes. In this stage at most 3n+k messages are sent.
  2. Election process: the election in outer ring takes two stages at most with 6(a+b)-16 messages.
  3. Termination: leader sends a terminating message to all nodes. This requires at most 2n messages.

The message complexity is at most 6(a+b)-16 and if the mesh is square shaped O(√n).

Oriented Mesh

An oriented mesh is a special case where are port numbers are compass labels, i.e. north, south, east and west. Leader election in an oriented mesh is trivial. We only need to nominate a corner, e.g. “north” and “east” and make sure that node knows it is a leader.

Torus

Torus network structure.

A special case of mesh architecture is a torus which is a mesh with “wrap-around”. In this structure, every node has exactly 4 connecting edges. One approach to elect a leader in such a structure is known as electoral stages. Similar to procedures in ring structures, this method in each stage eliminates potential candidates until eventually one candidate node is left. This node becomes the leader and then notifies all other processes of termination.[20] This approach can be used to achieve a complexity of O(n). There also more practical approaches introduced for dealing with presence of faulty links in the network.[21][22]

Election in Hypercubes

H_4 hypercube network topology.

A Hypercube H_k is a network consisting of n=2^k nodes, each with degree of k and O(n log n) edges. A similar electoral stages as before can be used to solve the problem of leader election. In each stage two nodes (called duelists) compete and the winner is promoted to the next stage. This means in each stage only half of the duelists enter the next stage. This procedure continues until only one duelist is left, and it becomes the leader. Once selected, it notifies all other processes. This algorithm requires O(n) messages. In the case of unoriented hypercubes, a similar approach can be used but with a higher message complexity of O(nloglogn).[23]

Election in complete networks

Complete network structure.

Complete networks are structures in which all processes are connected to one another, i.e., the degree of each node is n-1, n being the size of the network. An optimal solution with O(n) message and space complexity is known.[24] In this algorithm, processes have the following states:

  1. Dummy: nodes that do not participate in the leader election algorithm.
  2. Passive: the initial state of processes before start.
  3. Candidate: the status of nodes after waking up. The candidate nodes will be considered to become the leader.

To elect a leader, a virtual ring is considered in the network. All processors initially start in a passive state until they are woken up. Once the nodes are awake, they are candidates to become the leader. Based on a priority scheme, candidate nodes collaborate in the virtual ring. At some point, candidates become aware of the identity of candidates that precede them in the ring. The higher priority candidates ask the lower ones about their predecessors. The candidates with lower priority become dummies after replying to the candidates with higher priority. Based on this scheme, the highest priority candidate eventually knows that all nodes in the system are dummies except itself, at which point it knows it is the leader.

Universal leader election techniques

As the name implies, these algorithms are designed to be used in every form of process networks without any prior knowledge of the topology of a network or its properties such its size.[25]

Mega-Merger

This technique in essence is similar to finding a Minimum Spanning Tree (MST) in which the root of the tree becomes the leader. The basic idea in this method is individual nodes merge with each other to form bigger structures. The result of this algorithm is a tree (a graph with no cycle) whose root is the leader of entire system. The cost of mega-merger method is O(m+nlogn) where m is the number of edges and n is the number of nodes.

YO-YO

An example of YO-YO procedure. a) The network, b) Oriented network after setup phase, c) YO- phase in which source values are passed, d)-YO phase sending responses from sinks, e) updated structure after -YO phase.

YO-YO is a minimum finding algorithm consisting of two parts: a preprocessing phase and a series of iterations.[26] In the first phase or setup, each node exchanges its id with all its neighbours and based on the value it orients its incident edges. For instance, if node x has a smaller id than y, x orients towards y. If a node has a smaller id than all its neighbours it becomes a source. In contrast, a node with all inward edges (i.e., with id larger than all of its neighbours) is a sink. All other nodes are internal nodes.
Once all the edges are oriented, the iteration phase starts. Each iteration is an electoral stage in which some candidates will be removed. Each iteration has two phases: YO- and –YO. In this phase sources start the process to propagate to each sink the smallest values of the sources connected to that sink.

YO-

  1. A source (local minima) transmits its value to all its out-neighbours
  2. An internal node waits to receive a value from all its in-neighbours. It calculates the minimum and sends it to out-neighbour.
  3. A sink(a node with no outgoing edge) receives all the values and compute their minimum.

-YO

  1. A sink sends YES to neighbours from which saw the smallest value and NO to others
  2. An internal node sends YES to all in-neighbours from which it received the smallest value and NO to others. If it receives only one NO, it sends NO to all.
  3. A source waits until it receives all votes. If all YES, it survives and if not, it is no longer a candidate.
  4. When a node x sends NO to an in-neighbour y, the logical direction of that edge is reversed.
  5. When a node y receives NO from an out-neighbour, it flips the direction of that link.

After the final stage, any source who receives a NO is no longer a source and becomes a sink. An additional stage, pruning, also is introduced to remove the nodes that are useless, i.e. their existence has no impact on the next iterations.

  1. If a sink is leaf, then it is useless and therefore is removed.
  2. If, in the YO- phase the same value is received by a node from more than one in-neighbour, it will ask all but one to remove the link connecting them.

This method has a total cost of O(mlogn) messages. Its real message complexity including pruning is an open research problem and is unknown.

Applications

Radio networks

In radio network protocols, leader election is often used as a first step to approach more advanced communication primitives, such as message gathering or broadcasts.[27] The very nature of wireless networks induces collisions when adjacent nodes transmit at the same time; electing a leader allows to better coordinate this process. While the diameter D of a network is a natural lower bound for the time needed to elect a leader, upper and lower bounds for the leader election problem depend on the specific radio model studied.

Models and runtime

In radio networks, the n nodes may in every round choose to either transmit or receive a message. If no collision detection is available, then a node cannot distinguish between silence or receiving more than one message at a time. Should collision detection be available, then a node may detect more than one incoming message at the same time, even though the messages itself cannot be decoded in that case. In the beeping model, nodes can only distinguish between silence or at least one message via carrier sensing.

Known runtimes for single-hop networks range from a constant (expected with collision detection) to O(n log n) rounds (deterministic and no collision detection). In multi-hop networks, known runtimes differ from roughly O((D+ log n)(log² log n)) rounds (with high probability in the beeping model), O(D log n) (deterministic in the beeping model), O(n) (deterministic with collision detection) to O(n log3/2 n (log log n)0.5) rounds (deterministic and no collision detection).

See also

References

  1. R. G. Gallager, P. A. Humblet, and P. M. Spira (January 1983). "A Distributed Algorithm for Minimum-Weight Spanning Trees" (PDF). ACM Transactions on Programming Languages and Systems. 5 (1): 66–77. doi:10.1145/357195.357200.
  2. Ephraim Korach, Shay Kutten, Shlomo Moran (1990). "A Modular Technique for the Design of Efficient Distributed Leader Finding Algorithms". ACM Transactions on Programming Languages and Systems. 12 (1): 84–101. doi:10.1145/77606.77610.
  3. H. Attiya and J. Welch, Distributed Computing: Fundamentals, Simulations and Advance Topics, John Wiley & Sons inc., 2004, chap. 3
  4. I. Gupta, R. van Renesse, and K. P. Birman,2000, A Probabilistically Correct Leader Election Protocol for Large Groups, Technical Report , Cornell University
  5. R. Bakhshi, W. Fokkink, J. pang, and J. Van de Pol,c2008 "Leader Election in Anonymous Rings:Franklin Goes Probabilistic", TCS, Vol. 273, pp. 57-72.
  6. H. Attiya and J. Welch, Distributed Computing: Fundamentals, Simulations and Advance Topics, John Wiley & Sons inc., 2004, chap. 3
  7. H. Attiya and J. Welch, Distributed Computing: Fundamentals, Simulations and Advance Topics, John Wiley & Sons inc., 2004, chap. 3
  8. H. Attiya and M. Snir, 1988,"Computing on an anonymous ring",JACM,Vol. 35, issue. 4, pp. 845-875
  9. A. Itai and M. Rodeh, 1990,"Symmetry breaking in distributed networks", Vol. 88, issue 1, pp. 60-87.
  10. L. Higham and S. Myers, 1998, "Self-Stabilizing Token Circulation on Anonymous Message Passing Rings", Second International Conference On Principles Of DIstributed Systems.
  11. G. Itkis, C. Lin, and J. Simon,1995,"Deterministic, constant space, self-stabilizing leader election on uniform rings.", In Proc. 9th Workshop on Distributed Algorithms, Vol. 972, pp. 288-302.
  12. J. Burns and J. Pachl,1989,"Uniform self-stabilizing rings",ACM Trans. Program. Lang. Systems, Vol. 11, issue. 2, pp.330-344
  13. T. Herman, 1990, "Probabilistic self-stabilization", Inf. Process. Lett., Vol. 35, issue 2, pp.63-67.
  14. G. Tel,Introduction to Distributed Algorithms. Cambridge University Press, 2000.2nd edition
  15. M. Fischer and H. Jiang, 2006,"Self-stabilizing leader election in networks of _nite-state anonymous agents", In Proc. 10th Conf. on Principles of Distributed Systems,Vol. 4305, pp. 395-409.
  16. E. Chang and R. Roberts, 1979, "An improved algorithm for decentralized extrema-finding in circular configurations of processes", ACM, Vol. 22, issue 5, pp. 281-283.
  17. D. S. Hirschberg and J. B. Sinclair, 1980, "Decentralized extrema-finding in circular configurations of processors", ACM, Vol. 23, issue 11, pp. 627-628.
  18. N. Santoro, Design and Analysis of Distributed Algorithms, Wiley, 2006.
  19. H. Kallasjoki, 2007, "Election in Mesh, Cube and Complete Networks", Seminar on Theoretical Computer Science.
  20. N. Santoro, Design and Analysis of Distributed Algorithms, Wiley, 2006.
  21. M. Refai, A. Sharieh and . Alsmmari, 2010, "Leader Election Algorithm in 2D Torus Network with the Presence of One Link Failure", The International Arab Journal of Information Technology, Vol. 7, No. 2.
  22. M Al Refai,2014, "Dynamic Leader Election Algorithm in 2D Torus Network with Multi Links Failure", IJCST, Vol. 2, issue 5.
  23. N. Santoro, Design and Analysis of Distributed Algorithms, Wiley, 2006.
  24. J. Villadangos, A. Cordoba, F. Farina, and M. Prieto, 2005, "Efficient leader election in complete networks", PDP, pp.136-143.
  25. N. Santoro, Design and Analysis of Distributed Algorithms, Wiley, 2006.
  26. N. Santoro, Design and Analysis of Distributed Algorithms, Wiley, 2006.
  27. Haeupler, Bernhard; Ghaffari, Mohsen (2013). "Near Optimal Leader Election in Multi-Hop Radio Networks". Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithm. doi:10.1137/1.9781611973105.54.
This article is issued from Wikipedia - version of the 11/28/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.