org.jgroups.protocols
Class FD

java.lang.Object
  extended by org.jgroups.stack.Protocol
      extended by org.jgroups.protocols.FD
Direct Known Subclasses:
FD_ICMP, FD_PING

public class FD
extends Protocol

Failure detection based on simple heartbeat protocol. Regularly polls members for liveness. Multicasts SUSPECT messages when a member is not reachable. The simple algorithms works as follows: the membership is known and ordered. Each HB protocol periodically sends an 'are-you-alive' message to its *neighbor*. A neighbor is the next in rank in the membership list, which is recomputed upon a view change. When a response hasn't been received for n milliseconds and m tries, the corresponding member is suspected (and eventually excluded if faulty).

FD starts when it detects (in a view change notification) that there are at least 2 members in the group. It stops running when the membership drops below 2.

When a message is received from the monitored neighbor member, it causes the pinger thread to 'skip' sending the next are-you-alive message. Thus, traffic is reduced.

Author:
Bela Ban

Nested Class Summary
protected  class FD.Broadcaster
          Task that periodically broadcasts a list of suspected members to the group.
protected  class FD.BroadcastTask
           
static class FD.FdHeader
           
protected  class FD.Monitor
          Task which periodically checks of the last_ack from ping_dest exceeded timeout and - if yes - broadcasts a SUSPECT message
 
Field Summary
protected  FD.Broadcaster bcast_task
          Transmits SUSPECT message until view change or UNSUSPECT is received
protected  long last_ack
           
protected  Address local_addr
           
protected  java.util.concurrent.locks.Lock lock
           
protected  int max_tries
           
protected  java.util.List<Address> members
           
protected  java.util.concurrent.Future<?> monitor_future
           
protected  int num_heartbeats
           
protected  int num_suspect_events
           
protected  java.util.concurrent.atomic.AtomicInteger num_tries
           
protected  Address ping_dest
           
protected  java.util.List<Address> pingable_mbrs
          Members from which we select ping_dest.
protected  BoundedList<Address> suspect_history
           
protected  long timeout
           
protected  TimeScheduler timer
           
 
Fields inherited from class org.jgroups.stack.Protocol
down_prot, ergonomics, id, log, name, stack, stats, up_prot
 
Constructor Summary
FD()
           
 
Method Summary
protected  void computePingDest(Address remove)
          Computes pingable_mbrs (based on the current membership and the suspected members) and ping_dest
 java.lang.Object down(Event evt)
          An event is to be sent down the stack.
 int getCurrentNumTries()
           
 java.lang.String getLocalAddress()
           
 int getMaxTries()
           
 java.lang.String getMembers()
           
 int getNumberOfHeartbeatsSent()
           
 int getNumSuspectEventsGenerated()
           
 java.lang.String getPingableMembers()
           
 java.lang.String getPingDest()
           
protected  Address getPingDest(java.util.List<Address> mbrs)
           
 long getTimeout()
           
 void init()
          Called after instance has been created (null constructor) and before protocol is started.
 boolean isMonitorRunning()
           
 java.lang.String printSuspectHistory()
           
 void resetStats()
           
protected  void sendHeartbeatResponse(Address dest)
           
 void setMaxTries(int max_tries)
           
 void setTimeout(long timeout)
           
 void startFailureDetection()
           
protected  void startMonitor()
          Requires lock to held by caller
 void stop()
          This method is called on a Channel.disconnect().
 void stopFailureDetection()
           
protected  void stopMonitor()
          Requires lock to be held by caller
protected  void unsuspect(Address mbr)
           
 java.lang.Object up(Event evt)
          An event was received from the layer below.
protected  void updateTimestamp(Address sender)
           
 
Methods inherited from class org.jgroups.stack.Protocol
destroy, dumpStats, enableStats, getConfigurableObjects, getDownProtocol, getDownServices, getId, getIdsAbove, getLevel, getName, getProtocolStack, getSocketFactory, getThreadFactory, getTransport, getUpProtocol, getUpServices, getValue, isErgonomics, printStats, providedDownServices, providedUpServices, requiredDownServices, requiredUpServices, resetStatistics, setDownProtocol, setErgonomics, setId, setLevel, setProtocolStack, setSocketFactory, setUpProtocol, setValue, setValues, start, statsEnabled
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

timeout

protected long timeout

max_tries

protected int max_tries

num_heartbeats

protected int num_heartbeats

num_suspect_events

protected int num_suspect_events

suspect_history

protected final BoundedList<Address> suspect_history

local_addr

protected Address local_addr

last_ack

protected volatile long last_ack

num_tries

protected final java.util.concurrent.atomic.AtomicInteger num_tries

lock

protected final java.util.concurrent.locks.Lock lock

ping_dest

protected volatile Address ping_dest

members

protected final java.util.List<Address> members

pingable_mbrs

protected final java.util.List<Address> pingable_mbrs
Members from which we select ping_dest. Copy of members minus the suspected members


timer

protected TimeScheduler timer

monitor_future

protected java.util.concurrent.Future<?> monitor_future

bcast_task

protected final FD.Broadcaster bcast_task
Transmits SUSPECT message until view change or UNSUSPECT is received

Constructor Detail

FD

public FD()
Method Detail

getLocalAddress

public java.lang.String getLocalAddress()

getMembers

public java.lang.String getMembers()

getPingableMembers

public java.lang.String getPingableMembers()

getPingDest

public java.lang.String getPingDest()

getNumberOfHeartbeatsSent

public int getNumberOfHeartbeatsSent()

getNumSuspectEventsGenerated

public int getNumSuspectEventsGenerated()

getTimeout

public long getTimeout()

setTimeout

public void setTimeout(long timeout)

getMaxTries

public int getMaxTries()

setMaxTries

public void setMaxTries(int max_tries)

getCurrentNumTries

public int getCurrentNumTries()

printSuspectHistory

public java.lang.String printSuspectHistory()

resetStats

public void resetStats()
Overrides:
resetStats in class Protocol

init

public void init()
          throws java.lang.Exception
Description copied from class: Protocol
Called after instance has been created (null constructor) and before protocol is started. Properties are already set. Other protocols are not yet connected and events cannot yet be sent.

Overrides:
init in class Protocol
Throws:
java.lang.Exception - Thrown if protocol cannot be initialized successfully. This will cause the ProtocolStack to fail, so the channel constructor will throw an exception

stop

public void stop()
Description copied from class: Protocol
This method is called on a Channel.disconnect(). Stops work (e.g. by closing multicast socket). Will be called from top to bottom. This means that at the time of the method invocation the neighbor protocol below is still working. This method will replace the STOP, STOP_OK, CLEANUP and CLEANUP_OK events. The ProtocolStack guarantees that when this method is called all messages in the down queue will have been flushed

Overrides:
stop in class Protocol

getPingDest

protected Address getPingDest(java.util.List<Address> mbrs)

stopFailureDetection

public void stopFailureDetection()

startFailureDetection

public void startFailureDetection()

startMonitor

protected void startMonitor()
Requires lock to held by caller


stopMonitor

protected void stopMonitor()
Requires lock to be held by caller


isMonitorRunning

public boolean isMonitorRunning()

up

public java.lang.Object up(Event evt)
Description copied from class: Protocol
An event was received from the layer below. Usually the current layer will want to examine the event type and - depending on its type - perform some computation (e.g. removing headers from a MSG event type, or updating the internal membership list when receiving a VIEW_CHANGE event). Finally the event is either a) discarded, or b) an event is sent down the stack using down_prot.down() or c) the event (or another event) is sent up the stack using up_prot.up().

Overrides:
up in class Protocol

down

public java.lang.Object down(Event evt)
Description copied from class: Protocol
An event is to be sent down the stack. The layer may want to examine its type and perform some action on it, depending on the event's type. If the event is a message MSG, then the layer may need to add a header to it (or do nothing at all) before sending it down the stack using down_prot.down(). In case of a GET_ADDRESS event (which tries to retrieve the stack's address from one of the bottom layers), the layer may need to send a new response event back up the stack using up_prot.up().

Overrides:
down in class Protocol

sendHeartbeatResponse

protected void sendHeartbeatResponse(Address dest)

unsuspect

protected void unsuspect(Address mbr)

updateTimestamp

protected void updateTimestamp(Address sender)

computePingDest

protected void computePingDest(Address remove)
Computes pingable_mbrs (based on the current membership and the suspected members) and ping_dest

Parameters:
remove - The member to be removed from pingable_mbrs


Copyright © 1998-2012 Bela Ban / Red Hat. All Rights Reserved.