Package org.jgroups.protocols
Class FD
- java.lang.Object
-
- org.jgroups.stack.Protocol
-
- org.jgroups.protocols.FD
-
public class FD extends Protocol
Failure detection based on simple heartbeat protocol. Regularly polls members for liveness. Multicasts SUSPECT messages when a member is not reachable. The simple algorithms works as follows: the membership is known and ordered. Each HB protocol periodically sends an 'are-you-alive' message to its *neighbor*. A neighbor is the next in rank in the membership list, which is recomputed upon a view change. When a response hasn't been received for n milliseconds and m tries, the corresponding member is suspected (and eventually excluded if faulty).FD starts when it detects (in a view change notification) that there are at least 2 members in the group. It stops running when the membership drops below 2.
When a message is received from the monitored neighbor member, it causes the pinger thread to 'skip' sending the next are-you-alive message. Thus, traffic is reduced.
- Author:
- Bela Ban
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected classFD.BroadcasterTask that periodically broadcasts a list of suspected members to the group.protected classFD.BroadcastTaskstatic classFD.FdHeaderprotected classFD.HeartbeatSenderclassFD.TimeoutCheckerTask which periodically checks of the last_ack from ping_dest exceeded timeout and - if yes - broadcasts a SUSPECT message
-
Field Summary
Fields Modifier and Type Field Description protected FD.Broadcasterbcast_taskTransmits SUSPECT message until view change or UNSUSPECT is receivedprotected java.util.concurrent.Future<?>heartbeat_sender_futureprotected longlast_ackprotected Addresslocal_addrprotected java.util.concurrent.locks.Locklockprotected intmax_triesprotected java.util.List<Address>membersprotected intnum_heartbeatsprotected intnum_suspect_eventsprotected java.util.concurrent.atomic.AtomicIntegernum_triesprotected Addressping_destprotected java.util.List<Address>pingable_mbrsMembers from which we select ping_dest.protected BoundedList<java.lang.String>suspect_historyprotected longtimeoutprotected java.util.concurrent.Future<?>timeout_checker_futureprotected TimeSchedulertimer-
Fields inherited from class org.jgroups.stack.Protocol
after_creation_hook, down_prot, ergonomics, id, log, stack, stats, up_prot
-
-
Constructor Summary
Constructors Constructor Description FD()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidcomputePingDest(Address remove)Computes pingable_mbrs (based on the current membership and the suspected members) and ping_destjava.lang.Objectdown(Event evt)An event is to be sent down the stack.intgetCurrentNumTries()java.lang.StringgetLocalAddress()intgetMaxTries()java.lang.StringgetMembers()intgetNumberOfHeartbeatsSent()intgetNumSuspectEventsGenerated()java.lang.StringgetPingableMembers()java.lang.StringgetPingDest()protected AddressgetPingDest(java.util.List<Address> mbrs)longgetTimeout()booleanisMonitorRunning()java.lang.StringprintSuspectHistory()voidresetStats()protected voidsendHeartbeatResponse(Address dest)voidsetMaxTries(int max_tries)voidsetTimeout(long timeout)voidstart()This method is called on aJChannel.connect(String).voidstartFailureDetection()protected voidstartMonitor()Requires lock to held by callervoidstop()This method is called on aJChannel.disconnect().voidstopFailureDetection()protected voidstopMonitor()Requires lock to be held by callerprotected voidunsuspect(Address mbr)java.lang.Objectup(Message msg)A single message was received.voidup(MessageBatch batch)Sends up a multiple messages in aMessageBatch.protected voidupdateTimestamp(Address sender)-
Methods inherited from class org.jgroups.stack.Protocol
accept, afterCreationHook, destroy, down, enableStats, getConfigurableObjects, getDownProtocol, getDownServices, getId, getIdsAbove, getLevel, getLog, getName, getProtocolStack, getSocketFactory, getThreadFactory, getTransport, getUpProtocol, getUpServices, getValue, init, isErgonomics, level, parse, providedDownServices, providedUpServices, requiredDownServices, requiredUpServices, resetStatistics, setDownProtocol, setErgonomics, setId, setLevel, setProtocolStack, setSocketFactory, setUpProtocol, setValue, statsEnabled, up
-
-
-
-
Field Detail
-
timeout
protected long timeout
-
max_tries
protected int max_tries
-
num_heartbeats
protected int num_heartbeats
-
num_suspect_events
protected int num_suspect_events
-
suspect_history
protected final BoundedList<java.lang.String> suspect_history
-
local_addr
protected Address local_addr
-
last_ack
protected volatile long last_ack
-
num_tries
protected final java.util.concurrent.atomic.AtomicInteger num_tries
-
lock
protected final java.util.concurrent.locks.Lock lock
-
ping_dest
protected volatile Address ping_dest
-
members
protected final java.util.List<Address> members
-
pingable_mbrs
protected final java.util.List<Address> pingable_mbrs
Members from which we select ping_dest. Copy ofmembersminus the suspected members
-
timer
protected TimeScheduler timer
-
timeout_checker_future
protected java.util.concurrent.Future<?> timeout_checker_future
-
heartbeat_sender_future
protected java.util.concurrent.Future<?> heartbeat_sender_future
-
bcast_task
protected final FD.Broadcaster bcast_task
Transmits SUSPECT message until view change or UNSUSPECT is received
-
-
Method Detail
-
getLocalAddress
public java.lang.String getLocalAddress()
-
getMembers
public java.lang.String getMembers()
-
getPingableMembers
public java.lang.String getPingableMembers()
-
getPingDest
public java.lang.String getPingDest()
-
getNumberOfHeartbeatsSent
public int getNumberOfHeartbeatsSent()
-
getNumSuspectEventsGenerated
public int getNumSuspectEventsGenerated()
-
getTimeout
public long getTimeout()
-
setTimeout
public void setTimeout(long timeout)
-
getMaxTries
public int getMaxTries()
-
setMaxTries
public void setMaxTries(int max_tries)
-
getCurrentNumTries
public int getCurrentNumTries()
-
printSuspectHistory
public java.lang.String printSuspectHistory()
-
resetStats
public void resetStats()
- Overrides:
resetStatsin classProtocol
-
start
public void start() throws java.lang.ExceptionDescription copied from class:ProtocolThis method is called on aJChannel.connect(String). Starts work. Protocols are connected and queues are ready to receive events. Will be called from bottom to top. This call will replace the START and START_OK events.- Overrides:
startin classProtocol- Throws:
java.lang.Exception- Thrown if protocol cannot be started successfully. This will cause the ProtocolStack to fail, soJChannel.connect(String)will throw an exception
-
stop
public void stop()
Description copied from class:ProtocolThis method is called on aJChannel.disconnect(). Stops work (e.g. by closing multicast socket). Will be called from top to bottom. This means that at the time of the method invocation the neighbor protocol below is still working. This method will replace the STOP, STOP_OK, CLEANUP and CLEANUP_OK events. The ProtocolStack guarantees that when this method is called all messages in the down queue will have been flushed
-
stopFailureDetection
public void stopFailureDetection()
-
startFailureDetection
public void startFailureDetection()
-
startMonitor
protected void startMonitor()
Requires lock to held by caller
-
stopMonitor
protected void stopMonitor()
Requires lock to be held by caller
-
isMonitorRunning
public boolean isMonitorRunning()
-
up
public java.lang.Object up(Message msg)
Description copied from class:ProtocolA single message was received. Protocols may examine the message and do something (e.g. add a header) with it before passing it up.
-
up
public void up(MessageBatch batch)
Description copied from class:ProtocolSends up a multiple messages in aMessageBatch. The sender of the batch is always the same, and so is the destination (null == multicast messages). Messages in a batch can be OOB messages, regular messages, or mixed messages, although the transport itself will create initial MessageBatches that contain only either OOB or regular messages. The default processing below sends messages up the stack individually, based on a matching criteria (callingProtocol.accept(org.jgroups.Message)), and - if true - callsProtocol.up(org.jgroups.Event)for that message and removes the message. If the batch is not empty, it is passed up, or else it is dropped. Subclasses should check if there are any messages destined for them (e.g. usingMessageBatch.getMatchingMessages(short,boolean)), then possibly remove and process them and finally pass the batch up to the next protocol. Protocols can also modify messages in place, e.g. ENCRYPT could decrypt all encrypted messages in the batch, not remove them, and pass the batch up when done.
-
down
public java.lang.Object down(Event evt)
Description copied from class:ProtocolAn event is to be sent down the stack. A protocol may want to examine its type and perform some action on it, depending on the event's type. If the event is a message MSG, then the protocol may need to add a header to it (or do nothing at all) before sending it down the stack usingdown_prot.down().
-
sendHeartbeatResponse
protected void sendHeartbeatResponse(Address dest)
-
unsuspect
protected void unsuspect(Address mbr)
-
updateTimestamp
protected void updateTimestamp(Address sender)
-
computePingDest
protected void computePingDest(Address remove)
Computes pingable_mbrs (based on the current membership and the suspected members) and ping_dest- Parameters:
remove- The member to be removed from pingable_mbrs
-
-