Fault tolerant Distributed Coordination System developed in JAVA using a ring based algorithm. A single token is passed among the nodes to detect faults. The system detects and recovers from the fault.
The objective of this assignment was to combine and use various concepts we studied throughout the lectures, including distributed coordination, communication and fault tolerance in the design of a simple distributed system. We consider the reservation process of University Auditorium by various departments. When a client (an instructor) is willing to reserve the auditorium for a specific day, (s)he will contact his/her department. Each department node, in turn, will try to update a shared data structure Schedule when servicing the client request. System is capable of detecting and recovering from crash faults that can accept department nodes, through appropriate communication and re-organization actions. In addition, when a(faulty) department node becomes ready to re-join the group, system allows it to do so.