Wednesday, July 16, 2014

Failure of a parcipating site

Failure of a Participating Site / Handling the failure of a participating site by 2PC protocol in distributed database / 2 Phase Commit (2PC) protocol failure handling techniques

Recall the messages used by the TC (Transaction Coordinator) to perform any transaction in a Distributed Database.

<Prepare T> - It is send by the TC to the TM (Transaction Managers) of all the participating sites. It instructs the participating sites to ready for Commit the transaction T.
<Ready T>   - This message is send by the TMs of all the participating sites to the TC if they are ready to commit the transaction T.
<Abort T>    - This message indicates that the system which sends this message is not ready to commit or cannot commit T. This message can be send by the participating sites (TMs can send) or by the initiating site (TC can send).
<Commit T> - Instructs the system to commit, i.e, permanently store the changes in the database.

Handling a Failure of a participating site

Let us assume that the failed site is Si and the Transaction Coordinator is TC.There are two things we need to look into to handle such failure;
1. The response of the Transaction Coordinator of transaction T.
          If the failed site have not sent any <ready T> message, the TC cannot decide to commit the transaction [Remember, in distributed database all the participating sites must be ready to commit. Even if, one site is not ready, then the whole transaction needs to be aborted by the TC]. Hence, the transaction T should be aborted and other participating sites to be informed.
          If the failed site have sent a <ready T> message, the TC can assume that the failed site also was ready to commit, hence the transaction can be committed by TC and the other sites will be informed to commit. In this case, the site which recovers from failure has to execute the 2PC protocol to set its local database up-to-date.

2. The response of the failed site when it recovers.
When recover from failure, the recovering site Si must identify the fate of the transactions which were going on during the failure of Si. This can be done by examining the log file entries of site Si.
The following are the possible cases and relevant actions;
[a] If the log contains a <commit T> entry - It means that all the other sites including Si have responded with <ready T> message to TC and TC must have send <commit T> to all the participants. Because, the participating sites are not allowed to insert <commit T> message in the log file without the coordinator’s decision. Hence, the recovered site Si can perform redo(T). That is, T is executed once again locally by Si.
[b] If the log contains an <abort T> entry – Any site can have <abort T> message in its entry, if the decision taken by the coordinator TC is to abort the transaction T. Hence, site Si executes undo(T).
[c] If the log contains a <ready T> entry – This means that the site Si failed immediately after sending its own status on transaction T. Now, it has contact the TC or other sites for deciding the fate of the transaction T.
          The first choice is to contact the TC of transaction T. If the TC have an entry <commit T>, then according to the above discussions, it is clear that the Si have to perform redo(T). If the TC have an entry <abort T>, then Si performs undo(T).
          The second choice is to contact the other sites which have participated in transaction T (this choice is chosen only if TC is not available). Then the decision can be taken based on the other sites’ log entries.
[d] If the log contains no control messages, i.e, no <abort T, <commit T>, or <ready T> - It clearly shows that the site Si has failed well before responding to the <prepare T> message. Hence, the TC must have aborted the transaction. So, Si needs to execute a undo(T).

This is how the 2PC handles the failure of a participating Site.

The handling of other types of failures can be visited through the following links;

         Failure of a coordinator

         Network partition

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery