XA specification
XA is the interface specification (i.e., interface function) between the transaction middleware and the database defined by X/Open DTP, which is used by the transaction middleware to notify the database of the start, end, commit, rollback, etc. of transactions. XA interface functions are provided by database vendors. The second-order submission agreement and the third-order submission agreement were derived from this idea. It can be said that two-stage commits are actually the key to implementing XA distributed transactions (to be precise: two-stage commits mainly ensure the atomicity of distributed transactions: that is, all nodes either do all or nothing)
2PC
Two-phase Commit refers to an algorithm designed to maintain consistency in transaction commits for all nodes based on distributed system architecture in the field of computer networks and databases. Often, a two-stage commit is also referred to as a protocol. In a distributed system, each node can know the success or failure of its own operation, but it cannot know the success or failure of other nodes' operations. When a transaction spans multiple nodes, in order to maintain the ACID characteristics of the transaction, a component that acts as a coordinator needs to be introduced to control the results of all nodes (called participants) and ultimately instruct these nodes to actually submit the results (such as writing updated data to disk, etc.). Therefore, the algorithm idea of the two-stage submission can be summarized as follows: the participants will notify the coordinator of the success or failure of the operation, and then the coordinator will decide whether to submit the operation or abort the operation based on the feedback information of all participants. The so-called two stages are: the first stage: the preparation stage (voting stage) and the second stage: the submission stage (execution stage).
Preparation stage
The transaction coordinator (transaction manager) sends a Prepare message to each participant (resource manager), and each participant either returns a failure directly (such as a failed permission verification), or executes the transaction locally, writes local redo and undo logs, but does not commit, and reaches a state of "everything is ready, only the east wind is owed".
The preparation stage can be further divided into the following three steps:
1) The coordinator node asks all participant nodes if they can perform a vote and starts waiting for a response from each participant node.
2) The participant node performs all transaction operations until the query is initiated, and writes the Undo information and Redo information to the log. (Note: If successful, each participant has already performed the transaction operation)
3) Each participant node responds to the inquiry initiated by the coordinator node. If the transaction operation of the participant node is actually executed successfully, it returns an "Agree" message; If the transaction operation of the participant node actually fails, it returns an "aborted" message.
Submission stage If the coordinator receives a failure message or timeout from a participant, it will send a rollback message directly to each participant. Otherwise, send a Commit message; Participants perform commit or rollback operations according to the coordinator's instructions to release all lock resources used in the transaction process. (Note: Lock resources must be released in the final stage)
Next, the process of the submission stage is discussed separately in two cases.
When the corresponding message received by the coordinator node from all participant nodes is Agree:
Submission stage If the coordinator receives a failure message or timeout from a participant, it will send a rollback message directly to each participant. Otherwise, send a Commit message; Participants perform commit or rollback operations according to the coordinator's instructions to release all lock resources used in the transaction process. (Note: Lock resources must be released in the final stage)
Next, the process of the submission stage is discussed separately in two cases.
When the corresponding message received by the coordinator node from all participant nodes is Agree:
1) The coordinator node issues a "commit" request to all participant nodes.
2) The participant node officially completes the operation and releases the resources occupied throughout the transaction period.
3) The participant node sends a "Done" message to the coordinator node.
4) The coordinator node completes the transaction after receiving the "Done" message feedback from all participant nodes. If either participant node returns a response message of "Aborted" in the first phase, or if the coordinator node is unable to get a response message for all participant nodes before the query timeout in the first phase:
1) The coordinator node issues a "rollback" request to all participant nodes.
2) The participant node uses the previously written Undo information to perform a rollback and release resources occupied throughout the transaction period.
3) The participant node sends a "rollback complete" message to the coordinator node.
4) The coordinator node cancels the transaction after receiving the "Rollback Complete" message feedback from all participant nodes. Regardless of the final outcome, the second phase ends the current transaction. Phase 2 commits do seem to provide atomic operations, but unfortunately, stage 2 commits still have a few drawbacks:
1. Synchronous blocking problem. During execution, all participating nodes are transaction blocking. When a participant occupies a public resource, other third-party nodes have to be blocked from accessing the public resource.
2. Single point of failure. Due to the importance of the coordinator, once the coordinator fails. The participants will continue to block the blockage. Especially in the second stage, if the coordinator fails, all participants are still in a state of locking transaction resources and cannot continue to complete transaction operations. (If the coordinator hangs up, you can re-elect a coordinator, but it cannot solve the problem that the participant is blocked due to the coordinator down)
3. Data inconsistency. In the second stage of the second stage of commit, when the coordinator sends a commit request to the participant, a local network exception occurs or the coordinator fails during the commit request process, which causes only some participants to accept the commit request. After receiving the commit request, these participants will perform the commit operation. However, other machines that do not receive a commit request cannot execute the transaction commit. As a result, the data department consistency occurs in the entire distributed system.
4. Problems that cannot be solved in the second stage: The coordinator goes down after sending a commit message, and the only participant who receives this message is also down. So even if the facilitator elects a new facilitator through the election agreement, the status of the transaction is uncertain, and no one knows whether the transaction has been submitted. Due to the defects of the second stage of submission, such as synchronous blocking, single point problem, and split brain, the researchers made improvements on the basis of the second stage of submission and proposed a three-stage submission.
3PC
Three-phase commit, also known as the three-phase commit protocol, is an improved version of the two-phase commit (2PC).
Unlike two-stage commits, there are two changes to three-stage commits.
1. Introduce a timeout mechanism. At the same time, a timeout mechanism is introduced in both the facilitator and the participants. 2. Insert a preparatory stage in the first and second stages. This ensures that the state of all participating nodes is consistent until the final commit stage. In other words, in addition to introducing a timeout mechanism, 3PC once again divides the preparation stage of 2PC into two, so that there are three stages of CanCommit, PreCommit, and DoCommit in the three stages of commit.
CanCommit stage
The CanCommit stage of 3PC is actually very similar to the preparation stage of 2PC. The coordinator sends a commit request to the participant, who returns a Yes response if they can commit, or a No response. 1. Transaction Inquiry The facilitator sends a CanCommit request to the participant. Ask if you can perform a transaction commit operation. Then start waiting for a response from the participants. 2. Response Feedback After receiving the CanCommit request, the participant will return a Yes response and enter the ready state if it thinks that the transaction can be executed smoothly. Otherwise feedback No
PreCommit phase
The facilitator decides whether or not to memorize the PreCommit operation of the transaction based on the participant's response. Depending on the response, there are two possibilities. If the feedback the facilitator gets from all participants is a Yes response, then the pre-execution of the transaction is performed.
1. Send a PreCommit Request The facilitator sends a PreCommit request to the participant and moves on to the Prepare stage.
2. Transaction Pre-Commit After the participant receives the PreCommit request, it performs the transaction operation and records the undo and redo information in the transaction log.
3. Response Feedback If the participant successfully executes the transaction operation, an ACK response is returned while starting to wait for the final instruction. If any participant sends a No response to the coordinator, or waits for a timeout, and the coordinator does not receive a response from the participant, then the transaction is interrupted.
1. Send an interrupt request The facilitator sends an abort request to all participants.
2. Interrupt the transaction After the participant receives the ABORT request from the coordinator (or after the timeout, the request from the coordinator has not been received), the interruption of the transaction is executed. doCommit phase
This stage of real transaction commit can also be divided into the following two situations.
Perform a commit
1. Send a commit request Coordinating receives the ACK response sent by the participant, then he will go from the pre-commit state to the commit state. and send a doCommit request to all participants.
2. Transaction Submission After receiving the doCommit request, the participant executes the formal transaction commit. and release all transaction resources after completing the transaction commit.
3. Respond to feedback After the transaction is submitted, send an Ack response to the coordinator.
4. Complete the transaction After the coordinator receives the ACK response from all participants, the transaction is completed. Interrupt transactions
If the coordinator does not receive an ACK response from the participant (it may not be an ACK response from the receiver, or the response may have timed out), then the interrupt transaction is executed.
1. Send an interrupt request The facilitator sends an abort request to all participants
2. Transaction Rollback After receiving the ABORT request, the participant uses the undo information recorded in Phase 2 to perform the transaction rollback operation, and releases all transaction resources after completing the rollback.
3. Feedback results After the participant completes the transaction rollback, send an ACK message to the coordinator
4. Interrupt the transaction After the coordinator receives the ACK message from the participant, the transaction is interrupted. In the doCommit phase, if the participant cannot receive the doCommit or rebort request from the coordinator in time, the transaction will continue to be submitted after the timeout is waited. (In fact, this should be determined based on probability, when entering the third stage, it means that the participant has received the PreCommit request in the second stage, so the prerequisite for the coordinator to generate a PreCommit request is that he receives a Yes CanCommit response from all participants before the start of the second stage.) (Once the participant receives the PreCommit, it means that he knows that everyone actually agrees to the modification) So, in a word, when entering the third stage, due to network timeouts and other reasons, although the participant did not receive a commit or abort response, he has reason to believe that the probability of a successful commit is very high. )
The difference between 2PC and 3PC
Relative to 2PC, 3PC mainly solves the single point of failure problem and reduces blocking, because once the participant fails to receive a message from the coordinator in time, he executes the commit by default. Instead of holding transaction resources all the time and being in a blocking state. But this mechanism also causes data consistency issues, because the abort response sent by the coordinator is not received by the participant in time due to network reasons, then the participant executes the commit operation after waiting for the timeout. This creates data inconsistencies with other participants who receive the abort command and perform a rollback.
|