High-Performance Parallel Database Processing and Grid Databases- P8

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:50

Thêm vào BST

Báo xấu

91
lượt xem 9
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

High-Performance Parallel Database Processing and Grid Databases- P8: Parallel databases are database systems that are implemented on parallel computing platforms. Therefore, high-performance query processing focuses on query processing, including database queries and transactions, that makes use of parallelism techniques applied to an underlying parallel computing platform in order to achieve high performance.

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: High-Performance Parallel Database Processing and Grid Databases- P8

330 Chapter 11 Grid Concurrency Control (2) The global transactions currently executing are added to a set, which stores all active transactions. The set of active transactions is represented as Active Trans. (3) The middleware appends a timestamp to every subtransaction of the global transaction before submitting it to the corresponding database. (4) If there are two active global transactions that access more than one database site simultaneously, this creates a potential threat that local databases may schedule the subtransactions in conﬂicting order. The subtransactions are therefore executed strictly according to the timestamp attached to the subtransaction. Total-order is achieved by executing the conﬂicting subtransactions according to the timestamp. (5) When all subtransactions of any global transaction complete the execution at all the sites, the transaction terminates and is removed from Active Trans set (see details in Termination Phase). Note: Active trans and Active Trans(DB) are different. The former is the set of currently active global transactions, and the latter is a function that takes the database site as an argument and returns the set of active transac- tions running at that database site. Explanation of Figure 11.3. Line 1 of Figure 11.3 checks the number of sub- transactions of the submitted transaction. If there is only a single subtransaction, then the global transaction can start executing immediately. The global transaction is added in the active set (line 2) and is submitted immediately to the database for execution (line 3). If the global transaction has more than one subtransaction, that is, the transaction accesses more than one database site, then total-order must be followed. Hence, the timestamp must be appended to all subtransactions of the global transaction. The global transaction is added in the active set (line 4). Global transactions having only one subtransaction are ﬁltered out from the active set, and the new set (Conﬂict Active trans) of the conﬂicting global transactions is formed (line 5). Timestamps are then appended to all subtransactions of the global transaction (line 6 and line 7). If the global transaction being submitted conﬂicts with other active global transactions, it must be submitted to the participant site’s queue to be executed in total-order. Conﬂict of a submitted global transaction (Ti ) with some other active global transaction (T j ) (having more than one active sub- transaction) is checked in line 8. If two global transactions having more than one active subtransaction (i.e., global-global conﬂict) exist, then the global transaction is added in all participating sites’ active transaction sets (Active Trans(DBi )) (line 13) and the subtransactions are submitted to the participants’ queue (line 14), to be strictly executed according to the total-order. If the submitted global transac- tion does not conﬂict with any other active global transaction (i.e., line 8 is true), then the global transaction is added in the active transaction set of all the partici- pant sites (line 10), and the subtransaction is immediately submitted for scheduling (line 11). Global transactions are said to be conﬂicting if two global transactions have more than two active subtransactions executing in different participating sites
11.3 Grid Concurrency Control 331 Algorithm: Grid Concurrency Control Algorithm for the submission phase input Ti: Transaction var Active_trans: set of active transactions var Conﬂict_Active_trans: set of active transactions that conﬂict with global transaction being submitted var Database_accessed [Ti]: database sites being accessed by global transaction Ti Generate timestamp ts: unique timestamp is generated Split_trans(Ti) Database_accessed [Ti] DB_accessed(Ti) 1. if Cardinality(Database_accessed [Ti]) D 1 S 2. Active_Trans(DBi) Active_Trans(DBi) Ti // Ti has only one subtransaction 3. submit subtransaction to DBi else [ 4. Active_trans Active_Trans(DBi) 5. Conﬂict_Active_trans {Ti jTi 2 Active_trans ^ Cardinality(DB_accessed(Tj)) > 1} 6. for each subtransaction of Ti 7. Append_TS(Subtransaction) T [ 8. if Cardinality(Database_accessed[Ti] ( T2 Conﬂict Active trans DB_accessed(Tj)) Ä 1) 9. for each DBi 2 Database_accessed [Ti] S 10. Active_Trans(DBi) Active_Trans(DBi) Ti 11. submit subtransaction to DBi // Subtransaction executes immediately else 12. for each DBi 2 Database_accessed [Ti] 13. Active_Trans(DBi) Active_Trans(DBi) [ Ti 14. submit subtransaction to participant’s DB Queue // Signiﬁes that subtransaction must follow // total-order Figure 11.3 Grid concurrency control algorithm for submission phase simultaneously. This is different from the deﬁnition of conﬂicting transaction in deﬁnition 11.2. The use of these two terms will be easily distinguished by the context. Termination Phase The global transaction is considered active until a response from all subtransac- tions is received. Because of the atomicity property of the transaction, the global
332 Chapter 11 Grid Concurrency Control transaction cannot reach a ﬁnal decision (i.e., commit or abort) until it has received a decision from all the subtransactions. The steps of the transaction termination phase are explained as follows: (1) When any subtransaction ﬁnishes execution, the originator site of the global transaction is informed. (2) Active transactions, conﬂicting active transactions, and databases accessed (by the global transaction) set are adjusted to reﬂect the recent changes due to completion of the subtransaction. (3) The site checks whether a completed subtransaction is the last subtransac- tion of the global transaction to terminate. (3a) If the subtransaction is not the last to terminate, then the subtransac- tions waiting in the queue cannot be scheduled. (3b) If the subtransaction is the last subtransaction of the global transaction to terminate, then other conﬂicting subtransactions can be scheduled. The subtransactions from the queue then follow the normal submis- sion steps as discussed in Figure 11.3. Explanation of Figure 11.4. The originator site of the global transaction is informed after any subtransaction completes execution. The global transaction, Algorithm: Grid Concurrency Control Algorithm for termination phase input ST: subtransaction of Ti at a site that completes execution 1. Active_trans D (Active_trans - Ti) // removes the global transaction from active set of the site 2. Conﬂict_Active_trans D (Conﬂict_Active_trans - Ti) 3. Database_accessed [Ti] D (Database_accessed [Ti] - DBk) // the database where the subtransaction committed is removed from the set of database being accessed by the global transaction 4. if(Database_accessed [Ti]) D φ //subtransaction was last cohort of GT Ti 5. resubmit subtransactions from queue for execution //from Figure 11.3 else 6. resubmit subtransactions to the queue // same as line (14) Figure 11.3 Figure 11.4 Grid concurrency control algorithm for termination phase
11.3 Grid Concurrency Control 333 Ti , is then removed from the active transaction’s set (line 1). This follows the earlier assumption that a global transaction can have only one subtransaction running at any site at any particular time. The conﬂicting active transaction’s set is also adjusted accordingly (line 2). The database site where the subtrans- action is completed is removed from the database accessed set (line 3). If the completed subtransaction is the last subtransaction of the global transaction, that is, the database accessed set is empty (line 4), other waiting subtransactions in the queue are submitted for execution (line 5). The normal transaction submission procedure from Figure 11.3 is followed thereafter. If the completed subtransaction is not the last subtransaction, then the queue is unaffected (line 6). 11.3.4 Revisiting the Earlier Example Taking the same scenario as the earlier example, consider that global transactions T1 and T2 are submitted in quick succession. Since both the transactions need to access data from more than one site, they are forwarded to the middleware to check the metadata service and form subtransactions (eq. 11.1, 11.2, 11.3, and 11.4) (step 1 of the GCC protocol). As data from multiple sites are to be accessed, the transactions are added in the Active Trans set (step 2 of the GCC protocol). Since subtransactions (eq. 11.1 and 11.2) belong to the same global transaction, T1 , the middleware appends same timestamp to both of them, say, timestamp D 1 (step 3 of the protocol). Similarly, subtransactions (eq. 11.3 and 11.4) belong to T2 , hence the same timestamp is appended to both of them, say, timestamp D 2 (step 3 of the protocol). By looking at equation 11.5, we note that history produced at the database site DB2 schedules the subtransaction of the global transaction T1 before the subtrans- action of T2 (the history in equation 11.5 is serial, but it does not matter as long as H2 is serializable, with serialization order T1 T2 because the timestamp attached to T1 by the middleware is less than T2 ). Execution of equation 11.6 will be pro- hibited by line 14 (or step 4) of the algorithm, because T1 and T2 are conﬂicting global transactions and the serialization order is T2 T1 , which does not follow the timestamp sequence. Hence, schedules H2 and H3 will be corrected by the GCC protocol as follows: H2 D r12 .O1 /r12 .O2 /w12 .O1 /C12 r22 .O1 /w22 .O1 /C22 .same as eq. 11:5/ H3 D w13 .O3 /C13r23 .O3 /w23 .O4 /C23 .corrected execution order by the GCC protocol/ Thus in both schedules, T1 T2 . It is not required that the schedules be serial schedules, but only that the serializability order should be the same as that of the timestamp sequence from the middleware.
334 Chapter 11 Grid Concurrency Control 11.3.5 Comparison with Traditional Concurrency Control Protocols Homogeneous distributed concurrency control protocols may be lock-based, timestamp-based, or hybrid protocols. The following discusses the lock-based protocol only, but the arguments hold for other protocols as well. The homogeneous distributed concurrency control protocols can be broadly classiﬁed as (i/ centralized and (ii) distributed. The lock manager and the global lock table are situated at a central site in a centralized protocol. The ﬂow of con- trol (sequence diagram) for centralized concurrency control protocols in distributed DBMS (e.g., centralized two-phase locking) is shown in Figure 11.5. All the global information is stored in a central site, which makes the central site a hotspot and prone to failure. To overcome the limitations of central management, a distributed concurrency protocol is used in distributed DBMSs. The ﬂow of control messages is shown in Figure 11.6 for distributed concurrency control protocols (e.g., dis- tributed two-phase locking). Coordinator site Central site managing All participating (typically where the global information sites (1,2…n) transaction is submitted) (e.g. global lock table) Lock request Lock granted Operation command Operation decision Release lock request Figure 11.5 Operations of a general centralized locking protocol (e.g., centralized two-phase locking) in homogeneous distributed DBMS Coordinator site Participant’s All participating sites (typically where the image of global (1,2,…n) transaction is submitted) information Operation command embedded with lock request Operation End of operation Release lock request Figure 11.6 Operations of a general distributed locking protocol (e.g., decentralized two-phase locking) in homogeneous distributed DBMS
11.3 Grid Concurrency Control 335 Originator site Multidatabase All participants (wherethe transaction management system (1,2,…n) is submitted) (global management layer) Operation request embedded with global information Talk to participant depending on its local protocol Check with multi-DBMS layer if required MDBS Reply Final decision Forward final decision to the originator Figure 11.7 Operations of a general multi-DBMS protocol Originator site Grid Middleware services All participants (where the transactionis (metadata and time stamp (1,2,…n) submitted) services for this purpose) Operation request Forward operation request to participants Forward final decision to the originator Final decision Figure 11.8 Operations of GCC protocol Figure 11.7 shows the sequence of operations for heterogeneous distributed DBMS (e.g., multidatabase systems). Figure 11.8 shows the sequence of operations for the GCC protocol and highlights that the middleware’s function is very lightweight in a Grid environment, as it acts only as the rerouting node for the global transaction (speciﬁcally from correctness perspective), unlike all other architectures. All other ﬁgures (Figs. 11.5–11.7) have a global image of the data and have more communication with the sites. It could be noted that the ﬁnal decision in Figure 11.8 runs in a straight line from the participants to the originator via the middleware; this shows that there is no processing at the middleware and it acts only as a forwarding node. Conversely, Figure 11.7 shows a time lag after receiving the responses from the participants and before forwarding it to the originator, as the multi-DBMS layer has to map the responses in a protocol understandable to the originator.
336 Chapter 11 Grid Concurrency Control The term “coordinator” is used in Figures 11.5 and 11.6 and “originator” in Figures 11.7 and 11.8. In both cases, the sites are where the global transaction is submitted. But the reason to distinguish between the two terms is that in Figures 11.5 and 11.6 the site also acts as the coordinator of the global transaction, while in Figures 11.7 and 11.8, because of site autonomy, the site acts only as the originator of the global transaction. But Figure 11.7 has far more communication compared with Figure 11.8, with the multi-DBMS layer, as it stores and processes all the global information. 11.4 CORRECTNESS OF GCC PROTOCOL A Grid-serializable schedule is considered correct in the Grid environment for database systems. A concurrency control protocol conforming to theorem 11.1 is Grid-serializable, and is thus correct. Hence, to show the correctness of the GCC protocol, any schedule produced by the GCC protocol has the Grid-serializability property. Proposition 11.1 states the assumption that each DBMS can correctly schedule the transactions (local transactions and global subtransactions) submitted to its site. Proposition 11.1: All local transactions and global subtransactions submitted to any local scheduler are scheduled in serializable order. Because of the autonomy of sites, local schedulers cannot communicate with each other, and because of architectural limitations, the global scheduler cannot be implemented in a Grid environment. Because of the lack of communication among the local schedulers and the absence of a global scheduler, it becomes difﬁcult to maintain consistency of the data. Thus the execution of global sub- transactions at local database sites must be handled in such a way that data consis- tency is maintained. The additional requirement for Grid-serializability is stated in proposition 11.2. Proposition 11.2: Any two global transactions having more than one subtransac- tion actively executing simultaneously must follow total-order. Based on propositions 11.1 and 11.2, the following theorem shows that all schedules produced by GCC protocol are Grid-serializable. Theorem 11.2: Every schedule produced by GCC protocol is Grid-serializable. Proof: The types of possible schedules produced by the GCC are identiﬁed ﬁrst, and then it can be shown that the schedules are Grid-serializable. Global transac- tions are broadly classiﬁed in two categories:
11.4 Correctness of GCC Protocol 337 (a) Global transactions having only one subtransaction: Global transactions having a single subtransaction can be scheduled immediately and will always either precede or follow any of the conﬂicting transactions because they execute only on a single site. From proposition 11.1, local schedulers can schedule the transaction in serializable order. (b) Global transactions having more than one subtransaction: Global trans- actions having more than one subtransaction may come under one of the following two cases: (i) Although the global transaction has multiple subtransactions, it con- ﬂicts with other active global transactions at only a single site. This scenario is not a threat to data consistency, and thus the subtransactions could be scheduled immediately (Fig. 11.3, line 8). Local schedulers can correctly schedule transactions in this case. (ii) The global transaction has multiple transactions and conﬂicts with other global transactions at more than one site: Local schedulers cannot schedule global transactions for this scenario. Hence, the GCC protocol submits all subtransactions in the queue and these subtransactions are executed strictly according to the timestamp attached at the Grid middleware. This ensures that if a subtransaction of any global transaction, GTi , precedes a subtransaction of any other global transaction, GT j , at any site, then subtransactions of GTi will precede subtransactions of GT j at all sites. Thus for all cases: a, b –i and b – ii schedule conﬂicting global transactions in such a way that if any global transaction, GTi , precedes any other global transac- tion, GT j , at any site, then GTi precedes GT j at all sites. The type of schedules produced by GCC protocol is thus identiﬁed. Next, it is shown that these schedules are Grid-serializable. To prove that schedules are Grid-serializable, the Grid-serializability graph must be acyclic and global transactions must be in total-order. Conﬂicts of the following types may occur: ž Conﬂict between local and local transactions. The local scheduler is respon- sible for scheduling local transactions. Total-order is required only for sched- ules where global subtransactions are involved. From proposition 11.1, local schedulers can schedule transactions in serializable order. ž Conﬂict between global transaction and local transaction. A local transac- tion executes only in one site. The subtransaction of the global transaction can only conﬂict with the local transaction in that site. Thus the local trans- action and the subtransaction of global transaction are scheduled by the same scheduler. From proposition 11.1, these are scheduled in serializable order. Total-order is also maintained, as only one local scheduler is involved in the serialization process. ž Conﬂict between global and global transactions. Assume that an arc exists from GTi ! GT j at any site DBi . It will be shown that an arc from
338 Chapter 11 Grid Concurrency Control GT j ! GTi cannot exist in GCC. GT j can either precede GTi at the database site DBi or at any other database site DBn . Suppose GT j precedes and conﬂicts with GTi at data site DBi . This contradicts with proposition 11.1. Thus GT j cannot precede GTi at DBi . Suppose GT j precedes and conﬂicts with GTi at any other data site DBn . If GT j precedes GTi at any other site, then total-order is not followed and it contradicts proposition 11.2. Figure 11.3 (line 14) of the GCC protocol prevents the occurrence of such a scenario. Thus schedules produced by the GCC protocol are Grid-serializable. 11.5 FEATURES OF GCC PROTOCOL The concurrency control protocol helps to interleave operations of different trans- actions while maintaining the consistency of data in the presence of multiple users. The GCC protocol has the following main features: (a) Concurrency control in a heterogeneous environment: The GCC protocol does not need to store global information regarding participating sites; e.g., in traditional distributed DBMS, a global lock table stores information of all locks being accessed by the global transaction. But in the Grid environment, all database sites might not be using the same concurrency control strategy (e.g., locking protocol). In the GCC protocol, individual subtransactions are free to execute the local concurrency control protocol of participating sites. The Grid middleware is used to monitor the execution order of the conﬂicting transactions. (b) Reducing the load from the originating site: The centralized scheduling scheme and decentralized consensus-based policies intend to delegate the originating site of the transaction as the coordinator. Thus the coordinator site may become a bottleneck when a transaction has to access multiple sites simultaneously. The GCC protocol delegates the scheduling responsibility to the respective sites where the data resides without compromising the correctness of the data, and thus prevents the coordinator from becoming the bottleneck. (c) Reducing the number of messages in the internetwork: Centralized and consensus-based decentralized scheduling schemes need to communicate with the coordinator to achieve correct schedules. The communication increases the number of messages in the system. Messages are one of the most expensive items to handle in any distributed infrastructure. The GCC protocol has fewer messages moving across the network to achieve concurrency. Since the GCC protocol implements total-order on global transactions, the con- ﬂicting transactions will always proceed in one direction, thereby avoiding the problem of distributed deadlocks. Local deadlock management is the policy of the local database site. Because of autonomy restrictions, external interference in
11.8 Exercises 339 the local policy is not possible. Other concurrency control anomalies such as lost update, dirty read, and unrepeatable read are addressed at the local DBMS level. The above-mentioned features are due to the architectural requirement of the Grid. But there is a serious architectural limitation of Grid architecture in con- currency control protocols. Because of the inability to install a global scheduler, it becomes difﬁcult to monitor the execution of global subtransactions at differ- ent database sites. As a result, some valid interleaving of transactions cannot take place. Thus the resultant schedule becomes stricter than required. 11.6 SUMMARY Grids are evolving as a new distributed computing infrastructure. Traditional distributed databases such as distributed database management systems and multidatabase management systems make use of globally stored information for concurrency control protocols. Centralized or decentralized consensus-based policies are mostly employed for these database systems. The Grid architecture does not support the storage of global information such as global lock tables, global schedulers, etc. Thus a new concurrency control protocol, called GCC, for Grid databases is needed. The GCC protocol has several advantages: It operates in a heterogeneous envi- ronment; the load of the originator site is reduced compared with traditional dis- tributed databases; and the number of messages in the network is reduced. But at the same time, because of the lack of global control and autonomy restrictions of the Grid architecture, it is difﬁcult to optimize the scheduling process. In this chapter, the focus was the maintenance of data consistency during scheduling of the global transactions. 11.7 BIBLIOGRAPHICAL NOTES Consistency and isolation are two of the ACID properties of transaction, which are the focus of this chapter. Most of the important work on concurrency control has been mentioned in the Bibliographical Notes section at the end of Chapter 10. This covers the work on parallel and grid transaction management by Brayner (DEXA 2001), Burger et al. (BNCOD 1994), Colohan et al. (VLDB 2005), 1993), Machado and Collet (DASFAA 1997), Wang et al. (Parallel Computing 1997), and Wiekum and Hasse (VLDB J). 11.8 EXERCISES 11.1. Explain how concurrency control helps to achieve the “C” and “I” of the ACID properties. 11.2. Explain why individual serializable schedules in each site of the Grid environment may not produce a serializable global schedule.
340 Chapter 11 Grid Concurrency Control 11.3. Explain the following terminologies: a. Total-order b. Grid-serial history c. Grid-serializable history d. Grid-serializability graph e. Grid-serializability theorem 11.4. Summarize the main features of the grid concurrency control (GCC) protocol, and explain how it solves the concurrency issues in the Grid. 11.5. Compare and contrast the difference between GCC and any other concurrency control protocols (e.g., distributed databases and multidatabase systems). 11.6. Discuss why the number of messages in the internetwork using GCC is reduced in comparison with other concurrency control protocols.
Chapter 12 Grid Transaction Atomicity and Durability I n this chapter, the “A” and “D” (atomicity and durability) of ACID properties of transactions running on Grid databases are explained. Atomic commitment proto- cols (ACPs) such as two-phase commit (2PC), three-phase commit (3PC), and other variants of these protocols are used for homogeneous and heterogeneous distributed DBMS. ACPs designed for homogeneous distributed DBMS are synchronous and tightly coupled between participating sites, while on the other hand, ACPs designed for heterogenous DBMS, for example, multidatabase systems, need a global manage- ment layer for monitoring the execution of global transactions. The former approach is unsuitable for a Grid database because communication among sites must be asyn- chronous, and the latter is unsuitable because sites in Grid databases are autonomous and cannot accept any functional/architectural changes due to external factors. The purpose of this chapter is twofold. First, an ACP for a Grid database is described, that is, atomicity in a Grid database is addressed. Failures are unavoid- able, and hence in the latter part of the chapter the effect of failure on transaction execution is discussed, including details of different types of logs stored in the origi- nator and participant sites. The chapter is organized as follows. Section 12.1 presents the motivation for addressing atomic commitment in Grid databases. Section 12.2 describes the Grid-atomic commit protocol (Grid-ACP) and proves the correctness of the protocol. The Grid-ACP is extended to handle site failures in Section 12.3, including the comparison of the Grid-ACP with centralized and distributed recovery models. Correctness of the recovery protocol is also given. High-Performance Parallel Database Processing and Grid Databases, by David Taniar, Clement Leung, Wenny Rahayu, and Sushant Goel Copyright  2008 John Wiley & Sons, Inc. 341
342 Chapter 12 Grid Transaction Atomicity and Durability 12.1 MOTIVATION 2PC is the most widely accepted ACP in distributed data environments. 2PC is a consensus-based protocol, which needs to synchronize individual decisions of all participating sites to reach a global decision. It involves two phases, the voting phase and the decision phase. 2PC is also a blocking protocol. For n partici- pants, 2PC needs 3n message and 3 rounds of message exchange to reach the ﬁnal decision: (1) the coordinator broadcasts a request to vote, (2) participants reply with their vote, and (3) the coordinator broadcasts the decision. Many variations and optimizations have been proposed to increase the performance of 2PC. But homogeneity and synchronous communication among sites is the basic assumption behind the original and other variants of 2PC. Grid architecture is heterogeneous and autonomous; thus dependence on other sites and synchronous communication between sites is not a valid assumption. Multi/federated database systems are heterogeneous, and they have been exten- sively studied during the last decade. Multi/federated database systems are mostly studied, designed, and optimized for short-lived and noncollaborative transactions. These database systems are designed in a bottom-up fashion, that is, the sys- tem designer knows the design requirements before the system is designed and deployed. On the other hand, a Grid database supports long-running collaborative transactions. Design requirements of Grid databases can vary rapidly as the system is more dynamic than the traditional distributed, multi/federated databases because sites should be able to join and leave the Grid dynamically without modiﬁcations to the database management systems. In addition, multidatabase systems have the leverage of a global management layer known as a multidatabase management system. The transaction models developed for long-running transactions were designed with nested transaction structures. Hence, these models are not suitable for the collaborative environment of Grids. To summarize, the following points are noted: (1) ACPs designed for homogeneous and synchronous distributed database sys- tems cannot be implemented in Grid databases because of architectural lim- itations (Grid databases are heterogeneous and asynchronous). (2) Multidatabase systems, being heterogeneous and autonomous in nature, are architecturally closer to the Grid environment. But they enjoy the leverage of a multidatabase management systems layer, which is absent in Grids. The multidatabase layer stores global information such as global lock table, global logs, etc. Because of site autonomy, Grids cannot store global infor- mation. (3) A multidatabase employs redo, retry, and compensate approach for ACP. Redo and retry cannot be implemented in a Grid environment because both approaches make use of a global management layer. The compensate approach assumes that no other transaction should be serialized between the compensated-for transaction and the compensation transaction. This is impossible to implement in Grids because of the lack of a top-layer
12.2 Grid Atomic Commit Protocol (Grid-ACP) 343 management system and autonomy restrictions of sites. Grid databases need to operate in a loosely coupled service-oriented architecture. Even from an architectural perspective (apart from a data consistency per- spective), it is difﬁcult to implement traditional ACPs in Grid environment. Grid databases will access data from globally separated data sites via WWW. Most of the distributed data architecture uses distributed objects for communication, for example, CORBA. CORBA has major limitations while operating on WWW. CORBA was not designed to work with HTTP, the standard web-based protocol. Thus there is a need to develop protocols that can be easily integrated into web services. In this chapter, an ACP suitable for heterogeneous, autonomous, and asyn- chronous Grid databases is presented. 12.2 GRID ATOMIC COMMIT PROTOCOL (GRID-ACP) The concept of compensating transactions is used in Grid-ACP. The execution of compensating transactions results in semantic atomicity. Semantic atomicity is deﬁned as follows: Deﬁnition 12.1: Let Ti be a global transaction and CT i be a collection of local compensating subtransactions fC Ti1 ; C Ti2 ; : : : ; C Tin g, one for each site where Ti executes. Ti is semantically atomic if and only if either Ti is committed at all sites j where Ti executes, or all C Ti (where j D 1; 2 : : : n) are committed at all sites where Ti has committed. 12.2.1 State Diagram of Grid-ACP Figure 12.1 shows the state diagram of the proposed grid-atomic commitment pro- tocol (Grid-ACP). A pre-abort state is introduced in the originator, and two new states, the sleep and compensate states, are introduced in the participant’s state dia- gram. The subtransaction will enter the “sleep” state when it has ﬁnished execution and is ready to release all acquired resources such as locks on data items, comput- ing resources, etc. Because of the autonomy restriction in Grids, all resources must be released according to the participating site’s requirement and cannot wait for the global decision. Thus the “sleep” state is an indication to transaction managers that the local subtransaction of the global transaction at the participating site has decided to commit. But the global decision is not yet made. At the time when the subtransaction enters into the “sleep” state, all computing and data resources can be released. All sites are autonomous and thus cannot hold resources for any external process. If any of the other participating sites decide to abort any cohort of the global transaction, Ti , the originator site informs all the participating sites (in “sleep”
344 Chapter 12 Grid Transaction Atomicity and Durability Running Running Wait Sleep Abort Commit Pre-Abort Commit Compensate Abort State diagram of transaction originator State diagram of participating site Figure 12.1 State diagram of Grid-ACP state) of Ti to start the compensation. While the compensating transaction is exe- cuting, the site is in the compensate state. If the compensating transaction fails, it is re-executed until it is completed successfully. This raises the question as to whether the compensating transaction can keep on executing forever. The answer is “no,” because the compensating transaction is the logical inverse of the commit- ted subtransaction. Since the subtransaction decided to successfully commit and was in the “sleep” state, its logical inverse must also eventually commit. Once the compensating transaction successfully commits, the subtransaction is semantically aborted. If the global transaction decides to commit, it can directly enter into the com- mit state. But if the global decision is to abort, the originator of the transaction cannot directly enter into the abort state. The originator must be assured that all participants have successfully compensated, that is, semantically aborted, and thus it enters in the pre-abort state. After receiving conﬁrmation that all sleeping sub- transactions have been successfully aborted, the originator enters into the abort state. 12.2.2 Grid-ACP Algorithm The Grid-ACP algorithm is as follows: (1) Based on the information available at the middleware, the global transaction is divided into subtransactions and submitted to the participating database systems. (2) The sites are autonomous in nature; hence after executing their portion of the subtransactions, participants go into the “sleep” state. The participants
12.2 Grid Atomic Commit Protocol (Grid-ACP) 345 then inform the originator about the outcome of the subtransactions. Neces- sary information is logged. Details of log ﬁles are discussed in the following sections. (3) The originator, after collecting responses from all participants, decides whether or not to commit or to abort the global transaction. If all participants decide to go into the “sleep” state, the decision is to commit, or else the decision is to abort. If the decision is to abort, the message is sent only to those participants who are in the sleep state. If the decision is to commit, the decision is sent to all participants. (4) (a) If the participating site decides to commit and is in the “sleep” state and the global decision is also to commit, the subtransaction can go directly to the commit state because local and global decisions are the same. (b) If the participating site decides to commit and is in the “sleep” state, but the ﬁnal decision is to abort the global transaction, then the subtrans- action, which is in the sleep state, must be aborted. But, as mentioned earlier, when the local site enters the sleep state it releases all locks on data items as well as all acquired resources. This makes abortion of the transaction impossible. Hence, a compensating transaction must be exe- cuted to reverse all the changes, using compensation rules, to restore the semantics of the database before executing the original subtransaction, thereby achieving semantic autonomy. If the compensating transaction fails, it is resubmitted until it commits. The compensating transaction must eventually commit, as it is a logical inverse of the “sleeping” trans- action. We are not deﬁning the compensation rules as they are out of the scope of this study. Grid-ACP for the originator of the global transaction is shown in Figure 12.2. Grid-ACP for participants of the global transaction is shown in Figure 12.3. Algorithm: Originator’s algorithm for Grid-ACP submit subtransactions to participants; wait for response from all participants; 1. if all response to sleep 2. write commit record in log; 3. send global commit to all participants; else 4. write abort record in log; 5. send global abort to participants, decided to commit; 6. wait for response from these participants; Figure 12.2 Originator’s algorithm for Grid-ACP
346 Chapter 12 Grid Transaction Atomicity and Durability Algorithm: Participant’s algorithm for Grid-ACP received subtransaction from originator if participant decides to commit write sleep in log send commit decision to originator 1. enter sleep state wait for decision from originator if decision is commit write commit in participant log else if decision is abort start compensating transaction for subtransaction 2. if compensating transaction aborts restart compensating transaction until it commits write commit for compensating transaction send acknowledgement to originator else write commit for compensating transaction else if participant decides to abort write abort in log send abort decision to originator Figure 12.3 Participant’s algorithm for Grid-ACP 12.2.3 Early-Abort Grid-ACP Step 3 of the Grid-ACP algorithm can be modiﬁed to improve the performance. The originator can decide to abort as soon as it receives the ﬁrst abort from any of the participants. But with this strategy, the abort message has to be sent to all participants, not only to those who decided to commit. Thus there is a trade-off between saving the number of messages in the network and the processing time of those subtransactions that are still active and have not yet reached a decision. The participants’ algorithm for early-abort Grid-ACP will be same as Figure 12.3, and hence the discussion of the participants’ algorithm is omitted for the sake of brevity. Figure 12.4 shows the modiﬁed algorithm of the originator for the early-abort protocol. The originator in the Grid-ACP algorithm waits until response from all par- ticipating sites is received (line 1, Fig. 12.2). Those participants who decided to abort would not be affected by the originator’s decision. Thus if the global decision is to abort the transaction, the decision is only sent to participants that have decided to commit their subtransactions (and are in the “sleep” state). This strategy may sometimes become computationally expensive, for example, say a global transaction has n subtransactions. Suppose the ﬁrst subtransaction returns an abort decision; then the ﬁnal decision must be a global abort. Although the global decision can be made from available information, the originator still has
12.2 Grid Atomic Commit Protocol (Grid-ACP) 347 Algorithm: algorithm for early-abort Grid-ACP submit subtransactions to participants; wait for response from participants; 1. if any response is abort 2. write abort record in log; 3. send global abort to all participants; 4. wait for response from participants; else if all response to sleep then begin 5. write commit record in log; 6. send global commit to all participants; Figure 12.4 Originator’s algorithm for early-abort Grid-ACP to wait for other (n 1) decisions to arrive from the participants. If all other par- ticipants decided to commit their subtransactions, effectively the computation is wasted, ﬁrst in completing the subtransactions and then in compensating the sub- transactions. The originator’s algorithm for the early-abort Grid-ACP can make the global decision as soon as the ﬁrst abort decision from any participant is received (line 1, Fig. 12.4). An abort message to all participant sites can be sent immediately (line 3, Fig. 12.4), and the computation cost of other participants can be saved. Thus the overhead of compensation could also be reduced. If the last decision instead of the ﬁrst one that was received from the participants is “abort,” then the early-abort Grid-ACP reduces to a normal Grid-ACP because all the participants have already ﬁnished execution. EXAMPLE Let us consider a simple example to demonstrate the working of Grid-ACP. There is no subtransaction executing at the originator site in this case. Consider the following cases: Case 1 (atomicity of a single transaction): Considering the execution of subtransactions of equations 11.1 and 11.2 from Chapter 11. ST12 D r12 .O1 /r12 .O2 /w12 .O1 /C12 ST13 D w13 .O3 /C13 Subtransactions running to successful completion: Subtransactions will autonomously execute and enter into the “sleep” state (step 2 of Grid-ACP and line 1 of Fig. 12.3). Since ST 12 and ST 13 both decided to commit, the originator’s decision is to “commit,” which is communicated to the participants (step 3 of Grid-ACP and lines 1 to 3 of Fig. 12.2). As the global decision matches with the local ones, both subtransactions update their state from “sleep” to “commit” (step 4a of Grid-ACP, and else part of Fig. 12.3). Any subtransaction decides to abort: Suppose ST 13 decides to abort. The originator’s decision will now be to abort the global transaction (lines 4 to 6 of Fig. 12.2). ST 13 has unilaterally decided to abort, and the decision only needs to be sent to ST 12 at site 2. Since
348 Chapter 12 Grid Transaction Atomicity and Durability ST 12 decided to commit, it starts the compensating procedure (step 4b of Grid-ACP and else if part of Fig. 12.3). Since the compensating transaction nulliﬁes the effect of the ST 12 , it may be of the following form: CT 12 (compensating transaction for global transaction 1 at site 2) D w12 (old value of O1 / If CT 12 aborts, it is reexecuted to successful completion, so that it reﬂects that ST 12 has aborted. The acknowledgement is sent back to the originator. Case 2 (atomicity in the presence of multiple transactions): Maintaining atomicity in the presence of multiple transactions is more complicated, because of the fact that other transactions may have read the values written by the “sleeping” transaction. If all the sub- transactions execute to completion, then the execution is similar to case 1. Therefore, we only discuss the case of aborting transactions. Consider global transactions, T1 and T2 , from Chapter 11. T1 D r1 .O1 /r1 .O2 /w1 .O3 /w1 .O1 /C1 T2 D r2 .O1 /r2 .O3 /w2 .O4 /w2 .O1 /C2 Consider the following history: H1 D r12 .O1 /r12 .O2 /w12 .O1 /S12 r22 .O1 /w22 .O1 / .S12 means ST 12 is in sleep state) In the above history H1 , ST 12 is waiting for the global decision of T2 . Suppose the global decision is to abort T2 , then ST 12 must also abort. Consequently, ST 22 should also abort, as it has read from ST 12 . This situation may lead to cascading aborts. Considering the autonomous behavior of Grids, this may be an unavoidable situation. If any database site implements a strict schedule, there will be no cascading aborts, but it is not in the control of the middleware. As a preventive measure, the following two options can be considered: (a) After a transaction enters into “sleep” state, a ceiling or cap value can be deﬁned to restrict the number of cascading aborts. (b) A conﬂicting global transaction may not be submitted to a conﬂicting site until the originator has made the ﬁnal decision on the already executing global transaction, so that cascading aborts may be avoided. 12.2.4 Discussion The purpose of Grid-ACP and early-abort Grid-ACP is to deal with autonomy and heterogeneity between sites. Because of autonomy, synchronous communication between the originator and participants is not possible, and thus participants decide to commit unilaterally. Deciding to commit the transaction unilaterally and without consulting the originator does not come without a cost. The algorithm pays a price for releasing locks and resources early, which is unavoidable in an autonomous environment like the Grid. If the transaction releases the resources early, then other transactions may read the values written by this transaction. To handle this
12.2 Grid Atomic Commit Protocol (Grid-ACP) 349 problem, traditional multidatabase systems implement a top layer of multidatabase management system. The top-layer management system enforces a criterion that prevents execution of any other global transaction between the compensated-for transaction and the compensating transaction. Implementation of a top-layer management system is not possible in Grids for two reasons: (i/ the autonomy of sites and (ii) the criterion becomes too restric- tive in the Grid environment (heterogeneous). The absence of a global management layer increases the chances of cascading aborts. For example, any local transaction LT i that reads data written by any global transaction Ts in the sleep state cannot decide to commit. If Ts has to abort from the sleep state, then the local transaction LT i must also abort, thus having cascading aborts or compensating transactions. Hence, to avoid cascading aborts/compensation, any transaction that reads from values written by a transaction that is in the sleep state must also not commit until the “sleeping” transaction commits. But, considering the heterogeneity and autonomy properties of the Grid architecture, cascading aborts/compensations are unavoidable. Thus the purpose of the “sleep” state becomes twofold. First, it acts as an intermediate step before the commit state to encounter autonomy of the archi- tecture, and second, the “sleep” state can be used by the application to cap the number of cascading aborts. Implementing the “sleep” state does not need any modiﬁcation to the local transaction manager module. The “sleep” state is deﬁned in the interface, and hence no changes are required in any local modules. Thus the autonomy of the individual site is maintained. 12.2.5 Message and Time Complexity Comparison Analysis The time and message complexity of the algorithm are described below. Time Complexity Analysis The Grid-ACP needs two rounds of messages under normal conditions: (1) the participant sends its decision to commit/abort and (2) the decision of the originator is communicated to participants. This gives an impression of 2PC, but the state of participants after sending the decision is different. While 2PC enters in wait state and holds all the resources, computing, and data, until the global decision is received, Grid-ACP releases all resources after sending the local decision and enters into a “sleep” state. Message Complexity Analysis Let the number of participants be n. Considering that the originator sends the ﬁnal decision to all the sites, including those sites it has decided to abort, the number of messages in each round is n. Thus maximum number of messages required is 2n to reach a consistent decision under normal conditions. Early-abort Grid-ACP takes