Managing time in relational databases- P11

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:20

Thêm vào BST

Báo xấu

60
lượt xem 7
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'managing time in relational databases- p11', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Managing time in relational databases- P11

184 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES appears in it as a non-key column. Wellness program name is left unchanged. Episode begin date, effective end date, assertion end date and row create date are added as non-key columns. As before, unique constraints and indexes are augmented and are modified, as required. Wellpgmcat_cd code appears in the logical data model as a foreign key to the Wellness Program table, and so the AVF must convert it into a temporal foreign key. The foreign key declaration is dropped from the DDL, the wellness program category code col- umn is also dropped, and a wellpgmcat_oid column replaces it. With these changes, the temporalization of this table is complete. The Wellness Program Enrollment Table. Unlike the other tables in this sample database, Wellness Program Enrollment is an associative table, commonly called an “xref table”. But its conversion to a temporal table follows the pattern we have already seen. The only difference is that this table has two for- eign keys to convert to temporal foreign keys, not just one, and two columns in its original primary key. According to the Table Type metadata table, the Wellness Pro- gram Enrollment table is an asserted version table. Prior to temporalization, the primary key of this table consisted of the two foreign keys client_nbr and wellpgm_nbr. But asserted ver- sion tables must have single-column object identifiers, and so instead of creating an object identifier for both client and well- ness program, we create a single object identifier and name it client_wellpgm_oid. We then add effective begin date and asser- tion begin date as the other two primary key columns. As we see in Figure 8.8, the business key of this table is the pair of temporal foreign keys. The other four non-key columns are left unchanged. Episode begin date, effective end date, asser- tion end date and row create date are added as non-key columns. As before, unique constraints and indexes are aug- mented and are modified, as required. Client_nbr and wellpgm_nbr appear in the logical data model as foreign keys to the Client and Wellness Program tables, respectively. The foreign key declarations are dropped from the DDL, the client number and wellness program number columns are also dropped, and the client_oid and wellpgm_oid columns, respectively, replace them. With these changes, the temporalization of this table is complete. In fact, the temporalization of the entire physical data model is now complete. The result is the Asserted Versioning physical data model shown in Figure 8.8. But an asserted version data- base is not simply one that contains one or more temporal tables. It is also a database that includes the code which enforces
Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 185 the semantic constraints without which those tables would just be a collection of columns with nothing particularly temporal about them at all. Generating Temporal Entity and Temporal Referential Integrity Constraints If this temporalized physical data model were submitted to the DBMS, and an empty database were created from it, we could begin to populate the tables in the database right away. We could populate them using conventional SQL insert, update and delete statements. But we would have to be very careful. We already have some idea of what temporal entity integrity and temporal referential integrity are, but we have yet to see these integrity constraints at work. Some of the work they do is quite complex. The AVF enforces temporal integrity as data is being updated, not as it is being read. Today’s DBMSs do not support temporal integrity constraints on versions and episodes, so it is the AVF— or a developer-written framework—that must do it. Applying those constraints, the AVF would reject some temporal transactions because they would violate one or both of those constraints. But if we write our transactions in native SQL, then whenever we do maintenance to the database, we will have to manually check the contents of the database, compare each transaction to those contents, and determine for ourselves whether or not the transactions both did what they were intended to do, and resulted in a temporally valid database state. Past experience has shown us that doing our own application-developed bi-temporal data maintenance, using standard SQL, is resource-intensive and error-prone. It is a job for a company’s most experienced DBAs, and even they will have a difficult time with it. Having an enter- prise standard framework like the AVF to carry out these oper- ations significantly reduces the work involved in maintaining temporal data, and will eliminate the errors that would otherwise inevitably happen as temporal data is maintained. Using a framework like the AVF, temporal transactions will be no more difficult to write than conventional transactions. The reason is that the AVF supports a temporal insert, temporal update and temporal delete transaction in which all temporal qualifiers on the transaction are expressed declaratively. These transactions also preserve a fundamentally important feature of standard insert, update and delete transactions. They allow one bi-temporal semantic unit of work to be expressed in one transaction.
186 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES Typically, a single standard SQL transaction will insert, update or delete a single row in a conventional table. And typi- cally, the corresponding temporal transaction will require two or three physical transactions to complete. In addition, many temporal update transactions, as we will see, and many temporal delete cascade transactions too, can require a dozen or more physical transactions to complete. If we attempt to maintain a bi-temporal database ourselves, using standard SQL, then for each semantic intention we want to express in the database, we will have to figure out and write these multiple physical transactions ourselves. As Chapter 7 indicated, and as Chapters 9 through 12 will make abundantly clear, that is a daunting task. Redundancies in the Asserted Versioning Bi-Temporal Schema An Asserted Versioning database is a physical implementa- tion of a logical data model, a logical model which does not contain any mention of temporal data in the model itself. In fact, the logical data models of Asserted Versioning databases are indistinguishable from the logical data models of conventional databases. Apparent Redundancies in the Asserted Versioning Schema However, some data modelers have objected to an apparent third normal form (3NF) violation in the bi-temporal schema common to all asserted version tables. They point to the effec- tive end date, the assertion end date and the row creation date to support their claims. Their objections, in summary, are one or more of the following: (i) The effective end date is redundant because it can be inferred from the effective begin date of the following version. (ii) The assertion end date is redundant because it can be inferred from the assertion begin date of the next assertion of a version. (iii) The row create date is redundant because it is the same as the assertion end date. Now in fact, none of these objections are correct. As for the first objection, an effective end date would be redundant if every version of an object followed immediately after the previous
Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 187 version. If we could depend on that being true, which means if we could depend on there never being a requirement to support multiple episodes of the same object, then the effective end date would be redundant. One could make the argument that all versions within one episode have versions that [meet] and so, within each episode, the end date could be inferred. Although that is true, we would still need an episode end date to mark the end of the episode. Furthermore, the end dates on each version significantly improve performance because both dates are searched on the same row, reducing the need, otherwise, for expensive subselects on every read. Also, we are not interested in implementing just the minimal temporal requirements a specific business use may require, especially when it would be difficult and expensive to add additional functionality, such as support for multiple episodes (i.e. for temporal gaps between some adjacent versions of the same object), to a database already built and populated, and to a set of maintenance transactions and queries already written and in use. All asserted version tables are ready to support gaps between versions. On the other hand, as long as temporal trans- actions issued to the AVF do not specify an effective begin date, that capability of Asserted Versioning will remain unused and the mechanics of its use will remain invisible. As for the second objection, an assertion end date would be redundant with the following asserted version’s assertion begin date only if every assertion of a version followed the previous one without a gap of even a single clock tick in assertion time. But once again, we are not interested in implementing just the minimal temporal requirements a specific business use may require. All asserted version tables are ready to support deferred assertions, and deferred assertions may involve a gap in asser- tion time. On the other hand, as long as temporal transactions issued to the AVF do not specify an assertion begin date, that capability of Asserted Versioning will remain unused and the mechanics of its use will remain invisible. In addition, as we will see in following chapters, single vers- ions can be replaced by multiple versions as new assertions are made, and vice versa. In that case, the logic for inferring asser- tion begin dates from the assertion end dates of other versions could become quite complex. This complexity could affect the performance, not only of maintenance transactions, but also of queries. The reason is that, if we followed this suggestion, it would be impossible to determine, from just the data on any one row, whether or not that row has an Allen relationship with the assertion time specified on a query. To determine that, we
188 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES would need to know the assertion time period of the row, not just when that time period ended. As for the third objection, a row create date would be redun- dant with an assertion end date if Asserted Versioning did not support deferred assertions. In fact, neither the standard tempo- ral model, nor any more recent computer science research that we are aware of, includes deferred assertions. But Asserted Versioning does. Because it does, the AVF may insert rows into asserted version tables whose assertion begin dates are later than their row creation dates. A Real Redundancy in the Asserted Versioning Schema But there is one redundancy that we did introduce into the Asserted Versioning schema. It was to add the episode begin date to every row. The episode begin date, as we all know by now, is the effective begin date of the effective-time earliest version of an episode. So it is not functionally dependent on the primary key of any row which is not the initial version of an episode.2 The primary use of this column is to indicate, for any version, when the episode that version is a part of began. It efficiently associates every version with the one episode it belongs to. Lacking this column, we would only be able to find all versions of an episode by looking for versions with the same oid that [meet], and we would only be able to distinguish one episode from the next one by looking for a [before] or [beforeÀ1] relation- ship between adjacent versions with the same oid. Together with that version’s own effective end date, this tells us that the object that version designates has been continuously represented, in current assertion time, from the effective-time beginning of that version’s episode to the effective-time end of that version. Since the parent managed object in a temporal ref- erential integrity relationship is an episode, this means that when we are validating temporal referential integrity on a child version, all we need to do is find one parent version whose effec- tive end date is not earlier than the effective end date of the new 2 Interestingly enough, although clearly redundant, this replication of the effective begin date of each episode’s initial version onto all other versions of the episode is not a violation of any relational normal form. Its presence involves no partial, transitive or multi-valued dependencies. For other examples of redundancies that are not caught by fully normalizing a database, see Johnston’s articles in the archives at Information_Management.com (formerly DM Review), with links listed in the bibliography.
Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 189 child version, and whose episode begin date is not later than the effective begin date of the new version. In other words, it enables us to do TRI checking from one parent-side row, rather than hav- ing to go back and find the row that begins that parent episode. This significantly improves performance for temporal referential integrity checking. The result of TRI enforcement is to guarantee that the effec- tive-time extent of any version representing a TRI child object completely [fills] the effective-time extent of one set of contigu- ous versions representing a TRI parent object. In addition, note that the presence of this redundant column has little maintenance cost associated with it. As new versions are added to an episode, the episode begin date of the previous version is just copied onto that of the new version. Only in the rare cases in which an episode’s begin date is changed will this redundancy require us to update all the versions in the episode. Glossary References Glossary entries whose definitions form strong inter- dependencies are grouped together in the following list. The same glossary entries may be grouped together in different ways at the end of different chapters, each grouping reflecting the semantic perspective of each chapter. There will usually be sev- eral other, and often many other, glossary entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfamiliar term is encountered. Allen relationships contiguous filled by include asserted version table Asserted Versioning Asserted Versioning database Asserted Versioning Framework (AVF) assertion begin date assertion end date assertion time assertion time period business key reliable business key unreliable business key
190 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES child object clock tick closed-open granularity conventional database conventional table conventional transaction deferred assertion design encapsulation maintenance encapsulation query encapsulation effective begin date effective end date effective time effective time period episode episode begin date existence dependency managed object mechanics object object identifier oid parent episode parent object PERIOD datatype represented row creation date temporal database temporal entity integrity (TEI) temporal foreign key (TFK) temporal referential integrity (TRI) temporal transaction temporal update transaction temporalize version
AN INTRODUCTION TO 9 TEMPORAL TRANSACTIONS CONTENTS Effective Time Within Assertion Time 192 Explicitly Temporal Transactions: The Mental Model 195 A Taxonomy of Temporal Extent State Transformations 197 The Asserted Versioning Temporal Transactions 200 The Temporal Insert Transaction 201 The Temporal Update Transaction 206 The Temporal Delete Transaction 209 Glossary References 211 Temporal transactions are inserts, updates or deletes whose targets are asserted version tables. But temporal transactions are not submitted directly to the DBMS. The work that has to be done to manage conventional tables is straightforward enough that we can let users directly manipulate those tables. But bi-temporal tables, including asserted version tables, are too complex to expose to the transaction author. The difference between what the user wants done, and what has to take place to accomplish it, is too great. And so temporal transactions are the way that the query author tells us what she wants done to the database, without having to tell us how to do it. The mechanics of how her intentions are carried out are encapsulated within our Asserted Versioning Framework. All that the appli- cation accepting the transaction has to do is to pass it on to the AVF. A DBMS can enforce such constraints as entity integrity and referential integrity, but it cannot enforce the significantly more complex constraints of their temporal analogs. It is the AVF which enforces temporal entity integrity and temporal refer- ential integrity. It is the AVF which rejects any temporal Managing Time in Relational Databases. Doi: 10.1016/B978-0-12-375041-9.00009-1 Copyright # 2010 Elsevier Inc. All rights of reproduction in any form reserved. 191
192 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS transactions that violate the semantic constraints that give bi- temporal data its meaning. It is the AVF that gives the user a declarative means of expressing her intentions with respect to the transactions she submits. In the Asserted Versioning temporal model, the two bi- temporal dimensions are effective time and assertion time. If assertion time were completely equivalent to the standard tem- poral model’s transaction time, then every row added to an asserted version table would use the date the transaction was physically applied as its assertion begin date. Important addi- tional functionality is possible, however, if we permit rows to be added with assertion begin dates in the future. This is func- tionality not supported by the standard temporal model. But it comes at the price of additional complexity, both in its seman- tics and in its implementation. Fortunately, it is possible to segregate this additional func- tionality, which is based on what we call deferred transactions and deferred assertions, and to discuss Asserted Versioning as though both its temporal dimensions are strictly analogous to the temporal dimensions of the standard temporal model. This makes the discussion easier to follow, and so this is the approach we will adopt. Deferred assertions, then, will not be discussed until Chapter 12. Effective Time Within Assertion Time A row in a conventional table makes a statement. Such a row, in a conventional Policy table, is shown in Figure 9.1. This row makes the following statement: “I represent a policy which has an object identifier of P861, a client of C882, a type of HMO and a copay of $15.” The statement makes no explicit ref- erence to time. But we all understand that it means “I represent a policy which exists at the current moment, and which at the current moment has an object identifier of . . . . . ”. This same row, with an effective time period attached, is shown in Figure 9.2. It makes the following statement: “I represent a policy which has an object identifier of P861 and which, from January 2010 to oid client type copay P861 C882 HMO $15 Figure 9.1 A Non-Temporal Row.
Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 193 oid eff-beg eff-end client type copay P861 Jan10 Jul10 C882 HMO $15 Figure 9.2 A Uni-Temporal Version. July 2010, has a client of C882, a type of HMO and a copay of $15.” In other words, the row shown in Figure 9.2 has been placed in a temporal container, and is treated as representing the object as it exists within that container, but as saying nothing about the object as it may exist outside that container. If we were managing uni-temporal versioned data, that would be the end of the story. But if we are managing bi-temporal data, there is one more temporal tag to add. This same row, with an assertion time period attached, is shown in Figure 9.3. It makes the following statement: “I represent the assertion, made on January 2010 but withdrawn on October 2010, that this row represents a policy which has an object identifier of P861 and which, from January 2010 to July 2010, has a client of C882, a type of HMO and a copay of $15.” In other words, the row shown in Figure 9.2, as included in its first temporal container, has been placed in a second temporal container, and is treated as representing what we claim, within that sec- ond container, is true of the object as it exists within that first container, but as saying nothing about what we might claim about the object within its first container outside that second container. From January to July, this statement makes a current claim about what P861 is like during that period of time. From July to October, this statement makes an historical claim, a claim about what P861 was like at that time. But from October on, this statement makes no claim at all, not even an historical one. It is simply a record of what we once claimed was true, but no longer claim is true. All this is another way of saying (i) that a non-temporal row represents an object; (ii) that when that row is tagged with an effective time period, it represents that object as it exists during that period of time (January to July in our example); and (iii) that when that tagged row receives an additional time period tag, it represents our assertion, during the indicated period of time oid eff-beg eff-end asr-beg asr-end client type copay P861 Jan10 Jul10 Jan10 Oct10 C882 HMO $15 Figure 9.3 A Bi-Temporal Row.
194 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS object version assertion Figure 9.4 Assertions Are About Versions Are About Objects. (January to October in our example) that the effective-time tagged row represents that object as it is/was during the other indicated period of time (January to July). So an effective time tag qualifies the representation of an object, while an assertion time tag qualifies the effective-time qualified representation of an object. Effective time containment turns a row representing an object into a version. Assertion time containment turns a row representing a version into an assertion of a version, i.e. into a temporally delimited truth claim.1 This is illustrated in Figure 9.4. Temporal integrity constraints govern the effective time relationships among bi-temporal rows. But, as we pointed out earlier, these effective time relationships apply only within shared assertion time. For example, when one version is asserted from January 2012 to April 2014, and another version of the same object is asserted from March 2012 to 12/31/9999, then the effective time periods of those two versions must not [intersect] from March 2012 to April 2014 in assertion time. But from Janu- ary 2012 to March 2012, they neither [intersect] nor do not [intersect]. During those two periods of assertion time, the com- parison doesn’t apply. During those times, those two versions are what philosophers call “incommensurable”. In the following discussion of temporal integrity constraints, we will assume that all the rows involved exist in shared asser- tion time. Note that it is effective time that exists with assertion time, and not vice versa. If the semantic containment were reversed, 1 And if there were no versioning, and non-temporal statements were contained directly in assertion time, i.e. non-temporal rows were given an assertion time tag but not an effective time tag, then assertion time containment would turn non-temporal statements directly into temporally delimited truth claims.
Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 195 it would be possible to have some rows which are in effect and which we assert to be true, and also have other rows which are in effect but which we do not assert are true. But when we say that a row is in effect from January to June, we are saying that it makes a true statement about what its object is like during that period of time. In other words, we are asserting that it is true. Consider a row with an assertion time of [Mar 2012 – Aug 2012] and an effective time of [Jan 2012 – Dec 2012]. Clearly this means that, from March to August, we assert that this row makes a true statement about what its object was like from January to December. Barring deferred assertions, we can tell that on March 2012, we retroactively inserted this version, effective as of January 2012, and that on August 2012, we withdrew the assertion. Explicitly Temporal Transactions: The Mental Model In every clock tick within a continuous period of effective time, an object is either represented by a row or not represented. If it is represented, there is business data which describes what that object is like during that clock tick. So our three temporal transactions affect the representation of an object in a period of time as follows: (i) A temporal insert places business data representing an object into one or more clock ticks of effective time. (ii) A temporal update replaces business data representing an object in one or more clock ticks of effective time. (iii) A temporal delete removes business data representing an object from one or more clock ticks of effective time. In all three cases, those clock ticks are contiguous with one another, as they must be since they constitute a continuous period of effective time. Let’s call that continuous span of clock ticks the target span for a temporal transaction. A designated tar- get span can be anywhere along the calendar timeline. It can also be open or closed, i.e. it can use either a normal date or 12/31/9999 to mark the end of the span. When the user writes a temporal insert transaction, she is doing two things. First, she is designating a target span of clock ticks. Second, she is specifying business data that she wants inserted into the table, and that will occupy precisely that effective time target span within current assertion time. That current assertion time starts Now(), i.e. when the transaction is processed, and continues on until further notice. We can say that this transaction, like every transaction that accepts the default
196 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS values for effective time, creates a version that describes what its object looks like from now on, and also that this transaction, like every transaction other than the deferred ones, creates an asser- tion that, from now on, claims that the version makes a true statement. When the user writes a temporal update, she is also doing two things. First, she is designating a target span of clock ticks. Second, she is specifying a change in one or more columns of business data, a change that she wants applied to every version or part of a version of the designated object that falls, wholly or partially, within that effective time target span, a change that will be visible in current assertion time but not in past assertion time. She is not, however, necessarily specifying a change that will be applied to every clock tick in that target span, because a temporal update transaction does not require that the object it designates be represented in every clock tick within its target span—only that it be represented in at least one of those clock ticks. A temporal delete is like a temporal update except that it specifies that every version or part of a version of the designated object that falls, wholly or partially, within that target span will be, in current assertion time, removed from that target effective timespan. Like a temporal update, a temporal delete does not require that its designated object occupy (be represented in) all of the clock ticks of the target span, only that it occupy at least one of them. It follows that it is possible for more than one episode to be affected by an update or a delete. All episodes that fall within the target span of an update or delete transaction are affected. This includes parts of episodes as well as entire episodes. Given a target timespan, it is possible for one episode to begin outside that span and either extend into it or extend past the end of it, and also for one episode to begin within that span and either end within that span or extend past the end of it. By the same token, within either or both of those partially included target episodes, the start or end of the target span may or may not line up with the start or end of a specific version. In other words, one version may start outside a target span but extend into or even through it, and another version may begin within a target span and either end within it or extend past the end of it. The details of how these transactions work will be discussed in the rest of this chapter. But we can already see that the mental image of designating a target span of clock ticks and then issuing a transaction whose scope is limited to that target span, is intui- tively clear. But it is one thing to provide a clear mental image—
Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 197 that of transactions as designating (i) an object, (ii) a span of time, (iii) business data and (iv) an action to take—but another thing to provide the details. To that task we now turn. A Taxonomy of Temporal Extent State Transformations Because of the complexities of managing temporal data, we need a way to be sure that we understand how to carry out every possible temporal extent state transformation that could be spe- cified against one or more asserted version tables. A temporal extent state transformation is one which, within a given period of assertion time, adds to or subtracts from the total number of effective-time clock ticks in which a given object is represented. We need a taxonomy of temporal extent state transformations. As we explained in Chapter 2, a taxonomy is not just any hier- archical arrangement we happen to come up with. It is one whose components are distinguished on the basis of what they mean—and not, for example, on the basis of what they contain, as parts explosion hierarchies are. It is also a hierarchical arrangement whose components are, based on their meanings, mutually exclusive and jointly exhaustive. Because good taxonomies are like this, constructing them is a way to be sure that we haven’t overlooked anything (because of the jointly exhaustive property) and haven’t confused anything with any- thing else (because of the mutually exclusive property). We begin with objects and episodes. The target of every tem- poral transaction is an episode of an object. Semantically, it is episodes which are created or destroyed, or which are transformed from one state into another state. Physically, of course, it is individual rows of data which are created and modified (but never deleted) in an asserted version table. But what we are concerned with here is semantics, not bits and bytes, not strings of letters and numerals. From a semantic point of view, episodes are the fundamental managed objects of Asserted Versioning. Figure 9.5 shows our taxonomy. Under each leaf node, we have a graphic representing that transformation. A shaded rect- angle represents an episode, and a non-shaded rectangle represents the absence of an episode. A short vertical bar separa- tes the before-state, on the left-hand side, from the after-state, on the right-hand side, produced by a transformation. Each of the nodes in our Allen relationship taxonomy are referred to, in the text, by surrounding the name of the node with
198 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS Asserted Versioning Temporal Extent State Transformations Create Modify Erase Merge Split Lengthen Shorten Lengthen Lengthen Shorten Shorten Backwards Forwards Backwards Forwards Temporal Insert Transaction Temporal Delete Transaction Figure 9.5 A Taxonomy of Temporal Extent State Transformations. brackets, for example as with [meets], [duringÀ1], [intersects], etc. We also underline the names of nodes which are not leaf nodes. We will use a similar convention for this taxonomy of temporal extent transformations, referring to each of its nodes by surrounding the name of the node with curly braces, for example as with {remove}, {shorten forwards}, {shorten}, etc. With any type of thing we are concerned with, there are three basic things we can do with its instances. We can create an instance of it, modify an existing instance, or remove an instance. This is reflected in the three nodes of the first level underneath the root node. Of course, in the case of episodes, the {erase} transformation is neither a physical nor a logical deletion. Instead, it is the action in which the entire episode is withdrawn from current assertion time into past assertion time. {Create}, {modify} and {erase} are clearly jointly exhaustive of the set of all temporal extent transformations, and also mutually exclusive of one another. Thus, at its first two levels, this taxon- omy is, as all taxonomies must be, a partitioning. At the level of abstraction we are dealing with, there is no fur- ther breakdown of either the {create} or {erase} transformations. Of course, there are variations on those themes, as we will see in the next chapter. For example, we can create an episode retroactively, or in current time, or proactively, and similarly for modifying or erasing an episode. As for the {modify} transformation, we achieve a partitioning by distinguishing transformations which change one episode
Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 199 into two episodes or vice versa, from transformations that trans- form episodes one for one, and then in the latter category, transformations that lengthen an episode’s representation in effective time from transformations that shorten it. Thus, we extend the property of being a mathematical partitioning down to the third level of this taxonomy. And again, as with the {create} and {erase} transformations, there are similar variations. Next, for both the {lengthen} and {shorten} transformations, it is possible to do so at the beginning or at the end of the episode. And so we complete this taxonomy with the assurance that no instance of any parent node can fail to be an instance of a child node of that parent, and also that every instance of any parent node exists as no more than one instance across the set of its child nodes. Another way to reassure ourselves of the completeness of this taxonomy is to note its bilateral symmetry. If the diagram were folded along a vertical line running between the {merge} and {split} nodes, and also between the {lengthen forwards} and {shorten backwards} nodes, each transformation would overlay the transformation that is its inverse. A third way to assure ourselves of completeness is to analyze the taxonomy in terms of its topology. On a line representing a timeline, we can place a line segment representing an episode. We can also remove a line segment from that line. Given a line segment, we can either lengthen it forwards or backwards, shorten it forwards or backwards, or split it. Given two line segments with no other segments between them, we can merge them (by lengthening one forwards towards the other and/or lengthening the other backwards towards the first, until they [meet]). There is nothing that can be done with the placement of segments on a line that cannot be done by means of com- binations and iterations of these basic operations. Finally, we need to be aware of the different scenarios possi- ble under each of these nodes. As we have already pointed out, any of these transformations can result in changes to past, pres- ent or future effective time. Additional variations come into play when we distinguish between transformations that are applied to closed or to open episodes, and between transactions which leave an episode in a closed or open state. With eight possible topological transformations, nine possi- ble combinations of past, present and future effective time (three for the target and three for the transaction), and four pos- sible open/closed combinations (two for the target and two for the transaction), we have a grand total of 288 scenarios. And this doesn’t even take into consideration deferred assertions, which
200 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS would at a minimum double the number of scenarios. Of course it might be possible to eliminate some of these scenarios as semantically impossible, i.e. as corresponding to no meaningful state and/or no meaningful transformation; but each one would still have to be analyzed. We cannot analyze every one of these scenarios. Instead, our approach will be to analyze one variation of each temporal extent transformation, and then briefly discuss other variations which appear to be interestingly different. The Asserted Versioning Temporal Transactions Figure 9.6 shows three episodes—A, B and C—located along a four-year timeline. Eight versions make up these three episodes, which are all episodes of the same object, policy P861. We will be referring to this diagram throughout our discussion of temporal transactions. In the syntax used for the transactions shown below, values are associated with business data columns by means of their position in a bracket-delimited, comma-separated series of values. Except for the object identifier, which occupies the first position, the other six columns which implement Asserted Versioning are not included in this list, those other columns being effective begin date, effective end date, assertion begin date, assertion end date, episode begin date and row create date. And of those six temporal parameters, only the first three may be specified on a temporal transaction. The syntax used in this book for temporal transactions is not the syntax in which those transactions will be submitted to the AVF. That is, it is not the syntax with which the AVF will be invoked. We use it because it is unambiguous and compact. The actual transactions supported by release 1 of the AVF can be seen at our website, AssertedVersioning.com. Episode A Episode B Episode C 1 2 3 4 5 6 7 8 Jan Jan Jan Jan Jan 2010 2011 2012 2013 2014 Figure 9.6 Eight Versions and Three Episodes of Policy P861.
Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 201 Every temporal transaction may supply an oid, a business key or both. In addition, every asserted version table has either reli- able business keys, or unreliable ones. This gives us eight possi- ble combinations of oids with reliable or unreliable business keys, for each of our three temporal transactions. For every temporal transaction, the first thing the AVF does is to perform edits and validity checks. As we will see below, these eight combinations are the checklist the AVF uses for its validity checks. The edit checks work like this. Following the object identifier are comma-delimited places for the business data associated with the object. In this case, that business data is, in order, client, pol- icy type and copay. Following this set of values is a set of three asserted version dates. The effective begin date defaults to the current date, but may be overridden with any date, past, present or future, that is not 12/31/9999. The effective end date defaults to 12/31/9999, but may be overridden with any other date that is at least one clock tick later than the effective begin date. Asser- tion dates on transactions will be discussed in Chapter 12. Until then, we assume that the assertion begin date on all transactions takes on its default value of the date current when the transaction takes place. The assertion end date can never be specified on tem- poral transactions, and is always set to 12/31/9999. After doing edit checks to insure that each element of the transaction, including its data values, is well formed, the AVF does validity checks on the transaction as a whole. Only if a transaction passes both edit and validity checks will the AVF map it into one or more physical SQL transactions, submit those transactions to the DBMS, and monitor the results to insure that either all of them or none of them update the database, i.e. that they make up a semantically complete atomic unit of work. The Temporal Insert Transaction The format of a temporal insert transaction is as follows: INSERT INTO {tablename} [,,, ] eff_beg_dt, eff_end_dt, asr_beg_dt The validity checks work like this: (i) No oid, no business key, business key is reliable. In this case, the AVF rejects the insert. The reason is that if the business key is reliable, an insert must provide it, even if it also provides an oid. Otherwise, it would be like an insert to a conventional table with a missing or incomplete pri- mary key.
202 Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS (ii) No oid, no business key, business key is not reliable. In this case, the AVF accepts the insert and assigns it a new oid. The reason is that if a business key is not reliable, it is not required on an insert transaction. Since no business key match logic will ever be carried out on tables with unreliable business keys, it doesn’t matter if rows lacking those business keys make their way onto those tables. (iii) No oid, business key present, business key is reliable. In this case, the AVF looks for a match on the business key. If it finds one, it assigns the oid of the effective-time latest matching row to the transaction. Otherwise, it assigns a new oid to the transaction. The reason is that multiple appearances of the same object, separated by gaps in time, should be recognized as multiple appearances of the same object whenever possible, and not as appearances of differ- ent objects. (iv) No oid, business key present, business key is not reliable. In this case, the AVF accepts the insert and assigns it a new oid. The reason is that if the business key is not reliable, it doesn’t matter, and so the AVF proceeds as though the business key were not there. Note that in this case, multi- ple temporal inserts which lack an oid but which contain the same business key value will result in multiple object identifiers all using that same value. Semantically, the business key will be a homonym, a single value designating multiple different objects. This, of course, is precisely what the “unreliable” means in “unreliable business key”. (v) Oid present, no business key, business key is reliable. In this case, the AVF rejects the insert. The reason is the same as it was for case (i); if the business key is reliable, an insert must provide it. Otherwise, it would be like an insert to a conventional table with a missing or incomplete primary key. (vi) Oid present, no business key, business key is not reliable. In this case, the AVF accepts the insert and uses the oid supplied with it. The reason is that if the business key is not reliable, it doesn’t matter, and so the AVF proceeds as though the business key were not there. And if the oid is already in use, that doesn’t matter. If it is in use, and there is a collision in time periods, temporal entity integ- rity checks will catch it. (vii) Oid present, business key present, business key is reli- able. In this case, the AVF looks for a match on the busi- ness key. If it finds a match, and the oid of the matching row matches the oid on the transaction, the AVF accepts
Chapter 9 AN INTRODUCTION TO TEMPORAL TRANSACTIONS 203 the transaction. If it finds a match, but the oid of the matching row does not match the oid on the transaction, the AVF rejects the transaction. If it does not find a match, and the oid on the transaction does not match any oid already in use, the AVF accepts the transaction. And if it does not find a match, but the oid on the transaction does match an oid already in use, the AVF rejects the transac- tion. The reason behind all this logic is that when both an oid and a reliable business key are present, any conflict makes the transaction invalid. (viii) Oid present, business key present, business key is not reliable. In this case, the AVF accepts the insert and uses the oid supplied with it. The reason is that if the business key is not reliable, it doesn’t matter, and so the AVF pro- ceeds as though the business key were not there. And if the oid is already in use, that doesn’t matter. If it is in use, and there is a collision in time periods, temporal entity integrity checks will catch it. Note that if the oid is in use, and there is no collision in time periods, then if the business key is a new one, the result of applying the transaction is to assign a new business key to an object, in clock ticks not previously occupied by that object. The Temporal Insert Transaction: Semantics In a conventional table, if an object is not represented and the user wishes to represent it, she issues an insert transaction which creates a row that does just that. The insert transaction expresses not only her intentions, but also her beliefs. It expresses her intention to create a representation of an object, as it currently exists, in the target table. But it also expresses her belief that such a representation does not already exist. If she is mistaken in her belief, her transaction is rejected, which is precisely what she would expect and want to happen. In an asserted version table, the question is not whether or not the object is already represented, but rather whether or not the object is already represented during all or part of the target timespan indicated on the transaction. If the user sub- mits a temporal insert transaction, she also expresses both intention and belief. Her intention is to create a representation of an object in an effective time target span. Her belief is that such a representation does not already exist anywhere in that target span. If she is mistaken in her belief, her transaction is rejected, which is precisely what she would expect and want to happen.