Managing time in relational databases- P10

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:20

0
37
lượt xem
4
download

Managing time in relational databases- P10

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'managing time in relational databases- p10', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:
Lưu

Nội dung Text: Managing time in relational databases- P10

  1. 164 Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES transactions until it is the right time to apply them, Asserted Versioning applies them right away, but does not immediately assert them. These deferred assertions may themselves be updated or deleted, and the moment on which their assertion periods become current is the moment on which we begin to claim that the world was, is or will be as they describe it. Just as deferred assertions replace collections of transactions that have not yet been applied to the database, bi-temporal data in any of the other seven categories replaces other physically external datasets. Asserted version tables contain data in all these temporal categories and, in doing so, internalize what would otherwise be physically distinct datasets, ones whose management costs are obviously significant. In Chapter 13, we look more closely at the entire family of pipeline datasets. We distinguish eight logical categories of pipe- line datasets, based on where in a combination of past, present or future assertion and effective time their data is located. Hav- ing previously shown how to eliminate these physically distinct datasets by bringing them into the production tables which are their destinations and points of origin, we now discuss each of them and show how queries and views can reassemble, as queryable objects, exactly the data that had existed in those datasets. This demonstrates that while eliminating the manage- ment costs associated with this data, we can still make this data available in whatever combinations it is needed. In Chapter 14, we discuss how to query asserted version tables. As we said before, many queries, especially the ad hoc queries written by non-technical database users, will be directed against non-temporal or uni-temporal views of asserted version tables, not against those bi-temporal tables themselves. But many queries will be written directly against those physical tables, especially those we call production queries. In that case, the effective time period specified on the query, and which qualifies the result set, will have to be com- pared to the effective time periods of the rows targeted by the query; and as we know from our review of the Allen relationships, there are 13 different ways in which those two time periods may be positioned with respect to one another. And when those queries involve joins across two (or more) asserted version tables, then the Allen relationship issues can become even more difficult. In Chapter 15, we discuss how to optimize the performance of Asserted Versioning databases. Our focus is on optimizing access to currently asserted current versions, i.e. to the rows that correspond to rows in a conventional table of persistent objects.
  2. Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES 165 In this chapter, we focus on index design, although a wide range of other optimization techniques are also considered. In Chapter 16, we conclude our presentation of Asserted Versioning. We discuss each of the four objectives we had for Asserted Versioning, and which we described in the Preface, and explain why we think those objectives have been met. We point out that Asserted Versioning has value both as a bridge to a future standards-based and vendor-provided implementation of bi-temporal data, and as a destination, being itself a semanti- cally complete implementation of bi-temporal data which works with today’s SQL and today’s databases. In the last section, we discuss ongoing research and development at Asserted Versioning LLC, and explain how interested readers can learn more about Asserted Versioning. Glossary References Glossary entries whose definitions form strong inter- dependencies are grouped together in the following list. The same Glossary entries may be grouped together in different ways at the end of different chapters, each grouping reflecting the semantic perspective of each chapter. There will usually be sev- eral other, and often many other, Glossary entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfamiliar term is encountered. We note, in particular, that none of the nodes in our taxon- omy of data management methods, or our state transformation taxonomy, are included in this list. In general, we leave taxon- omy nodes out of these lists, but recommend that the reader look them up in the Glossary. Allen relationships asserted version table Asserted Versioning Framework (AVF) assertion time transaction time bi-temporal uni-temporal deferred assertion deferred transaction
  3. 166 Part 3 DESIGNING, MAINTAINING AND QUERYING ASSERTED VERSION DATABASES effective time valid time object persistent object physical transaction temporal transaction temporal parameter pipeline dataset production table temporal entity integrity (TEI) temporal referential integrity (TRI) temporal extent state transformation the standard temporal model
  4. DESIGNING AND GENERATING 8 ASSERTED VERSIONING DATABASES CONTENTS Translating a Non-Temporal Logical Data Model into a Temporal Physical Data Model 169 The Logical Data Model 169 Referential Constraints Between Non-Temporal and Bi-Temporal Tables 171 Asserted Versioning Metadata 173 The Physical Data Model 180 Generating an Asserted Versioning Database from a Physical Data Model and Metadata 181 Temporalizing the Physical Data Model 182 Generating Temporal Entity and Temporal Referential Integrity Constraints 185 Redundancies in the Asserted Versioning Bi-Temporal Schema 186 Apparent Redundancies in the Asserted Versioning Schema 186 A Real Redundancy in the Asserted Versioning Schema 188 Glossary References 189 An Asserted Versioning database is one that contains at least one asserted version table. An asserted version table is one whose schema is that shown in Chapter 6, and on which the two temporal integrity constraints are enforced. Figure 8.1 shows how Asserted Versioning databases are gener- ated from the combination of a conventional logical data model and a set of metadata entries. Note that the logical data model has no temporal features. This means that logical data models of conventional databases, developed perhaps years ago, do not have to be changed if a decision is made to convert one or more of the tables in those databases into bi-temporal asserted Managing Time in Relational Databases. Doi: 10.1016/B978-0-12-375041-9.00008-X Copyright # 2010 Elsevier Inc. All rights of reproduction in any form reserved. 167
  5. 168 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES Logical Data Model Temporal Requirements Physical Data Model Temporal Metadata An Asserted Versioning Database Non-Temporal Tables Asserted Versioning Tables TEI Enforcement TRI Enforcement Figure 8.1 Designing and Generating an Asserted Versioning Database. version tables. This means that when building new logical data models, or extending old ones, data modelers can ignore tempo- ral requirements and focus on design issues which are often complex enough without introducing temporal considerations. It means that temporal requirements can be expressed declara- tively, in metadata associated with a conventional data model, rather than by hardcoding those requirements in the data model itself. This greatly simplifies the work of the data modeler. Her work, as far as temporality is concerned, is not to translate tem- poral requirements into data model constructs. Instead, it becomes that of simply expressing business requirements for temporal data as a set of metadata associated with the data model. As well as developing the logical model, the other task for the data modeler is to translate business requirements for temporal information into metadata. There are metadata entries for each table in the data model which is to be generated as an asserted version table. For these tables, there are entries to specify which business column or columns make up the business key for the table. This metadata also provides the information which the AVF needs to enforce temporal entity integrity and temporal referential integrity. Once the logical model and its associated metadata are com- plete, the next step is to generate a physical data model from the logical model. At this point, of course, the physical model that is
  6. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 169 generated has no temporal features; all of its tables are conven- tional non-temporal tables. The final step is a process in which a team consisting of the data modeler and a DBA uses the temporal metadata to modify the physical data model, changing specific tables into asserted version tables. In this process, pairs of date columns are added to implement assertion time and effective time. Surrogate pri- mary keys are created as object identifiers. Physical primary keys are converted into Asserted Versioning business keys, and physi- cal foreign keys into Asserted Versioning temporal foreign keys. However, for organizations using the ERwin data modeling tool, this manual process is unnecessary. In the first release of the AVF, we provide ERwin user-defined properties (UDPs) to hold all temporal metadata, and ERwin scripting macros which use these UDPs to generate a physical data model in which all the temporal conversion work has already been done. Note also that the Asserted Versioning database is more than a set of entries in a database catalog—more than the temporal data schemas shown in Figure 8.1. It is also the stored pro- cedures, triggers or other code that enforces temporal integrity constraints on temporal tables. In the Preface, we stated that Asserted Versioning simplifies the management of temporal databases by providing mainte- nance encapsulation, query encapsulation and design encapsu- lation. What we have just described here is how Asserted Versioning provides design encapsulation. In the rest of this chapter, we will see how design encapsulation works. Translating a Non-Temporal Logical Data Model into a Temporal Physical Data Model The Logical Data Model Figure 8.2 is the logical data model (LDM) of a sample data- base we have constructed, and which can be accessed at AssertedVersioning.com. The most important thing to notice about this LDM is that there is nothing special about it. In par- ticular, there is nothing explicitly temporal about it. And yet from this model, supplemented with metadata provided by the data modeler, the AVF will create an Asserted Versioning data- base in which all of the tables are bi-temporal tables. There may be other tables in an Asserted Versioning database which are non-temporal tables. But we are not concerned with them. The DBMS enforces entity integrity on them, while the
  7. 170 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES Client Wellness-Program-Category client-nbr: CHAR(10) wellpgmcat-cd: CHAR(4) client-nm: VARCHAR(40) wellpgmcat-nm: VARCHAR(50) may own may categorize may be enrolled in Policy Wellness-Program policy-nbr: CHAR(10) wellpgm-nbr: CHAR(10) client-nbr: CHAR(10) (FK) wellpgmcat-cd: CHAR(4) (FK) policy-type: CHAR(3) wellpgm-nm: VARCHAR(50) copay-amt: MONEY may enroll may be amended by Wellness-Program-Enrollment client-nbr: CHAR(10) (FK) wellpgm-nbr: CHAR(10) (FK) Policy-Amendment policy-amend-nbr: CHAR(10) wellpgm-enroll-begin-wgt: SMALLINT wellpgm-enroll-end-wgt: SMALLINT policy-nbr: CHAR(10) (FK) wellpgm-enroll-begin-a1c-nbr: DECIMAL(2,1) policy-amend-txt: VARCHAR(100) wellpgm-enroll-end-a1c-nbr: DECIMAL(2,1) Figure 8.2 The Sample Database Logical Data Model. AVF enforces temporal entity integrity on its tables. The DBMS enforces referential integrity on them, while the AVF enforces temporal referential integrity on its tables. Later on, additional non-temporal tables may be converted to asserted version tables, and this can be done without making any changes to the logical data models of those databases. Temporality is introduced “downstream” from the logical data models, by making entries in asserted version metadata tables, and then by modifying DDL in accordance with this metadata before that DDL is submitted to the DBMS. This particular logical data model is a simple one. In it, a cli- ent may own any number of policies, each of which must be owned by exactly one client. Each policy may be amended by any number of policy amendments, each of which amends exactly one policy.1 A wellness program category categorizes any number of wellness programs, each of which is categorized by exactly one wellness program category. A client may be 1 “Any number of” is our substitute for the less graceful expression “zero, one or more”. The two expressions mean the same thing.
  8. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 171 enrolled in any number of wellness programs, each of which may enroll any number of clients. Thus, the entity Wellness Program Enrollment is an associative entity, implementing a many-to-many relationship between clients and programs. The business meaning of the entities, attributes and relation- ships should need no explanation, with the possible exception of the two attributes with a suffix of “a1c”. As all diabetics know, a1c is a blood test that measures what percentage of a person’s hemoglobin has glucose attached to it. As ERwin data modelers will immediately recognize, primary keys are shown above the horizontal line in each entity. Foreign keys, of course, have “(FK)” as a separate suffix. Since all of these entities will be generated as temporal tables, all these FKs will be replaced by temporal foreign keys, by TFKs. As we said earlier, the current implementation of Asserted Versioning uses ERwin’s user-defined properties to capture the metadata needed to generate a bi-temporal database schema from a non-temporal data model. In this chapter, however, we will organize that metadata as a set of five metadata tables. Referential Constraints Between Non-Temporal and Bi-Temporal Tables There is nothing semantically wrong about a bi-temporal table being the child table in a referential integrity relationship. In that case, the bi-temporal table will contain a conventional foreign key which points to a row in a parent non-temporal table. Conversely, there is nothing semantically wrong about a non-temporal table being the child table in a temporal referen- tial integrity relationship. In that case, the non-temporal table will contain a temporal foreign key which points to an episode in a parent bi-temporal table. In both cases, the referential relationships reflect an existence dependency between the objects involved. When both tables are non-temporal, we represent that existence dependency as a ref- erential integrity dependency. When both tables are bi-temporal, we represent it as a temporal referential integrity dependency. When one table is non-temporal and the other bi-temporal, the existence dependency between their objects isn’t somehow nullified because of our choice of how to represent it. And so our managed objects should be able to express that dependency even in that “mixed” case. As bi-temporal theory, Asserted Versioning interprets non- temporal tables as tables whose rows are bi-temporal, but
  9. 172 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES implicitly so. Rows in non-temporal tables exist in an assertion time which is co-temporal with their physical presence, and so too for effective time. In other words, non-temporal rows are asserted for as long as they physically exist, and are versions which describe what their objects are currently like for as long as those rows physically exist. Their assertion time periods and their effective time periods are fixed; both are always [row create date – 12/31/9999]. In an alternative interpretation, non-temporal rows are asserted for as long as they physically exist in their current form, and are versions which describe their objects for as long as those rows physically exist in their current form. Each time a row is updated, its old form, i.e. an exact image of all of the data in that row, is lost because at least some of it is overwritten. In this interpretation, those rows must have a last update date, in which case their assertion time periods and their effective time periods are not fixed because both are [last update date – 12/31/9999]. In our initial release of the AVF, however, we will not support mixed referential relationships. One of these relationships won’t work, and the other one is dangerous. The relationship that won’t work is the one in which the child table is a non-temporal table, and contains a temporal foreign key. This temporal foreign key is not declared in DDL because current DBMSs cannot rec- ognize it. This temporal foreign key cannot be managed by the DBMS because, unlike normal foreign keys, it does not point to a specific row in the parent table. The relationship that is dangerous is the one in which the child table is an asserted version table, and contains a conven- tional foreign key. This foreign key is declared in DDL, and the DBMS can recognize it. The danger lies in the fact that the DBMS can then carry out a delete cascade from the parent table to the child table, if it is so directed. This delete cascade, however, is unaware of the temporal semantics of the child table. It will simply find every physical row in that child table that contains the referenced foreign key value, and will then physically delete that row. This is meat cleaver work where delicate surgery is required. It can destroy past, current and future episodes in the child table, leaving col- lections of versions which are semantically invalid, and which the AVF will be unable to manage. It will physically remove both version history and assertion history, whereas bi-temporal data management is a promise to preserve both. The conventional delete set null rule would be a safer alternative because episode timelines would not be destroyed. Nonetheless, column-level history would still be lost.
  10. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 173 Mixed referential relationships should be addressed, but they will not be addressed in the first release of the AVF. And so, in the remainder of this chapter, and in most of the remainder of this book, we will not discuss them. Asserted Versioning Metadata Figures 8.3 through 8.7 show the metadata needed by the AVF to generate an Asserted Versioning database from the LDM shown in Figure 8.2. As with other figures showing tables, we indicate foreign keys by italicizing the column heading, and primary keys by underlining the column heading. We show these metadata tables as themselves conventional tables, and therefore all relationships as ones implemented with conventional foreign keys. This simplifies the discussions in this chapter, and allows us to concentrate on the metadata without being concerned about keeping a bi-temporal history of changes to that data. Table Type Metadata In a logical data model that will generate an Asserted Versioning database, we need a metadata list of which entities to generate as non-temporal tables and which entities to gener- ate as asserted version tables. This metadata table lists all the tables that will be generated as asserted version tables, as shown in Figure 8.3. For this data model, we will generate all its entities as asserted version tables. The non-key column in this metadata table is the business key flag. If it is set to ‘Y’, then the table is considered to have a reliable business key. Otherwise, it is set to ‘N’, indicating that the business key for the table is not reliable. Table-Type tbl-nm bus-key- rlb-flag Client Y Policy Y Policy_Amendment Y Wellness_Program Y Wellness_Program_Category Y Wellness_Program_Enrollment Y Figure 8.3 The Table Type Metadata Table.
  11. 174 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES The term business key usually refers to a set of one or more columns of data which can be used as unique identifiers for the objects they represent, and which contain business- meaningful data only, and no surrogate-valued columns. Some- times business keys are used as primary keys. But sometimes, surrogate-valued columns are used as primary keys instead of business keys. Asserted Versioning uses the term “business key” to refer to the one or more columns of an asserted version table which are the primary key of the corresponding entity in the logical data model, or of the corresponding conventional table which has been converted to an asserted version table. Sometimes these columns contain business-meaningful data, but some- times they do not. The role of business keys in asserted version tables is to identify the object represented by each row in the same way that object would be identified, or was identified, in a conventional table. Most of the time, business keys are reliable. In other words, most of the time, each business key value is a unique identifier for one and only one object. So in a non-temporal table, it would be possible to define a unique index on the business key, whether or not it is used as the primary key. Unfortunately, it is sometimes necessary to manage tables whose business keys are not reliable. If the business keys for a table are not completely reliable, we cannot be sure that each business key value represents one and only one object. We may sometimes have to manage transactions, and rows in tables, that have missing or incomplete business keys. In Chapter 5, we discussed how business keys are used when matching transactions to non-temporal tables, and how they are used when matching transactions to asserted version tables. But throughout that discussion, we assumed that the business keys for those tables were reliable. When they are not, it is more diffi- cult to match transactions to a target table, especially when that target table is bi-temporal. In the next chapter, we will discuss the match logic that must be used when temporal inserts, updates and deletes are applied both to asserted version tables with reliable business keys, and also to asserted version tables with unreliable business keys. Temporal Foreign Key Metadata The temporal foreign key metadata table contains one entry for every temporal foreign key. Recall that temporal foreign keys, like conventional foreign keys, express an existence dependency, one in which the existence of the child object represented by the
  12. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 175 TFK src-tbl-nm col-nm tar-tbl-nm req- del-rule- flg ind Policy client_oid Client Yes Cascade Policy_ policy_oid Policy Yes Cascade Amendment Wellness_ wellpgmcat_ Wellness_ No Set Null Program oid Program_ Category Wellness_ client_oid Client Yes Cascade Program_ Enrollment Wellness_ wellpgm_oid Wellness_ Yes Restrict Program_ Program Enrollment Figure 8.4 The Temporal Foreign Key Metadata Table. child row depends on the existence of the parent object represented by a parent episode. In our model, for example, policies are existence dependent on clients, policy amendments are existence dependent on policies, and so on. But unlike a conventional foreign key, which picks out a unique row in the parent table, a temporal foreign key designates an object, but does not pick out any one managed object, any specific episode or version of that object. The AVF takes the oid and searches for an episode of that object whose assertion and effective time periods include those of the child row. The temporal foreign key itself is always a single column, containing an oid value; none of the other columns making up a full Asserted Versioning primary key are part of the TFK. Both the source table name and the target table name columns are foreign keys back to the Table Type metadata table. For each entry, the source table name column says which table contains the TFK, while the target table name column says which table is referenced by the TFK. Column name provides the name of the TFK column itself. Generally, all TFK columns have the same name as the oid they point to, but this obviously will not work when there are two or more TFKs in the same child table that point to the same parent table. The required flag says whether the TFK is required or not. Like FKs, TFKs may be optional or required. If they are required, the TFK in each row of the table must, at all times, contain an oid to an object one of whose episodes has an effective time period that includes ([fills-1]) the effective time period of the row that contains the TFK. If the TFK is not required, the TFK in each row must either contain a valid oid reference, or be null. In our sample database, we have made the TFK to Wellness
  13. 176 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES Program Category optional so we can illustrate how the SET NULL option works with temporal foreign keys. The delete rule indicator has the usual three choices. A delete may be restricted if there are any dependent children; or the foreign key in those children may be set to NULL; or the children may be delete cascaded. These same three choices exist when the dependent children are temporally dependent, i.e. when their dependency is expressed by means of a TFK rather than an FK. The delete indicator for a TFK cannot be set to the null option, of course, unless the TFK required flag is set to “No”. But unlike a conventional delete cascade, a temporal delete cascade does not simply delete a dependent child managed object. Instead, a temporal delete cascade removes the represen- tation of a dependent child managed object from all of the clock ticks from which the managed object it is dependent on is being removed. So, for example, if a delete to a client removes that cli- ent’s representation from the months of March and April 2015, then the delete cascade down to that client’s policies will guar- antee that none of those policies are in effect during those two months. Business Key Metadata The business key metadata table, shown in Figure 8.5, lists the business key for every temporal table. This is the business key that uniquely identifies an object. In a non-temporal table, the business key may or may not also function as the primary key. But even if a business key is not used as a primary key, a reliable business key still uniquely identifies a row, since in a non-temporal table each object is represented by only one row. This is usually enforced with a unique index in a conventional database. Buskey tbl-nm bus-key-col-nm Client client_nbr Policy policy_nbr Policy_Amendment policy_amend_nbr Wellness_Program_Category wellpgmcat_cd Wellness_Program wellpgm_nbr Wellness_Program_Enrollment client_oid Wellness_Program_Enrollment wellpgm_oid Figure 8.5 The Business Key Metadata Table.
  14. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 177 But in a temporal table, multiple rows may represent the same object, and so all of those rows will have the same business key. Consequently, we cannot guarantee that each business key points to one and only one object by defining a unique index on it. Nor can we simply extend the scope of the index by defin- ing a unique index on the business key plus assertion begin and end dates and effective begin and end dates. On the other hand, once SQL supports a PERIOD datatype, then new index methods already defined by computer scientists could enforce uniqueness on the combination of a business key plus an assertion and effective time period. A business key can be made up of any number of columns. For example, the business key for the Wellness Program Enroll- ment table consists of two columns. Each column is a temporal foreign key which points back to an object in a parent table. The Wellness Program Enrollment table is a temporal associative table, expressing a many-to-many relationship between wellness programs and the clients that enroll in them. Business keys are unique to the objects they designate. But in temporal tables, a business key may appear on multiple rows, and consequently their uniqueness cannot be guaranteed simply by defining a unique index on them. Nonetheless, they have an important role to play. We discuss business keys, how the AVF’s enforcement of temporal entity integrity guarantees that no two objects will ever have the same business key, and how busi- ness keys help the business user clarify her intentions when submitting transactions to an Asserted Versioning database, in Chapter 9. Foreign Key Mapping Metadata The foreign key mapping metadata table, shown in Figure 8.6, lists the foreign keys used by tables that were originally conven- tional tables, and that have been converted into asserted version tables. For each foreign key, whether it consists of a single col- umn or multiple columns, the table shows the single-column FK-TFK tbl-nm fkey-col-nm tfk-col-nm Policy client_nbr client_oid Policy_Amendment policy_nbr policy_oid Wellness_Program wellpgmcat_cd wellpgmcat_oid Wellness_Program_Enrollment client_nbr client_oid Wellness_Program_Enrollment wellpgm_nbr wellpgm_oid Figure 8.6 The Foreign Key to Temporal Foreign Key Metadata Table.
  15. 178 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES temporal foreign key that replaces it. The TFK must be included in this table to avoid ambiguities in the FK to TFK mapping when the same parent and child tables have more than one ref- erential relationship between them. The purpose of this metadata table is to assist in reconstructing a view of the asserted version table which is exactly like the original conventional table. This view will have exactly the same name as the original table. And so if it is also column for column and row for row identical with that original table, all queries which worked against the original table will work, unchanged, against this view. As we can see in Figure 8.8, this foreign key information is lost when the tables are converted. None of the DBMS catalog entries for the tables in Figure 8.8 will contain any foreign key declarations because as far as the DBMS knows, those tables contain no foreign keys. None of the foreign keys shown in Fig- ure 8.2 have made their way into the DBMS catalog for this database. In place of their original foreign keys (FKs), these tables now contain temporal foreign keys (TFKs). But today’s DBMSs cannot recognize temporal foreign keys, and so to them, these non-PK oid columns are just non-key columns, not references to objects on which the object represented by their own row is existence- dependent. To these DBMSs, the PK oid columns are just pri- mary key columns, and so they are oblivious to the role of these columns as object identifiers. To these DBMSs, the two PK date columns in each table are just two PK date columns. They do not recognize these two dates as each representing a PERIOD of time, which is a concept, i.e. a datatype, that these DBMSs don’t have to begin with. All in all, to these DBMSs, there is noth- ing especially temporal about the database described by Figure 8.8. Miscellaneous Metadata The miscellaneous metadata table, shown in Figure 8.7, contains additional metadata needed by the AVF. In release 1 of the AVF, there are only two such items. One is the granularity MiscMetaData type data assertion time granularity {microsecond timestamp} effective time granularity {month, 1} Figure 8.7 The Miscellaneous Metadata Table.
  16. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 179 for assertion time, and the other the granularity for effective time. Granularity, of course, is the size of the clock tick. As we have said, the clock ticks being used in examples in this book are one-month clock ticks, for both assertion time and effec- tive time. But in our consulting experience, companies often use one granularity for versioning, and a different granularity to record row creation dates. Versioning, of course, is the best practice approximation to Asserted Versioning’s effective time, and row creation dates are an approximation to Asserted Versioning’s assertion time. And in our own prior implementations of bi- temporal data management, we have used dates for effective time and microsecond timestamps for assertion time. By using the same granularity for all assertion times in the same database, and the same granularity for all effective times, it is easy to determine the Allen relationship between any two time periods. So suppose that the same granularity is not used, and that there are two time periods which start at the same time, one of which is delimited by dates and the other by timestamps. The values, each of which designate the same point in time, are not identical. But if the same granularity is used, the EQUALS operator will tell us whether or not those time periods begin at the same time. Of particular importance is whether or not two time periods have a gap in time between them or not. By using the same gran- ularity for all asserted version tables in the same database, it is easy to spot two versions of the same object that are contiguous in either assertion or in effective time. Because of the closed- open convention, two time periods [meet] (are contiguous) if and only if the end point in time of one has the same value as the begin point in time of the other. This is a very important temporal relationship because two versions of the same object that effective-time [meet] belong to the same episode, whereas two versions that do not [meet] belong to different episodes. We note that no granularity mismatch issues should arise when assertion time and effective time are based on different granularities. This is because there seem to be no semantically meaningful queries that would compare an assertion time period or point in time to an effective time period or point in time. One final point. We recommend that assertion time granular- ity be set to the level of the atomic clock tick for the DBMS, i.e. the smallest clock tick that could occur between two successive modifications to the database. The reason is that if a non-atomic clock tick is used for assertion time, then it would be physically possible to place two or more asserted version rows, for the same
  17. 180 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES object and the same or [intersecting] effective time periods, in the same clock tick of assertion time. For example, let’s assume that one row is a policy row that results from a temporal insert, and the other row results from a temporal update, and that the temporal update changes the policy’s type from HMO to PPO. Let’s also assume that the clock tick granularity chosen for assertion time is one day. Now if both the insert and the update take place on the same day, then we have two truth-functionally conflicting assertions. We assert, with those two rows, that during a shared period of effective time, the policy was an HMO policy, and was also a PPO policy. We now have all the temporal metadata we need. Together with the logical data model, this is all the information we need to both generate an Asserted Versioning database, and to manage it. The Physical Data Model Figure 8.8 shows the physical data model which the AVF cre- ated from (i) the non-temporal physical data model generated by ERwin from the logical data model shown in Figure 8.2, together with (ii) the metadata shown in Figures 8.3 through 8.7. Client Wellness_Program_Enrollment Wellness_Program_Category client_oid: bigint client_wellpgm_oid: bigint wellpgmcat_oid: bigint eff_beg_dt: datetime eff_beg_dt: datetime eff_beg_dt: datetime asr_beg_dt: datetime asr_beg_dt: datetime asr_beg_dt: datetime client_nbr: char(10) client_oid: bigint wellpgmcat_cd: char(4) epis_beg_dt: datetime wellpgm_oid: bigint epis_beg_dt: datetime client_nm: varchar(40) epis_beg_dt: char(18) wellpgmcat_nm: varchar(50) eff_end_dt: datetime wellpgm_enroll_begin_wgt: smallint eff_end_dt: datetime asr_end_dt: datetime wellpgm_enroll_end_wgt: smallint asr_end_dt: datetime row_crt_dt: datetime wellpgm_enroll_begin_a1c_nbr: decimal(2,1) row_crt_dt: datetime wellpgm_enroll_end_a1c_nbr: decimal(2,1) eff_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime Policy Policy_Amendment Wellness_Program policy_oid: bigint eff_beg_dt: datetime policy-amend_oid: bigint Wellpgm_oid: bigint asr_beg_dt: datetime eff_beg_dt: datetime eff_beg_dt: datetime asr_beg_dt: datetime asr_beg_dt: datetime policy_nbr: char(10) epis_beg_dt; datetime policy_oid: bigint wellpgm_nbr: char(10) client_oid: bigint policy_amend_nbr: char(10) epis_beg_dt: datetime policy_type: char(3) epis_beg_dt: datetime wellpgmcat_oid: bigint copay_amt: money policy_amend_txt: varchar(100) wellpgm_nm: varchar(50) eff_end_dt: datetime eff_end_dt: datetime eff_end_dt: datetime asr_end_dt: datetime asr_end_dt: datetime asr_end_dt: datetime row_crt_dt: datetime row_crt_dt: datetime row_crt_dt: datetime Figure 8.8 The Sample Database Physical Data Model.
  18. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 181 A primary key column with an oid suffix is the unique identi- fier of the object whose type is indicated by that table, and which is represented by one or more managed objects contained in that table. Thus client_oid, in the Client table, identifies a partic- ular client which is represented in the Client table by one or more episodes each consisting of one or more versions. A non-primary key column with an oid suffix is a temporal foreign key. Thus client_oid, in the Policy and Wellness Program Enrollment tables, is a temporal foreign key. Temporal foreign key relationships are graphically represented, in Figure 8.8, by arrows from the primary key oid to the TFK oid. Since we do not append a suffix to business key column names, we have adopted the convention of listing the business key column or columns of a table immediately below the pri- mary key, and immediately followed by the episode begin date. This shows us that Client, Policy, Wellness Program Category and Wellness Program all have single-column business keys, that the business key of Policy Amendment is the policy identifier plus an amendment number, and that the business key of Well- ness Program Enrollment is the concatenation of the client and wellness program identifiers. The process by which this physical data model (PDM) is gen- erated from the non-temporal PDM corresponding to the LDM of Figures 8.1 and 8.2 is straightforward. As illustrated in Fig- ure 8.1, the AVF does this automatically. However, as with all the work the AVF does, this transformation of a non-temporal PDM to a temporal PDM can be done manually. We describe this process in the following section. Generating an Asserted Versioning Database from a Physical Data Model and Metadata An Asserted Versioning database consists of data structures plus constraints on the instances of data that conform to them. The data structures are the table definitions. The constraints are temporal entity integrity and temporal referential integrity, applied to the asserted version tables in the database, and implemented by the AVF as stored procedures. However, other techniques could also be used to enforce these constraints, including triggers, generalized and reusable application code, and Java object persistence frameworks. The AVF generates this logic based on the metadata associated with each data model, and so it does not need to recreate this code for each table.
  19. 182 Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES Temporalizing the Physical Data Model ERwin generates a physical data model from the logical data model shown in Figure 8.2. At that point, the physical data model is a conventional model. The next step is to apply the metadata shown in Figures 8.3 through 8.7. The result of that step is the temporalized physical data model shown in Figure 8.8. An enterprise which chooses to build its own framework, based on the ideas presented in this book, may choose to use some other data modeling tool, of course. But Asserted Versioning LLC’s commercially available AVF stores this metadata in ERwin User- Defined Properties, not in metadata tables like those shown in this chapter; and it automatically applies that metadata as the physical model is generated. Absent that automated process in which temporal metadata is applied, the temporalization process works like this. The Client Table. According to the Table Type metadata table, the Client table is an asserted version table. Prior to temporalization, the primary key of this table was client_nbr. From this column we derive the name client_oid for the first column in the bi-temporal primary key by dropping the suffix and replacing it with oid. We then add effective begin date and assertion begin date as the other two primary key columns. Client number itself is now treated as the business key of the table, and so it appears in the temporalized table as a non-key column. Client name is left unchanged. Episode begin date, effective end date, assertion end date and row create date are added as non-key columns. Where necessary, unique constraints and indexes defined in the non-temporal model are augmented with temporal date columns, and converted to non-unique indexes. And at this point, the temporalization of the Client table is complete. The Policy Table. According to the Table Type metadata table, the Policy table is an asserted version table. Prior to temporalization, the primary key of this table was policy_nbr. From this column we derive the name policy_oid for the first col- umn in the bi-temporal primary key. We then add effective begin date and assertion begin date as the other two primary key columns. Policy number itself is designated as the business key of this table, and appears in it as a non-key column. Policy type and copay amount are left unchanged. Episode begin date, effec- tive end date, assertion end date and row create date are added as non-key columns. As before, unique constraints and indexes are augmented and modified, as required. Client_nbr appears in the logical data model as a foreign key to the Client table, and so the AVF must convert it into a
  20. Chapter 8 DESIGNING AND GENERATING ASSERTED VERSIONING DATABASES 183 temporal foreign key. So the foreign key declaration is dropped from the DDL, the client number column is also dropped, and a client_oid column replaces it. With these changes, the temporalization of this table is complete. The Policy Amendment Table. According to the Table Type metadata table, the Policy Amendment table is an asserted version table. Prior to temporalization, the primary key of this table was policy_amend_nbr. From this column we derive the name policy_amend_oid for the first column in the bi-temporal primary key. We then add effective begin date and assertion begin date as the other two primary key columns. Policy amend- ment number itself is designated as the business key of this table, and appears in it as a non-key column. Policy amendment text is left unchanged. Episode begin date, effective end date, assertion end date and row create date are added as non-key columns. As before, unique constraints and indexes are aug- mented and are modified, as required. Policy_nbr appears in the logical data model as a foreign key to the Policy table, and so the AVF must convert it into a temporal foreign key. So the foreign key declaration is dropped from the DDL, the policy number column is also dropped, and a policy_oid column replaces it. With these changes, the temporalization of this table is complete. The Wellness Program Category Table. According to the Table Type metadata table, the Wellness Program Category table is an asserted version table. Prior to temporalization, the primary key of this table was wellpgmcat_cd. From this column we derive the name wellpgmcat_oid for the first column in the bi-temporal primary key. We then add effective begin date and assertion begin date as the other two primary key columns. Wellness pro- gram category code itself is designated as the business key of this table, and appears in it as a non-key column. Wellness pro- gram category name is left unchanged. Episode begin date, effective end date, assertion end date and row create date are added as non-key columns. As before, unique constraints and indexes are augmented and are modified, as required. With these changes, the temporalization of this table is complete. The Wellness Program Table. According to the Table Type metadata table, the Wellness Program table is an asserted ver- sion table. Prior to temporalization, the primary key of this table was wellpgm_nbr. From this column we derive the name wellpgm_oid for the first column in the bi-temporal primary key. We then add effective begin date and assertion begin date as the other two primary key columns. Wellness program num- ber itself is designated as the business key of this table, and
Đồng bộ tài khoản