Managing time in relational databases- P21

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:20

0
47
lượt xem
4
download

Managing time in relational databases- P21

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'managing time in relational databases- p21', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:
Lưu

Nội dung Text: Managing time in relational databases- P21

  1. Chapter 16 CONCLUSION 387 Query Encapsulation As we have already pointed out, production queries against Asserted Versioning databases do not have to check for TEI or TRI violations. The maintenance processes carried out by the AVF guarantee that asserted version tables will already conform to those semantic requirements. For example, when joining from a TRI child to a TRI parent, these queries do not have to check that the parent object is represented by an effective-time set of contig- uous and non-overlapping rows whose end-to-end time period fully includes that of the child row. Asserted Versioning already guarantees that those parent version rows [meet] within an epi- sode, and that they [fill-1] the effective time period of the child row. Ad hoc queries against Asserted Versioning databases can be written directly against asserted version tables. But as far as pos- sible, they should be written against views in order to simplify the query-writing task of predominately non-technical query authors. So we recommend that a basic set of views be provided for each asserted version table. Additional subject-matter- specific views written against these basic views could also be created. Some basic views that we believe might prove useful for these query authors are: (i) The Conventional Data View, consisting of all currently asserted current versions in the table. This is a one-row- per-object view. (ii) The Current Versions View, consisting of all currently asserted versions in the table, past, present and future. This is a view that will satisfy all the requirements satisfied by any best practice versioning tables, as described in Chapter 4. (iii) The Episode View, consisting of one current assertion for each episode. That is the current version for current episodes, the last version for past episodes, and the latest version for future episodes. This view is useful because it filters out the “blow-by-blow” history which version tables provide, and leaves only a “latest row” to represent each episode of an object of interest. (iv) The Semantic Logfile View, consisting of all no longer asserted versions in the table. This view collects all asserted version data that we no longer claim is true, and should be of particular interest to auditors. (v) The Transaction File View, consisting of all near future asserted versions. These are deferred assertions that will become currently asserted data soon enough that the busi- ness is willing to let them become current by means of the passage of time.
  2. 388 Chapter 16 CONCLUSION (vi) The Staging Area View, consisting of all far future asserted versions. These are deferred assertions that are still a work in progress. They might be incomplete data that the busi- ness fully intends to assert once they are completed. They might also be hypothetical data, created to try out various what-if scenarios. We also note that existing queries against conventional tables will execute properly when their target tables are con- verted to asserted version tables. In the conversion, the tables are given new names. For example, we use the suffix “_AV” on asserted version tables and only on those tables. One of the views provided on each table, then, is one which selects exactly those columns that made up the original table, and all and only those rows that dynamically remain currently asserted and cur- rently in effect. This dynamic view provides, as a queryable object, a set of data that is row for row and column for column identical to the original table. The view itself is given the name the original table had. Every column has the same name it orig- inally had. This provides temporal upward compatibility for all queries, whether embedded in application code or free- standing. We conclude that Asserted Versioning does provide query encapsulation for bi-temporal data, and also temporal upward compatibility for queries. The Internalization of Pipeline Datasets Non-current data is often found in numerous nooks and crannies of conventional databases. Surrounding conventional tables whose rows have no time periods explicitly attached to them, and which represent our current beliefs about what their objects are currently like, there may be various history tables, transaction tables, staging area tables and developer-maintained logfile tables. In some cases, temporality has even infiltrated some of those tables themselves, transforming them into one or another of some variation on the four types of version tables which we described in Chapter 4. When we began writing, we knew that deferred transactions and deferred assertions went beyond the standard bi-temporal semantics recognized in the computer science community. We knew that they corresponded to insert, update or delete trans- actions written but not yet submitted to the DBMS. The most familiar collections of transactions in this state, we recognized, are those called batch transaction datasets.
  3. Chapter 16 CONCLUSION 389 But as soon as we identified the nine logical categories of bi- temporal data, we realized that deferred transactions and deferred assertions dealt with only three of those nine categories—with future assertions about past, present or future versions. What, then, we wondered, did the three categories of past assertions correspond to? The answer is that past assertions play the role of a DBMS semantic logfile, one specific to a particular production table. Of course, by now we understand that past assertions do not make it possible to fully recreate the physical state of a table as of any point in past time because of deferred assertions which are not, by definition, past assertions. Instead, they make it pos- sible to recreate what we claimed, at some past point in time, was the truth about the past, present and future of the things we were interested in at the time. In this way, past assertions support a semantic logfile, and allow us to recreate what we once claimed was true, as of any point of time in the past. They pro- vide the as-was semantics for bi-temporal data. But Asserted Versioning also supports a table-specific physi- cal logfile. It does so with the row create date. With this date, we can almost recreate everything that was physically in a table as of any past point in time, no matter where in assertion time or effective time any of those rows are located.2 This leaves us with only three of the nine categories—the cur- rent assertion of past, present and future versions of objects. The current assertions of current versions, of course, are the conven- tional data in an asserted version table. This leaves currently asserted past versions and currently asserted future versions. But these are nothing new to IT professionals. They are what IT best practice version tables have been trying to manage for several decades. Now it all comes together. Instead of conventional physical logfiles, Asserted Versioning supports queries which make both semantic logfile data and physical logfile data available. Instead of batch transaction datasets, Asserted Versioning keeps track of what the database will look like when those transactions are applied—which, for asserted version tables, means when those future assertions pass into currency. Instead of variations on best practice version tables which support some part of the seman- tics of versioning, Asserted Versioning is an enterprise solution which implements versioning, in every case, with the same 2 The exception is deferred assertions that have been moved backwards in assertion time. Currently, Asserted Versioning does not preserve information about the far future assertion time these assertions originally existed in.
  4. 390 Chapter 16 CONCLUSION schemas and with support for the full semantics of versioning, whether or not the specific business requirements, at the time, specify those full semantics. With all these various physical datasets internalized within the production tables they are directed to or derived from, Asserted Versioning eliminates the cost of managing them as distinct physical data objects. Asserted Versioning also eliminates the cost of coordinating maintenance to them. There is no latency as updates to produc- tion tables ripple out to downstream copies of that same data, such as separate history tables. On the inward-bound side, there is also no latency. As soon as a transaction is written, it becomes part of its target table. The semantics supported here is, for maintenance transactions, “submit it and forget it”. We conclude that Asserted Versioning does support the semantics of the internalization of pipeline datasets. Performance We have provided techniques on how to index, partition, cluster and query an Asserted Versioning database. We’ve recommended key structures for primary keys, foreign keys and search keys, and recommended the placement of temporal columns in indexes for optimal performance. We have also shown how to improve performance with the use of currency flags. All these techniques help to provide query performance in Asserted Versioning databases which is nearly equivalent to the query performance in equivalent conventional databases. We conclude that queries against even very large Asserted Versioning databases, especially those queries retrieving cur- rently asserted current versions of persistent objects, will per- form as well or nearly as well as the corresponding queries against a conventional database. Enterprise Contextualization As temporal data has become increasingly important, much of it has migrated from being reconstructable temporal data to being queryable temporal data. But much of that queryable tem- poral data is still isolated in data warehouses or other historical databases, although some of it also exists in production databases as history tables, or as version tables. Often, this queryable tem- poral data fails to distinguish between data which reflects changes in the real world, and data which corrects mistakes in earlier data.
  5. Chapter 16 CONCLUSION 391 So business needs for a collection of temporal data against which queries can be written are often difficult to meet. Some of the needed data may be in a data warehouse; the rest of it may be contained in various history tables and version tables in the production database, and the odds of those history tables all using the same schemas and all being updated according to the same rules are not good. As for version tables, we have seen how many different kinds there are, and how difficult it can be to write queries that extract exactly the desired data from them. We need an enterprise solution to the provision of queryable bi-temporal data. We need one consistent set of schemas, across all tables and all databases. We need one set of transactions that update bi-temporal data, and enforce the same temporal integ- rity constraints, across all tables and all databases. We need a standard way to ask for uni-temporal or bi-temporal data. And we need a way to remove all temporal logic from application programs, isolate it in a separate layer of code, and invoke it declaratively. Asserted Versioning is that enterprise solution. Asserted Versioning as a Bridge and as a Destination Asserted Versioning, either in the form of the AVF or of a home-grown implementation of its concepts, has value as both a bridge and as a destination. As a bridge to a standards-based, vendor-supported implementation of bi-temporal data manage- ment, Asserted Versioning is a way to begin migrating databases and applications right away, using the DBMSs available today and the SQL available today. As a destination, Asserted Versioning is an implementation of a more complete semantics for bi-temporality than has yet been defined in the academic literature. Asserted Versioning as a Bridge Applications which manage temporal data intermingle code expressing subject-matter-specific business rules with code for managing these different forms in which temporal data is stored. Queries which access temporal data in these databases cannot be written correctly without a deep knowledge of the specific schemas used to store the data, and of both the scope and limits of the semantics of that data. Assembling data from two or more
  6. 392 Chapter 16 CONCLUSION temporal tables, whether in the same or in different physical databases, is likely to require complicated logic to mediate the discrepancies between different implementations of the same semantics. As a bridge to the new SQL standards and to DBMS support for them, Asserted Versioning standardizes temporal semantics by removing history tables, various forms of version tables, transac- tion datasets, staging areas and logfile data from databases. In their place, Asserted Versioning provides a standard canonical form for bi-temporal data, that form being the Asserted Ver- sioning schema used by all asserted version tables. By implementing Asserted Versioning, businesses can begin to remove temporal logic from their applications, and at each point where often complex temporal logic is hardcoded inside an application program, they can begin to replace that code with a simple temporal insert, update or delete statement. Sometimes this will be difficult work. Some implementations of versioning, for example, are more convoluted than others. The code that supports those implementations will be correspond- ingly difficult to identify, isolate and replace. But if a business is going to avail itself of standards-based temporal SQL and commercial support for those temporal extensions—as it surely will, sooner or later—then this work will have to be done, sooner or later. With an Asserted Versioning Framework available to the business, that work can begin sooner rather than later. It can begin right now. Asserted Versioning as a Destination Even if the primary motivation for using the AVF—ours or a home-grown version—is as a bridge to standards-based and vendor implemented bi-temporal functionality, that is certainly not its only value. For as soon as the AVF is installed, hundreds of person hours will typically be saved on every new project to introduce temporal data into a database. Based on our own con- sulting experience, which jointly spans about half a century and several dozen client engagements, we can confidently say, with- out exaggeration, that many large projects involving temporal data will save thousands of person hours. Here’s how. Temporal data modeling work that would other- wise have to be done, will be eliminated. Project-specific designs for history tables or version tables, likely differing in some way from the many other designs that already exist in the databases across the enterprise, will no longer proliferate. Separate code to maintain these idiosyncratically different structures will no
  7. Chapter 16 CONCLUSION 393 longer have to be written. Temporal entity integrity rules and temporal referential integrity rules will no longer be overlooked, or only partially or incorrectly implemented. Special instructions to those who will write the often complex sets of SQL transactions required to carry out what is a single insert, update or delete action from a business user perspective will no longer have to be provided and remembered each time a transaction is written. Special instructions to those who will write queries against these tables, possibly joining them with slightly different temporal tables designed and written by some other project team, will no longer have to be provided and remembered each time a query is written. When the first set of tables is converted to asserted version tables, seamless real-time access to bi-temporal data will be immediately available for that data. This is declaratively specified access, with the procedural complexities encapsulated within the AVF. In addition, the benefits of the internalization of pipeline datasets will also be made immediately available, this being one of the principal areas in which Asserted Versioning extends bi- temporal semantics beyond the semantics of the standard model. We conclude that Asserted Versioning has value both as a bridge and as a destination. It is a bridge to a standards-based SQL that includes support for PERIOD datatypes, Allen relationships and the declarative specification of bi-temporal semantics. It is a desti- nation in the sense that it is a currently available solution which provides the benefits of declaratively specified, seamless real-time access to bi-temporal data, including the extended semantics of objects, episodes and internalized pipeline datasets. Ongoing Research and Development Bi-temporal data is an ongoing research and development topic within the computer science and DBMS vendor com- munities. Most of that research will affect IT professionals only as products delivered to us, specifically in the form of enhancements to the SQL language and to relational DBMSs. But bi-temporal data and its management by means of Asserted Versioning’s conceptual and software frameworks is an ongoing research and development topic for us as well. Some of this ongoing work will appear as future releases of the Asserted Versioning AVF. Some of it will be published on our website, AssertedVersioning.com, and some of it will be made available as seminars. Following is a partial list of topics that we are working on as this book goes to press.
  8. 394 Chapter 16 CONCLUSION (i) An Asserted Versioning Ontology. A research topic. We have begun to formalize Asserted Versioning as an ontology by translating our Glossary into a FOPL axiomatic system. The undefined predicates of the system are being collected into a controlled vocabulary. Multiple taxonomies will be identified as KIND-OF threads running through the ontol- ogy. Theorems will be formally proved, demonstrating how automated inferencing can extract useful information from a collection of statements that are not organized as a database of tables, rows and columns. (ii) Asserted Versioning and the Relational Model. A research topic. Bi-temporal extensions to the SQL language have been blocked for over 15 years, in large part because of objections that those extensions violate Codd’s relational model and, in particular, his Information Principle. We will discuss those objections, especially as they apply to Asserted Versioning, and respond to them. (iii) Deferred Transaction Workflow Management and the AVF. A development topic. When deferred assertion groups are moved backwards in assertion time, and when isolation cannot be maintained across the entire unit of work, vio- lations of bi-temporal semantics may be exposed to the database user. We are developing a solution that identifies semantic components within and across deferred asser- tion groups, and moves those components backwards in a sequence that preserves temporal semantic integrity at each step of the process. (iv) Asserted Versioning and Real-Time Data Warehousing. A methodology topic. Asserted Versioning supports bi- temporal tables in OLTP source system databases and/or Operational Data Stores. It is a better solution to the man- agement of near-term historical data than is real-time data warehousing, for several reasons. First, much near-term historical data remains operationally relevant, and must be as accessible to OLTP systems as current data is. Thus, it must either be maintained in ad hoc structures within OLTP systems, or retrieved from the data warehouse with poorly-performing federated queries. Second, data ware- houses, and indeed any collection of uni-temporal data, do not support the important as-was vs. as-is distinction. Third, real-time feeds to data warehouses change the warehousing paradigm. Data warehouses originally kept historical data about persistent objects as a time-series of periodic snapshots. Real-time updating of warehouses for- ces versioning into warehouses, and the mixture of
  9. Chapter 16 CONCLUSION 395 snapshots and versions is conceptually confused and confusing. Asserted Versioning makes real-time data warehousing neither necessary nor desirable. (v) Temporalized Unique Indexes. A development topic. Values which are unique to one row in a conventional table may appear on any number of rows when the table is converted to an asserted version table. So unique indexes on conven- tional tables are no longer unique after the conversion. To make those indexes unique, both an assertion and an effective time period must be added to them. This reflects the fact that although those values are no longer unique across all rows in the converted table, they remain unique across all rows in the table at any one point in time, specif- ically at any one combination of assertion and effective time clock ticks. (vi) Instead Of Triggers. A development topic. Instead Of triggers function as updatable views. These updatable views make Asserted Versioning’s temporal transactions look like conventional SQL. When invoked, the triggered code recognizes insert, update and delete statements as temporal transactions. As described in this book, it will translate them into multiple physical transactions, apply TEI and TRI checks, and manage the processing of those physical transactions as atomic and isolated units of work. The utilization of Instead Of triggers by the AVF is ongoing work, as we go to press. (vii) Java and Hibernate. A research and development topic. Hibernate is an object/relational persistence and query service framework for Java. It hides the complexities of SQL, and functions as a data access layer supporting object-oriented semantics (not to be confused with the semantics of objects, as Asserted Versioning uses that term). Hibernate and other frameworks can be used to invoke the AVF logic to enforce TEI and TRI while maintaining an Asserted Versioning bi-temporal database. (viii) Archiving. A methodology topic. An important archiving issue is how to archive integral semantic units, i.e. how to archive without leaving “dangling references” to archived data in the source database. Assertions, versions, episodes and objects define integral semantic units, and we are developing an archiving strategy, and AVF support for it, based on those Asserted Versioning concepts. (ix) Star Schema Temporal Data. A methodology topic. Bi- temporal dimensions can make the “cube explosion prob- lem” unmanageable, and bi-temporal semantics do not
  10. 396 Chapter 16 CONCLUSION apply to fact tables the same way they apply to dimension tables. We are developing a methodology for supporting both versioning, and the as-was vs. as-is distinction, in both fact and dimension tables. Going Forward We thank our readers who have stuck with us through an extended discussion of some very complex ideas. For those who would like to learn more about bi-temporal data, and about Asserted Versioning, we recommend that you visit our website, AssertedVersioning.com, and our webpage at Elsevier.com. At our website, we have also created a small sample database of asserted version tables. Registered users can write both main- tenance transactions and queries against that database. Because these tables contain data from all nine temporal categories, we recommend that interested readers first print out the contents of these tables before querying them. It is by comparing the full contents of those tables to query result sets that the work of each query can best be understood, and the semantic richness of the contents of Asserted Versioning databases best be appreciated. Glossary References Glossary entries whose definitions form strong inter- dependencies are grouped together in the following list. The same glossary entries may be grouped together in different ways at the end of different chapters, each grouping reflecting the semantic perspective of each chapter. There will usually be sev- eral other, and often many other, glossary entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfamiliar term is encountered. ad hoc query production query Allen relationships time period as-is query as-was query asserted version table assertion assertion time
  11. Chapter 16 CONCLUSION 397 Asserted Versioning Asserted Versioning Framework (AVF) history table row create date semantic logfile transaction table bi-temporal data bi-temporal data management business data conventional data conventional table currently asserted current version implicitly temporal data deferred assertion deferred transaction far future assertion time near future assertion time design encapsulation maintenance encapsulation query encapsulation effective time version version table versioned data enterprise contextualization episode non-current data non-temporal data Now() PERIOD datatype point in time object temporalized unique index physical transaction temporal transaction temporal insert transaction temporal update transaction temporal delete transaction
  12. 398 Chapter 16 CONCLUSION pipeline dataset internalization of pipeline datasets production database production table queryable object queryable temporal data reconstructable temporal data seamless access temporal data temporal dimension temporal entity integrity (TEI) temporal foreign key (TFK) temporal referential integrity (TRI)
  13. APPENDIX: BIBLIOGRAPHICAL ESSAY Except for the 1983 and 1988 articles, all the references listed here are readily accessible by IT professionals. Those two articles are listed because of their seminal importance in the field of temporal data management. 1983: The Allen Relationships James F. Allen. “Maintaining Knowledge About Temporal Intervals.” Communications of the ACM (November 1983), 26 (11), 832–843. This article defined a set of 13 positional relationships between two time periods along a common time- line. These relationships are a partitioning of all possible posi- tional temporal relationships. They are mutually exclusive, and there are no others. 1988: Architecture for a Business and Information System B. A. Devlin and P T. Murphy. “An Architecture for a Business . and Information System.” IBM Systems Journal (1988), 27(1). To the best of our knowledge, this article is the origin of data ware- housing in just as incontrovertible a sense as Dr. E. F. Codd’s early articles were the origins of relational theory. 1996: Building the Data Warehouse William Inmon. Building the Data Warehouse, 2nd ed. (John Wiley, 1996). (The first edition was apparently published in 1991, but we can find no reliable references to it.) With this book, Inmon began his work of introducing the concepts of data warehousing to the rest of the IT profession, in the process extending the concept into several iterations of his own data warehousing architecture. 399
  14. 400 Appendix BIBLIOGRAPHICAL ESSAY 1996: The Data Warehouse Toolkit Ralph Kimball. The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses (John Wiley, 1996). This book, and later “data warehouse toolkit” books, introduced and developed Kimball’s event-centric approach to managing historical data. Concepts such as dimensional data marts, the fact vs. dimension distinction, and star schemas and snowflake schemas are all grounded in Kimball’s work, as is the entire range of OLAP and business intelligence software. 2000: Developing Time-Oriented Database Applications in SQL R. T. Snodgrass. Developing Time-Oriented Database Appli- cations in SQL (Morgan-Kaufmann, 2000). Both this book and our own are concerned with explaining how to support bi-temporal data management using current DBMSs and current SQL. This book is an invaluable source of SQL code fragments that illustrate the complexity of managing bi-temporal data and, in particular, that illustrate how to write temporal entity integrity and temporal referential integrity checks. This book is available in PDF form, at no cost, at Dr. Snodgrass’s website: http://www.cs.arizona.edu/people/rts/ publications.html. 2000: Primary Key Reengineering Projects Tom Johnston. “Primary Key Reengineering Projects: The Problem.” Information Management Magazine (February 2000). http://www.information-management.com/issues/20000201/ 1866-1.html Tom Johnston. “Primary Key Reengineering Projects: The Solu- tion.” Information Management Magazine (March 2000). http:// www.information-management.com/issues/20000301/2004-1.html These two articles, by one of the authors, explain why he believes that all relational tables should use surrogate keys rather than business keys. Additional material on this topic can be found at his website, MindfulData.com. For anyone con- templating the idea of an Asserted Versioning Framework of their own, in which they use business keys as primary keys instead of Asserted Versioning’s object identifiers (oids), we recommend that you read these articles first.
  15. Appendix BIBLIOGRAPHICAL ESSAY 401 2001: Unobvious Redundancies in Relational Data Models Tom Johnston. “Unobvious Redundancies in Relational Data Models, Part 1.” InfoManagement Direct (September 2001). http://www.information-management.com/infodirect/ 20010914/4007-1.html Tom Johnston. “Unobvious Redundancies in Relational Data Models, Part 2.” InfoManagement Direct (September 2001). http://www.information-management.com/infodirect/ 20010921/4017-1.html Tom Johnston. “Unobvious Redundancies in Relational Data Models, Part 3.” InfoManagement Direct (September 2001). http://www.information-management.com/infodirect/ 20010928/4037-1.html Tom Johnston. “Unobvious Redundancies in Relational Data Models, Part 4.” InfoManagement Direct (October 2001). http://www.information-management.com/infodirect/ 20011005/4103-1.html Tom Johnston. “Unobvious Redundancies in Relational Data Models, Part 5.” InfoManagement Direct (October 2001). http://www.information-management.com/infodirect/ 20011012/4132-1.html These five articles, by one of the authors, show how fully nor- malized relational data models may still contain data redun- dancies. The issue of redundancies that do not violate normal forms was raised in Chapter 15, where we discussed our reasons for repeating the effective begin date of the initial version of every episode on all the non-initial versions of those same episodes. 2002: Temporal Data and The Relational Model C. J. Date, Hugh Darwen, Nikos Lorentzos. Temporal Data and the Relational Model (Morgan-Kaufmann, 2002). While the main focus of our book and the book by Dr. Snodgrass is row-level bi-temporality, the main focus of Date, Darwen, and Lorentzos’s book is column-level versioning. While the main focus of our book and Snodgrass’s is on implementing temporal data manage- ment with today’s DBMSs and today’s SQL, the main focus of their book is on describing language extensions that contain new operators for manipulating versioned data.
  16. 402 Appendix BIBLIOGRAPHICAL ESSAY 2007: Time and Time Again This series of some two dozen articles by the authors, succeeded by a bi-monthly column of the same name and about the same number of installments, began in the May 2007 issue of DM Review magazine, now Information Management. The entire set, amounting to some 50 articles and columns combined, ended in June of 2009. Although we had designed and built bi-temporal databases prior to writing these articles, our ideas evolved a great deal in the process of writing them. For example, although we emphasized the importance of maintenance encap- sulation in the first article, we did not distinguish between temporal and physical transactions. All in all, we do not believe that these articles can usefully be consulted to gain additional insight into the topics discussed in this book. Although we intended them as instructions to other modelers and developers on how to implement bi-temporal data in today’s DBMSs, we now look back on them as an on-line diary of our evolving ideas on the subject. 2009: Oracle 11g Workspace Manager Oracle Database 11g Workspace Manager Overview. An Oracle White Paper (September 2009). http://www.oracle.com/ technology/products/database/workspace_manager/pdf/ twp_AppDev_Workspace_Manager_11g.pdf A discussion of the Oracle 11g Workspace Manager, and in particular its key role in implementing Oracle’s support for bi-temporal data management. On our website, we compare and contrast this implementation of a framework for bi-temporal data management with Asserted Versioning. Philosophical Concepts The best Internet source for an introduction to philosophical concepts, including those used in this book, is the Stanford Encyclopedia of Philosophy, at http://plato.stanford.edu/. Unfortunately, while each entry is individually excellent, the choice of which concepts to include seems somewhat idiosyn- cratic. For example, there is no general entry for ontology. Nonetheless, we recommend the following entries there, as rele- vant to the concepts used in this book: assertion, change, epistemology, facts, Arthur Prior, propositional attitude reports, speech acts, temporal logic, temporal parts.
  17. Appendix BIBLIOGRAPHICAL ESSAY 403 There are, of course, numerous other excellent introductions to philosophical concepts available on the Web. The problem is that there are also numerous other very poor ones, too! Philoso- phy is a topic that seems to lend itself to this kind of variety. We would recommend to those interested, that as a general rule, sources at dot-edu domains can be presumed reliable, while sources at other domains should be treated with caution. The Computer Science Literature In this bibliography, we include no direct references to the computer science literature on temporal data management because most of that literature will not be available to many of our readers. For those who wish to access this material, we recommend getting a membership in the ACM and subscribing to the ACM Digital Library. Downloadable PDF copies of hundreds and probably thousands of articles on the manage- ment of temporal data are available from that source. Another invaluable—and free!—source of information on temporal databases, with many links to other resources, can be found on Dr. Snodgrass’s website, specifically at http://www.cs. arizona.edu/people/rts/publications.html.
  18. THE ASSERTED VERSIONING GLOSSARY This Glossary contains approximately 300 definitions, nearly all of which are specific to Asserted Versioning. Most expressions have both a Mechanics entry and a Semantics entry. A Mechan- ics entry describes how the defined concept is implemented in the “machinery” of Asserted Versioning. A Semantics entry describes what that concept means. We can also think of a Mechanics entry as telling us what a component of Asserted Versioning is or what it does, and a Semantics entry as telling us why it is important. In linguistics, the usual contrast to semantics is syntax. But syntax is only the bill of materials of Asserted Versioning. The Asserted Versioning Framework, or any other implementa- tion of Asserted Versioning, has an intricately interconnected set of parts, which correspond to the syntax of a language. But when it is turned on, it is a software engine which translates metadata and data models into the database schemas it uses to do its work, transforms the data instances it manages from one state to another state, augments or diminishes the totality of the representation of the objects its data corresponds to, and facilitates the ultimate purpose of this wealth of activity, which is to provide meaningful information about the time-varying state of the world an enterprise is a part of and needs to remain cognizant of. Grammar Grammatical variations of the same glossary term will not usually be distinguished. Thus both “version” and “versions” are in this book, but only the former is a Glossary entry. “Currently asserted” is listed as a component of one or more definitions, but the corresponding Glossary entry is “current assertion”. Dates and Times All references to points in time in this Glossary, unless other- wise noted, refer to them using the word “date”. This is done for the same reason that all examples of points in time in the 405
  19. 406 THE ASSERTED VERSIONING GLOSSARY text, unless otherwise noted, are dates. This reason is simply convenience. Periods of time in either of the two bi-temporal dimensions are delimited by their starting point in time and ending point in time. These points in time may be timestamps, dates, or any other point in time recognizable by the DBMS. As defined in this Glossary, they are clock ticks. Components Components of a definition are other Glossary entries used in the definition. Listing the components of every definition separately makes it easier to pick them out and follow cross- reference trails. The Components sections of these definitions are also work- ing notes towards a formal ontology of temporal data. If we assume first-order predicate logic as an initial formalization, we can think of the components of a Glossary definition, together with a set of primitive (formally undefined) terms, as the predicates with which the Mechanics and Semantics sections of those definitions can be expressed as statements in predicate logic. Thus formalized, automated inferencing and theorem prov- ing mechanisms can then be used to discover new theorems. And the point of that activity, of course, is that it can make us aware of the deductive implications of things we already know, of statements we already recognize as true statements. These deductive implications are other true statements. But until we are aware of them, they are not part of our knowledge about the world. These mechanisms can also be used to prove or disprove conjectures about temporal data, thus adding some of them to the totality of that knowledge, and adding, for the rest of them, the knowledge that they are wrong. Of particular note are those few Glossary entries whose list of components is empty (indicated by “N/A”). In an ontology, the collection of undefined terms is called a controlled vocabulary, and these Glossary entries with empty component lists are part of the controlled vocabulary for a formal ontology of Asserted Versioning. Non-Standard Glossary Definitions Broadly speaking, the semantics entry of a Glossary defini- tion describes a concept, while the Mechanics entry describes
  20. THE ASSERTED VERSIONING GLOSSARY 407 its implementation. However, in some cases, there doesn’t seem to be a need for both kinds of entry, and so those definitions will have just a Mechanics section, or just a Seman- tics section. And in other cases, it seems more appropriate to provide a general description rather than to attempt a precise definition. But the heart of this Glossary are the definitions which have both a Semantics and a Mechanics section. Together, the collec- tion of their semantics entries is a summary statement of Asserted Versioning as a theory of bi-temporal data manage- ment, while the collection of their mechanics entries is a summary statement of the implementation of the theory in the Asserted Versioning Framework. Allen Relationships The original Allen relationships are leaf nodes in our Allen relationship taxonomy. Most of the Allen relationships, as well as our taxonomic groupings which are OR’d collections of those relationships, have an inverse. The inverse of an Allen relation- ship or relationship group, between two time periods which do not both begin and end on the same clock tick, is the relation- ship in which the two time periods are reversed. Following Allen’s original notation, we use a superscript suffix (xÀ1) to denote the inverse relationship. Inverse relationships exist in all cases where one of the two time periods is shorter than the other and/or begins on an earlier clock tick than the other. Conse- quently, all the Allen relationships except [equals], have an inverse. “Trivial” Definitions Some Glossary definitions may appear to be “trivial”, in the sense that we can reliably infer what those expressions mean from the expressions themselves. For example, “end date” is defined as “an assertion end date or an effective end date”. Definitions like these exist because the expressions they define are used in the definitions of other expressions. So they are a kind of shorthand. But in addition, our ultimate objective, with this Glossary, is to formalize it as an ontology expressed in predicate logic. For that purpose, apparently trivial entries such as “end date” are needed as predicates in the formal definitions of, for example, expressions like “assertion end date”.
Đồng bộ tài khoản