# Managing time in relational databases- P18

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:20

0
40
lượt xem
5

## Managing time in relational databases- P18

Mô tả tài liệu

Tham khảo tài liệu 'managing time in relational databases- p18', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Bình luận(0)

Lưu

## Nội dung Text: Managing time in relational databases- P18

1. 326 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES The inverse of this relationship is: P2 [overlapsÀ1] P1. In the superscripted relationship, the first time period is the later one. The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open con- vention, is: (eff_beg_dt1 > eff_beg_dt2) AND (eff_beg_dt1 < eff_end_dt2) AND (eff_end_dt1 > eff_end_dt2) It says that P1 starts after P2 starts and before P2 ends, and ends after P2 ends. Consider the following request for information: which policies began before the Diabetes Management Wellness Pro- gram for 2009, and ended while that program was still going on? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_eff_beg_dt < wp_eff_beg_dt AND pol_epis_end_dt > wp_eff_beg_dt AND pol_epis_end_dt < wp_epis_end_dt P1 [intersects] P2 This not an Allen relationship. It is the node in our taxonomy of Allen relationships which includes the [starts], [startsÀ1], [finishes], [finishesÀ1], [during], [duringÀ1], [equals], [overlaps] and [overlapsÀ1] relationships. In other words, it combines the [overlaps] relationships with the [fills] relationships. In the non-superscripted relationship, the first time period is the earlier one. The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_beg_dt1 eff_beg_dt2) It says that P1 starts no later than P2 starts, and ends after P2 starts. The idea behind it is that it includes every relationship in which P1 and P2 have at least one clock tick in common and in which P1 is the earlier time period. The limiting case is that in which P1 ends at the same time P2 starts. So let P1 be [4/15/2010 – 5/13/2010] and let P2 be [5/12/2010 – 9/18/2010]. The clock tick they share is 5/12/2010. The inverse of this relationship is: P1 [intersectsÀ1] P2. The first time period in this non-superscripted relationship is the later one. The predicate for this relationship, as it holds between
2. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 327 Time Period Relationships Along a Common Timeline Intersects Fills Overlaps |----------| |----------| Equals Occupies |-----| |-----| Aligns During |-----| |------------| Starts Finishes |-----| |-----| |------------| |-----------| Figure 14.10 P1 [intersects] P2. two time periods expressed as pairs of dates using the closed- open convention, is: (eff_beg_dt1 >¼ eff_beg_dt2) AND (eff_beg_dt1 < eff_end_dt2) It says that P2 starts no later than P1 starts, and ends after P1 starts. The idea behind it is that it includes every relationship in which P1 and P2 have at least one clock tick in common and in which P1 is the earlier time period. All pairs of time periods that share at least one clock tick sat- isfy one or the other of these two predicates. So the predicate that expresses the truth condition for all time periods that share at least one clock tick is: ((eff_beg_dt1 < eff_end_dt2) AND (eff_end_dt1 > eff_beg_dt2)) It says that either one of the clock ticks in P1 is also in P2 or that one of the clock ticks in P2 is also in P1. The idea behind it is that it covers all the cases where two time periods have at least one clock tick in common, regardless of which is the later time period. It is interesting to look at this relationship in terms of what falls outside its scope. For any two relationships that share at
3. 328 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES least one clock tick, neither ends before the other begins. Other- wise, they could not share a clock tick. Looking at [includes] in terms of what falls outside its scope, we can express it as follows: NOT(eff_end_dt1 ¼ eff_end_dt2) And for those who like as few NOTs as possible, a logical rule (one of the transformation rules known as the De Morgan’s equi- valences) gives us the following predicate: NOT((eff_end_dt1 ¼ eff_end_dt2)) In other words, if two things are both not true, then it isn’t true that either of them is true! On such profundities are the rules of logic constructed. Consider the following request for information: which policies share any clock tick with the Diabetes Management Wellness Program for 2009? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_eff_beg_dt < wp_epis_end_dt AND pol_epis_end_dt > wp_eff_beg_dt Notice how this SQL is much simpler than the OR’d collection of all of the conditions that make up the leaf nodes of its Allen relationships. P1 [before] P2 This is a pair of relationships, one the inverse of the other. In the non-superscripted relationship, the first time period is the earlier one. Time Period Relationships Along a Common Timeline Excludes Before |-----| |-----| Figure 14.11 P1 [before] P2.
4. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 329 The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_end_dt1 < eff_beg_dt2) It says that after P1 ends, there is at least one clock tick before P2 begins. For example, consider the case where eff_end_dt1 is 5/13/2014 and eff_beg_dt2 is 5/14/2014. Because of the closed- open convention, the last clock tick in P1 is 5/12/2014, and so there is one clock tick gap between the two time periods, that clock tick being 5/13/2014. The inverse of this relationship is: P1 [beforeÀ1] P2. In the superscripted relationship, the first time period is the later one. The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_beg_dt1 > eff_end_dt2) It says that before P1 begins, there is at least one clock tick after P2 ends. For example, consider the case where eff_beg_dt1 is 5/14/2014 and eff_end_dt2 is 5/13/2014. Because of the closed-open convention, the last clock tick in P2 is 5/12/2014, and so there is one clock tick gap between the two time periods, that clock tick being 5/13/2014. Throughout this book, if it isn’t important which time period comes first, we will simply say that the two time periods are non- contiguous. This is a particularly useful pair of relationships because they distinguish episodes of the same object from one another. Two adjacent versions—versions of an object with no other ver- sion of the same object between them—belong to different episodes just in case the earlier one is [before] the later one. Of two adjacent episodes of the same object, one is [before] the other, and the other is [beforeÀ1] the former. Consider the following request for information: which policies ended at least one date before the Diabetes Manage- ment Wellness Program for 2009 began? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_epis_end_dt < wp_eff_beg_dt P1 [meets] P2 This is a pair of relationships, one the inverse of the other. In the non-superscripted relationship, the first time period is the earlier one.
5. 330 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES Time Period Relationships Along a Common Timeline Excludes Meets |-----|-----| Figure 14.12 P1 [meets] P2. The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_end_dt1 ¼ eff_beg_dt2) It says that after P1 ends, P2 begins on the very next clock tick. There is no clock tick gap between them. Say that both dates are 5/13/2004. This means that the last clock tick in P1 is 5/12/2004 and the first clock tick in P2 is 5/13/2004, and so there are no clock ticks between the two time periods. The inverse of this relationship is: P2 [meetsÀ1] P1. In the super- scripted relationship, the first time period is the later one. The pred- icate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_beg_dt1 ¼ eff_end_dt2) It says that before P1 begins, P2 ends on the previous clock tick. There is no clock tick gap between them. This is a particularly useful relationship because it defines a collection of versions of the same object that belong to the same episode. Every adjacent pair of versions of the same object that do not share any clock ticks, i.e. in which neither includes the other, and which also do not have a single clock tick between them, belong to the same episode. The earlier version of the pair meets the later one; the later version is met by the earlier one. Throughout this book, if it isn’t important which of two time periods that meet come first, we will simply say that the two time periods are contiguous. Consider the following request for information: which policies ended immediately before the Diabetes Management Wellness Program for 2009 began?
6. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 331 The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_epis_end_dt ¼ wp_eff_beg_dt P1 [excludes] P2 This not an Allen relationship. It is the node in our taxonomy of Allen relationships which includes the [before], [beforeÀ1], [meets] and [meetsÀ1] relationships. In the non-superscripted relationship, the first time period is the earlier one. The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_end_dt1 ¼ eff_end_dt2) It says that P2 ends either immediately before the start of P1, or earlier than that. The idea behind it is that it includes every relationship in which P1 and P2 have no clock ticks in common and in which P1 is the later time period. All pairs of time periods that share no clock ticks satisfy one or the other of these two predicates. So the predicate that Time Period Relationships Along a Common Timeline Excludes Before Meets |-----| |-----| |-----|-----| Figure 14.13 P1 [excludes] P2.
7. 332 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES designates all and only those time periods that share no clock ticks is: (eff-end-dt1 ¼ eff-end-dt2) It says that either P2 starts after P1 ends or ends before P2 starts. The idea behind it is that regardless of which time period comes first, they share no clock ticks. It should be the case that two time periods [exclude] one another if and only if they do not [intersect] one another. If so, then if we put a NOT in front of the predicate for the [intersects] relationship, we should get a predicate which expresses the [excludes] relationship.2 Putting a NOT in front of the [intersects] relationship, we get: NOT((eff_beg_dt1 < eff_end_dt2) AND (eff_end_dt1 > eff_beg_dt2)) This is a statement of the form NOT(X AND Y). The first thing we will do is transform it, according to the De Morgan’s rules, into (NOT-X OR NOT-Y). This gives us: NOT(eff_beg_dt1 < eff_end_dt2) OR NOT(eff_end_dt1 > eff_beg_dt2) Next, we can replace NOT(eff_beg_dt1 < eff_end_dt2) with (eff_beg_dt1 >¼ eff_end_dt2), and NOT(eff_end_dt1 > eff_beg_dt2) with (eff_end_dt1 ¼eff_end_dt2) OR (eff_end_dt1
8. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 333 Point in Time to Period of Time Queries A point in time is a period of time that includes only one clock tick. Thus, using the closed-open convention, a point in time, T1, is identical to the period of time [T1 – T2] where T2 is the next clock tick after T1. The only difference is in the notation. In the following discussions, we will use the simpler notation, T1, for the point in time. In this section, we consider periods of time that are longer than a single clock tick. Periods of time that are one clock tick in length are points in time, and we consider Allen relationships between two points in time later. Given that P1 is longer than a single clock tick, it may or may not share a clock tick with T1. If it does, then T1 [occupies] P1. Otherwise, either one is [before] the other, or else they [meet]. In Asserted Versioning databases, all temporal periods are delimited with the same point in time granularities. When compar- ing time periods to time periods, the logic in the AVF does not depend on the granularity of the clock ticks used in temporal parameters, as long as all of them are the same. The clock ticks could be months (as they are in the examples throughout this book), days, seconds or microseconds of any size. As we noted in Chapter 3, the AVF can carry out its temporal logic without caring about granularity specifically because of the closed-open convention. However, when comparing a point in time to a period of time, we must be aware of the granularity of the clock tick, and must often either add a clock tick to a point or period in time, or sub- tract a clock tick from a point or period in time. Consequently, we need to specify the clock tick duration used in the specific implementation to correctly perform this arithmetic. We will use “fCTD”, standing for “clock tick duration”, as the name of a function that converts an integer into that integer number of clock ticks of the correct granularity. So, for example, in: eff_end_dt – fCTD(1) f CTD takes on the value of one clock tick. If the granularity is a month, as it is in most of the examples in this book, the result will be to subtract one month from the effective end date. If the gran- ularity is a millisecond-level timestamp, it will subtract one milli- second from that date. The fCTD function determines the granularity for a specific Asserted Versioning database from the miscellaneous metadata table, shown as Figure 8.7 in Chapter 8. Different DBMSs use different date formats for date literals. It is also dependent on the default language and the date format currently set. These formats are shown in Figure 14.14.
9. 334 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES Name Layout Example ISO yyyy-mm-dd 2010-09-25 USA mm/dd/yyyy 09/25/2010 EUR dd.mm.yyyy 25.09.2010 JIS yyyy-mm-dd 2010-09-25 Figure 14.14 Date Formats for Date Literals. We used the USA format in parts of the book, so we will assume that the default date format in our sample DBMS is the same. Different DBMSs use different syntax for date arithmetic. SQL Server would use something like this: AND DATEADD(DAY, -1, pol.eff_end_dt) > ‘07/15/2010’ where DAY is the granularity (which can also be abbreviated as DD or D), while DB2 might use: AND (pol.eff_end_dt - 1 DAY) > ‘07/15/2010’ with the reserved word DAY indicating the granularity. We will use the T-SQL format for our examples, and will assume our clock tick granularity is one month, to keep it in synch with the examples used in the book. However, in real-world databases, the granularity would more likely be a day, a second or a microsecond. This fCTD translation could be built into a reusable database function as part of the framework based on metadata. T1 [starts] P1 This is a pair of relationships, one the inverse of the other. In the non-superscripted relationship, the first time period is the point in time, i.e. the single clock-tick time period. Figure 14.15 shows this relationship, and its place in our taxonomy. The two dashed lines in the illustration graphically represent T1 and P1, with T1 being the upper dashed line. The predicate for this relationship, as it holds between a period of time expressed as a pair of dates using the closed-open convention, and a point in time, is: (T1 ¼ eff_beg_dt) It says that T1 starts at P1. Consider the following request for information: which policies begin on the same date as the 2009 Diabetes Manage- ment Wellness Program?
10. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 335 Time Period Relationships Along a Common Timeline Intersects Fills Occupies Aligns Starts |-| |------------| Figure 14.15 T1 [starts] P1. The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_eff_beg_dt ¼ wp_eff_beg_dt T1 [finishes] P1 This is a pair of relationships, one the inverse of the other. In the non-superscripted relationship, the first time period is the point in time, i.e. the single clock-tick time period. Figure 14.16 shows this relationship, and its place in our taxonomy. The two dashed lines in the illustration graphically represent T1 and P1, with T1 being the upper dashed line. The predicate for this relationship, as it holds between a period of time expressed as a pair of dates using the closed-open convention, and a point in time, is: (T1 ¼ eff_end_dt – fCTD(1)) Since the effective end date of a time period is the next clock tick after the last clock tick in that time period, this predicate says that P1 ends on the clock tick that is T1.
11. 336 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES Time Period Relationships Along a Common Timeline Intersects Fills Occupies Aligns Finishes |-| |-----------| Figure 14.16 T1 [finishes] P1. Consider the following request for information: which policies began on the same date as the 2009 Diabetes Wellness Management Program ended? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_eff_beg_dt ¼ DATEADD(MONTH, -1, wp_epis_end_dt) T1 [during] P1 This is the relationship in which a single clock tick occurs after the start of a period of time, and before that period of time ends. Figure 14.17 shows this relationship, and its place in our taxonomy. The two dashed lines in the illustration graphically represent T1 and P1, with T1 being the upper dashed line. The predicate for this relationship, as it holds between a period of time expressed as a pair of dates using the closed-open convention, and a point in time, is: (eff_beg_dt < T1) AND (eff_end_dt À fCTD(1) > T1) It says that T1 occurs during P1 just in case P1 starts before T1 and ends after T1.
12. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 337 Time Period Relationships Along a Common Timeline Intersects Fills Occupies During |-| |------------| Figure 14.17 T1 [during] P1. Consider the following request for information: which policies began before the 2009 Diabetes Wellness Management Program started, and ended after it started? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_eff_beg_dt < wp_eff_beg_dt AND DATEADD(MONTH, À1, pol_epis_end_dt) > wp_eff_beg_dt T1 [occupies] P1 This not an Allen relationship. It is the node in our taxonomy of Allen relationships which, when one of the time periods is a point in time, includes the [starts], [finishes], and [during] relationships. In other words, it combines the [during] relationships with the [aligns] relationships. These are all the relationships in which a time period (of more than one clock tick) includes a point in time. The predicate for this relationship, as it holds between a period of time expressed as a pair of dates using the closed-open convention, and a point in time, is: (eff_beg_dt T1)
13. 338 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES Time Period Relationships Along a Common Timeline Intersects Fills Occupies Aligns During |-| |------------| Starts Finishes |-| |-| |------------| |-----------| Figure 14.18 T1 [occupies] P1. It says that P1 occupies T1 just in case P1 starts on or before T1 and ends on or after T1. Consider the following request for information: which policies began on or before the 2009 Diabetes Management Wellness Program started, and ended on or after it started? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE pol_eff_beg_dt wp_eff_beg_dt) T1 [before] P1 The predicate for this relationship, as it holds between a period of time expressed as a pair of dates using the closed-open convention, and a point in time, is: (T1 þ fCTD(1) < eff_beg_dt) It says that P1 starts at least one clock tick after T1; similarly T occurs at least one clock tick before P1 starts. The inverse of this relationship is: P1 [beforeÀ1] T1. In this superscripted relationship, the time period is later than the point
14. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 339 Time Period Relationships Along a Common Timeline Excludes Before |-| |-----| Figure 14.19 T1 [before] P1. in time. The predicate for this relationship, as it holds between two time periods expressed as pairs of dates using the closed- open convention, is: (eff_end_dt < T1) It says that P1 ends at least one clock tick before T1. It follows that to pick out those versions or episodes which are non-contiguous with a given point in time, the predicate is: (T1 þ fCTD(1) < eff_beg_dt) OR (eff_end_dt < T1) Consider the following request for information: which policies have a temporal gap between when they began and when the 2009 Diabetes Management Wellness Program ended? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE wp_epis_end_dt < pol_eff_beg_dt T1 [meets] P1 The predicate for this relationship, as it holds between a period of time expressed as a pair of dates using the closed-open convention, and a point in time, is: (T1 þ fCTD(1) ¼ eff_beg_dt) It says that P1 starts immediately after T1. The inverse of this relationship is: P1 [meetsÀ1] T1. In this superscripted relationship, the time period is earlier than the point in time. The predicate for this relationship, as it holds
15. 340 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES Time Period Relationships Along a Common Timeline Excludes Meets |-|-----| Figure 14.20 T1 [meets] P1. between two time periods expressed as pairs of dates using the closed-open convention, is: (eff_end_dt ¼ T1) It says that P1 ends immediately before T1. It follows that to pick out those versions, or episodes which are contiguous with a given point in time, the predicate is: (T1 þ fCTD(1) ¼ eff_beg_dt) OR (eff_end_dt ¼ T1) Consider the following request for information: which policies began at the same time the 2009 Diabetes Management Wellness Program ended? The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE wp_epis_end_dt ¼ pol_eff_beg_dt P1 [excludes] T1 This not an Allen relationship. It is the node in our taxonomy of Allen relationships which includes the [before], [beforeÀ1], [meets] and [meetsÀ1] relationships. The predicate for this relationship is: (T1 þ fCTD(1))
16. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 341 Time Period Relationships Along a Common Timeline Excludes Before Meets |-| |-----| |-|-----| Figure 14.21 P1 [excludes] T1. the same as a request for policies which ended [before] the 2009 Diabetes Management Wellness Program began. The SQL written to fulfill this request is: SELECT * FROM V_Allen_Example WHERE (pol_epis_end_dt
17. 342 Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES Time Period Relationships Along a Common Timeline Excludes Before |-| |-| Figure 14.22 T1 [before] T2. T1 [meets] T2 The predicate for this relationship, as it holds between two points in time, expressed as points in time, is: ((T1 þ fCTD(1)) ¼ T2) It says that T1 comes immediately before T2. The inverse of this relationship is: T1 [meetsÀ1] T2. The predi- cate for this relationship, as it holds between two points in time, expressed as points in time, is: ((T1 – fCTD(1)) ¼ T2) It says that T1 comes immediately after T2. T1 [equals] T2 The predicate for this relationship, as it holds between two points in time, expressed as points in time, is: (T1 ¼ T2) It says that T1 and T2 are equal if and only if they occur on the same clock tick. Time Period Relationships Along a Common Timeline Excludes Meets |-|-| Figure 14.23 T1 [meets] T2.
18. Chapter 14 ALLEN RELATIONSHIP AND OTHER QUERIES 343 Time Period Relationships Along a Common Timeline Intersects Fills Equals |-| |-| Figure 14.24 T1 [equals] T2. A Claims Processing Example With the Asserted Versioning queries developed and discussed in this book, we have generally chosen to sacrifice realism in the examples in order to guarantee completeness of coverage. For example, in Chapter 13 we developed and discussed SQL views of nine categories of bi-temporal data, eight of those categories being what we called pipeline datasets. And yet few real-world collections of bi-temporal data, we acknowl- edged, would correspond to precisely one of those nine categories of data (with the one exception of currently asserted current versions of data, the category which corresponds to the data in a conventional table). Completeness of coverage was guaranteed, however, because those nine categories are a math- ematical partitioning of all possible combinations of past, pres- ent and future assertion and effective time data. Thus far in Chapter 14, we have developed and discussed SQL predicates, and SQL statements illustrating their use, each corresponding to one of the Allen relationships or one of the nodes in our taxonomy of Allen relationships. We have reviewed all possible Allen relationships, and taxonomic groupings of them, between pairs of time periods, between a time period and a point in time, and between two points in time. Once again, completeness of coverage has been guaranteed because of the use of a mathematical partitioning of all possible types of queries. But once again, the examples have not been very realistic.