Advances in Database Technology- P18

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:46

0
55
lượt xem
5
download

Advances in Database Technology- P18

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'advances in database technology- p18', công nghệ thông tin, cơ sở dữ liệu phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:
Lưu

Nội dung Text: Advances in Database Technology- P18

  1. 832 K.-Y. Lam and H.C.W. Pang Due to the dynamic properties of sensor data, the probability of satisfying the condition in a sub-query at a node may change with time. Therefore the coordinator node needs to reorder the sequence of the participating nodes periodically. The reorder procedure is performed when the following condition is satisfied: the evaluation is stopped at the same node, called the false node, consecutively for a pre- defined period of time, and the false node is not the first node. Satisfaction of these conditions suggests that the sensor data values generated at the false node may have a high probability to be false in next evaluation. Hence the coordinator node will reorder the sequence of the nodes using the following procedure: a. The false node is now the first node in the sequence. b. All the nodes following the false node will remain in the same relative order to each other. c. All the nodes in front of the false node remain in their original relative order. They rejoin the node sequence but are now attached after the last node of the original sequence. 4 Implementation CMQES is implemented with MICA Motes [MM]. In CMQES, one of the MSPUs is connected with the base station through a mote interface board. It is the base station MSPU. CMQES contains two main software components: the sensor program in the MPSU, and the server program in the base station. The sensor program is implemented in NesC, which is a C-like language for TinyOS [TINY], and is responsible for capturing sensor data values, evaluating sub-queries, submitting sub- query results to the coordinator nodes, and collecting performance statistics at each MPSU. We have implemented SeqPush in the sensor program. The evaluation results of a CMQ and performance statistics are periodically sent to the base station through the base station MSPU for reporting. The second program is the server program residing at the base station. This program is implemented in Java, with MS Windows 2000 and MS SQL server chosen respectively as the operating systems and database. The server program is responsible for collecting results and performance statistics. In addition, the following parameters can be set using the server program at the base station for submitting a CMQ and controlling the operations at the MPSUs: 1. The sampling rate of the MSPUs. 2. The number of nodes participating in a CMQ. 3. Aggregation functions to be performed at a node, i.e., calculate the mean, maximum and minimum from the values of sub-queries. 4. Condition for the sub-query of a CMQ at an MPSU. 5 Demonstration and Sample Results In the demonstration, through the interface at the base station, we can submit CMQs for processing at the MPSUs, as shown in Figure 1. Currently, the MPSU is programmed to capture the light intensity of its surrounding environment periodically, Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  2. Aggregation of Continuous Monitoring Queries 833 i.e., every 2 sec, as sensor data values, and a message is sent every 30 sec to the coordinator node. The sampling rate and message reporting period can be varied at the base station. The message size for communication in TinyOS is 34 bytes. Five bytes are reserved for the message header and the message contains the results from 10 evaluations with each reading taking up 2 bytes. The remaining 9 bytes are for cycle number, message type and the destination address. Currently, the transmission delay of a single message from a MSPU to another one in the testing environment is between 300ms and 700ms. As TinyOS only provides best effort message delivery service. A lost message will be considered a missed evaluation cycle and logged accordingly by the coordinator node. Fig. 1. Program at the base station Fig. 2. Real time display of received messages and statistics Our experiment results show that the number of messages submitted in central aggregation scheme (CAS), i.e. all the participating MSPU submits sub-query result to a central MUPU for data aggregation periodically, is much larger than that in SeqPush. Two ammeters are connected to one of the participating nodes and the coordinator node to measure the energy consumption rates of the nodes when different operations are performed at the nodes. The result captured by the base station is displayed in real time as shown in Fig. 2. The statistics include: (1) Number of message transmitted, including sending and receiving messages. (2) Number of successful evaluations and number of missed query results due to loss of messages. (3) Number of reorder of nodes in SeqPush. References [MM] www.xbow.com [TINY] http://today.cs.berkeley.edu/tos/ Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  3. eVitae: An Event-Based Electronic Chronicle Bin Wu, Rahul Singh, Punit Gupta, Ramesh Jain Experiential Systems Group Georgia Institute of Technology 30308 Atlanta, USA {binwu,rsingh,jain}@ece.gaetch.edu punit@cc.gatech.edu Abstract. We present an event based system for storing, managing, and presenting personal multimedia history. At the heart of our approach is information organization using the concept of an event. Events allow modeling of the data in a manner that is independent of media. To store events, a novel database called EventBase is developed which is indexed by events. The unique characteristics of events make multidimensional querying and multiple perspective explorations of personal history information feasible. In this demo we present the major functions of eVitae. 1 Introduction Personal history systems electronically record important activities from a person’s life in the form of photographs, text, video, and audio. Examples of some existing systems such as [3] have shown that there are a number of salient challenges in this domain. First, information is fundamentally anchored to space and time and people often exploit them as cues for querying information. Second, the data as the carrier of information stays in respective silos. This fragments meaningful information across data. Third, to break down these silos, an information model independent of media is required to depict the content of information. Lastly, presentation of the information must be dynamically generated according to individual users’ preferences. We have developed a personal eChronicle [1] called eVitae [4]. In eVitae we utilize a novel generative theory based upon the concept of event to design an information system [2]. In this paper, we show how eVitae system as an access environment ingests heterogeneous data into meaningful information conveyed by events, aids the user to quickly focus on what is of interest and presents a multidimensional environment for exploration of events with their details stored in appropriate media. 2 Event-Based Data Modeling The approach we employ to design and implement eVitae is based on the notion of events [6]. An event is an occurrence or happening of significance that can be defined as a region or collection of regions in spatial-temporal-attribute space. Given k events, E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 834–836, 2004. © Springer-Verlag Berlin Heidelberg 2004 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  4. eVitae: An Event-Based Electronic Chronicle 835 the event is formally denoted as and uniquely identified by eID (event identifier). In this notation t characterizes the event temporally, s denotes the spatial location(s) associated with the event, and are the attribute associated with the event. An event is defined by its event models, which includes the mandatory attributes: space, time, transcluded-media, event-name, and event-topic, and a finite set of free attributes. Events can be grouped together in collections, called event categories. Formally, an event category can be represented as: where is the set of events that comprise the category. Event categories provide a powerful construct to support the multiple ways of organizing information, definition of complex queries and notification, personalized views of information space where the user is interested. According to the definition of the event, the information layer implemented by events breaks down the data silos. This layer uses an event-based data model to construct a new index that is independent of data type. The organization, indexing, and storage of events conveying potentially changing information are accomplished by parsing the data as it is entered, and storing all events in a database of events called EventBase. The data is parsed by the system and events are produced using the event model. The EventBase also stores links to original data sources, which means the system can only present the appropriate media in the context of a particular event. EventBase is the extension of traditional database. In the implementation of prototype eVitae system, we use MySQL as the database to store and index events. 3 System Architecture The architecture of eVitae system comprises three modules, namely, Event Entry, EventBase, and What-You-See-Is-What-You-Get (WYSIWYG) query and exploration environments. The key features of the system are briefly discussed as following. Event Entry. An event may include multifarious data from different sources, such as video, audio, sensors, texts. The Event Entry module of the system is designed to produce events using event models, and record the link between events and related data. EventBase. EventBase is the backend of eVitae system which stores the Events. The transclusion of media is maintained by storing links between an event and the data it is based upon. EventBase uses the eID attribute of events as the unified index and is supported by MySQL. WYSIWYG query and exploration environment. This is an integrated interaction environment to explore electronic chronicle of a person, as shown in figure 1. By using temporal and spatial relationship as cues and by defining event categories to organize events, we create an exploratory environment for the user. The event exhibit panel (Fig. 1) presents an overall picture of events. Options for zooming, filtering, extraction, viewing relations, history keeping, and details-on-demand make the environment appealing and powerful. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  5. 836 B. Wu et al. Fig. 1. WYSIWYG Query and Exploration Environment 4 Conclusion eVitae system demonstrates an event-based approach for organization, storage, management, and querying of personal history information comprising of multifarious data. The system provides an event entry module for importing data from heterogeneous data sources and assimilating them into events. This information is stored in EventBase which is indexed by events. Event-based modeling allows multidimensional querying and exploration of personal history information. Furthermore, flexibility in event definition and organization allows exploration of the data from multiple perspectives. Preliminary results indicate that an event-based system not only offers significant advantages in information organization, but also in exploration and discovery of information from data. References 1. R. Jain. “Multimedia Electronic Chronicles”, IEEE MultiMedia, pp. 102-103, Volume 10, Issue 03, July 2003. 2. R. Jain, “Events in Heterogeneous Data Environments”, Proc. International Conference on Data Engineering, Bangalore, March 2003. 3. J. Gemmell, G. Bell, R. Lueder, S. Drucker, and C. Wong. “MyLifeBits: fulfilling the Memex vision”, ACM Multimedia, pp. 235-238, ACM, 2002. 4. R. Singh, B. Wu, P. Gupta, R. Jain. “eVitae: Designing Experiential eChronilces”, ESG Technical Report Number : GT-ESG-01-10-03, http://www.esg.gatech.edu/report/GT- ESG-01-10-03.pdf Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  6. CAT: Correct Answers of Continuous Queries Using Triggers Goce Trajcevski1, Peter Scheuermann1*, Ouri Wolfson2, and Nimesh Nedungadi2 1 Department of Electrical and Computer Engineering, Northwestern University, Evanston, Il 60208, {goce,peters}@ece.northwestern.edu 2 Department of Computer Science, University of Illinois at Chicago, Chicago, Il 60607, {wolfson,nnedunga}@cs.uic.edu 1 Introduction and Motivation Consider the query Q1: Retrieve all the motels which will be no further then 1.5 miles from my route, sometime between 7:00PM and 8:30PM, which a mobile user posed to the Moving Objects Database (MOD). Processing such queries is of interest to wide range of applications (e.g. tourist information systems and context awareness [1,2]). These queries pertain to the future of a dynamic 1 world. Since MOD is only a model of the objects moving in the real world, the accuracy of the representation has to be continuously verified and updated, and the answer-set of Q1 has to be re-evaluated in every clock-tick . However, the re-evaluation of such queries can be avoided if an update to the MOD does not affect the answer-set. The motion of the moving object is typically represented as a trajectory – a sequence of 3D points and its projection in the X- Y plane is called a route. The details of the construction based on electronic maps and the speed-profiles of the city blocks are given in [5]. After a trajectory is constructed, a traffic abnormality may occur at a future time-point, due to an accident, road-work, etc..., and once it occurs, we need to: identify the trajectories that are affected, and update them properly (c.f. [5]). In the sequel, we focus on the impacts of the abnormalities to the continuous queries. Figure 1 shows three trajectories – and and their respective routes and If a road-work starts at 4:30PM on the segment between A and B which will last 5 hours, slow down the speed between 4:30PM and 9:30PM. enters that segment after 4:30PM, and its future portion will need to be modified. As illustrated by the thicker portion of instead of being at the point B at 4:50, the object will be there at 5:05. A key observation is that if the object, say whose trajectory is issued the query Q1, we have to re-evaluate the answer. * Research partially supported by NSF grant EIA-000 0536 1 Hence the name continuous queries – formally defined for MOD in [3]. E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 837–840, 2004. © Springer-Verlag Berlin Heidelberg 2004 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  7. 838 G. Trajcevski et al. Fig. 1. Trajectories, Routes and Updates 2 System Architecture Figure 2 illustrates the main components and behavioral aspects of the CAT system: There are tables which: store the trajectories (MOT) and landmarks of inter- est (BUILDINGS); keep track of the queries posed and their answers (PEND- ING-QUERIES and ANSWERS); and store the information about the traffic abnormalities (TRAFFIC_ABN). The trajectories and the landmarks were ob- tained using the real maps of Cook County, Chicago. In the heart of the CAT system is the set of Triggers, part of which we illustrate in the context of the example query Q1. If posed the query Q1 at 3:30PM, its answer contains the motels and to which should be close at 7:55PM and 8:20PM, respectively. When an abnormality is detected, its relevant parameters are inserted in the TRAFFIC_ABN table. This, however, “awakes” whose event is: ON INSERT TO TRAFFIC_ABN checks if the abnormality affects some of the trajectories stored in the MOT and, if so, updates their future part. However, the action part of which updates the MOT, in turn, is the event for ON UPDATE TO MOT IF HAS_PENDING_QUERIES ... checks if the corresponding moving object has posed some queries which are still of interest (i.e. their time has not “expired”). The condition of is satisfied by and its action part will re-evaluate the query Q1, based on the new future-portion of Due to the delay, trajectory will Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  8. CAT: Correct Answers of Continuous Queries Using Triggers 839 Fig. 2. Behavioral Aspects of the CAT be near at 8:35PM, which is a bit too late for the user. On the other hand, will be near the motel at 7:05PM. Before the traffic incident was not part of the answer, (it would have had the desired proximity at 6:50PM). 3 Demonstration All the back-end components are implemented using Oracle 9i as a server. We used User-Defined Types (UDT) to model the entities and User-Defined Func- tions (UDF) to implement the processing, exploiting the Oracle Spatial predi- cates. The front-end client, which is the GUI presented to the end-user, is imple- mented in Java. The GUI gives the options of specifying the queries (i.e. time of submission; relevant time interval; objects of interest; etc...). Once the user clicks the SUBMIT button, the query is evaluated and its answer is displayed. In the server, the query is assigned an id number and it is stored in the PEND- ING_QUERIES table. Clearly, in a real MOD application, the client will be either a wireless (mobile) user of a web browser-based one, properly interfaced to the server. To test the execution of the triggers and the updates of the answer(s) to the continuous queries posed, the GUI offers a window for generating a traffic ab- normality. The user enters the beginning and the end times of the incident as well as its “type” (which determines the impact on the speed-profile). He also enters the route segments along which the traffic incident is spread. The moment this information is submitted to the server, the affected trajectories are updated AND the new answer (s) to the posed continuous queries are displayed back to the user. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  9. 840 G. Trajcevski et al. References 1. A. Hinze and A. Voisard. Location-and time-based information delivery in tourism. In SSTD, 2003. 2. A. Pashtan, R. Blatter, A. Heusser, and P. Scheuermann. Catis: A context-aware tourist information system. In IMC, 2003. 3. A. P. Sistla, O. Wolfson, S. Chamberlain, and S. Dao. Modeling and querying moving objects. In ICDE, 1997. 4. G. Trajcevski and P. Scheuermann. Triggers and continuous queries in moving objects databases. In MDDS, 2003. 5. G. Trajcevski, O. Wolfson, B. Xu, and P. Nelson. Real-time traffic updates in moving object databases. In MDDS, 2002. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  10. Hippo: A System for Computing Consistent Answers to a Class of SQL Queries Jan Chomicki1, Jerzy Marcinkowski2, and Slawomir Staworko1 1 Dept. Computer Science and Engineering, University at Buffalo {chomicki,staworko}@cse.buffalo.edu 2 Instytut Informatyki, Wroclaw Uniwersity, Poland Jerzy.Marcinkowski@ii.uni.wroc.pl 1 Motivation and Introduction Integrity constraints express important properties of data, but the task of pre- serving data consistency is becoming increasingly problematic with new database applications. For example, in the case of integration of several data sources, even if the sources are separately consistent, the integrated data can violate the in- tegrity constraints. The traditional approach, removing the conflicting data, is not a good option because the sources can be autonomous. Another scenario is a long-running activity where consistency can be violated only temporarily and future updates will restore it. Finally, data consistency may be neglected because of efficiency or other reasons. In [1] Arenas, Bertossi, and Chomicki have proposed a theoretical framework for querying inconsistent databases. Consistent query answers are defined to be those query answers that are true in every repair of a given database instance. A repair is a consistent database instance obtained by changing the given instance using a minimal set of insertions/deletions. Intuitively, consistent query answers are independent of the way the inconsistencies in the data would be resolved. Example 1. Assume that an instance of the relation Student is as follows: Assume also that the functional dependency is given. The above instance has two repairs: one obtained by deleting the first tuple, the other - by deleting the second tuple. A query asking for the address of Jones returns Chicago as a consistent answer because the third tuple is in both repairs. However, the query asking for the address of Smith has no consistent answers because the addresses in different repairs are different. On the other hand, the query asking for those people who live in Los Angeles or New York returns Smith as a consistent answer. This conservative definition of consistent answers has one shortcoming: the number of repairs. Even for a single functional dependency, the number of repairs E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 841–844, 2004. © Springer-Verlag Berlin Heidelberg 2004 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  11. 842 J. Chomicki, J. Marcinkowski, and S. Staworko can be exponential in the number of tuples in the database [3]. Nevertheless, several practical mechanisms for the computation of consistent query answers without computing all repairs have been developed (see [5] for a survey): query rewriting [1], logic programs [2,4,9], and compact representations of repairs [6, 7]. The first is based on rewriting the input query Q into a query such that the evaluation of returns the set of consistent answers to Q. This method works only for SJD1 queries in the presence of universal binary constraints. The second approach uses disjunctive logic programs to specify all repairs, and then with the help of a disjunctive LP system [8] finds the consistent answers to a given query. Although this approach is applicable to very general queries in the presence of universal constraints, the complexity of evaluating disjunctive logic programs makes this method impractical for large databases. 2 The System Hippo The system Hippo is an implementation of the third approach. All information about integrity violations is stored in a conflict hypergraph. Every hyperedge connects the tuples violating together an integrity constraint. Using the conflict hypergraph, we can find if a given tuple belongs to the set of consistent answers without constructing all repairs [6]. Because the conflict hypergraph has polynomial size, this method has polynomial data complexity and allows us to efficiently deal even with large databases [7]. Currently, our ap- plication computes consistent answers to SJUD queries in the presence of denial constraints (a class containing functional dependency constraints and exclusion constraints). Allowing union in the query language is crucial for being able to extract indefinite disjunctive information from an inconsistent database (see Ex- ample 1). Future work includes the support for restricted foreign key constraints, uni- versal tuple-generating dependencies and full PSJUD2 queries. However, because computing consistent query answers for SPJ queries is co-NP-data-complete [3, 6], polynomial data complexity cannot be guaranteed once projection is allowed. The whole system is implemented in Java as an RDBMS frontend. Hippo works with any RDBMS that can execute SQL queries, and provides a JDBC access interface (we use PostgreSQL). The data stored in the RDBMS needs not be altered. The flow of data in Hippo is presented on Figure 1. Before processing any input query, the system performs Conflict Detection and creates Conflict Hyper- graph for further usage. We are assuming that the number of conflicts is small enough for the hypergraph to be stored in main memory. The only output of this system is the Answer Set consisting of the consistent answers to the input 1 When describing a query class, P stands for projection, S for selection, U for union, J for cartesian product, and D for difference. 2 Currently, our application supports only those cases of projection that don’t intro- duce existential quantifiers in the corresponding relational calculus query. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  12. Hippo: A System for Computing Consistent Answers 843 Fig. 1. Data flow in Hippo Query in the database instance DB with respect to a set of integrity constraints IC. The processing of the Query starts from Enveloping. As a result of this step we get a query defining Candidates (candidate consistent query answers). This query subsequently undergoes Evaluation by the RDBMS. For every tuple from the set of candidates, the system uses Prover to check if the tuple is a consistent answer to the Query. Depending on the result of this check, the tuple is either added to the Answer Set or not. For every tuple that Prover processes, several membership checks have typi- cally to be performed. In the base version of the system this is done by simply executing the appropriate membership queries on the database. This is a costly procedure and it has a significant influence on the overall time performance of the system. We have introduced several optimizations addressing this problem. In general, by modifying the expression defining the envelope (the set of can- didates) the optimizations allow us to answer the required membership checks without executing any queries on the database. Also, using an expression select- Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  13. 844 J. Chomicki, J. Marcinkowski, and S. Staworko ing a subset of the set of consistent query answers, we can significantly reduce the number of tuples that have to be processed by Prover. A more detailed description of those techniques can be found in [7]. 3 Demonstration The presentation of the Hippo system will consist of three parts. First, we will demonstrate that using consistent query answers we can extract more informa- tion from an inconsistent database than in the approach where the input query is evaluated over the database from which the conflicting tuples have been removed. Secondly, we will show the advantages of our method over competing approaches by demonstrating the expressive power of supported queries and integrity con- straints. And finally, we will compare the running times of our approach and the query rewriting approach, showing that our approach is more efficient. For every query being tested, we will also measure the execution time of this query by the RDBMS backend (it corresponds to the approach when we ignore the fact that the database is inconsistent). This will allow us to conclude that the time overhead of our approach is acceptable. References 1. M. Arenas, L. Bertossi, and J. Chomicki. Consistent Query Answers in Inconsistent Databases. In ACM Symposium on Principles of Database Systems (PODS), pages 68–79, 1999. 2. M. Arenas, L. Bertossi, and J. Chomicki. Answer Sets for Consistent Query An- swering in Inconsistent Databases. Theory and Practice of Logic Programming, 3(4–5):393–424, 2003. 3. M. Arenas, L. Bertossi, J. Chomicki, X. He, V. Raghavan, and J. Spinrad. Scalar Aggregation in Inconsistent Databases. Theoretical Computer Science, 296(3) :405– 434, 2003. 4. P. Barcelo and L. Bertossi. Logic Programs for Querying Inconsistent Databases. In International Symposium on Practical Aspects of Declarative Languages (PADL), pages 208–222. Springer–Verlag, LNCS 2562, 2003. 5. L. Bertossi and J. Chomicki. Query Answering in Inconsistent Databases. In J. Chomicki, R. van der Meyden, and G. Saake, editors, Logics for Emerging Appli- cations of Databases. Springer-Verlag, 2003. 6. J. Chomicki and J. Marcinkowski. Minimal-Change Integrity Maintenance Using Tuple Deletions. Technical Report cs.DB/0212004, arXiv.org e-Print archive, De- cember 2002. Under journal submission. 7. J. Chomicki, J. Marcinkowski, and S. Staworko. Computing Consistent Query An- swers Using Conflict Hypergraphs. In preparation. 8. T. Eiter, W. Faber, N. Leone, and G. Pfeifer. Declarative Problem-Solving in DLV. In J. Minker, editor, Logic-Based Artificial Intelligence, pages 79–103. Kluwer, 2000. 9. G. Greco, S. Greco, and E. Zumpano. A Logical Framework for Querying and Repairing Inconsistent Databases. IEEE Transactions on Knowledge and Data En- gineering, 15(6):1389–1408, 2003. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  14. An Implementation of P3P Using Database Technology Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, and Yirong Xu IBM Almaden Research Center 650 Harry Road, San Jose, CA 95120, USA {ragrawal,kiernan,srikant}@almaden.ibm.com, xuyirong@cn.ibm.com http://www.almaden.ibm.com/software/quest 1 Introduction The privacy of personal information on the Internet has become a major concern for governments, businesses, media, and the public. Platform for Privacy Pref- erences (P3P), developed by the World Wide Web Consortium (W3C), is the most significant effort underway to enable web users to gain more control over their private information. P3P provides mechanisms for a web site to encode its data-collection and data-use practices in a standard XML format, known as a P3P policy [3], which can be programmatically checked against a user’s privacy preferences. This demonstration presents an implementation of the server-centric archi- tecture for P3P proposed in [1]. The novel aspect of this implementation is that it makes use of the proven database technology, as opposed to the prevailing client-centric implementation based on specialized policy-preference matching engines. Not only does this implementation have qualitative advantages, our experiments indicate that it performs significantly better (15-30 times faster) than the sole public-domain client-centric implementation and that the latency introduced by preference matching is small enough (0.16 second on average) for real-world deployments of P3P [1]. 2 Overview of P3P The P3P protocol has two parts: 1. Privacy Policies: An XML format in which a web site can encode its data- collection and data-use practices [3]. For example, an online bookseller can publish a policy which states that it uses a customer’s name and home phone number for telemarketing purpose, but that it does not release this information to external parties. 2. Privacy Preferences: An XML format for specifying privacy preferences and an algorithm for programmatically matching preferences against policies. The W3C APPEL working draft provides such a format and corresponding policy-preference matching algorithm [2]. For example, a privacy-conscious consumer may define a preference stating that she does not want retailers to use her personal information for telemarketing and product promotion. E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 845–847, 2004. © Springer-Verlag Berlin Heidelberg 2004 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  15. 846 R. Agrawal et al. Fig. 1. Client-centric policy-preference matching 2.1 Client-Centric Implementation A client-centric architecture for implementing P3P has been described in [4]. As a user browses a web site, the site’s P3P policy is fetched to the client side. The policy is then checked by a specialized APPEL engine against the user’s APPEL preference to see if the policy conforms to the preference (see Figure 1). There are two prominent implementations of this architecture: Microsoft IE6 and AT&T Privacy Bird. 2.2 Server-Centric Implementation Figure 2 shows the server-centric architecture we have developed. A web site de- ploying P3P first installs its privacy policy in a database system. Then database querying is used for matching a user’s privacy preference against privacy policies. The server-centric implementation has several advantages including: setting up the infrastructure necessary for ensuring that web sites act according to their stated policies, allowing P3P to be deployed in thin, mobile clients that are likely to dominate Internet access in the future, and allowing site owners to refine their policies based on the privacy preferences of their users. 3 Demonstration Our implementation consists of both client and server components. 3.1 Client Components We extend Microsoft Internet Explorer to invoke preference checking at the server before a web page is accessed. The IE extension allows a user to specify her privacy preference at different sensitivity levels. It invokes the preference checking by sending the preference to the server. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  16. An Implementation of P3P Using Database Technology 847 Fig. 2. Server-centric policy-preference matching 3.2 Server Components We define a schema in DB2 for storing policy data in the relational tables. This schema contains a table for every element defined in the P3P policy. The tables are linked using foreign keys reflecting the XML structure of the policies. We extend IBM Tivoli Privacy Wizard (a web-based GUI tool for web site owners to define P3P policies) with the functionality of parsing and shredding P3P policies as a set of records in the database tables. When the server receives the APPEL preference from the client, it translates the preference into SQL queries to be run against the policy tables. The SQL queries corresponding to the preference are submitted to the database engine. The result of the query evaluation yields the action to be taken. The evaluation result is sent back to the client. If the policy does not conform to the preference, the IE extension will block the web page and prompt a message to the user. Otherwise, the requested web page is displayed. References 1. Rakesh Agrawal, Jerry Kiernan, Ramakrishnan Srikant, and Yirong Xu. Implement- ing P3P using database technology. In 19th Int’l Conference on Data Engineering, Bangalore, India, March 2003. 2. Lorrie Cranor, Marc Langheinrich, and Massimo Marchiori. A P3P Preference Ex- change Language 1.0 (APPEL1.0). W3C Working Draft, April 2002. 3. Lorrie Cranor, Marc Langheinrich, Massimo Marchiori, Martin Presler-Marshall, and Joseph Reagle. The Platform for Privacy Preferences 1.0 (P3P1.0) Specifica- tion. W3C Recommendation, April 2002. 4. The World Wide Web Consortium. P3P 1.0: A New Standard in Online Privacy. Available from http://www.w3.org/P3P/brochure.html. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  17. XQBE: A Graphical Interface for XQuery Engines Daniele Braga, Alessandro Campi, and Stefano Ceri Politecnico di Milano Piazza Leonardo da Vinci,32 - 20133 Milano, Italy {braga,campi,ceri}@elet.polimi.it Abstract. XQuery is increasingly popular among computer scientists with a SQL background, since queries in XQuery and SQL require com- parable skills to be formulated. However, the number of these experts is limited, and the availability of easier XQuery “dialects” could be ex- tremely valuable. Something similar happened with QBE, initially pro- posed as an alternative to SQL, that has then become popular as the user-friendly query language supported by MS Access. We designed and implemented XQBE, a visual dialect of XQuery that uses hierarchical structures, coherent with the hierarchical nature of XML, to denote the input and output documents. Our demo consists of examples of queries in XQBE and shows how our prototype allows to switch between equivalent representations of the same query. 1 Introduction The diffusion of XML sets a pressing need for providing the capability to query XML data to a wide spectrum of users, typically lacking in computer program- ming skills. This demonstration presents a user friendly interface, based on an in- tuitive visual query language (XQBE, XQuery By Example), that we developed for this purpose, inspired by the QBE [2]. QBE showed that a visual interface to a query language is effective in supporting the intuitive formulation of queries when the basic graphical constructs are close to the visual abstraction of the underlying data model. Accordingly, while QBE is a relational query language, based on the representation of tables, XQBE is based on the use of annotated trees, to adhere to the hierarchical nature of XML. XQBE was designed with the objectives of being intuitive and easy to map directly to XQuery. Our interface is capable of generating the visual representation of many XQuery statements that belong to a subset of XQuery, defined by our translation algorithm (sketched later). XQBE allows for arbitrarily deep nesting of XQuery FLWOR expressions, construction of new XML elements, and restructuring of existing documents. However, the expressive power of XQBE is limited in comparison with XQuery, which is Turing-complete. The particular purpose of XQBE makes usability one E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 848–850, 2004. © Springer-Verlag Berlin Heidelberg 2004 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  18. XQBE: A Graphical Interface for XQuery Engines 849 Fig. 1. A sample document (bib.xml) of its critical success factors, and we considered this aspect during the whole de- sign and implementation process. Still from a usability viewpoint, our prototype is a first step towards an integrated environment to support both XQuery and XQBE, where users alternate between the XQBE and XQuery representations. 2 XQuery by Example XQBE is fully described in [1]. Here we only introduce its basics by means of the query (Q1) “List books published by Addison-Wesley after 1991, including their year and title”, on the data in Figure 1. Its XQBE version is in Figure 2(a), while its XQuery version is A query is formed by a source part (on the left) and a construct part (on the right). Both parts contain labelled graphs that express properties of XML fragments: the source part describes the XML data to be matched, the construct part specifies which are to be retained. Correspondence between the two parts is expressed by explicit bindings. XML elements in the target are represented as labelled rectangles, their attributes as black circles (with the name on the arc), and their PCDATA as an empty circle. In the construct part, the paths that branch out of a bound node indicate which of its contents are to be retained. In Figure 2(a) the source part matches the book elements with a year greater than 1991 and a publisher equal to “Addison-Wesley”. The binding edge between the book nodes states that the result shall contain as many book elements as those matched. The trapezoidal bib node means that all the generated books are to be contained into one bib element. The translation process translates an XQBE query into a sentence of the XQuery subset defined by the grammar in figure 3. The generated translation of Q1 is: It is also possible to obtain the XQBE version of an XQuery statement. The automatically generated XQBE version of Q1 is shown in Figure 2(b). Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  19. 850 D. Braga, A. Campi, and S. Ceri Fig. 2. The XQBE version of Q1(a) and the automatically generated XQBE for Q1(b). Fig. 3. EBNF specification of the XQuery subset expressible with XQBE 3 Conclusions The contribution of our work is the availability of an environment in which users can query XML data with a GUI, access the generated XQuery statement, and also visualize the XQBE version of a large class of XQuery statements. Moreover they can modify any of the representations and observe the changes in the other representation. References 1. D. Braga and A. Campi. A graphical environment to query xml data with xquery. In Proc. of the 4th WISE, Roma (Italy), December 2003. 2. M. M. Zloof. Query-by-example: A data base language. IBM Systems Journal, 1977. Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
  20. P2P-DIET: One-Time and Continuous Queries in Super-Peer Networks Stratos Idreos, Manolis Koubarakis, and Christos Tryfonopoulos Dept. of Electronic and Computer Engineering Technical University of Crete, GR73100 Chania, Greece {sidraios,manolis,trifon}@intelligence.tuc.gr 1 Introduction In peer-to-peer (P2P) systems a very large number of autonomous computing nodes (the peers) pool together their resources and rely on each other for data and services. P2P systems are application level virtual or overlay networks that have emerged as a natural way to share data and resources. The main applica- tion scenario considered in recent P2P data sharing systems is that of one-time querying: a user poses a query (e.g., “I want music by Moby”) and the system returns a list of pointers to matching files owned by various peers in the network. Then, the user can go ahead and download files of interest. The complementary scenario of selective dissemination of information (SDI) or selective information push is also very interesting. In an SDI scenario, a user posts a continuous query to the system to receive notifications whenever certain resources of interest ap- pear in the system (e.g., when a song of Moby becomes available). SDI can be as useful as one-time querying in many target applications of P2P networks ranging from file sharing, to more advanced applications such as alert systems for digital libraries, e-commerce networks etc. At the Intelligent Systems Laboratory of the Technical University of Crete, we have recently concentrated on the problem of SDI in P2P networks in the context of project DIET (http://www.dfki.de/diet). Our work, summarized in [3], has culminated in the implementation of P2P-DIET, a service that uni- fies one-time and continuous query processing in P2P networks with super- peers. P2P-DIET is a direct descendant of DIAS, a Distributed Information Alert System for digital libraries, that was presented in [4]. P2P-DIET combines one-time querying as found in other super-peer networks and SDI as proposed in DIAS. P2P-DIET has been implemented on top of the open source DIET Agents Platform (http://diet–agents.sourceforge.net/) and it is available at http://www.intelligence.tuc.gr/p2pdiet. 2 The System P2P-DIET A high-level view of the P2P-DIET architecture is shown in Figure 1(a) and a layered view in Figure 1(b). There are two kinds of nodes: super-peers and clients. All super-peers are equal and have the same responsibilities, thus the E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 851–853, 2004. © Springer-Verlag Berlin Heidelberg 2004 Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark
Đồng bộ tài khoản