intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Điện thoại di động giao thức viễn thông cho các mạng dữ liệu P7

Chia sẻ: Hug Go Go | Ngày: | Loại File: PDF | Số trang:27

85
lượt xem
2
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

XML, RDF, and CC/PP Extensible Markup Language (XML) describes a class of data objects called XML documents and partially describes the behavior of the computer programs that process them. XML is an application profile or restricted form of the Standard Generalized Markup Language (SGML). Resource Description Framework (RDF) can be used to create a general, yet extensible framework for describing user preferences and device capabilities. This information can be provided by the user to servers and content providers. The servers can use this information describing the user’s preferences to customize the service or content provided....

Chủ đề:
Lưu

Nội dung Text: Điện thoại di động giao thức viễn thông cho các mạng dữ liệu P7

  1. Mobile Telecommunications Protocols For Data Networks. Anna Ha´ c Copyright  2003 John Wiley & Sons, Ltd. ISBN: 0-470-85056-6 7 XML, RDF, and CC/PP Extensible Markup Language (XML) describes a class of data objects called XML doc- uments and partially describes the behavior of the computer programs that process them. XML is an application profile or restricted form of the Standard Generalized Markup Language (SGML). Resource Description Framework (RDF) can be used to create a general, yet extensible framework for describing user preferences and device capabilities. This information can be provided by the user to servers and content providers. The servers can use this information describing the user’s preferences to customize the service or content provided. The ability of RDF to reference profile information via URLs assists in minimizing the number of network transactions required to adapt content to a device, while the framework fits well into the current and future protocols. A Composite Capability/Preference Profile (CC/PP) is a collection of the capabilities and preferences associated with user and the agents used by the user to access the World Wide Web. These user agents include the hardware platform, system software, and appli- cations used by the user. User agent capabilities and references can be thought of as metadata or properties and descriptions of the user agent hardware and software. 7.1 XML DOCUMENT XML documents are made up of storage units called entities, which contain either parsed or unparsed data. Parsed data is made up of characters, some of which form character data and some of which form markup. Markup encodes a description of the document’s storage layout and logical structure. XML provides a mechanism to impose constraints on the storage layout and logical structure. A software module called an XML processor is used to read XML documents and provide access to their content and structure. It is assumed that an XML processor is doing its work on behalf of another module called the application. An XML processor reads XML data and provides the information to the application.
  2. 112 XML, RDF, AND CC/PP The design goals for XML are • to be straightforwardly usable over the Internet, • to support a wide variety of applications, • to be compatible with SGML, • to create easy-to-write programs that process XML documents, • to keep the number of optional features in XML to the absolute minimum, ideally zero, • to have XML documents human-legible and reasonably clear, • to prepare XML design quickly, • to have the design of XML formal and concise, • to have XML documents that are easy to create, • to have terseness in XML markup of minimal importance. A data object is an XML document if it is well formed, which may be valid if it meets certain further constraints. Each XML document has both a logical and a phys- ical structure. Physically, the document is composed of units called entities. An entity may refer to other entities to cause their inclusion in the document. A document begins in a root or document entity. Logically, the document is composed of declarations, ele- ments, comments, character references, and Processing Instructions (PIs), all of which are indicated in the document by explicit markup. The logical and physical structures must nest properly. Matching the document production implies that it contains one or more elements, and there is exactly one element, called the root or document element, no part of which appears in the content of any other element. For all other elements, if the start-tag is in the content of another element, the end-tag is in the content of the same element. The elements, delimited by start- and end-tags, nest properly within each other. A parsed entity contains text, a sequence of characters, which may represent markup or character data. Characters are classified for convenience as letters, digits, or other characters. A letter consists of an alphabetic or syllabic base character or an ideographic character. A Name is a token beginning with a letter or one of a few punctuation characters, and continuing with letters, digits, hyphens, underscores, colons, or full stops, together known as name characters. The Name spaces assign a meaning to names containing colon characters. Therefore, authors should not use the colon in XML names except for name space purposes, but XML processors must accept the colon as a name character. An Nmtoken (name token) is any mixture of name characters. Literal data is any quoted string not containing the quotation mark used as a delimiter for that string. Literals are used for specifying the content of internal entities (EntityValue), the values of attributes (AttValue), and external identifiers (SystemLiteral). Note that a SystemLiteral can be parsed without scanning for markup. Text consists of intermingled character data and markup. Markup takes the form of start-tags, end-tags, empty-element tags, entity references, character references, com- ments, Character Data (CDATA) section delimiters, document type declarations, process- ing instructions, XML declarations, text declarations, and any white space that is at the top level of the document entity (that is, outside the document element and not inside any other markup). All text that is not markup constitutes the character data of the document.
  3. XML DOCUMENT 113 Comments may appear anywhere in a document outside other markup; in addition, they may appear within the document type declaration at places allowed by the grammar. They are not part of the document’s character data; an XML processor may, but need not, make it possible for an application to retrieve the text of comments. For compatibility, the string "" (double-hyphen) must not occur within comments. Parameter entity references are not recognized within comments. PIs allow documents to contain instructions for applications. PIs are not part of the document’s character data, but must be passed through to the application. The PI begins with a target (PITarget) used to identify the application to which the instruction is directed. The target names XML, xml, and so on are reserved for specification standardization. The XML Notation mechanism may be used for formal declaration of PI targets. Parameter entity references are not recognized within PIs. Markup declarations can affect the content of the document, as passed from an XML processor to an application; examples are attribute defaults and entity declarations. The stand-alone document declaration, which may appear as a component of the XML dec- laration, signals whether there are such declarations, which appear external to the doc- ument entity or in parameter entities. An external markup declaration is defined as a markup declaration occurring in the external subset or in a parameter entity (external or internal, the latter being included because nonvalidating processors are not required to read them). In a stand-alone document declaration, the value ‘yes’ indicates that there are no external markup declarations that affect the information passed from the XML processor to the application. The value ‘no’ indicates that there are or may be such external markup declarations. The stand-alone document declaration only denotes the presence of external declarations; the presence, in a document, of references to external entities, when those entities are internally declared, does not change its stand-alone status. If there are no external markup declarations, the stand-alone document declaration has no meaning. If there are external markup declarations but there is no stand-alone document declaration, the value no is assumed. Each XML document contains one or more elements, the boundaries of which are either delimited by start-tags and end-tags, or, for empty elements, by an empty-element tag. Each element has a type, identified by name, sometimes called its Generic Identifier (GI), and may have a set of attribute specifications. Each attribute specification has a name and a value. An element is valid if there is a declaration matching element declaration in which the Name matches the element type, and one of the following holds: 1. The declaration matches EMPTY and the element has no content. 2. The declaration matches CHILDREN and the sequence of child elements belongs to the language generated by the regular expression in the content model, with optional white space between the start-tag and the first child element, between child elements, or between the last child element and the end-tag. 3. The declaration matches MIXED and the content consists of character data and child elements whose types match names in the content model. 4. The declaration matches ANY, and the types of any child elements have been declared.
  4. 114 XML, RDF, AND CC/PP The element structure of an XML document may, for validation purposes, be con- strained using element-type and attribute-list declarations. An element-type declaration constrains the element’s content. Element-type declarations often constrain which element types can appear as children of the element. At the user option, an XML processor may issue a warning when a declaration mentions an element type for which no declaration is provided, but this is not an error. An element type has element content when elements of that type must contain only child elements (no character data), optionally separated by white space. In this case, the constraint includes a content model, a simple grammar governing the allowed types of the child elements and the order in which they are allowed to appear. The grammar is built on content particles, which consist of names, choice lists of content particles, or sequence lists of content particles. Attribute-list declarations may be used • to define the set of attributes pertaining to a given element type; • to establish type constraints for these attributes; • to provide default values for attributes. Attribute-list declarations specify the name, data type, and default value (if any) of each attribute associated with a given element type. An XML document may consist of one or many storage units. These are called entities; they all have content and are all [except for the document entity and the external Document Type Definition (DTD) subset] identified by entity name. Each XML document has one entity called the document entity, which serves as the starting point for the XML processor and may contain the whole document. Entities may be either parsed or unparsed. A parsed entity’s contents are referred to as its replacement text; this text is considered an integral part of the document. An unparsed entity is a resource whose contents may or may not be text, and if text, may be other than XML. Each unparsed entity has an associated notation, identified by name. Beyond a requirement that an XML processor makes the identifiers for the entity and notation available to the application, XML places no constraints on the contents of unparsed entities. Parsed entities are invoked by name using entity references – unparsed entities by name, given in the value of ENTITY or ENTITIES attributes. General entities are entities for use within the document content. General entities are sometimes referred to with the unqualified term entity when this leads to no ambiguity. Parameter entities are parsed entities for use within the DTD. These two types of entities use different forms of reference and are recognized in different contexts. Furthermore, they occupy different name spaces; a parameter entity and a general entity with the same name are two distinct entities. 7.2 RESOURCE DESCRIPTION FRAMEWORK (RDF) The RDF is a foundation for processing metadata; it provides interoperability between applications that exchange machine-understandable information on the Web. RDF uses
  5. RESOURCE DESCRIPTION FRAMEWORK (RDF) 115 XML to exchange descriptions of Web resources but the resources being described can be of any type, including XML and non-XML resources. RDF emphasizes facilities to enable automated processing of Web resources. RDF can be used in a variety of application areas, for example, in resource discovery to provide better search engine capabilities; in cataloging for describing the content and content relationships available at a particular Web site, page, or digital library, by intelligent software agents to facilitate knowledge sharing and exchange; in content rating; in describing collections of pages that represent a single logical document; in describing intellectual property rights of Web pages; and in expressing the privacy preferences of a user as well as the privacy policies of a Web site. RDF with digital signatures is the key to building the Web of Trust for electronic commerce, collaboration, and other applications. Descriptions used by these applications can be modeled as relationships among Web resources. The RDF data model defines a simple model for describing interrelationships among resources in terms of named properties and values. RDF properties may be thought of as attributes of resources and in this sense correspond to traditional attribute-value pairs. RDF properties also represent relationships between resources. As such, the RDF data model can therefore resemble an entity-relationship diagram. The RDF data model, however, provides no mechanisms for declaring these properties, nor does it provide any mechanisms for defining the relationships between these properties and other resources. That is the role of RDF Schema. To describe bibliographic resources, for example, descriptive attributes including author, title, and subject are common. For digital certification, attributes such as checksum and authorization are often required. The declaration of these properties (attributes) and their corresponding semantics are defined in the context of RDF as an RDF schema. A schema defines not only the properties of the resource (e.g., title, author, subject, size, color, etc.) but may also define the kinds of resources being described (books, Web pages, people, companies, etc.). The type system is specified in terms of the basic RDF data model – as resources and properties. Thus, the resources constituting this system become part of the RDF model of any description that uses them. The schema specification language is a declarative representation language influenced by ideas from knowledge representation (e.g., semantic nets, frames, predicate logic) as well as database schema specification languages and graph data models. The RDF schema specification language is less expressive and simpler to implement than full predicate calculus languages. RDF adopts a modular approach to metadata that can be considered an implementa- tion of the Warwick Framework. RDF represents an evolution of the Warwick Framework model in that the Warwick Framework allows each metadata vocabulary to be represented in a different syntax. In RDF, all vocabularies are expressed within a single well-defined model. This allows for a finer grained mixing of machine-processable vocabularies and addresses the need to create metadata in which statements can draw upon multiple vocabularies that are managed in a decentralized fashion by independent communities of expertise. RDF Schemas may be contrasted with XML DTDs and XML Schemas. Unlike an XML DTD or Schema, which gives specific constraints on the structure of an XML document, an RDF Schema provides information about the interpretation of the statements given in
  6. 116 XML, RDF, AND CC/PP an RDF data model. While an XML Schema can be used to validate the syntax of an RDF/XML expression, a syntactic schema alone is not sufficient for RDF purposes. RDF Schemas may also specify constraints that should be followed by these data models. The RDF Schema specification was directly influenced by consideration of the follow- ing problems: • Platform for internet content selection (PICS): The RDF Model and Syntax is adequate to represent PICS labels; however, it does not provide a general-purpose mapping from PICS rating systems into an RDF representation. • Simple web metadata: An application for RDF is in the description of Web pages. This is one of the basic goals of the Dublin Core Metadata Initiative. The Dublin Core Element Set is a set of 15 elements believed to be broadly applicable to describing Web resources to enable their discovery. The Dublin Core has been a major influence on the development of RDF. An important consideration in the development of the Dublin Core was to not only allow simple descriptions but also to provide the abil- ity to qualify descriptions in order to provide both domain-specific elaboration and descriptive precision. The RDF Schema specification provides a machine-understandable system for defin- ing schemas for descriptive vocabularies like the Dublin Core. It allows designers to specify classes of resource types and properties to convey descriptions of those classes, relationships between those properties and classes, and constraints on the allowed com- binations of classes, properties, and values. • Sitemaps and concept navigation: A sitemap is a hierarchical description of a Web site. Subject taxonomy is a classification system that might be used by content creators or trusted third parties to organize or classify Web resources. The RDF Schema specifica- tion provides a mechanism for defining the vocabularies needed for such applications. Thesauri and library classification schemes are examples of hierarchical systems for representing subject taxonomies in terms of the relationships between named con- cepts. The RDF Schema specification provides sufficient resources for creating RDF models that represent the logical structure of Thesauri and other library classifica- tion systems. • P 3P : The World Wide Web Consortium (W3C Platform for Privacy Preferences Project (P3P) has specified a grammar for constructing statements about a site’s data collection practices and personal preferences as exercised over those practices, as well as a syntax for exchanging structured data. Although personal data collection practices have been described in P3P using an application-specific XML tagset, there are benefits of using a general metadata model for this data. The structure of P3P policies can be interpreted as an RDF model. Using a metadata schema to describe the semantics of privacy practice descriptions will permit privacy practice data to be used along with other metadata in a query during resource discovery, and will permit a generic software agent to act on privacy metadata using the same techniques as used for other descriptive metadata. Extensions to P3P that describe the specific data elements collected by a site could use RDF Schema to further specify how those data elements are used.
  7. RESOURCE DESCRIPTION FRAMEWORK (RDF) 117 Resources may be instances of one or more classes. Classes are often organized in a hierarchical fashion; for example, a class Cat might be considered a subclass of Mammal, which is a subclass of Animal, meaning that any resource, which is of type Cat, is also considered to be of type Animal. This specification describes a property of a subclass, to denote such relationships between classes. The RDF Schema type system is similar to the type systems of object-oriented pro- gramming languages such as Java. However, RDF differs from many such systems in that instead of defining a class in terms of the properties its instances may have, an RDF schema defines properties in terms of the classes of resource to which they apply. For example, we could define the author property to have a domain of Book and a range of Literal, whereas a classical object-oriented system may typically define a class Book with an attribute called author of type Literal. One benefit of the RDF property-centric approach is that it is very easy for anyone to say anything they want about existing resources, which is one of the architectural principles of the Web. The following resources are the core classes that are defined as part of the RDF Schema vocabulary. Every RDF model that draws upon the RDF Schema name space (implicitly) includes these: • rdfs:Resource: All things being described by RDF expressions are called resources and are considered to be instances of the class rdfs:Resource. The RDF class rdfs:Resource represents the set called ‘Resources’ in the formal model for RDF. • rdf:Property: The rdf:Property represents the subset of RDF resources that are proper- ties, that is, all the elements of the set introduced as ‘Properties’. • rdfs:Class: This corresponds to the generic concept of a Type or Category, similar to the notion of a Class in object-oriented programming languages such as Java. When a schema defines a new class, the resource representing that class must have an rdf:type property whose value is the resource rdfs:Class. RDF classes can be defined to rep- resent almost anything, such as Web pages, people, document types, databases, or abstract concepts. Every RDF model that uses the schema mechanism also (implicitly) includes the following core properties. These are instances of the rdf:Property class and provide a mechanism for expressing relationships between classes and their instances or super- classes. • rdf:type: This indicates that a resource is a member of a class, and thus has all the characteristics that are to be expected of a member of that class. When a resource has an rdf:type property whose value is some specific class, we say that the resource is an instance of the specified class. The value of an rdf:type property for some resource is another resource that must be an instance of rdfs:Class. The resource known as rdfs:Class is itself a resource of rdf:type rdfs:Class. Individual classes (e.g., ‘Cat’) will always have an rdf:type property whose value is rdfs:Class (or some subclass of rdfs:Class). • rdfs:subClassOf : This property specifies a subset/superset relation between classes. The rdfs:subClassOf property is transitive. If class A is a subclass of some broader class B, and B is a subclass of C, then A is also implicitly a subclass of C. Consequently,
  8. 118 XML, RDF, AND CC/PP resources that are instances of class A will also be instances of C, since A is a subset of both B and C. Only instances of rdfs:Class can have the rdfs:subClassOf property and the property value is always of rdf:type rdfs:Class. A class may be a subclass of more than one class. A class can never be declared to be a subclass of itself, nor of any of its own subclasses. An example class hierarchy is shown in Figure 7.1. In this figure, we define a class Art. Two subclasses of Art are defined as Painting and Sculpture. We define a class Reproduction – Limited Edition, which is a subclass of both Painting and Sculpture. The arrows in Figure 7.1 point to the subclasses and the type. RDF schemas can express constraints that relate vocabulary items from multiple inde- pendently developed schemas. Since URI references are used to identify classes and properties, it is possible to create new properties whose domain or range constraints reference classes defined in another name space. These constraints include the following: • The value of a property should be a resource of a designated class. This is a range constraint. For example, a range constraint applying to the author property might express that the value of an author property must be a resource of class Person. • A property may be used on resources of a certain class. This is a domain constraint. For example, that the author property could only originate from a resource that was an instance of class Book. RDF uses the XML Name space facility to identify the schema in which the properties and classes are defined. Since changing the logical structure of a schema risks breaking other RDF models that depend on that schema, a new name space URI should be declared whenever an RDF schema is changed. s = rdfs: subclass of t = rdf: type rdfs: Resource s s t t rdfs: Class t xyz: Art t t t s s xyz: Painting xyz: Sculpture s s xyz: Reproduction-Limited Edition Figure 7.1 Class hierarchy in RDF.
  9. CC/PP – USER SIDE FRAMEWORK FOR CONTENT NEGOTIATION 119 In effect, changing the RDF statements, which constitute a schema, creates a new one; new schema name spaces should have their own URI to avoid ambiguity. Since an RDF Schema URI unambiguously identifies a single version of a schema, software that uses or manages RDF (e.g., caches) should be able to safely store copies of RDF schema models for an indefinite period. The problems of RDF schema evolution share many characteristics with XML DTD version management and the general problem of Web resource versioning. Since each RDF schema has its own unchanging URI, these can be used to con- struct unique URI references for the resources defined in a schema. This is achieved by combining the local identifier for a resource with the URI associated with that schema name space. The XML representation of RDF uses the XML name space mechanism for associating elements and attributes with URI references for each vocabulary item used. The resources defined in RDF schemas are themselves Web resources and can be described in other RDF schemas. This principle provides the basic mechanism for RDF vocabulary evolution. The ability to express specialization relationships between classes (subClassOf) and between properties (subPropertyOf) provides a simple mechanism for making statements about how such resources map to their predecessors. Where the vocab- ulary defines properties, the same approach can be taken, using rdfs:subPropertyOf to make statements about relationships between properties defined in successive versions of an RDF vocabulary. 7.3 CC/PP – USER SIDE FRAMEWORK FOR CONTENT NEGOTIATION RDF can be used to create a general, yet extensible framework for describing user pref- erences and device capabilities. This information can be provided by the user to servers and content providers. The servers can use this information describing the user’s pref- erences to customize the service or content provided. The ability of RDF to reference profile information via URLs assists in minimizing the number of network transactions required to adapt content to a device, while the framework fits well into the current and future protocols. A CC/PP is a collection of the capabilities and preferences associated with user and the agents used by the user to access the World Wide Web. These user agents include the hardware platform, system software, and applications used by the user. User agent capabilities and references can be thought of as metadata or properties and descriptions of the user agent hardware and software. A description of the user’s capabilities and preferences is necessary but insufficient to provide a general content negotiation solution. A general framework for content negoti- ation requires a means for describing the metadata or attributes and preferences of the user and his/hers/its agents, the attributes of the content and the rules for adapting content to the capabilities and preferences of the user. The mechanisms, such as accept headers and tags, are somewhat limited. For example, the content might be authored in multiple languages with different levels of confidence in the translation and the user might be able
  10. 120 XML, RDF, AND CC/PP to understand multiple languages with different levels of proficiency. To complete the negotiation, some rule is needed for selecting a version of the document on the basis of weighing the user’s proficiency in different languages against the quality of the documents various translations. The CC/PP proposal describes an interoperable encoding for capabilities and prefer- ences of user agents, specifically Web browsers. The proposal is also intended to support applications other than browsers, including e-mail, calendars, and so on. Support for peripherals such as printers and fax machines requires other types of attributes such as type of printer, location, Postscript support, color, and so on. We believe an XML/RDF- based approach would be suitable. However, metadata descriptions of devices such as printers or fax machines may use a different scheme. The basic data model for a CC/PP is a collection of tables. Though RDF makes modeling a wide range of data structures possible, it is unlikely that this flexibility will be used in the creation of complex data models for profiles. In the simplest form, each table in the CC/PP is a collection of RDF statements with simple, atomic properties. These tables may be constructed from default settings, persistent local changes, or temporary changes made by a user. One extension to the simple table of properties data model is the notion of a separate, subordinate collection of default properties. Default settings might be properties defined by the vendor. In the case of hardware, the vendor often has a very good idea of the physical properties of any given model of product. However, the current owner of the product may be able to add options, such as memory or persistent store or additional I/O devices that add new properties or change the values of some original properties. These would be persistent local changes. An example of a temporary change would be turning sound on or off. The profile is associated with the current network session or transaction. Each major component may have a collection of attributes or preferences. Examples of major compo- nents are the hardware platform, upon which all the software is executing, the software platform, upon which all the applications are hosted, and each of the applications. Some collections of properties and property values may be common to a particular component. For example, a specific model of a smart phone may come with a specific CPU, screen size, and amount of memory by default. Gathering these default proper- ties together as a distinct RDF resource makes it possible to independently retrieve and cache those properties. A collection of default properties is not mandatory, but it may improve network performance, especially the performance of relatively slow wire- less networks. From the point of view of any particular network transaction, the only property or capability information that is important is whatever is current. The network transaction does not care about the differences between defaults or persistent local changes; it only cares about the capabilities and preferences that apply to the current network transaction. Because this information may originate from multiple sources and because different parts of the capability profile may be differentially cached, the various components must be explicitly described in the network transaction. The CC/PP is the encoding of profile information that needs to be shared between a client and a server, gateway, or proxy. The persistent encoding of profile information and the encoding for the purposes of interoperability (communication) need not be the same.
  11. CC/PP – USER SIDE FRAMEWORK FOR CONTENT NEGOTIATION 121 Instead of enumerating each set of attributes, a remote reference can be used to name a collection of attributes such as the hardware platform defaults. This has the advantage of enabling the separate fetching and caching of functional subsets. This might be very good if the link between the gateway or the proxy and the client agent was slow and the link between the gateway or proxy and the site named by the remote reference was fast – a typical case when the user agent is a smart phone. Another advantage is the simplification of the development of different vocabularies for hardware vendors and software vendors. It is important to be able to add to and modify attributes associated with the current CC/PP. We need to be able to modify the value of certain attributes, such as turning sound on and off and we need to make persistent changes to reflect things like a memory upgrade. We need to be able to override the default profile provided by the vendor. When used in the context of a Web-browsing application, a CC/PP should be associated with a notion of a current session rather than a user or a node. HTTP and WSP (the WAP session protocol) both define different session semantics. The client, server, gateways, and proxies may already have their own, well-defined notions of what constitutes a connection or a session. The protocol strategy is to send as little information as possible and if anyone is missing something, they have to ask for it. If there is good reason to believe that someone is going to ask for a profile, the client can elect to send the most efficient form of the profile that makes sense. We consider the following possible interaction between a server and a client. When the client begins a session, it sends a minimal profile using as much indirection as possible. If the server/gateway/proxy does not have a CC/PP for this session, then it asks for one. When a profile is sent, the client tries a minimal form, that is, it uses as much indirection as possible and only names the nondefault attributes of the profile. The server/gateway/proxy can try to fill in the profile using the indirect HTTP references (which may be indepen- dently cached). If any of these fail, a request for additional data can be sent to the user, which can reply with a fully enumerated profile. If the client changes the value of an attribute, such as turning sound off, only that change needs to be sent. It is likely that servers and gateways/proxies are concerned with different preferences. For example, the server may need to know which language the user prefers and the gateway may have responsibility to trim images to eight bits of color (to save bandwidth). However, the exact use of profile information by each server/gateway/proxy is hard to predict. Therefore, gateways/proxies should forward all profile information to the server. Any requests for profile information that the gateway/proxy cannot satisfy should be forwarded to the client. The ability to compose a profile from sources provided by third parties at run-time exposes the system to a new type of attack. For example, if the URL that named the hardware default platform defaults were to be compromised via an attack on domain name system (DNS), it would be possible to load incorrect profile information. If cached within a server/gateway/proxy, this could be a serious denial of service attack. If this is a serious enough problem, it may be worth adding digital signatures to the URLs used to refer to profile components. The CC/PP framework is a mechanism for describing the capabilities and preferences associated with users and user agents accessing the World Wide Web. Information about
  12. 122 XML, RDF, AND CC/PP user agents includes the hardware platform, system software, applications, and user pref- erences. The user agent capabilities and preferences can be thought of as metadata, or properties and descriptions of the user agent’s hardware and software. The CC/PP descriptions are intended to provide information necessary to adapt the content and the content delivery mechanisms to best fit the capabilities and preferences of the user and its agents. The major disadvantage of this format is that it is verbose. Some networks are very slow and this would be a moderately expensive way to handle metadata. There are several optimizations possible to help deal with network performance issues. One strategy is to use a compressed form of XML, and a complementary strategy is to use references (URIs). Instead of enumerating each set of attributes, a reference can be used to name a collection of attributes such as the hardware platform defaults. This has the advantage of enabling the separate fetching and caching of functional subsets. Another problem is to propagate changes to the current CC/PP descriptions to an origin server, a gateway, or a proxy. One solution is to transmit the entire CC/PP descriptions with each change. This is not ideal for slow networks. An alternative is to send only the changes. The CC/PP exchange protocol does not depend on the profile format that it conveys. Therefore, another profile format besides the CC/PP description format can be applied to the CC/PP exchange protocol. The basic requirements for the CC/PP exchange protocol are as follows: • The transmissions of the CC/PP descriptions should be HTTP/1.1-compatible. • The CC/PP exchange protocol should support an indirect addressing scheme based on Request For Comment RFC2396 (Generic Syntax for URIs) for referencing profile information. • Components used to construct CC/PP descriptions, such as vendor default descriptions, should be independently cacheable. • The CC/PP exchange protocol should provide a lightweight exchange mechanism that permits the client to avoid resending the elements of the CC/PP descriptions that have not changed since the last time the information was transmitted. CC/PP repository is an application program that maintains CC/PP descriptions. The CC/PP repository should be HTTP/1.0 or HTTP/1.1-compliant. The CC/PP repository is not required to comply with the CC/PP exchange protocol. The protocol strategy is to send a request with profile information, which is as limited as possible, by using references (URIs). For example, a user agent issues a request with URIs that address the profile information, and if the user agent changes the value of an attribute, such as turning sound off, only that change is sent together with the URIs. When an origin server receives the request, the origin server inquires of CC/PP repositories the CC/PP descriptions using the list of URIs. Then the origin server creates a tailored content using the fully enumerated CC/PP descriptions. The origin server might not obtain the fully enumerated CC/PP descriptions when any one of the CC/PP repositories is not available. In this case, it depends on the implemen- tation whether the origin server should respond to the request with a tailored content,
  13. CC/PP – USER SIDE FRAMEWORK FOR CONTENT NEGOTIATION 123 a nontailored content, or an error. In any case, the origin server should inform the user agent of the fact. A warning mechanism is introduced for this purpose. It is likely that an origin server, a gateway, or a proxy will be concerned with different device capabilities or user preferences. For example, the origin server may have respon- sibility to select content according to the user-preferred language, while the proxy may have responsibility to transform the encoding format of the content. Therefore, gateways or proxies might not forward all profile information to an origin server. The CC/PP exchange protocol might convey natural language codes within header field values. Therefore, internationalization issues must be considered. The internationalization policy of the CC/PP exchange protocol is based on RFC2277 (IETF Policy on Character Sets and Language). Considering how to maintain a session like real-time streaming protocol (RTSP) is worthwhile from the point of view of minimizing transactions (i.e., the session mechanism could permit the client to avoid resending the elements of the CC/PP descriptions that have not changed since the last time the information was transmitted). However, a session mechanism would reduce cache efficiency and requires maintaining states between a user agent and an origin server. The CC/PP exchange protocol is designed as a session-less (stateless) protocol. The CC/PP exchange protocol is based on the HTTP Extension Framework. The HTTP Extension Framework is a generic extension mechanism for HTTP/1.1, which is designed to interoperate with existing HTTP applications. An extension declaration is used to indicate that an extension has been applied to a message and possibly to reserve a part of the header name space identified by a header field prefix. The HTTP Extension Framework introduces two types of extension declaration strength: mandatory and optional, and two types of extension declaration scope: hop- by-hop and end-to-end. Which type of the extension declaration strengths and/or which type of the extension declaration scopes should be used depends on what the user agent needs to do. The strength of the extension declaration should be mandatory if the user agent needs to obtain an error response when a server (an origin server, a gateway, or a proxy) does not comply with the CC/PP exchange protocol. The strength of the extension declaration should be optional if the user agent needs to obtain the nontailored content when a server does not comply with the CC/PP exchange protocol. The scope of the extension declaration should be hop-by-hop if the user agent has an a priori knowledge that the first-hop proxy complies with the CC/PP exchange protocol. The scope of the extension declaration should be end-to-end if the user agent has an a priori knowledge that the first-hop proxy does not comply with the CC/PP exchange protocol, or the user agent does not use a proxy. The integrity and persistence of the extension should be maintained and kept unquestioned throughout the lifetime of the extension. The name space prefix is generated dynamically. The profile header field is a request-header field, which conveys a list of references that address CC/PP descriptions. The goal of the CC/PP framework is to specify how client devices express their capabilities and preferences (the user agent profile) to the server that originates content (the origin server). The origin server uses the user agent profile to produce and deliver content appropriate to the client device. In addition to
  14. 124 XML, RDF, AND CC/PP computer-based client devices, particular attention is paid to other kinds of devices such as mobile phones. The requirements on the framework emphasize three aspects: flexibility, extensibility, and distribution. The framework must be flexible, since we cannot today predict all the different types of devices that will be used in the future, or the ways those devices will be used. It must be extensible for the same reasons: it should not be hard to add and test new descriptions. And it must be distributed, since relying on a central registry might make it inflexible. The basic problem that the CC/PP framework addresses is to create a structured and universal format for how a client device tells an origin server about its user agent profile. A design used to convey the profile is independent on the protocols used to transport it. It does not present mechanisms or protocols to facilitate the transmission of the profile. The framework describes a standardized set of CC/PP attributes – a vocabulary – that can be used to express a user agent profile in terms of capabilities and the users preferences for the use of these capabilities. This is implemented using the XML application RDF. This enables the framework to be flexible, extensible, and decentralized, thus fulfilling the requirements. RDF is used to express the client device’s user agent profile. The client device may be a workstation, personal computer, mobile terminal, or set-top box. When used in a request-response protocol like HTTP, the user agent profile is sent to the origin server that, subsequently, produces content that satisfies the constraints and preferences expressed in the user agent profile. The CC/PP framework may be used to convey to the client device what variations in the requested content are available from the origin server. Fundamentally, the CC/PP framework starts with RDF and then overlays a CC/PP- defined set of semantics that describe profiles. The CC/PP framework does not specify whether the client device or the origin server initiates this exchange of profiles. The CC/PP framework specifies the RDF usage and associated semantics that should be applied to all profiles that are being exchanged. The HTTP use case with repository for the profile information is as follows: 1. Request from client with profile information 2. Server resolves and retrieves profile (from CC/PP repository in the network), and uses it to adapt the content 3. Server returns adapted content 4. Proxy forwards response to the client. The notion of a proxy resolving the information and retrieving it from a repository might assume the use of an XML processor and encoding of the profile in XML. In case the document contains a profile, the above could still apply. However, there will be some interactions inside the server, as the client profile information needs to be matched with the document profile. The interactions in the server are not defined. The document profile use case is as follows: 1. Request (extended method) with profile information 2. Document profile is matched against device profile to derive optimum representation
  15. CC/PP – USER SIDE FRAMEWORK FOR CONTENT NEGOTIATION 125 3. Document is adapted 4. Response to the client with adapted content. The mobile environment requires small messages and has a much narrower bandwidth than fixed environments. When a user agent profile is used with a WAP device, the scenario is as follows: 1. WSP request with profile information or difference relative to a specified default. 2. Gateway caches WSP header, composes the current profile (using the cached header as defaults and diffs from the client). The user agent profile values can change at setup or resume of session. 3. Gateway passes request to server using extended HTTP method. 4. Server returns adapted information. 5. Response in WSP with adapted content. The user agent profile is transmitted as a parameter of the WSP session to the WAP gateway and cached; it is then transferred over HTTP using the CC/PP Exchange Protocol, which is an application of the HTTP Extension Framework. The WAP system uses wireless markup language (WML) as its content format, not HTML. This is an XML application, and the adaptation could, for instance, be transfor- mation from another XML format into WML. The Conneg (Content Negotiation) working group in the IETF has developed a form of media feature descriptors, which are registered with Internet Assigned Numbers Author- ity (IANA). Like the CC/PP format and vocabulary, this is intended to be independent of protocol. The Conneg working group also defined a matching semantics based on constraints. The Conneg framework defines an IANA registry for feature tags, which are used to label media feature values associated with the presentation of data (e.g., display res- olution, color capabilities, audio capabilities, etc.). To describe a profile, Conneg uses predicate expressions (feature predicates) on collections of media feature values (feature collection) as an acceptable set of media feature combinations (feature set). The same basic framework is applied to describe receiver and sender capabilities and preferences, and also document characteristics. Profile matching is performed by finding the feature set that matches two (or more) profiles. This involves finding the feature predicate that is equivalent to the logical-AND of the predicates being matched. Conneg is protocol independent, but can be used for server-initiated transactions, for example: 1. Server sends to proxy 2. Proxy retrieves profile from client (or checks against a cache) 3. Client returns profile 4. Proxy formats information and forwards it. The TV/broadcast use case describes a push situation, in which a broadcaster sends out an information set to a device without a back channel. The server cannot get capabilities for all devices, so it broadcasts a minimum set of elements or a multipart document, which
  16. 126 XML, RDF, AND CC/PP is then adapted to the optimal presentation for the device. Television manufacturers desire to turn their appliances into interactive devices. This effort is based on the use of extensible HTML (XHTML) as language for the content representation, which, for instance, enables the use of content profiles as seen. A television set does not have a local intelligence of its own and does not allow for bidirectional communication with the origin server. This architecture also applies to several different device classes, such as pagers, e-mail clients, and other similar devices. It is not the case that they are entirely without interaction, however. In reality, these devices follow a split-client model, in which the broadcaster, cable head-end, or similar entity interacts with the origin server and sends a renderable version of the content to the part of the client, which resides at the user site. There are also use cases in which the entire data set is downloaded into the client, and the optimal rendering is constructed there, for instance, in a set-top box. In these cases, the CC/PP client profile will need to be matched against a document profile representing the author’s preferences for the rendering of the document. The protocol interactions are as follows: 1. Document is pushed to the client including alternate information and document profile. 2. Client matches the rules in the document profile and its own profile. 3. The client adapts content to its optimal presentation using the derived intersection of the two sets. When a request for content is made by a user agent to an origin server, a CC/PP profile describing the capabilities and preferences is transmitted along with the request. It is possible that intermediate network elements such as gateways and transcoding proxies that have additional capabilities might be able to translate or adapt the content before rendering it to the device. Such capabilities are not known to the user agents and therefore cannot be included in the original profile. However, these capabilities would need to be conveyed to the origin server or proxy serving/generating the content. In some instances, the profile information provided by the requesting client device may need to be overridden or augmented. CC/PP framework must therefore support the ability for such proxies and gateways to assert their capabilities using the existing vocabulary or extensions thereof. This can be done as amendments or overrides to the profile included in the request. Given the use of XML as the base format, these can be in-line references to be downloaded from a repository as the profile is resolved. The protocol interactions are as follows: 1. The CC/PP-compliant user agent requests content with the profile. 2. The transcoding proxy appends additional capabilities (profile segment), or overrides the default values, and forwards the profile to the network. 3. The origin server constructs the profile and generates adapted content. 4. The transcoding proxy transcodes the content received on the basis of its abilities, and forwards the resulting customized content to the device for rendering. The foundation of RDF is a model for representing named properties and property val- ues. The RDF model draws on principles from various data representation communities.
  17. CC/PP – USER SIDE FRAMEWORK FOR CONTENT NEGOTIATION 127 RDF properties may be thought of as attributes of resources and in this sense correspond to traditional attribute-value pairs. RDF properties also represent relationships between resources and an RDF model can therefore resemble an entity-relationship diagram. In object-oriented design terminology, resources correspond to objects and properties corre- spond to instance variables. The RDF data model is a syntax-neutral way of representing RDF expressions. The data model representation is used to evaluate equivalence in meaning. Two RDF expressions are equivalent if and only if their data model representations are the same. This definition of equivalence permits some syntactic variation in expression without altering the meaning. The basic data model consists of three object types: • Resources: Resources are described by RDF expressions. A resource may be an entire Web page, a part of a Web page, for example, a specific HTML or XML element within the document source. A resource may also be a whole collection of pages, for example, an entire Web site. A resource may also be an object that is not directly accessible via the Web, for example, a printed book. Anything can have a URI; the extensibility of URIs allows the introduction of identifiers for any entity. • Properties: A property is a specific aspect, characteristic, attribute, or relation used to describe a resource. Each property has a specific meaning, defines its permitted values, the types of resources it can describe, and its relationship with other properties. • Statements: A specific resource together with a named property plus the value of that property for that resource is an RDF statement. These three individual parts of a statement are called the subject, the predicate, and the object, respectively. The object of a statement (i.e., the property value) can be another resource or it can be a literal, that is, a resource (specified by a URI) or a simple string or other primitive datatype defined by XML. In RDF terms, a literal may have content that is XML markup but is not further evaluated by the RDF processor. There are some syntactic restrictions on how markup in literals may be expressed. RDF properties may be thought of as attributes of resources and in this sense correspond to traditional attribute-value pairs. RDF properties also represent relationships between resources. As such, the RDF data model can therefore resemble an entity-relationship diagram. The RDF data model, however, provides no mechanisms for declaring these properties, nor does it provide any mechanisms for defining the relationships between these properties and other resources. That is the role of RDF Schema. Each RDF schema is identified by its own static URI. The schema’s URI can be used to construct unique URI references for the resources defined in a schema. This is achieved by combining the local identifier for a resource with the URI associated with that schema name space. The XML representation of RDF uses the XML name space mechanism for associating elements and attributes with URI references for each vocabulary item used. A CC/PP profile describes client capabilities in terms of a number of CC/PP attributes or features. Each of these features is identified by a name in the form of a URI. A collection of such names used to describe a client is called a vocabulary. CC/PP defines a small, core set of features that are applicable to a wide range of user agents and that provide a broad indication of a clients capabilities. This is called the core
  18. 128 XML, RDF, AND CC/PP vocabulary. It is expected that any CC/PP processor will recognize all the names in the core vocabulary, together with an arbitrary number of additional names drawn from one or more extension vocabularies. When using names from the core vocabulary or an extension vocabulary, it is important that all system components (clients, servers, proxies, etc.), which generate or interpret the names, apply a common meaning to the same name. It is preferable that different components use the same name to refer to the same feature, even when they are a part of different applications, as this improves the chances of effective interworking across applications that use capability information. Within an RDF expression describing a device, a vocabulary name appears as the label on a graph edge linking a resource to a value for the named attribute. The attribute value may be a simple string value, or another resource, with its own attributes representing the component parts of a composite value. Vocabulary extensions are used to identify more detailed information than can be described using the core vocabulary. Any application or operational environment that uses CC/PP may define its own vocabulary extensions, but wider interoperability is enhanced if vocabulary extensions are defined, which can be used more generally, for example, a standard extension vocabulary for imaging devices, or voice messaging devices, or wireless access devices, and so on. Any CC/PP expression can use terms drawn from an arbitrary number of different vocabularies, so there is no restriction caused by reusing terms from an existing vocabulary rather then defining new names to identify the same information. CC/PP attribute names are in the form of a URI. Any CC/PP vocabulary is associated with an XML name space, which combines a base URI with a local XML element name (or XML attribute name) to yield a URI corresponding to an element name. Thus, CC/PP vocabulary terms are constructed from an XML name space base URI and a local attribute name. Anyone can define and publish a CC/PP vocabulary extension (assuming administrative control or allocation of a URI for an XML name space). For such a vocabulary to be useful, it must be interpreted in the same way by communicating entities. Thus, use of an existing extension vocabulary or publication of a new vocabulary definition containing detailed descriptions of the various CC/PP attribute names is encouraged wherever possible. Many extension vocabularies will be drawn from existing applications and protocols. CC/PP expresses the user agent capabilities and how the user wants to use them. XHTML document profiles express the required functionalities for what the author per- ceives as optimal rendering and how the author wants them to be used. We regard the CC/PP format as the common format, to which other profile formats have been mapped. The interactions are as follows: 1. Request (extended method) with profile information. 2. Profile translation (this refers to functional elements. The entire process can also take place in the origin server). 3. Schema for document profile is retrieved (from a repository or other entity). 4. Server resolves mappings and creates an intermediary CC/PP schema for the matching. 5. Document profile is matched against device profile to derive optimum representation.
  19. CC/PP EXCHANGE PROTOCOL BASED ON THE HTTP EXTENSION FRAMEWORK 129 6. Document is adapted. 7. Response to client with adapted content. Depending on the format of the document profile, the translation can be done in different ways. 8. In the case of a dedicated XML-based format, mapping the XML Schema for the dedicated format to the schema for RDF will allow the profile to be expressed as RDF by the translating entity. In the case of a non-XML-based format, a one-to-one mapping will have to be provided for the translation. 7.4 CC/PP EXCHANGE PROTOCOL BASED ON THE HTTP EXTENSION FRAMEWORK The CC/PP framework is a mechanism for describing the capabilities and preferences associated with users and user agents accessing the World Wide Web. Information about user agents includes the hardware platform, system software, applications, and user pref- erences (P3P). The user agent capabilities and preferences can be thought of as metadata, or properties and descriptions of the user agent’s hardware and software. The CC/PP descriptions are intended to provide information necessary to adapt the content and the content delivery mechanisms to best fit the capabilities and preferences of the user and its agents. Instead of enumerating each set of attributes, a reference can be used to name a collection of attributes such as the hardware platform defaults. This has the advantage of enabling the separate fetching and caching of functional subsets. Another problem is to propagate changes to the current CC/PP descriptions to an origin server, a gateway, or a proxy. One solution is to transmit the entire CC/PP descriptions with each change. This is not ideal for slow networks. An alternative is to send only the changes. The CC/PP exchange protocol does not depend on the profile format that it conveys. Therefore, another profile format besides the CC/PP description format can be applied to the CC/PP exchange protocol. The basic requirements for the CC/PP exchange protocol are as follows: 1. The transmissions of the CC/PP descriptions should be HTTP/1.1-compatible. 2. The CC/PP exchange protocol should support an indirect addressing scheme based on RFC2396 for referencing profile information. 3. Components used to construct CC/PP descriptions, such as vendor default descriptions, should be independently cacheable. 4. The CC/PP exchange protocol should provide a lightweight exchange mechanism that permits the client to avoid resending the elements of the CC/PP descriptions that have not changed since the last time the information was transmitted. For example, a user agent issues a request with URIs that address the profile infor- mation, and if the user agent changes the value of an attribute, such as turning sound off, only that change is sent together with the URIs. When an origin server receives the request, the origin server inquires of CC/PP repositories the CC/PP descriptions using the
  20. 130 XML, RDF, AND CC/PP list of URIs. Then the origin server creates a tailored content using the fully enumerated CC/PP descriptions. The origin server might not obtain the fully enumerated CC/PP descriptions when any one of the CC/PP repositories is not available. In this case, it depends on the implemen- tation whether the origin server should respond to the request with a tailored content, a nontailored content, or an error. In any case, the origin server should inform the user agent of the fact. A warning mechanism is introduced for this purpose. It is likely that an origin server, a gateway, or a proxy will be concerned with different device capabilities or user preferences. For example, the origin server may have respon- sibility to select content according to the user-preferred language, while the proxy may have responsibility to transform the encoding format of the content. Therefore, gateways or proxies might not forward all profile information to an origin server. The CC/PP exchange protocol is based on the HTTP Extension Framework. The HTTP Extension Framework is a generic extension mechanism for HTTP/1.1, which is designed to interoperate with existing HTTP applications. An extension declaration is used to indicate that an extension has been applied to a message and possibly to reserve a part of the header name space identified by a header field prefix. The HTTP Extension Framework introduces two types of extension declaration strength: mandatory and optional, and two types of extension declaration scope: hop-by- hop and end-to-end. Which type of the extension declaration strengths and/or which type of the extension declaration scopes should be used depends on what the user agent needs to do. The strength of the extension declaration should be mandatory if the user agent needs to obtain an error response when a server (an origin server, a gateway, or a proxy) does not comply with the CC/PP exchange protocol. The strength of the extension declaration should be optional if the user agent needs to obtain the nontailored content when a server does not comply with the CC/PP exchange protocol. The scope of the extension declaration should be hop-by-hop if the user agent has an a priori knowledge that the first-hop proxy complies with the CC/PP exchange protocol. The scope of the extension declaration should be end-to-end if the user agent has an a priori knowledge that the first-hop proxy does not comply with the CC/PP exchange protocol or the user agent does not use a proxy. The absoluteURI in the Profile header field addresses an entity of a CC/PP description, which exists in the World Wide Web. CC/PP descriptions may originate from multiple sources (e.g., hardware vendors, software vendors, etc). A CC/PP description that is pro- vided by a hardware vendor or a software vendor should be addressed by an absoluteURI. A user agent issues a request with these absoluteURIs in the Profile header instead of send- ing whole CC/PP descriptions, which contributes to reducing the amount of transaction. The syntax of the absoluteURI must conform to RFC2396. The scenario of mandatory and end-to-end using the CC/PP exchange protocol is as follows: 1. The user agent issues a mandatory extension request. 2. The origin server examines the extension declaration header and determines if it is supported for this message, if not, it responds with not extended status code.
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2