The New C Standard- P11

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:100

lượt xem

The New C Standard- P11

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'the new c standard- p11', công nghệ thông tin, kỹ thuật lập trình phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Nội dung Text: The New C Standard- P11

  1. Compound literals 1058 Commentary 1065 compound All objects defined outside the body of a function have static storage duration. The storage for such objects is literal outside func- tion body initialized before program startup, so can only consist of constant expressions. This constraint only differs 455 static storage dura- from an equivalent one for initializers by being framed in terms of “occurring outside the body of a function” tion 151 static storage rather than “an object that has static storage duration.” duration initialized before startup Semantics 1644 initializer static storage duration object 1057 A postfix expression that consists of a parenthesized type name followed by a brace-enclosed list of initializers compound literal is a compound literal. Commentary This defines the term compound literal. A compound literal differs from an initializer list in that it can occur 1641 initialization syntax outside of an object definition. Because their need be no associated type definition, a type name must be specified (for initializers the type is obtained from the type of the object being initialized). Other Languages A form of compound literals are supported in some languages (e.g., Ada, Algol 68, CHILL, and Extended Pascal). These languages do not always require a type name to be given. The type of the parenthesized list of expressions is deduced from the context in which it occurs. Coding Guidelines 1066 compound From the coding guideline point of view, the use of compound literals appears fraught with potential pitfalls, literal inside function including the use of the term compound literal which suggests a literal value, not an unnamed object. 1061 bodycompound However, this construct is new in C99 and there is not yet sufficient experience in their use to know if any literal is lvalue specific guideline recommendations might apply to them. 1058 It provides an unnamed object whose value is given by the initializer list.81) compound literal unnamed object Commentary The difference between this kind of unnamed object and that created by a call to a memory allocation function (e.g., malloc) is that its definition includes a type and it has a storage duration other than allocated (i.e., either static or automatic). Other Languages Some languages treat their equivalent of compound literals as just that, a literal. For instance, like other literals, it is not possible to take their address. Common Implementations In those cases where a translator can deduce that storage need not be allocated for the unnamed object, the as-if rule can be used, and it need not allocate any storage. This situation is likely to occur for compound literals because, unless their address is taken (explicitly using the address-of operator, or in the case of an array type implicit conversion to pointer type), they are only assigned a value at one location in the source code. At their point of definition, and use, a translator can generate machine code that operates on their constituent values directly rather than copying them to an unnamed object and operating on that. Coding Guidelines Guideline recommendations applicable to the unnamed object are the same as those that apply to objects having the same storage duration. For instance, the guideline recommendation dealing with assigning the 1088.1 object address of objects to pointers. address assigned Example The following example not only requires that storage be allocated for the unnamed object created by the compound literal, but that the value it contains be reset on every iteration of the loop. June 24, 2009 v 1.2
  2. 1062 Compound literals 1 struct s_r { 2 int mem; 3 }; 4 5 extern void glob(struct s_r *); 6 7 void f(void) 8 { 9 struct s_r *p_s_r; 10 11 do { 12 glob(p_s_r = &((struct s_r){1}); 13 /* 14 * Instead of writing the above we could have written: 15 * struct s_r unnamed_s_r = {1}; 16 * glob (p_s_r = &unnamed_s_r); 17 * which assigns 1 to the member on every iteration, as 18 * part of the process of defining the object. 19 */ 20 p_s_r->mem++; /* Increment value held by unnamed object. */ 21 } while (p_s_r->mem != 10) 22 } If the type name specifies an array of unknown size, the size is determined by the initializer list as specified in 1059 6.7.8, and the type of the compound literal is that of the completed array type. Commentary array of un- 1683 known size This behavior is discussed elsewhere. initialized Coding Guidelines array 1573 incomplete type The some of the issues involved in declaring arrays having an unknown size are discussed elsewhere. Otherwise (when the type name specifies an object type), the type of the compound literal is that specified by 1060 the type name. Commentary Presumably this is the declared type of the unnamed object initialized by the initializer list and therefore also effective type 948 its effective type. compound literal In either case, the result is an lvalue. 1061 is lvalue Commentary lvalue 721 While the specification for a compound literal meets the requirements needed to be an lvalue, wording lvalue 725 elsewhere might be read to imply that the result is not an lvalue. This specification clarifies the behavior. converted to value Other Languages Some languages consider, their equivalent of, compound literals to be just that, literals. For such languages rvalue 736 the result is an rvalue. footnote 81) Note that this differs from a cast expression. 1062 81 Commentary A cast operator takes a single scalar value (if necessary any lvalue is converted to its value) as its operand and returns a value as its result. v 1.2 June 24, 2009
  3. Compound literals 1064 Coding Guidelines Developers are unlikely to write expressions, such as (int){1}, when (int)1 had been intended (on standard US PC-compatible keyboards the pair of characters ( { and the pair ) } appear on four different keys). Such usage may occur through the use of parameterized macros. However, at the time of this writing there is insufficient experience with use of this new language construct to know whether any guideline recommendation is worthwhile. Example The following all assign a value to loc. The first two assignments involve an lvalue to value conversion. In the second two assignments the operand being assigned is already a value. 1 extern int glob = 1; 2 3 void f(void) 4 { 5 int loc; 6 7 loc=glob; 8 loc=(int){1}; 9 10 loc=2; 11 loc=(int)2; 12 } 1063 For example, a cast specifies a conversion to scalar types or void only, and the result of a cast expression is not an lvalue. Commentary 1134 cast These are restrictions on the types and operands of such an expression and one property of its result. scalar or void type 1131 footnote 85 Example 1 &(int)x; /* Constraint violation. */ 2 &(int){x}; /* Address of an unnamed object containing the current value of x. */ 1064 The value of the compound literal is that of an unnamed object initialized by the initializer list. Commentary The distinction between a compound literal acting as if the initializer list was its value, and an unnamed object (initialized with values from the initializer list) being its value, is only apparent when the address-of operator is applied to it. The creation of an unnamed object does not mean that locally allocated storage is a factor in this distinction. Implementations of languages where compound literals are defined to be literals sometimes use locally allocated temporary storage to hold their values. C implementations may find they can optimize away allocation of any actual unnamed storage. Common Implementations If a compound literal occurs in a context where its value is required (e.g., assignment) there are obvious opportunities for implementations to use the values of the initializer list directly. C99 is still too new to know whether most implementations will make use of this optimization. June 24, 2009 v 1.2
  4. 1066 Compound literals Coding Guidelines The distinction between the value of a compound literal being an unnamed object and being the values of the initializer list could be viewed as an unnecessary complication that is not worth educating a developer about. Until more experience has been gained with the kinds of mistakes developers make with compound literals, it is not possible to recommend any guidelines. Example 1 #include 2 3 struct TAG { 4 int mem_1; 5 float mem_2; 6 }; 7 8 struct TAG o_s1 = (struct TAG){1, 2.3}; 9 10 void f(void) 11 { 12 memcpy(&o_s1, &(struct TAG){4, 5.6}, sizeof(struct TAG)); 13 } compound literal If the compound literal occurs outside the body of a function, the object has static storage duration; 1065 outside function body Commentary This specification is consistent with how other object declarations, outside of function bodies, behave. The storage 448 duration storage duration of a compound literal is based on the context in which it occurs, not whether its initializer object list consists of constant expressions. 1 struct s_r { 2 int mem; 3 }; 4 5 static struct s_r glob = {4}; 6 static struct s_r col = (struct s_r){4}; /* Constraint violation. */ 7 static struct s_r *p_g = &(struct s_r){4}; 8 9 void f(void) 10 { 11 static struct s_r loc = {4}; 12 static struct s_r col = (struct s_r){4}; /* Constraint violation. */ 13 static struct s_r *p_l = &(struct s_r){4}; /* Constraint violation. */ 14 } Other Languages The storage duration specified by other languages, which support some form of compound literal, varies. Some allow the developer to choose (e.g., Algol 68), others require them to be dynamically allocated (e.g., Ada), while in others (e.g., Fortran and Pascal) the issue is irrelevant because it is not possible to obtain their address. compound literal otherwise, it has automatic storage duration associated with the enclosing block. 1066 inside function body Commentary A parallel can be drawn between an object definition that includes an initializer and a compound literal (that is the definition of an unnamed object). The lifetime of the associated objects starts when the block that v 1.2 June 24, 2009
  5. Compound literals 1066 458 object contains their definition is entered. However, the objects are not assigned their initial value, if any, until the 462 lifetime fromof entry to exit initialization block declaration is encountered during program execution. performed every time declaration reached The unnamed object associated with a compound literal is initialized each time the statement that contains 1711 object it is encountered during program execution. Previous invocations, which may have modified the value of the initializer eval- uated when unnamed object, or nested invocations in a recursive call, do not affect the value of the newly created object. 1026 function call recursive Storage for the unnamed object is created on block entry. Executing a statement containing a compound 1078 EXAMPLE compound literal single object literal does not cause any new storage to be allocated. Recursive calls to a function containing a compound literal will cause different storage to be allocated, for the unnamed object, for each nested call. 1 struct foo { 2 struct foo *next; 3 int i; 4 }; 5 6 void WG14_N759(void) 7 { 8 struct foo *p, 9 *q; 10 /* 11 * The following loop ... 12 */ 13 p = NULL; 14 for (int j = 0; j < 10; j++) 15 { 16 q = &((struct foo){ .next = p, .i = j }); 17 p = q; 18 } 19 /* 20 * ... is equivalent to the loop below. 21 */ 22 p = NULL; 23 for (int j = 0; j < 10; j++) 24 { 25 struct foo T; 26 27 = p; 28 T.i = j; 29 q = &T; 30 p = q; 31 } 32 } Common Implementations To what extent is it worth trying to optimize compound literals made up of a list of constant expressions; for instance, by detecting those that are never modified, or by placing them in a static region of storage that can be copied from or pointed at? The answer to these and many other optimization issues relating to compound literals will have to wait until translator vendors get a feel for how their customers use this new, to C, construct. Coding Guidelines Parallels can be drawn between the unnamed object associated with a compound literal and the temporaries created in C++. Experience has shown that C++ developers sometimes assume that the lifetime of a temporary is greater than it is required to be by that languages standard. Based on this experience it is to be expected that developers using C might make similar mistakes with the lifetime of the unnamed object associated with a compound literal. Only time will tell whether these mistakes will be sufficiently common, or serious, that the benefits of being able to apply the address-of operator to a compound literal (the operator that needs to be used to extend the range of statements over which an unnamed object can be accessed) are outweighed by the probably cost of faults. June 24, 2009 v 1.2
  6. 1068 Compound literals The guideline recommendation dealing with assigning the address of an object to a pointer object, whose object 1088.1 address assigned lifetime is greater than that of the addressed object, is applicable here. 1 #include 2 3 extern int glob; 4 struct s_r { 5 int mem; 6 }; 7 8 void f(void) 9 { 10 struct s_r *p_s_r; 11 12 if (glob == 0) 13 { 14 p_s_r = &((struct s_r){1}); 15 } 16 else 17 { 18 p_s_r = &((struct s_r){2}); 19 } 20 /* The value of p_s_r is indeterminate here. */ 21 22 /* 23 * The iteration-statements all enclose their associated bodies in 24 * a block. The effect of this block is to start and terminate 25 * the lifetime of the contained compound literal. 26 */ 27 p_s_r=NULL; 28 while (glob < 10) 29 { 30 /* 31 * In the following test the value of p_s_r is indeterminate 32 * on the second and subsequent iterations of the loop. 33 */ 34 if (p_s_r == NULL) 35 ; 36 p_s_r = &((struct s_r){1}); 37 } 38 } All the semantic rules and constraints for initializer lists in 6.7.8 are applicable to compound literals.82) 1067 Commentary They are the same except • initializer lists don’t create objects, they are simply a list of values with which to initialize an object; and • the type is deduced from the object being initialized, not a type name. Coding Guidelines initialization 1641 Many of the coding guideline issues discussed for initializers also apply to compound literals. syntax string literal String literals, and compound literals with const-qualified types, need not designate distinct objects.83) 1068 distinct object compound literal distinct object v 1.2 June 24, 2009
  7. Compound literals 1070 Commentary A strictly conforming program can deduce if an implementation uses the same object for two string literals, 1076 EXAMPLE or compound literals, by performing an equality comparison on their addresses (an infinite number of string literals shared comparisons would be needed to deduce whether an implementation always used distinct objects). This 908 string literal permission for string literals is also specified elsewhere. distinct array The only way a const-qualified object can be modified is by casting a pointer to it to a non-const-qualified 746 pointer pointer. Such usage results in undefined behavior. The undefined behavior, if the pointer was used to modify converting quali- fied/unqualified such an unnamed object that was not distinct, could also modify the values of other compound literal object values. Other Languages Most languages do not consider any kind of literal to be modifiable, so whether they share the same storage locations is not an issue. Common Implementations The extent to which developers will use compound literals having a const-qualified type, for which storage is allocated and whose values form a sharable subset with another compound literal, remains to be seen. Without such usage it is unlikely that implementors of optimizers will specifically look for savings in this area, although they may come about as a consequence of optimizations not specifically aimed at compound literals. Example In the following there is an opportunity to overlay the two unnamed objects containing zero values. 1 const int *p1 = (const int [99]){0}; 2 const int *p2 = (const int [20]){0}; 1069 EXAMPLE 1 The file scope definition int *p = (int []){2, 4}; initializes p to point to the first element of an array of two ints, the first having the value two and the second, four. The expressions in this compound literal are required to be constant. The unnamed object has static storage duration. Commentary This usage, rather than the more obvious int p[] = {2, 4};, can arise because the initialization value is derived through macro replacement. The same macro replacement is used in noninitialization contexts. 1070 EXAMPLE 2 In contrast, in void f(void) { int *p; /* ... */ p = (int [2]){*p}; /* ... */ } p is assigned the address of the first element of an array of two ints, the first having the value previously pointed to by p and the second, zero. The expressions in this compound literal need not be constant. The unnamed object has automatic storage duration. Commentary The assignment of values to the unnamed object occurs before the value of the right operand is assigned to p. June 24, 2009 v 1.2
  8. 1074 Compound literals Example The above example is not the same as declaring p to be an array. 1 void f(void) 2 { 3 int p[2]; /* Storage for p is created by its definition. */ 4 5 /* 6 * Cannot assign new object to p, can only change existing values. 7 */ 8 p[1]=0; 9 } EXAMPLE 3 Initializers with designations can be combined with compound literals. Structure objects created 1071 using compound literals can be passed to functions without depending on member order: drawline((struct point){.x=1, .y=1}, (struct point){.x=3, .y=4}); Or, if drawline instead expected pointers to struct point: drawline(&(struct point){.x=1, .y=1}, &(struct point){.x=3, .y=4}); Commentary This usage removes the need to create a temporary in the calling function. The arguments are passed by value, like any other structure argument. EXAMPLE 4 A read-only compound literal can be specified through constructions like: 1072 (const float []){1e0, 1e1, 1e2, 1e3, 1e4, 1e5, 1e6} Commentary An implementation may choose to place the contents of this compound literal in read-only memory, but it is not required to do so. The term read-only is something of a misnomer, since it is possible to cast its address to a non-const-qualified type and assign to the pointed-to object. (The behavior is undefined, but unless the values are held in a kind of storage that cannot be modified, they are likely to be modified.) Other Languages Some languages support a proper read-only qualifier. Common Implementations On some freestanding implementations this compound literal might be held in ROM. footnote 82) For example, subobjects without explicit initializers are initialized to zero. 1073 82 Commentary This behavior reduces the volume of the visible source code when the object type includes large numbers of initializer 1682 fewer in list than members members or elements. Coding Guidelines Some of the readability issues applicable to statements have different priorities than those for declarations. initialization 1641 These are discussed elsewhere. syntax footnote 83) This allows implementations to share storage for string literals and constant compound literals with the 1074 83 same or overlapping representations. v 1.2 June 24, 2009
  9. Compound literals 1077 Commentary The need to discuss an implementation’s ability to share storage for string literals occurs because it is possible to detect such sharing in a conforming program (e.g., by comparing two pointers assigned the addresses of two distinct, in the visible source code, string literals). The C Committee choose to permit this implementation behavior. (There were existing implementations, when the C90 Standard was being drafted, that shared storage.) 1075 EXAMPLE 5 The following three expressions have different meanings: "/tmp/fileXXXXXX" (char []){"/tmp/fileXXXXXX"} (const char []){"/tmp/fileXXXXXX"} The first always has static storage duration and has type array of char, but need not be modifiable; the last two have automatic storage duration when they occur within the body of a function, and the first of these two is modifiable. Commentary In all three cases, a pointer to the start of storage is returned and the first 16 bytes of the storage allocated will have the same set of values. If all three expressions occurred in the same source file, the first and third 1076 EXAMPLE could share the same storage even though their storage durations were different. Developers who see a string literals shared potential storage saving in using a compound literal instead of a string literal (the storage for one only need be allocated during the lifetime of its enclosing block) also need to consider potential differences in the number of machine code instructions that will be generated. Overall, there may be no savings. 1076 EXAMPLE 6 Like string literals, const-qualified compound literals can be placed into read-only memory and EXAMPLE string liter- can even be shared. For example, als shared (const char []){"abc"} == "abc" might yield 1 if the literals’ storage is shared. Commentary In this example pointers to the first element of the compound literal and a string literal are being compared for equality. Permission to share the storage allocated for a compound literal only applies to those having a 1068 compound const-qualified type (there is no such restriction on string literals). literal distinct object 908 string literal Coding Guidelines distinct array Comparing string using an equality operator, rather than a call to the strcmp library function is a common beginner mistake. Training is the obvious solution. Usage In the visible source of the .c files 0.1% of string literals appeared as the operand of the equality operator (representing 0.3% of the occurrences of this operator). 1077 EXAMPLE 7 Since compound literals are unnamed, a single compound literal cannot specify a circularly linked object. For example, there is no way to write a self-referential compound literal that could be used as the function argument in place of the named object endless_zeros below: struct int_list { int car; struct int_list *cdr; }; struct int_list endless_zeros = {0, &endless_zeros}; eval(endless_zeros); June 24, 2009 v 1.2
  10. 1079 Compound literals Commentary A modification using pointer types, and an additional assignment, creates a circularly linked list that uses the storage of the unnamed object: 1 struct int_list { int car; struct int_list *cdr; }; 2 struct int_list *endless_zeros = &(struct int_list){0, 0}; 3 4 endless_zeros->cdr=endless_zeros; /* Let’s follow ourselves. */ The following statement would not have achieved the same result: 1 endless_zeros = &(struct int_list){0, endless_zeros}; because the second compound literal would occupy a distinct object, different from the first. The value of endless_zeros in the second compound literal would be pointing at the unnamed object allocated for the first compound literal. Other Languages Algol 68 supports the creation of circularly linked objects (see the Other Languages subsection in the following C sentence). EXAMPLE EXAMPLE 8 Each compound literal creates only a single object in a given scope: 1078 compound literal single object struct s { int i; }; int f (void) { struct s *p = 0, *q; int j = 0; again: q = p, p = &((struct s){ j++ }); if (j < 2) goto again; return p == q && q->i == 1; } The function f() always returns the value 1. Note that if an iteration statement were used instead of an explicit goto and a labeled statement, the lifetime of the unnamed object would be the body of the loop only, and on entry next time around p would have an indeterminate value, which would result in undefined behavior. Commentary Specifying that a single object is created helps prevent innocent-looking code consuming large amounts of storage (e.g., use of a compound literal in a loop). Other Languages In Algol 68 LOC creates storage for block scope objects. However, it generates new storage every time it is executed. The following allocates 1,000 objects on the stack. 1 MODE M = STRUCT (REF M next, INT i); 2 M p; 3 INT i := 0 4 5 again: 6 p := LOC M := (p, i); 7 i +:= 1; 8 IF i < 1000 THEN 9 GO TO again 10 FI; v 1.2 June 24, 2009
  11. 6.5.3 Unary operators 1080 1079 Forward references: type names (6.7.6), initialization (6.7.8). 6.5.3 Unary operators unary-expression syntax 1080 unary-expression: postfix-expression ++ unary-expression -- unary-expression unary-operator cast-expression sizeof unary-expression sizeof ( type-name ) unary-operator: one of & * + - ~ ! Commentary 1133 cast- Note that the operand of unary-operator is a cast-expression, not a unary-expression. A unary operator expression syntax usually refers to an operator that takes a single argument. Technically all of the operators listed here, plus the postfix increment and decrement operators, could be considered as being unary operators. Unary plus was adopted by the C89 Committee from several implementations, for symmetry with unary minus. Rationale Other Languages Some languages (i.e., Ada and Pascal) specify the unary operators to have lower precedence than the 1143 multiplicative- multiplicative operators; for instance, -x/y is equivalent to -(x/y) in Ada, but (-x)/y in C. Most languages expression syntax call all operators that take a single-operand unary operators. Languages that support the unary + operator include Ada, Fortran, and Pascal. Some languages use the keyword NOT rather than !. In the case of Cobol this keyword can also appear to the left of an operator, indicating negation of the operator (i.e., NOT < meaning not less than). Coding Guidelines Coding guidelines need to be careful in their use of the term unary operator. Its meaning, as developers understand it, may be different from its actual definition in C. The operators in a unary-expression occur to the left of the operand. The only situation where a developer’s incorrect assumption about precedence relationships might lead to a difference between predicted and actual behavior is when a postfix operator occurs immediately to the right of the unary-expression. Dev 943.1 Except when sizeof ( type-name ) is immediately followed visually by a token having the lexical form of an additive operator, if a unary-expression is not immediately followed by a postfix operator it need not be parenthesized. Although the expression sizeof (int)-1 may not occur in the visible source code, it could easily occur as the result of macro replacement of the operand of the sizeof operator. This is one of the reasons behind the 1931.2 macro guideline recommendation specifying the parenthesizing of macro bodies (without parentheses the expression definition expression is equivalent to (sizeof(int))-1). Example 1 struct s { 2 int x; 3 }; June 24, 2009 v 1.2
  12. 1080 6.5.3 Unary operators 4 struct s *a; 5 int x; 6 7 void f(void) 8 { 9 xx; 10 xx; 11 xx; 12 xx; 13 14 sizeof(long)-3; /* Could be mistaken for sizeof a cast-expression. */ 15 (sizeof(long))-3; 16 sizeof((long)-3); 17 } Usage postfix- 985 expression See the Usage section of postfix-expression for ++ and -- digraph percentages. syntax Table 1080.1: Common token pairs involving sizeof, unary-operator , prefix ++, or prefix -- (as a percentage of all occurrences of each token). Based on the visible form of the .c files. Token Sequence % Occurrence % Occurrence of Token Sequence % Occurrence % Occurrence of of First Token Second Token of First Token Second Token ! defined 2.0 16.7 ! ( 14.5 0.5 *v --v 0.3 7.8 -v identifier 30.2 0.4 -v floating-constant 0.3 6.7 *v ( 9.0 0.4 *v ++v 0.5 6.3 ~ integer-constant 20.1 0.2 ! --v 0.2 4.8 ++v identifier 97.3 0.1 -v integer-constant 69.0 4.1 ~ identifier 56.3 0.1 &v identifier 96.1 1.9 ~( 23.4 0.1 sizeof ( 97.5 1.8 +v integer-constant 49.0 0.0 *v identifier 86.8 1.0 --v identifier 97.1 0.0 ! identifier 81.9 0.8 100,000 unary - unary ˜ × × decimal-constant 10,000 • hexadecimal-constant × Occurrences 1,000 × × × × × × × ×× × ×××× × × 100 ×× ×× × × •• • • ×× × × ×× • × × × × ×× × × ••• • × × • • ×× ××× × × ×× × ×× × • × ××× × × × • • × × ×× ×× × ×× × ×× × × × ××× ×× × × × ×× ×• • • • •× × × 10 × ××× ×× × × × × × × × × × ×× ×× × × ×× × × ×ו × • • • ××× × × ×× × × × ×× ×× × × × × ×× ×× × ×× × ×× × × × ×× ××××× × × × × × × ×× × × × × × × • ו × • • • • × ×× × × × ×× • • × ×× × ×• • • • • • × × × × • ×× × × × × × × ×× × × • × × × × × • ו• ו ו •• • •× 1 • • •• • × × ×× × ×× × ×× × × × ×× × ××× ×× × × ×× ×× × × × × × × ×•• • • × • • • • • ••• • • • •• • 0 16 32 64 100 128 150 200 255 0 16 32 64 100 128 150 200 255 Numeric value Numeric value Figure 1080.1: Number of integer-constants having a given value appearing as the operand of the unary minus and unary ~ operators. Based on the visible form of the .c files. v 1.2 June 24, 2009
  13. Prefix increment and decrement operators 1082 Table 1080.2: Occurrence of the unary-operator s, prefix ++, and prefix -- having particular operand types (as a percentage of all occurrences of the particular operator; an _ prefix indicates a literal operand). Based on the translated form of this book’s benchmark programs. Operator Type % Operator Type % Operator Type % -v _int 96.0 ~ unsigned long 6.8 ! _long 2.7 *v ptr-to 95.3 &v int 6.2 ~ unsigned char 2.5 +v _int 72.2 ~ unsigned int 6.0 &v unsigned char 2.4 --v int 54.7 +v unsigned long 5.6 ! unsigned long 2.1 ! int 50.0 +v long 5.6 ~ long 2.0 ~ _int 49.3 +v float 5.6 ++v unsigned char 1.9 &v other-types 45.1 ! other-types 5.6 ~ _unsigned long 1.7 ++v int 43.8 ++v unsigned long 5.2 ~ _unsigned int 1.7 ++v ptr-to 33.3 &v struct * 4.9 ! unsigned char 1.6 ~ int 28.5 --v unsigned long 4.7 ~ other-types 1.6 --v unsigned int 22.1 ! unsigned int 4.7 -v _double 1.4 ! ptr-to 20.1 *v fnptr-to 4.1 -v other-types 1.3 --v ptr-to 14.6 &v unsigned long 4.0 ++v long 1.2 &v struct 13.9 --v other-types 4.0 -v int 1.2 &v char 13.1 &v long 3.4 ! _int 1.2 ++v unsigned int 12.6 &v unsigned int 3.0 ++v unsigned short 1.1 +v int 11.1 &v unsigned short 2.9 &v char * 1.1 ! char 9.2 ! enum 2.9 Prefix increment and decrement operators Constraints 1081 The operand of the prefix increment or decrement operator shall have qualified or unqualified real or pointer postfix operator operand type and shall be a modifiable lvalue. Commentary 1081 postfix This constraint mirrors that for the postfix forms of these operators. operator operand C++ The use of an operand of type bool with the prefix ++ operator is deprecated (5.3.2p1); there is no corre- sponding entry in annex D, but the proposed response to C++ DR #145 inserted one. In the case of the decrement operator: The operand shall not be of type bool. 5.3.2p1 A C source file containing an instance of the prefix -- operator applied to an operand having type _Bool is likely to result in a C++ translator issuing a diagnostic. Coding Guidelines 822 symbolic Enumerated types are usually thought about in symbolic rather than arithmetic terms. The increment and name decrement operators can also be given a symbolic interpretation. They are sometimes thought about in terms 517 enumeration set of named constants of moving on to the next symbolic name in a list. This move to next operation relies on the enumeration constants being represented by successive numeric values. While this usage is making use of representation information, there is often a need to step through a series of symbolic names (and C provides no other built-in 1199 relational mechanism), for instance, iterating over the named constants defined by an enumerated type. operators real operands Dev 569.1 The operand of a prefix increment or decrement operator may have an enumerated type, provided the enumeration constants defined by that type have successive numeric values. Semantics June 24, 2009 v 1.2
  14. 1083 Prefix increment and decrement operators prefix ++ The value of the operand of the prefix ++ operator is incremented. 1082 incremented Commentary postfix ++ 1047 The ordering of this and the following C sentence is the reverse of that specified for the postfix ++ operator. result Common Implementations The implementation of this operator is usually very straight-forward. A value is loaded into a register, incremented, and then stored back into the original object, leaving the result in the register. Some CISC processors contain instructions that increment the contents of storage directly. Processors that have a stack- based architecture either need to contain store instructions that leave the value on the stack, or be willing to pay the penalty of another load from storage. Coding Guidelines Translators have now progressed to the point where the optimizations many of them perform are much more sophisticated than those needed to detect the more verbose sequence of operations equivalent to the prefix ++ operator. The writers of optimizers study existing source code to find out what constructs occur frequently (they don’t want to waste time and money implementing optimizations for constructs that rarely occur). However, in existing code it is rare to see an object being incremented (or decremented) without one of these operators being used. Consequently optimizers are unlikely to attempt to transform the C source i=i+1 into ++i (which they might have to do for say Pascal, which has no increment operators requiring optimizers to analyze an expression looking for operations that are effectively increment object). So the assertion that ++i can be written as i=i+1 and that it will be optimized by the translator is not guaranteed, even for a highly optimizing translator. However, this is rarely an important issue anyway; the difference in quality of generated machine code rarely has any impact on program performance. From the coding guidelines perspective, uses of these operators can be grouped into three categories: 1. The only operator in an expression statement. In this context the result returned by the operation is ignored. The statement simply increments/decrements its operand. Use of the prefix, rather than the postfix, form does not follow the pattern seen at the start of most visible source code statement lines— an identifier followed by an operator (see Figure 940.2). A reader’s scanning of the source looking for postfix 1046 objects that are modified will be disrupted by the initial operator. For this reason, use of the postfix operator form is recommended. constraint full ex- 1712 pression 2. One of the operators in a full expression that contains other operators. It is possible to write the code so that a prefix operator does not occur in the same expression as other operators. The evaluation can postfix 1046 be moved back before the containing expression (see the postfix operators for a fuller discussion of operator this point). constraint 1 ...++i... becomes the equivalent form: 1 i++; 2 ...i... The total cognitive effort needed to comprehend the equivalent form may be less than the prefix form, and the peak effort is likely to be less (because the operations may have been split into smaller chunks in serial rather than nested form). postfix 1046 operator 3. The third point is the same as for the postfix operators. constraint Cg 1082.1 The prefix operators shall not appear in an expression statement. v 1.2 June 24, 2009
  15. Address and indirection operators 1088 1083 The result is the new value of the operand after incrementation. prefix ++ result Other Languages Pascal contains the succ operator. This returns the successor value (i.e., it adds one to its operand), but it does not modify the value of an object appearing as its operand. 1084 The expression ++E is equivalent to (E+=1). Commentary The expression ++E need not be equivalent to E=E+1 (e.g., the expression E may contain a side effect). C++ C++ lists an exception (5.3.2p1) for the case when E has type bool. This is needed because C++ does not 476 _Bool define its boolean type in the same way as C. The behavior of this operator on operands is defined as a special large enough to store 0 and 1 case in C++. The final result is the same as in C. 1085 See the discussions of additive operators and compound assignment for information on constraints, types, prefix operators see also side effects, and conversions and the effects of operations on pointers. Commentary 1050 postfix op- The same references are given for the postfix operators. erators see also C++ [Note: see the discussions of addition (5.7) and assignment operators (5.17) for information on conversions. ] 5.3.2p1 There is no mention that the conditions described in these clauses also apply to this operator. 1086 The prefix -- operator is analogous to the prefix ++ operator, except that the value of the operand is decremented. Commentary The same Commentary and Coding Guidelines’ issues also apply. See the discussion elsewhere for cases 1082 prefix ++ incremented where the affects are not analogous. 1052 postfix -- analogous to ++ C++ The prefix -- operator is not analogous to the prefix ++ operator in that its operand may not have type bool. Other Languages Pascal contains the pred reserved identifier. This returns the predecessor value, but does not modify the value of its operand. Coding Guidelines 1082.1 prefix The guideline recommendation for the prefix ++ operator has been worded to apply to either operator. in expression statement 1087 Forward references: additive operators (6.5.6), compound assignment ( Address and indirection operators Constraints 1088 The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, unary & operand or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class constraints specifier. June 24, 2009 v 1.2
  16. 1088 Address and indirection operators Commentary bit-field 1410 Bit-fields are permitted (intended even) to occupy part of a storage unit. Requiring bit addressing could be a packed into huge burden on implementations. Very few processors support bit addressing and C is based on the byte byte 53 being the basic unit of addressability. addressable unit register 1369 storage-class The register storage-class specifier is only a hint to the translator. Taking the address of an object could effectively prevent a translator from keeping its value in a register. A harmless consequence, but the C Committee decided to make it a constraint violation. C90 The words: . . . , the result of a [ ] or unary * operator, are new in C99 and were added to cover the following case: 1 int a[10]; 2 3 for (int *p = &a[0]; p < &a[10]; p++) 4 /* ... */ where C90 requires the operand to refer to an object. The expression a+10 exists, but does not refer to an object. In C90 the expression &a[10] is undefined behavior, while C99 defines the behavior. C++ Like C90 the C++ Standard does not say anything explicit about the result of a [] or unary * operator. The C++ Standard does not explicitly exclude objects declared with the register storage-class specifier appearing as operands of the unary & operator. In fact, there is wording suggesting that such a usage is permitted: A register specifier has the same semantics as an auto specifier together with a hint to the implementation 7.1.1p3 that the object so declared will be heavily used. [Note: the hint can be ignored and in most implementations it will be ignored if the address of the object is taken. —end note] Source developed using a C++ translator may contain occurrences of the unary & operator applied to an operand declared with the register storage-class specifier, which will cause a constraint violation if processed by a C translator. 1 void f(void) 2 { 3 register int a[10]; /* undefined behavior */ 4 // well-formed 5 6 &a[1] /* constraint violation */ 7 // well-formed 8 ; 9 } Other Languages Many languages that support pointers have no address operator (e.g., Pascal and Java, which has references, not pointers). In these languages, pointers can only point at objects returned by the memory-allocation functions. The address-of operator was introduced in Ada 95 (it was not in available in Ada 83). Many languages do not allow the address of a function to be taken. Coding Guidelines In itself, use of the address-of operator is relatively harmless. The problems occur subsequently when the value returned is used to access storage. The following are three, coding guideline related, consequences of being able to take the address of an object: v 1.2 June 24, 2009
  17. Address and indirection operators 1088 • It provides another mechanism for accessing the individual bytes of an object representation (a pointer to an object can be cast to a pointer to character type, enabling the individual bytes of an object 761 pointer representation to be accessed). converted to pointer to charac- ter • It is an alias for the object having that address. • It provides a mechanism for accessing the storage allocated to an object after the lifetime of that object has terminated. Assigning the address of an object potentially increases the scope over which that object can be accessed. When is it necessary to increase the scope of an object? What are the costs/benefits of referring to an object using its address rather than its name? (If a larger scope is needed, could an objects definition be moved to a scope where it is visible to all source code statements that need to refer to it?) The parameter-passing mechanism in C is pass by value. What is often known as pass by reference is 1004 functionforcall preparing achieved, in C, by explicitly passing the address of an object. Different calls to a function having pass- by-reference arguments can involve different objects in different calls. Passing arguments, by reference, to functions is not a necessity; it is possible to pass information into and out of functions using file scope objects. Assigning the address of an object creates an alias for that object. It then becomes possible to access the same object in more than one way. The use of aliases creates technical problems for translators (the behavior implied by the use of the restrict keyword was introduced into C99 to help get around this problem) and 1491 restrictuse intended can require developers to use additional cognitive resources (they need to keep track of aliased objects). A classification often implicitly made by developers is to categorize objects based on how they are accessed, the two categories being those accessed by the name they were declared with and those accessed via pointers. A consequence of using this classification is that developers overlook the possibility, within a sequence of statements, of a particular object being modified via both methods. When readers are aware of an object having two modes of reference (a name and a pointer dereference) is additional cognitive effort needed to comprehend the source? Your author knows of no research in on this subject. These coding guidelines discuss the aliasing issue purely from the oversight point of view (faults being introduced because of lack of information), because there is no known experimental evidence for any cognitive factors. One way of reducing aliasing issues at the point of object access is to reduce the number of objects whose addresses are taken. Is it possible to specify a set of objects whose addresses should not be taken and what are the costs of having no alternatives for these cases? Is the cost worth the benefit? Restricting the operands of the address operator to be objects having block scope would limit the scope over which aliasing could occur. However, there are situations where the addresses of objects at file scope needs to be used, including: • An argument to a function could be an object with block scope, or file scope; for instance, the qsort function might be called. • In resource-constrained environments it may be decided not to use dynamic storage allocation. For instance, all of the required storage may be defined at file scope and pointers to objects within this storage used by the program. • The return from a function call is sometimes a pointer to an object, holding information. It may simplify storage management if this is a pointer to an object at file scope. The following guideline recommendation ensures that the storage allocated to an object is not accessed once the object’s lifetime has terminated. Cg 1088.1 The address of an object shall not be assigned to another object whose scope is greater than that of the object assigned. Dev 1088.1 An object defined in block scope, having static storage duration, may have its address assigned to any other object. June 24, 2009 v 1.2
  18. 1090 Address and indirection operators A function designator can appear as the operand of the address-of operator. However, taking the address of a function 732 designator function is redundant. This issue is discussed elsewhere. Likewise for objects having an array type. converted to type array 729 converted to pointer Example In the following it is not possible to take the address of a or any of its elements. 1 register int a[3]; In fact this object is virtually useless (the identifier a can appear as the operand to the sizeof operator). If allocated memory is not permitted (we know the memory requirements of the following on program startup): 1 extern int *p; 2 3 void init(void) 4 { 5 static int p_obj[20]; 6 7 p=&p_obj; 8 } This provides pointers to objects, but hides those objects within a block scope. There is no pointer/identifier aliasing problem. unary * The operand of the unary * operator shall have pointer type. 1089 operand has pointer type Commentary Depending on the context in which it occurs, there may be restrictions on the pointed-to type (because of the unary * 1098 result type type of the result). C++ The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object 5.3.1p1 type, or a pointer to a function type . . . C++ does not permit the unary * operator to be applied to an operand having a pointer to void type. 1 void *g_ptr; 2 3 void f(void) 4 { 5 &*g_ptr; /* DR #012 */ 6 // DR #232 7 } Other Languages In some languages indirection is a postfix operator; for instance, Pascal uses the token ^ as a postfix operator. Semantics unary & The unary & operator yields the address of its operand. 1090 operator v 1.2 June 24, 2009
  19. Address and indirection operators 1092 Commentary For operands with static storage duration, the value of the address operator may be a constant (objects having 1341 address constant an array type also need to be indexed with a constant expression). There is no requirement that the address of an object be the same between different executions of the same program image (for objects with static storage duration) or different executions of the same function (for objects with automatic storage duration). 139 transla- All external function references are resolved during translation phase 8. Any identifier denoting a function tion phase 8 definition will have been resolved. 1014 footnote The C99 Standard refers to this as the address-of operator. 79 C90 This sentence is new in C99 and summarizes what the unary & operator does. C++ Like C90, the C++ Standard specifies a pointer to its operand (5.3.1p1). But later on (5.3.1p2) goes on to say: “In particular, the address of an object of type “cv T” is “pointer to cv T,” with the same cv-qualifiers.” Other Languages Many languages do not contain an address-of operator. Fortran 95 has an address assignment operator, =>. The left operand is assigned the address of the right operand. Common Implementations Early versions of K&R C treated p=&x as being equivalent to p&=x.[734] In the case of constant addresses the value used in the program image is often calculated at link-time. For objects with automatic storage duration, their address is usually calculated by adding a known, at translation time, value (the offset of an object within its local storage area) to the value of the frame pointer for that function invocation. Addresses of elements, or members, of objects can be calculated using the base address of the object plus the offset of the corresponding subobject. Having an object appear as the operand of the address-of operator causes many implementations to play safe and not attempt to perform some optimizations on that object. For instance, without sophisticated pointer analysis, it is not possible to know which object a pointer dereference will access. (Implementations often assume all objects that have had their address taken are possible candidates, others might use information on the pointed-to type to attempt to reduce the set of possible accessed objects.) This often results in no attempt being made to keep the values of such objects in registers. Implementations’ representation of addresses is discussed elsewhere. 540 pointer type describes a 1091 If the operand has type “type”, the result has type “pointer to type”. Commentary Although developers often refer to the address returned by the address-of operator, C does not have an address type. 1092 If the operand is the result of a unary * operator, neither that operator nor the & operator is evaluated and the &* result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Commentary The only effect of the operator pair &* is to remove any lvalueness from the underlying operand. The 1114 footnote 84 combination *& returns an lvalue if its operand is an lvalue. This specification is consistent with the behavior 1115 *& of the last operator applied controlling lvalue-ness. This case was added in C99 to cover a number of existing coding idioms; for instance: 1 #include 2 June 24, 2009 v 1.2
  20. 1093 Address and indirection operators 3 void DR_076(void) 4 { 5 int *n = NULL; 6 int *p; 7 8 /* 9 * The following case is most likely to occur when the 10 * expression *n is a macro argument, or body of a macro. 11 */ 12 p = &*n; 13 /* ... */ 14 } C90 The responses to DR #012, DR #076, and DR #106 specified that the above constructs were constraint violations. However, no C90 implementations known to your author diagnosed occurrences of these constructs. C++ This behavior is not specified in C++. Given that either operator could be overloaded by the developer to have a different meaning, such a specification would be out of place. At the time of this writing a response to C++ DR #232 is being drafted (a note from the Oct 2003 WG21 meeting says: “We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior.”). 1 void DR_232(void) 2 { 3 int *loc = 0; 4 5 if (&*loc == 0) /* no dereference of a null pointer, defined behavior */ 6 // probably not a dereference of a null pointer. 7 ; 8 9 &*loc = 0; /* not an lvalue in C */ 10 // how should an implementation interpret the phrase must not (5.3.1p1)? 11 } Common Implementations Some C90 implementations did not optimize the operator pair &* into a no-op. In these implementations the behavior of the unary * operator was not altered by the subsequent address-of operator. C99 implementations are required to optimize away the operator pair &*. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by 1093 the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator. Commentary This case was added in C99 to cover a number of coding idioms; for instance: 1 void DR_076(void) 2 { 3 int a[10]; 4 int *p; 5 6 /* 7 * It is possible to point one past the end of an object. v 1.2 June 24, 2009
Đồng bộ tài khoản