Advanced PHP Programming- P11

Chia sẻ: Thanh Cong | Ngày: | Loại File: PDF | Số trang:50

0
51
lượt xem
6
download

Advanced PHP Programming- P11

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'advanced php programming- p11', công nghệ thông tin, kỹ thuật lập trình phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:
Lưu

Nội dung Text: Advanced PHP Programming- P11

  1. 478 Chapter 20 PHP and Zend Engine Internals n opcode 1—Here the ZEND_ASSIGN handler assigns to Register 0 (the pointer to $hi) the value hello. Register 1 is also assigned to, but it is never used. Register 1 would be utilized if the assignment were being used in an expression like this: if($hi = ‘hello’){} n opcode 2—Here you re-fetch the value of $hi, now into Register 2.You use the op ZEND_FETCH_R because the variable is used in a read-only context. n opcode 3—ZEND_ECHO prints the value of Register 2 (or, more accurately, sends it to the output buffering system). echo (and print, its alias) are operations that are built in to PHP itself, as opposed to functions that need to be called. n opcode 4—ZEND_RETURN is called, setting the return value of the script to 1. Even though return is not explicitly called in the script, every script contains an implicit return 1, which is executed if the script completes without return being explicitly called. Here is a more complex example: The intermediate code dump looks similar: opnum line opcode op1 op2 result 0 2 ZEND_FETCH_W “hi” ‘0 1 2 ZEND_ASSIGN ‘0 “hello” ‘0 2 3 ZEND_FETCH_R “hi” ‘2 3 3 ZEND_SEND_VAR ‘2 4 3 ZEND_DO_FCALL “strtoupper” ‘3 5 3 ZEND_ECHO ‘3 6 5 ZEND_RETURN 1 Notice the differences between these two scripts. n opcode 3—The ZEND_SEND_VAR op pushes a pointer to Register 2 (the variable $hi) onto the argument stack.This argument stack is how the called function receives its arguments. Because the function called here is an internal function (implemented in C and not in PHP), its operation is completely hidden from PHP. Later you will see how a userspace function receives arguments. n opcode 4—The ZEND_DO_FCALL op calls the function strtoupper and indicates that Register 3 is where its return value should be set. Here is an example of a trivial PHP script that implements conditional flow control:
  2. How the Zend Engine Works: Opcodes and Op Arrays 479 while($i < 5) { $i++; } ?> opnum line opcode op1 op2 result 0 2 ZEND_FETCH_W “i” ‘0 1 2 ZEND_ASSIGN ‘0 0 ‘0 2 3 ZEND_FETCH_R “i” ‘2 3 3 ZEND_IS_SMALLER ‘2 5 ‘2 4 3 ZEND_JMPZ $3 5 4 ZEND_FETCH_RW “i” ‘4 6 4 ZEND_POST_INC ‘4 ‘4 7 4 ZEND_FREE $5 8 5 ZEND_JMP 9 7 ZEND_RETURN 1 Note here that you have a ZEND_JMPZ op to set a conditional branch point (to evaluate whether you should jump to the end of the loop if $i is greater than or equal to 5) and a ZEND_JMP op to bring you back to the top of the loop to reevaluate the condition at the end of each iteration. Observe the following in these examples: n Six registers are allocated and used in this code, even though only two registers are ever used at any one time. Register reuse is not implemented in PHP. For large scripts, thousands of registers may be allocated. n No real optimization is performed on the code.This postincrement: $i++; could be optimized to a pre-increment: ++$i; because it is used in a void context (that is, it is not used in an expression where the former value of $i needs to be stored.) This would save you having to stash its value in a register. n The jump oplines are not displayed in the debugger.This is really the fault of the assembly dumper.The Zend Engine leaves ops used for some internal purposes marked as unused. Before we move on, there is one last important example to look at.The example show- ing function calls earlier in this chapter uses strtoupper, which is a built-in function. Calling a function written in PHP looks similar to that to calling a built-in function:
  3. 480 Chapter 20 PHP and Zend Engine Internals opnum line opcode op1 op2 result 0 2 ZEND_NOP 1 5 ZEND_SEND_VAL “George” 2 5 ZEND_DO_FCALL “hello” ‘0 3 7 ZEND_RETURN 1 But where is the function code? This code simply sets the argument stack (via ZEND_SEND_VAL) and calls hello, but you don’t see the code for hello anywhere.This is because functions in PHP are op arrays as well, as if they were miniature scripts. For example, here is the op array for the function hello: FUNCTION: hello opnum line opcode op1 op2 result 0 2 ZEND_FETCH_W “name” ‘0 1 2 ZEND_RECV 1 ‘0 2 3 ZEND_ECHO “hello%0A” 3 4 ZEND_RETURN NULL This looks pretty similar to the inline code you’ve seen before.The only difference is ZEND_RECV, which reads off the argument stack. As with standalone scripts, even though you don’t explicitly return at the end, a ZEND_RETURN op is implicitly added, and it returns null. Calling includes work similarly to function calls: opnum line opcode op1 op2 result 0 2 ZEND_INCLUDE_OR_EVAL “file.inc” ‘0 1 4 ZEND_RETURN 1 This illustrates an important aspect of the PHP language: All includes and requires happen at runtime. So when a script is initially parsed, the op array for that script is gen- erated, and any functions and classes defined in its top-level file (the one that is actually run) are inserted into the symbol table; but no potentially included scripts are parsed yet. When the script is executed, if an include statement is encountered, the include is then parsed and executed on the spot. Figure 20.1 illustrates the flow of a normal PHP script.
  4. How the Zend Engine Works: Opcodes and Op Arrays 481 SCRIPT ENTRY ZEND_COMPILE ZEND_COMPILE USER CALL INCLUDE/REQUIRE (FUNCTION/METHOD) Figure 20.1 The execution path of a PHP script. This design choice has a number of repercussions: n Flexibility—It is an oft-vaunted fact that PHP is a runtime language. One of the important things that being a runtime language means for PHP is that it supports conditional inclusion of files and conditional declaration of functions and classes. Here’s an example:
  5. 482 Chapter 20 PHP and Zend Engine Internals if($condition) { include(“file1.inc”); } else { include(“file2.inc”); } In this example, the runtime parsing and execution of included files makes this operation more efficient (because files are included only when needed), and it eliminates the potential hassles of symbol conflicts if two files contain different implementations of the same function or class. n Speed—Having to actually compile includes on-the-fly means that a significant portion of a script’s execution time is spent simply compiling its dependant includes. If a file is included twice, it must be parsed and executed twice. include_once and require_once partially solve that problem, but it is further exacerbated by the fact that PHP resets its compiler state completely between script executions. (We’ll talk about that more in a minute, as well as some ways to minimize that effect. ) Variables Programming languages come in two basic flavors when it comes to how variables are declared: n Statically typed—Statically typed languages include languages such as C++ or Java, where a variable is assigned a type (for example, int or String) and that type is fixed at compile time. n Dynamically typed—Dynamically typed languages include languages such as PHP, Perl, Python, and VBScript, where types are automatically inferred at run- time. If you use this: $variable = 0; PHP will automatically create it as an integer type. Furthermore, there are two additional criteria for how types are enforced or converted between: n Strongly typed—In a strongly typed language, if an expression receives an argu- ment of the wrong type, an error is generated.Without exception, statically typed languages are strongly typed (although many allow one type to be cast, or forced to be interpreted, as another type). Some dynamically typed languages, such as Python and Ruby, have strong typing; in them, exceptions are thrown if variables are used in an incorrect context.
  6. Variables 483 n Weakly typed—A weakly typed language does not necessarily enforce types.This is usually accompanied by autoconversion of variables to appropriate types. For instance, in this: $string = “The value of \$variable is $variable.”; $variable (which was autocast into an integer when it was first set) is now auto- converted into a string type so that it can be used to create $string. All these typing strategies have their relative benefits and drawbacks. Static typing allows you to enforce a certain level of data validation at compile time. For this reason, dynamically typed languages tend to be slower than statically typed languages. Dynamic typing is, of course, more flexible. Most interpreted languages choose to go with dynam- ic typing because it fits their flexibility. Strong typing similarly allows you a good amount of built-in data validation, in this case at runtime.Weak typing provides additional flexibility by allowing variables to auto- convert between types as necessary.The interpreted languages are pretty well split on strong typing versus weak typing. Python and Ruby (both of which bill themselves as general-purpose “enterprise” languages) implement strong typing, whereas Perl, PHP, and JavaScript implement weak typing. PHP is both dynamically typed and weakly typed. One slight exception is the option- al type checking for argument types in functions. For example, this: function foo(User $array) { } and this: function bar( Exception $array) {} enforce being passed a User or an Exception object (or one of its descendants or imple- menters), respectively. To fully understand types in PHP, you need to look under the hood at the data struc- tures used in the engine. In PHP, all variables are zvals, represented by the following C structure: struct _zval_struct { /* Variable information */ zvalue_value value; /* value */ zend_uint refcount; zend_uchar type; /* active type */ zend_uchar is_ref; }; and its complementary data container: typedef union _zvalue_value { long lval; /* long value */ double dval; /* double value */ struct {
  7. 484 Chapter 20 PHP and Zend Engine Internals char *val; int len; } str; /* string value */ HashTable *ht; /* hashtable value */ zend_object_value obj; /* handle to an object */ } zvalue_value; The zval consists of its own value (which we’ll get to in a moment), a refcount, a type, and the flag is_ref. A zval’s refcount is the reference counter for the value associated with that variable. When you instantiate a new variable, like this, it is created with a reference count of 1: $variable = ‘foo’; If you create a copy of $variable, the zval for its value has its reference count incre- mented. So after you perform the following, the zval for ‘foo’ has a reference count of 2: $variable_copy = $variable; If you then change $variable, it will be associated to a new zval with a reference count of 1, and the original string ‘foo’ will have its reference count decremented to 1, as follows: $variable = ‘bar’; When a variable falls out of scope (say it’s defined in a function and that function is returned from), or when the variable is destroyed, its zval’s reference count is decre- mented by one.When a zval’s refcount reaches 0, it is picked up by the garbage- collection system and its contents will be freed. The zval type is especially interesting.The fact that PHP is a weakly typed language does not mean that variables do not have types.The type attribute of the zval specifies what the current type of the zval is; this indicates which part of the zvalue_value union should be looked at for its value. Finally, is_ref indicates whether this zval actually holds data or is simply a reference to another zval that holds data. The zvalue_value value is where the data for a zval is actually stored.This is a union of all the possible base types for a variable in PHP: long integers, doubles, strings, hashtables (arrays), and object handles. union in C is a composite data type that uses a minimal amount of space to store at different times different possible types. Practically, this means that the data stored for a zval is either a numeric representation, a string rep- resentation, an array representation, or an object representation, but never more than one at a time.This is in contrast to a language such as Perl, where all these potential represen- tations can coexist (this is how in Perl you can have a variable that has entirely different representations when accessed as a string than when accessed as a number). When you switch types in PHP (which is almost never done explicitly—almost always implicitly, when a usage demands a zval be in a different representation than it
  8. Variables 485 currently is), zvalue_value is converted into the required format.This is why you get behavior like this: $a = “00”; $a += 0; echo $a; which prints 0 and not 00 because the extra characters are silently discarded when $a is converted to an integer on the second line. Variable types are also important in comparison.When you compare two variables with the identical operator (===), like this, the active types for the zvals are compared, and if they are different, the comparison fails outright: $a = 0; $b = ‘0’; echo ($a === $b)?”Match”:”Doesn’t Match”; For that reason, this example fails. With the is equal operator (==), the comparison that is performed is based on the active types of the operands. If the operands are strings or nulls, they are compared as strings, if either is a Boolean, they are converted to Boolean values and compared, and otherwise they are converted to numbers and compared. Although this results in the == operator being symmetrical (for example, if $a == $b is the same as $b == $a), it actu- ally is not transitive.The following example of this was kindly provided by Dan Cowgill: $a = “0”; $b = 0; $c = “”; echo ($a == $b)?”True”:”False”; // True echo ($b == $c)?”True”:”False”; // True echo ($a == $c)?”True”:”False”; // False Although transitivity may seem like a basic feature of an operator algebra, understanding how == works makes it clear why transitivity does not hold. Here are some examples: n “0” == 0 because both variables end up being converted to integers and com- pared. n $b == $c because both $b and $c are converted to integers and compared. n However, $a != $c because both $a and $c are strings, and when they are com- pared as strings, they are decidedly different. In his commentary on this example, Dan compared this to the == and eq operators in Perl, which are both transitive.They are both transitive, though, because they are both typed comparison. == in Perl coerces both operands into numbers before performing the comparison, whereas eq coerces both operands into strings.The PHP == is not a typed comparator, though, and it coerces variables only if they are not of the same active type. Thus the lack of transitivity.
  9. 486 Chapter 20 PHP and Zend Engine Internals Functions You’ve seen that when a piece of code calls a function, it populates the argument stack via ZEND_SEND_VAL and uses a ZEND_DO_FCALL op to execute the function. But what does that really do? To really understand how these things work, you need to go back to even before compilation.When PHP starts up, it looks through all its registered exten- sions (both the ones that were compiled statically and any that were registered in the php.ini file) and registers all the functions that they define.These functions look like this: typedef struct _zend_internal_function { /* Common elements */ zend_uchar type; zend_uchar *arg_types; char *function_name; zend_class_entry *scope; zend_uint fn_flags; union _zend_function *prototype; /* END of common elements */ void (*handler)(INTERNAL_FUNCTION_PARAMETERS); } zend_internal_function; The important things to note here are the type (which is always ZEND_INTERNAL_ FUNCTION, meaning that it is an extension function written in C), the function name, and the handler, which is a C function pointer to the function itself and is part of the exten- sion code. Registering one of these functions basically amounts to its being inserted into the global function table (a hashtable in which functions are stored). User-defined functions are, of course, inserted by the compiler.When the compiler (by which I still mean the lexer, parser, and code generator all together) encounters a piece of code like this: function say_hello($name) { echo “Hello $name\n”; } it compiles the code inside the function’s block as a new op array, creates a zend_ function with that op array, and inserts that zend_function into the global function table with its type set to ZEND_USER_FUNCTION. A zend_function looks like this: typedef union _zend_function { zend_uchar type; struct { zend_uchar type; /* never used */ zend_uchar *arg_types; char *function_name;
  10. Classes 487 zend_class_entry *scope; zend_uint fn_flags; union _zend_function *prototype; } common; zend_op_array op_array; zend_internal_function internal_function; } zend_function; This definition can be rather confusing if you don’t recognize one of the design goals: For the most part, zend_functions are zend_internal_functions are op arrays.They are not identical structs, but all the elements that are in “common” they hold in com- mon.Thus they can safely be casted to each other. In practice, this means that when a ZEND_DO_FCALL op is executed, it stashes away the current scope, populates the argument stack, and looks up the requested function by name (actually by the lowercase version of the name because PHP implements case- insensitive function names), returning a pointer to a zend_function. If the function’s type is ZEND_INTERNAL_FUNCTION, it can be recast to a zend_internal_function and executed via zend_execute_internal, which executes internal functions. Otherwise, it will be executed via zend_execute, the same function that is called to execute scripts and includes.This works because for user functions are completely identical to op arrays. As you can likely infer from the way that PHP functions work, ZEND_SEND_VAL does not push an argument’s zval onto the argument stack; instead, it copies it and pushes the copy onto the stack.This has the consequence that unless a variable is passed by refer- ence (with the exception of objects), changing its value in a function does not change the argument passed—it changes only the copy.To change a passed argument in a func- tion, pass it by reference. Classes Classes are similar to functions in that, like functions, they are stashed in their own global symbol table; but they are more complex than functions.Whereas functions are similar to scripts (possessing the same instruction set), classes are like a miniature version of the entire execution scope. A class is represented by a zend_class_entry, like this: struct _zend_class_entry { char type; char *name; zend_uint name_length; struct _zend_class_entry *parent; int refcount; zend_bool constants_updated; zend_uint ce_flags;
  11. 488 Chapter 20 PHP and Zend Engine Internals HashTable function_table; HashTable default_properties; HashTable properties_info; HashTable class_table; HashTable *static_members; HashTable constants_table; zend_function_entry *built-in_functions; union _zend_function *constructor; union _zend_function *destructor; union _zend_function *clone; union _zend_function *_ _get; union _zend_function *_ _set; union _zend_function *_ _call; /* handlers */ zend_object_value (*create_object)(zend_class_entry *class_type TSRMLS_DC); zend_class_entry **interfaces; zend_uint num_interfaces; char *filename; zend_uint line_start; zend_uint line_end; char *doc_comment; zend_uint doc_comment_len; }; Like the main execution scope, a class contains its own function table (for holding class methods), and its own constants table.The class entry also contains a number of other items, including tables for its attributes (for example, default_properties, properties_ info, static_members) as well as the interfaces it implements, its constructor, its destructor, its clone, and its overloadable access functions. In addition, there is the create_object function pointer, which, if defined, is used to create a new object and define its handlers, which allow for fine-grained control of how that object is accessed. One of the major changes in PHP 5 is the object model. In PHP 4, when you create an object, you are returned a zval whose zvalue_value looks like this: typedef struct _zend_object { zend_class_entry *ce; HashTable *properties; } zend_object; This means that zend_objects in PHP 4 are little more than hashtables (of attributes) with a zend_class_entry floating around to hold its methods.When objects are passed
  12. Classes 489 to functions, they are copied (as all other variable types are), and implementing controls of attribute accessors is extremely hackish. In PHP 5, an object’s zval contains a zend_object_value, like this: struct _zend_object_value { zend_object_handle handle; zend_object_handlers *handlers; }; The zend_object_value in turn contains a zend_object_handle (an integer that iden- tifies the location of the object in a global object store—effectively a pointer to the object proper) and a set of handlers, which regulate all accesses to the object. This intrinsically changes the way that objects are handled in PHP. In PHP 5, when an object’s zval is copied (as happens on assignment or when passed into a function), the data is not copied; another reference to the object is created.These semantics are much more standard and correspond to the object semantics in Java, Python, Perl, and other languages. The Object Handlers In PHP 5 it is possible (in the extension API) to control almost all access to an object and its properties. A handler API is provided that implements the following access han- dlers: typedef struct _zend_object_handlers { /* general object functions */ zend_object_add_ref_t add_ref; zend_object_del_ref_t del_ref; zend_object_delete_obj_t delete_obj; zend_object_clone_obj_t clone_obj; /* individual object functions */ zend_object_read_property_t read_property; zend_object_write_property_t write_property; zend_object_read_dimension_t read_dimension; zend_object_write_dimension_t write_dimension; zend_object_get_property_ptr_ptr_t get_property_ptr_ptr; zend_object_get_t get; zend_object_set_t set; zend_object_has_property_t has_property; zend_object_unset_property_t unset_property; zend_object_has_dimension_t has_dimension; zend_object_unset_dimension_t unset_dimension; zend_object_get_properties_t get_properties; zend_object_get_method_t get_method; zend_object_call_method_t call_method; zend_object_get_constructor_t get_constructor; zend_object_get_class_entry_t get_class_entry;
  13. 490 Chapter 20 PHP and Zend Engine Internals zend_object_get_class_name_t get_class_name; zend_object_compare_t compare_objects; zend_object_cast_t cast_object; } zend_object_handlers; We’ll explore each handler in greater depth in Chapter 22, “Extending PHP: Part II,” where you’ll actually implement extension classes. In the meantime, you just need to know that the handler names offer a relatively clear indication as to what they do. For example, add_ref is called whenever a reference to an object is added: $object2 = $object; and compare_objects is called whenever two objects are compared by using the is_equal operator: if($object2 == $object) {} Object Creation In the Zend Engine version 2, object creation happens in two phases.When you call this: $object = new ClassName; a new zend_object is created and placed in the object store, and a handle to it is assigned to $object. By default (as happens when you instantiate a userspace class), the object is allocated by using the default allocator, and it is assigned the default access han- dlers. Alternatively, if the class’s zend_class_entry has its create_object function defined, that function is called to handle the allocation of the object and returns the array of zend_object_handlers for that object. This level of control is especially useful if you need to override the basic operations of an object and if you need to store resource data in an object that should not be touched by the normal memory management mechanisms.The Java and mono exten- sions both use these facilities to allow PHP to instantiate and access objects from these other language. Only after the zend_object_value is created is the constructor called on the object. Even in extensions, the constructor (and destructor and clone) are “normal” zend_ functions.They do not alter the object’s access handlers, which have already been estab- lished. Other Important Structures In addition to the function and class tables, there are a few other important global data structures worth mentioning. Knowledge of how these work isn’t terribly important for a user of PHP, but it can be useful if you want to modify how the engine itself works. Most of these are elements of either the compiler_globals struct or the executor_globals struct and are most often referenced in the source via the macros
  14. Classes 491 CG()and EG(), respectively.These are some of the global data structures you should know about: n CG(function_table) and EG(function_table)—These structures refer to the function table we’ve talked about up until now. It exists in both the compiler and executor globals. Iterating through this hashtable gives you every callable function. n CG(class_table) and EG(class_table)—These structures refer to the hashtable in which all the classes are stored. n EG(symbol_table)—This structure refers to a hashtable that is the main (that is, global) symbol table.This is where all the variables in the global scope are stored. n EG(active_symbol_table)—This structure refers to a hashtable that contains the symbol table for the current scope. n EG(zend_constants)—This structure refers to the constants hashtable, where con- stants set with the function define are stored. n CG(auto_globals)—This structure refers to the hashtable of autoglobals ($_SERVER, $_ENV, $_POST, and so on) that are used in the script.This is a compil- er global so that the autoglobals can be conditionally initialized only if the script utilizes them.This boosts performance because it avoids the work of initializing and populating these variables when they are not needed. n EG(regular_list)—This structure refers to a hashtable that is used to store “reg- ular” (that is, nonpersistent) resources. Resources here are PHP resource-type vari- ables, such as streams, file pointers, database connections, and so on.You’ll learn more about how these are used in Chapter 22. n EG(persistent_list)—This structure is like EG(regular_list), but EG(persistent_list) resources are not freed at the end of every request (persist- ent database connections, for example). n EG(user_error_handler)—This structure refers to a pointer to a zval that con- tains the name of the current user_error_handler function (as set via the set_error_handler function). If no error-handler function is set, this structure is NULL. n EG(user_error_handlers)—This structure refers to the stack of error-handler functions. n EG(user_exception_handler)—This structure refers to a pointer to a zval that contains the name of the current global exception handler, as set via the function set_exception_handler. If none has been set, this structure is NULL. n EG(user_exception_handlers)—This structure refers to the stack of global exception handlers. n EG(exception)—This is an important structure.Whenever an exception is thrown, EG(exception) is set to the actual object handler’s zval that is thrown. Whenever a function call is returned, EG(exception) is checked. If it is not NULL,
  15. 492 Chapter 20 PHP and Zend Engine Internals execution halts and the script jumps to the op for the appropriate catch block.We will explore throwing exceptions from within extension code in depth in Chapter 21, “Extending PHP: Part I,” and Chapter 22. n EG(ini_directives)—This structure refers to a hashtable of the php.ini direc- tives that is set in this execution context. This is just a selection of the globals set in executor_globals and compiler_globals. The globals listed here were chosen either because they are used in interesting optimiza- tions in the engine (the just-in-time population of autoglobals) or because you will want to interact with them in extensions (such as resource lists). The Principle of Sandboxing The principle of sandboxing is that nothing that a user does in handling one request should in any way affect a subsequent request. PHP is an extremely well-sandboxed language in that at the end of every request, the interpreter is returned to a clean starting state. This specifically entails the following: n All function and class tables have all ZEND_USER_FUNCTION and ZEND_USER_CLASS (that is, all userspace-defined functions and classes) removed. n All op arrays for any parsed files are discarded. (They are actually discarded immediately after use.) n The symbol tables and constants tables are completely cleaned of all data. n All resources not on the persistent list are destructed. Solutions such as mod_perl make it easy to accidentally instantiate global variables that have persistent (and thus potentially unexpected) values between requests. PHP’s request-end sterilization makes that sort of problem almost impossible. It also means that data that is known not to change between requests (for example, the compilation results of a file) needs to be regenerated on every request in which it is used. As we’ve discussed before in relation to compiler caches such as APC, IonCube, and the Zend Accelerator, avoiding certain aspects of this sandboxing can be beneficial from a performance standpoint. We’ll look at some methods for that in Chapter 23. The PHP Request Life Cycle Now that you have a decent understanding of how the Zend Engine works, let’s look at how the engine sits inside PHP and how PHP itself sits inside other applications. Any discussion of the architecture of PHP starts with a diagram such as Figure 20.2, which shows the application layers in PHP. The outermost layer, where PHP interacts with other applications, is the Server Abstraction API (SAPI) layer.The SAPI layer partially handles the startup and shutdown of PHP inside an application, and it provides hooks for handling data such as cookies and POST data in an application-agnostic manner.
  16. The PHP Request Life Cycle 493 Application (apache, thttpd, cli, etc.) SAPI (see Chap 23) PHP API (streams, output, etc.) (streams, output, etc) (see Chap 22) (see chap 22) PHP Extensions (mysql, standard library, etc. ) (see Chap 22) Modular Code Zend Extension API Zend API (see Chap 23) Zend Engine Figure 20.2 The architecture of PHP. Below the SAPI layer lies the PHP engine itself.The core PHP code handles setting up the running environment (populating global variables and setting default .ini options), providing interfaces such as the stream’s I/O interface, parsing of data, and most impor- tantly, providing an interface for loading extensions (both statically compiled extensions and dynamically loaded extensions). At the core of PHP lies the Zend Engine, which we have discussed in depth here. As you’ve seen, the Zend Engine fully handles the parsing and execution of scripts.The Zend Engine was also designed for extensibility and allows for entirely overriding its basic functionality (compilation, execution, and error handling), overriding selective por- tions of its behavior (overriding op_handlers in particular ops), and having functions called on registerable hooks (on every function call, on every opcode, and so on).These features allow for easy integration of caches, profilers, debuggers, and semantics-altering extensions.
  17. 494 Chapter 20 PHP and Zend Engine Internals The SAPI Layer The SAPI layer is the abstraction layer that allows for easy embedding of PHP into other applications. Some SAPIs include the following: n mod_php5—This is the PHP module for Apache, and it is a SAPI that embeds PHP into the Apache Web server. n fastcgi—This is an implementation of FastCGI that provides a scalable extension to the CGI standard. FastCGI is a persistent CGI daemon that can handle multiple requests. FastCGI is the preferred method of running PHP under IIS and shows performance almost as good as that of mod_php5. n CLI—This is the standalone interpreter for running PHP scripts from the com- mand line, and it is a thin wrapper around a SAPI layer. n embed—This is a general-purpose SAPI that is designed to provide a C library interface for embedding a PHP interpreter in an arbitrary application. The idea is that regardless of the application, PHP needs to communicate with an appli- cation in a number of common places, so the SAPI interface provides a hook for each of those places.When an application needs to start up PHP, for instance, it calls the startup hook. Conversely, when PHP wants to output information, it uses the provided ub_write hook, which the SAPI layer author has coded to use the correct output method for the application PHP is running in. To understand the capabilities of the SAPI layer, it is easiest to look at the hooks it implements. Every SAPI interface registers the following struct, with PHP describing the callbacks it implements: struct _sapi_module_struct { char *name; char *pretty_name; int (*startup)(struct _sapi_module_struct *sapi_module); int (*shutdown)(struct _sapi_module_struct *sapi_module); int (*activate)(TSRMLS_D); int (*deactivate)(TSRMLS_D); int (*ub_write)(const char *str, unsigned int str_length TSRMLS_DC); void (*flush)(void *server_context); struct stat *(*get_stat)(TSRMLS_D); char *(*getenv)(char *name, size_t name_len TSRMLS_DC); void (*sapi_error)(int type, const char *error_msg, ...); int (*header_handler)(sapi_header_struct *sapi_header, sapi_headers_struct *sapi_headers TSRMLS_DC); int (*send_headers)(sapi_headers_struct *sapi_headers TSRMLS_DC); void (*send_header)(sapi_header_struct *sapi_header,
  18. The PHP Request Life Cycle 495 void *server_context TSRMLS_DC); int (*read_post)(char *buffer, uint count_bytes TSRMLS_DC); char *(*read_cookies)(TSRMLS_D); void (*register_server_variables)(zval *track_vars_array TSRMLS_DC); void (*log_message)(char *message); char *php_ini_path_override; void (*block_interruptions)(void); void (*unblock_interruptions)(void); void (*default_post_reader)(TSRMLS_D); void (*treat_data)(int arg, char *str, zval *destArray TSRMLS_DC); char *executable_location; int php_ini_ignore; int (*get_fd)(int *fd TSRMLS_DC); int (*force_http_10)(TSRMLS_D); int (*get_target_uid)(uid_t * TSRMLS_DC); int (*get_target_gid)(gid_t * TSRMLS_DC); unsigned int (*input_filter)(int arg, char *var, char **val, unsigned int val_len TSRMLS_DC); void (*ini_defaults)(HashTable *configuration_hash); int phpinfo_as_text; }; The following are some of the notable elements from this example: startup—This is called the first time the SAPI is initialized. In an application that n will serve multiple requests, this is performed only once. For example, in mod_php5, this is performed in the parent process before children are forked. activate—This is called at the beginning of each request. It reinitializes all the n per-request SAPI data structures. deactivate—This is called at the end of each request. It ensures that all data has n been correctly flushed to the application, and then it destroys all the per-request data structures. shutdown—This is called at interpreter shutdown. It destroys all the SAPI struc- n tures. n ub_write—This is what PHP will use to output data to the client. In the CLI SAPI, this is as simple as writing to standard output; in mod_php5, the Apache library call rwrite is called. n sapi_error—This is a handler for reporting errors to the application. Most SAPIs use php_error, which instructs PHP to use its own internal error system. n flush—This tells the application to flush its output. In the CLI, this is implement- ed via the C library call fflush; mod_php5 uses the Apache library rflush.
  19. 496 Chapter 20 PHP and Zend Engine Internals n send_header—This sends a single specified header to the client. Some servers (such as Apache) have built-in functions for handling header transmission. Others (such as the PHP CGI) require you to manually send them. Others still (such as the CLI) do not handle sending headers at all. n send_headers—This sends all headers to the client. n read_cookies—During SAPI activation, if a read_cookies handler is defined, it will be called to populate SG(request_info).cookie_data.This is then used to populate the $_COOKIE autoglobal. n read_post—During SAPI activation, if the request method is a POST (or if the php.ini variable always_populate_raw_post_data is true), the read_post han- dler is called to populate $HTTP_RAW_POST_DATA and $_POST. Chapter 23 takes a closer look at using the SAPI interface to integrate PHP into applica- tions and does a complete walkthrough of the CGI SAPI. The PHP Core There are several key steps in activating and running a PHP interpreter.When an appli- cation wants to start a PHP interpreter, it starts by calling php_module_startup.This function is like the master switch that turns on the interpreter. It activates the registered SAPI, initializes the output buffering system, starts the Zend Engine, reads in and acts on the php.ini file, and prepares the interpreter for its first request. Some important func- tions that are used in the core are n php_module_startup—Thisis the master startup for PHP. n php_startup_extensions—This runs the initialization function in all registered extensions. n php_output_startup—This starts the output system. n php_request_startup—At the beginning of a request, this is the master function, which calls up to the SAPI per-request functions, calls down into the Zend Engine for per-request initialization, and calls the request startup function in all registered modules. n php_output_activate—This activates the output system, setting the output func- tions to use the SAPI-specified output functions. n php_init_config—This reads in the php.ini file and acts on its contents. n php_request_shutdown—This is the master function to destroy per-request resources.
  20. The PHP Request Life Cycle 497 n php_end_ob_buffers—This is used to flush output buffers, if output buffering has been enabled. n php_module_shutdown—This is the master shutdown function for PHP, triggering all the rest of the interpreter shutdown functions. The PHP Extension API Most of our discussion regarding the PHP extension API will be carried on in Chapter 22, where you will actually implement extensions. Here we’ll only look at the basic call- backs available to extensions and when they are called. Extensions can be registered in two ways.When an extension is compiled statically into PHP, the configuration system permanently registers that module with PHP. An extension can also be loaded from the .ini file, in which case it is registered during the .ini parsing. The hooks that an extension can register are contained in its zend_module_entry function, like so: struct _zend_module_entry { unsigned short size; unsigned int zend_api; unsigned char zend_debug; unsigned char zts; struct _zend_ini_entry *ini_entry; char *name; zend_function_entry *functions; int (*module_startup_func)(INIT_FUNC_ARGS); int (*module_shutdown_func)(SHUTDOWN_FUNC_ARGS); int (*request_startup_func)(INIT_FUNC_ARGS); int (*request_shutdown_func)(SHUTDOWN_FUNC_ARGS); void (*info_func)(ZEND_MODULE_INFO_FUNC_ARGS); char *version; int (*global_startup_func)(void); int (*global_shutdown_func)(void); int globals_id; int module_started; unsigned char type; void *handle; int module_number; };
Đồng bộ tài khoản