Chapter 6: The Lexer, Compiler, Resolver, and
Interpreter Objects
Now that you're familiar with Mason's basic syntax and some of its more
advanced features, it's time to explore the details of how the various pieces
of the Mason architecture work together to process components. By knowing
the framework well, you can use its pieces to your advantage, processing
components in ways that match your intentions.
In this chapter we'll discuss four of the persistent objects in the Mason
framework: the Interpreter, Resolver, Lexer, and Compiler. These objects
are created once (in a mod_perl setting, they're typically created when the
server is starting up) and then serve many Mason requests, each of which
may involve processing many Mason components.
Each of these four objects has a distinct purpose. The Resolver is responsible
for all interaction with the underlying component source storage mechanism,
which is typically a set of directories on a filesystem. The main job of the
Resolver is to accept a component path as input and return various properties
of the component such as its source, time of last modification, unique
identifier, and so on.
The Lexer is responsible for actually processing the component source code
and finding the Mason directives within it. It interacts quite closely with the
Compiler, which takes the Lexer's output and generates a Mason component
object suitable for interpretation at runtime.
The Interpreter ties the other three objects together. It is responsible for
taking a component path and arguments and generating the resultant output.
This involves getting the component from the resolver, compiling it, then
caching the compiled version so that next time the interpreter encounters the
same component it can skip the resolving and compiling phases.
Figure 6-1 illustrates the relationship between these four objects. The
Interpreter has a Compiler and a Resolver, and the Compiler has a Lexer.
Figure 6-1. The Interpreter and its cronies
Passing Parameters to Mason Classes
An interesting feature of the Mason code is that, if a particular object
contains another object, the containing object will accept constructor
parameters intended for the contained object. For example, the Interpreter
object will accept parameters intended for the Compiler or Resolver and do
the right thing with them. This means that you often don't need to know
exactly where a parameter goes. You just pass it to the object at the top of
the chain.
Even better, if you decide to create your own Resolver for use with Mason,
the Interpreter will take any parameters that your Resolver accepts -- not the
parameters defined by Mason's default Resolver class.
Also, if an object creates multiple delayed instances of another class, as the
Interpreter does with Request objects, it will accept the created class's
parameters in the same way, passing them to the created class at the
appropriate time. So if you pass the autoflush parameter to the
Interpreter's constructor, it will store this value and pass it to any Request
objects it creates later.
This system was motivated in part by the fact that many users want to be
able to configure Mason from an Apache config file. Under this system, the
user just sets a certain configuration directive (such as MasonAutoflush1
to set the autoflush parameter) in her httpd.conf file, and it gets directed
automatically to the Request objects when they are created.
The details of how this system works are fairly magical and the code
involved is so funky its creators don't know whether to rejoice or weep, but
it works, and you can take advantage of this if you ever need to create your
own custom Mason classes. Chapter 12 covers this in its discussion of the
Class::Container class, where all the funkiness is located.
The Lexer
Mason's built-in Lexer class is, appropriately enough,
HTML::Mason::Lexer . All it does is parse the text of Mason
components and pass off the sections it finds to the Compiler. As of Version
1.10, the Lexer doesn't actually accept any parameters that alter its behavior,
so there's not much for us to say in this section.
Future versions of Mason may include other Lexer classes to handle
alternate source formats. Some people -- crazy people, we assure you -- have
expressed a desire to write Mason components in XML, and it would be
fairly simple to plug in a new Lexer class to handle this. If you're one of
these crazy people, you may be interested in Chapter 12 to see how to use
objects of your own design as pieces of the Mason framework.
By the way, you may be wondering why the Lexer isn't called a Parser, since
its main job seems to be to parse the source of a component. The answer is
that previous implementations of Mason had a Parser class with a different
interface and role, and a different name was necessary to maintain forward
(though not backward) compatibility.
The Compiler
By default, Mason will use the
HTML::Mason::Compiler::ToObject class to do its compilation. It
is a subclass of the generic HTML::Mason::Compiler class, so we
describe here all parameters that the ToObject variety will accept,
including parameters inherited from its parent:
allow_globals
You may want to allow access to certain Perl variables across all
components without declaring or initializing them each time. For
instance, you might want to let all components share access to a $dbh
variable that contains a DBI database handle, or you might want to
allow access to an Apache::Session%session variable.
For cases like these, you can set the allow_globals parameter to
an array reference containing the names of any global variables you
want to declare. Think of it like a broadly scoped use vars
declaration; in fact, that's exactly the way it's implemented under the
hood. If you wanted to allow the $dbh and %session variables, you
would pass an allow_globals parameter like the following:
allow_globals => ['$dbh', '%session']
Or in an Apache configuration file:
PerlSetVar MasonAllowGlobals $dbh
PerlAddVar MasonAllowGlobals %session
The allow_globals parameter can be used effectively with the
Perl local() function in an autohandler. The top-level autohandler
is a convenient place to initialize global variables, and local() is
exactly the right tool to ensure that they're properly cleaned up at the
end of the request:
# In the top-level autohandler:
<%init>
# $dbh and %session have been declared
using 'allow_globals'
local $dbh = DBI->connect(...connection
parameters...);
local *session; # Localize the glob so the
tie() expires properly
tie %session, 'Apache::Session::MySQL',
Apache::Cookie->fetch->{session_id}-
>value,
{ Handle => $dbh, LockHandle => $dbh };