Chapter 3: Special Components: Dhandlers and Autohandlers
In previous chapters you've seen an overview of the basic structure and
syntax of Mason components, and you've seen how components can
cooperate by invoking one another and passing arguments.
In this chapter you'll learn about dhandlers and autohandlers, two powerful
mechanisms that help lend reusable structure to your site and help you
design creative solutions to unique problems. Mason's dhandlers provide a
flexible way to create "virtual" URLs that don't correspond directly to
components on disk, and autohandlers let you easily control many structural
aspects of your site with a powerful object-oriented metaphor.
Dhandlers
The term " dhandler" stands for "default handler." The concept is simple: if
Mason is asked to process a certain component but that component does not
exist in the component tree, Mason will look for a component called
dhandler and serve that instead of the requested component. Mason looks
for dhandlers in the apparent requested directory and all parent directories.
For instance, if your web server receives a request for
/archives/2001/March/21 and passes that request to Mason, but no such
Mason component exists, Mason will sequentially look for
/archives/2001/March/dhandler, /archives/2001/dhandler, /archives/dhandler,
and /dhandler. If any of these components exist, the search will terminate
and Mason will serve the first dhandler it finds, making the remainder of the
requested component path available to the dhandler via $m-
>dhandler_arg. For instance, if the first dhandler found is
/archives/dhandler, then inside this component (and any components it
calls), $m->dhandler_arg will return 2001/March/21. The dhandler can
use this information to decide how to process the request.
Dhandlers can be useful in many situations. Suppose you have a large
number of documents that you want to serve to your users through your web
site. These documents might be PDF files stored on a central document
server, JPEG files stored in a database, text messages from an electronic
mailing list archive (as in the example from the previous paragraph), or even
PNG files that you create dynamically in response to user input. You may
want to use Mason's features to create or process these documents, but it
wouldn't be feasible to create a separate Mason component for each
document on your server.
In many situations, the dhandler feature is simply a way to make URLs more
attractive to the end user of the site. Most people probably prefer URLs like
http://www.yoursite.com/docs/corporate/decisions.pdf over URLs like
http://www.yoursite.com/doc.cgi?domain=corporate&format=pdf&content
=dec isions. It also lets you design an intuitive browsing interface, so that
people who chop off the tail end of the URL and request
http://www.yoursite.com/docs/corporate/ can see a listing of available
corporate documents if your dhandler chooses to show one.
The alert reader may have noticed that using dhandlers is remarkably similar
to capturing the PATH_INFO environment variable in a CGI application. In
fact, this is not exactly true: Apache's PATH_INFO mechanism is actually
available to you if you're running Mason under mod_perl, but it gets
triggered under different conditions than does Mason's dhandler mechanism.
If Apache receives a request with a certain path, say,
/path/to/missing/component, then its actions depend on what the final
existing part of that path is. If the /path/to/missing/ directory exists but
doesn't contain a component file, then Mason will be invoked, a dhandler
will be searched for, and the remainder of the URL will be placed in $m-
>dhandler_arg. On the other hand, if /path/to/missing exists as a regular
Mason component instead of as a directory, this component will be invoked
by Mason and the remainder of the path will be placed (by Apache) into
$r->path_info. Note that the majority of this handling is done by
Apache; Mason steps into the picture after Apache has already decided
whether the given URL points to a file, what that file is, and what the
leftover bits are.
What are the implications of this? The behavioral differences previously
described may help you determine what strategy to use in different
situations. For example, if you've got a bunch of content sitting in a database
but you want to route requests through a single Mason component, you may
want to construct "file-terminating" URLs and use $r->path_info to get
at the remaining bits. However, if you've got a directory tree under Mason's
control and you want to provide intelligent behavior for requests that don't
exist (perhaps involving customized 404 document generation, massaging of
content output, and so on) you may want to construct "directory-
terminating" URLs and use $m->dhandler_arg to get at the rest.
Finer Control over Dhandlers
Occasionally you will want more control over how Mason delegates
execution to dhandlers. Several customization mechanisms are available.
First, any component (including a dhandler) may decline to handle a request,
so that Mason continues its search for dhandlers up the component tree. For
instance, given components located at /docs/component.mas, /docs/dhandler,
and /dhandler, /docs/component.mas may decline the request by calling $m-
>decline, which passes control to /docs/dhandler. If /docs/dhandler calls
$m->decline, it will pass control to /dhandler. Each component may do
some processing before declining, so that it may base its decision to decline
on specific user input, the state of the database, or the phase of the moon. If
any output has been generated, $m->decline will clear the output buffer
before starting to process the next component.
Second, you may change the filename used for dhandlers, so that instead of
searching for files called dhandler, Mason will search for files called
default.mas or any other name you might wish. To do this, set the
dhandler_name Interpreter parameter (see Chapter 6 for details on
setting parameters). This may be useful if you use a text editor that
recognizes Mason component syntax (we mention some such editors in
Appendix C) by file extension, if you want to configure your web server to
handle (or deny) requests based on file extension, or if you simply don't like
the name dhandler .
Dhandlers and Apache Configuration
You may very well have something in your Apache configuration file that
looks something like this:
DocumentRoot /home/httpd/html
<LocationMatch "\.html$">
SetHandler perl-script
PerlHandler HTML::Mason::ApacheHandler
</LocationMatch>
This directive has a rather strange interaction with Mason's dhandler
mechanism. If you have a dhandler at /home/httpd/html/dhandler on the
filesystem, which corresponds to the URL /dhandler and a request arrives for
the URL /nonexistent.html, Mason will be asked to handle the request. Since
the file doesn't exist, Mason will call your dhandler, just as you would
expect.
However, if you request the URL /subdir/nonexistent.html, Apache will
never call Mason at all and will instead simply return a NOT FOUND (404)
error. Why, you ask? A good question indeed. It turns out that in the process
of answering the request, Apache notices that there is no
/home/httpd/html/subdir directory on the filesystem before it even gets to the
content generation phase, therefore it doesn't invoke Mason. In fact, if you
were to create an empty /home/httpd/html/subdir directory, Mason would be
called.
One possible solution is simply to create empty directories for each path you
would like to be handled by a dhandler, but this is not a very practical
solution in most cases. Fortunately, you can add another configuration
directive like this:
<Location /subdir>
SetHandler perl-script
PerlHandler HTML::Mason::ApacheHandler
</Location>